Problem and Goal

This article aims to provide a fast and clean way to unmap Ceph rbd image when k8s pvc has a problem mounting into PrimeHub.

Environment

A Kubernetes cluster has Rook Ceph installed.

Step-by-step Method

  1. Check the nfs pod which cannot be created.

    $ kubectl -n hub get pods | grep ContainerCreating
    nfs-project-test-0            0/1        ContainerCreating          0     1m1s
    $ kubectl -n hub describe pods nfs-{target-project}-0
    .....
    rpc error: code = Internal desc = rbd image replicapool/{csi-vol-name} is still being used.
    **Or if your ceph version is below 1.0, you will see the message like this:**
    rpc error: code = Internal desc = rbd image replicapool/{pv-name} is still being used.
    
    Unable to attach or mount volumes: unmounted volumes=[XXX], unattached volumes=[XXX], ... timed out waiting for the condition.
    .....
    
  2. The reason why nfs pod has issue mounting pv is due to the incorrect rbd image assigned by Ceph. We need to unmap the incorrect rbd image pointed to k8s pv and Ceph will remap a new rbd image to k8s pv to solve this issue.

  3. Base on the Ceph version you installed, you have to follow different section to proceed.

    1. New version of Ceph (>v1.0), running commands in csi-rbdplugin-pod.
    2. Old version of Ceph (<v1.0), running commands in rook-ceph-agent-pod.
  4. Use the command below to find your Ceph version:

    $ kubectl -n rook-system get deployment rook-ceph-operator -o yaml | grep 'image:'
    image: rook/ceph:v1.1.9
    

Ceph version ≥ 1.0

  1. Find the target pv name.

    $ kubectl -n hub get pvc | grep {target-project}
    

    You will get a string with pvc prefix.

    It is the pv name of the target pv, please write it down for later use.

    The format of the pv name looks like this: pvc-048ff4b5-c60d-41c2-ba97-11adf4aa845e.

  2. Map the pv name we catched earlier to a csi-vol name

    $ kubectl get pv {pv-name} -o json | jq -r ".spec.csi.volumeHandle[-36:]"
    

    You will get a 36-bit string , which is the suffix of the csi-vol name.

    The format of the csi-vol name looks like this: b760fe24-5080-11ed-b3db-eae54ebba3c1.

    It is the csi-vol name of the target pv, please write it down for later use.

  3. Put the {csi-vol-name} in the command below and run.

    $ kubectl -n rook-system get pods | grep csi-rbdplugin | grep -v provisioner | cut -d' ' -f1 | xargs -I{} sh -c 'echo {} ; kubectl -n rook-system exec -ti {} -c csi-rbdplugin -- rbd device list | grep {csi-vol-name}' 2>/dev/null
    csi-rbdplugin-xxxxx
    1  replicapool           csi-vol-b760fe24-5080-11ed-b3db-eae54ebba3c1 -    /dev/rbdX 
    csi-rbdplugin-qqqqq
    csi-rbdplugin-wwwww
    

    You will get the rbd image name and which csi-rbdplugin pod the rbd image is assigned.

    Find the node in which csi-rbdplugin pod is nested and write it down for later use.

    $ kubectl -n rook-system get pods -o wide | grep {csi-rbdplugin-pod-name}
    csi-rbdplugin-xxxxx                          3/3     Running   3          1d   192.168.1.1      node01   <none>           <none>
    
  4. Connect to the node which has hosted the target rbd image, and list all the mount points of this rbd image.

    $ mount | grep '/dev/rbdX'
    .....
    /dev/rbdX on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-..... type ext4 (rw,relatime,stripe=16)
    /dev/rbdX on /var/lib/kubelet/pods/...../mount type ext4 (rw,relatime,stripe=16)
    .....
    
  5. Umount all the mount points of this rbd image you got from the previous step.

    $ sudo umount -fl {mountpoint-path}
    
  6. Back to the k8s control node, and unmap the rbd image to k8s pv.

    Remove -o force argument if you prefer a more gentle way.

    $ kubectl -n rook-system exec -it {rook-csi-rbdplugin-pod-name} -c csi-rbdplugin -- rbd unmap -o force /dev/rbdX
    
  7. After the unmaping, restart nfs pod to reconnect to the k8s pv, in the mean time, a new rbd image will be created to link the k8s pv to Ceph.

    $ kubectl -n hub get pods | grep nfs-{target-project}-0
    $ kubectl -n hub delete pods {nfs-target-project-pod-name}
    

    After the restart of nfs pod, check the project volume again to see if it is successfully connected.

Ceph version < 1.0

  1. Find the target pv name.

    $ kubectl -n hub get pvc | grep {target-project}
    

    You will get a string with pvc prefix.

    It is the pv name of the target pv, please write it down for later use.

    The format of the pv name looks like this: pvc-048ff4b5-c60d-41c2-ba97-11adf4aa845e.

  2. Put the {pv-name} in the command below and run.

    $ kubectl -n rook-system get pods | grep rook-ceph-agent | cut -d' ' -f1 | xargs -I{} sh -c 'echo {} ; kubectl -n rook-system exec -ti {} -- rbd device list | grep {pv-name}' 2>/dev/null
    1  replicapool           pvc-048ff4b5-c60d-41c2-ba97-11adf4aa845e -    /dev/rbdX 
    

    You will get the rbd image name and which rook-ceph-agent pod the rbd image is assigned.

    Find the node in which rook-ceph-agent pod is nested and write it down for later use.

    $ kubectl -n rook-system get pods -o wide | grep {csi-rbdplugin-pod-name}
    rook-ceph-agent-xxxxx                          3/3     Running   3          1d   192.168.1.1      node01   <none>           <none>