Operation: Rebuild Dataset Volume
<aside> 💡 This document guides you on how to rebuild a new dataset volume and migrate the dataset volume from the old volume to the new one. The file system type of the new volume is according to the current filesystem setting.
</aside>
Step 1: Check the PV and PVC status and export the yaml file:
Check the PV status is Retain
and get the {pv-name}
Please record the pv size, we will create the same size of PV in the following step.
$ kubectl get pv | grep dataset-{dataset_name}
hub-nfs-dataset-{dataset_name} 2Gi RWX Retain Bound hub/dataset-{dataset_name} 70d
pvc-<pv-name-old> 2Gi RWO Retain Bound hub/data-nfs-dataset-{dataset_name}-0 rook-block 70d
Check the PVC is not mounted.
$ kubectl -n hub describe pvc dataset-{dataset_name} | grep -e '\\(Mounted\\|Used\\) By'
Mounted By: <none>
Export the yaml file.
$ kubectl -n hub get pvc data-nfs-dataset-{dataset_name}-0 -o yaml > data-nfs-dataset-{dataset_name}-0.yaml
$ kubectl -n hub get pvc dataset-{dataset_name} -o yaml > dataset-{dataset_name}.yaml
Step 2: Delete dataset PVC and PV.
$ kubectl -n hub delete pvc dataset-{dataset_name}
$ kubectl delete pv hub-nfs-dataset-{dataset_name}
Step 3: Delete data-nfs PVC.
Notice: Please do NOT delete the data-nfs PV. It might cause data lost.
$ kubectl -n hub delete pvc data-nfs-dataset-{dataset_name}-0
Step 4: Check the data-nfs PV is not related to PVC Storage
Check the PV status is Released.
$ kubectl get pv pvc-<pv-name-old>
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-<pv-name-old> 2Gi RWO Retain Released hub/data-nfs-dataset-{dataset_name}-0 rook-block 48d
remove claimRef
from pv
$ kubectl patch pv pvc-<pv-name-old> -p '{"spec":{"claimRef":null}}'
After the commend, check the PV status is Available.
$ kubectl get pv pvc-<pv-name-old>
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-<pv-name-old> 2Gi RWO Retain Released Available rook-block 48d
Step 5: Create a temp data-nfs pvc and assign the old data-nfs pv to be the PVC storage.
Create a temp pvc yaml
$ vim old-data-nfs-{dataset_name}.yaml
Note: The resource request storage need to be the same as the previous one.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-nfs-dataset-{dataset_name}-old-0 # Need to be changed
namespace: hub
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi # Change to target pv size.
storageClassName: rook-block
volumeMode: Filesystem
volumeName: pvc-<pv-name-old> # Need to be changed
Apply the old-data-nfs-{dataset_name}.yaml
$ kubectl -n hub apply -f old-data-nfs-{dataset_name}.yaml
persistentvolumeclaim/data-nfs-dataset-{dataset_name}-old-0 created
Step 6: Create a new data-nfs pvc and create the new pv by itself.
Create a new pvc yaml
$ vim new-data-nfs-{dataset_name}.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
labels:
app: primehub-group
primehub-group: {dataset_name} # Need to be changed
role: nfs-server
name: data-nfs-dataset-{dataset_name}-0 # Need to be changed
namespace: hub
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi # Change to target pv size.
storageClassName: rook-block
volumeMode: Filesystem
Apply new-data-nfs-{dataset_name}.yaml
$ kubectl -n hub apply -f new-data-nfs-{dataset_name}.yaml
persistentvolumeclaim/data-nfs-dataset-{dataset_name}-0 created
Step 7: Mount PVC and rsync the data from old data-nfs pv to new data-nfs pv.
$ kubectl mountpvc --pvc data-nfs-dataset-{dataset_name}-old-0 --pvc data-nfs-dataset-{dataset_name}-0 -n hub rsync-pvc --image ubuntu:18.04 --command -- sleep 100000 | kubectl apply -f -
rsync data files until finishing.
$ kubectl -n hub exec -it $(kubectl -n hub get pod -l app=rsync-pvc | cut -d' ' -f1 | grep -v NAME) bash
Note: It is very important for the slash word. Please be careful when you do the rsync command.
$ apt-get update && apt-get install -yq rsync
$ rsync -avP /pvcs/data-nfs-dataset-{dataset_name}-old-0/ /pvcs/data-nfs-dataset-{dataset_name}-0/
Step 8: Delete the rsync deployment.
$ kubectl -n hub delete deploy rsync-pvc
Step 9: Create a dataset pvc:
$ vim new-dataset-{dataset_name}.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
primehub-group: {dataset_name} # Need to be changed
primehub-group-sc: standard
name: dataset-{dataset_name} # Need to be changed
namespace: hub
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 2Gi # Change to target pv size
selector:
matchLabels:
primehub-group: dataset-{dataset_name} # Need to be changed
primehub-namespace: hub
storageClassName: ""
volumeMode: Filesystem
volumeName: hub-nfs-dataset-{dataset_name} # Need to be changed
$ kubectl -n hub apply -f new-dataset-{dataset_name}.yaml
persistentvolumeclaim/dataset-{dataset_name} created
Step 9: Delete old PVC.
NOTICE: If you finish syncing files and asking the user to check the files, then you can do the following commend.
$ Kubectl -n hub delete pvc claim-old-{username}
Check the PrimeHub Notebook can see the target volume files in the old PVC folder.
→ Create a PrimeHub Notebook and check the specific folder.
N/A