Why

If you need to maintain the node, then you need to follow the document to cordon and uncordon the service.

Tutorial

I. Cordon and shutdown the Kubernetes Node

  1. Set Ceph noout:

    $ k ceph osd set noout
    
  2. Cordon the node <node-name>.

    $ k get nodes
    $ k cordon <node-name>
    
  3. Stop notebook on node <node-name>.

    $ kubectl -n hub get pods -o wide | grep <node-name>
    $ kubectl -n hub delete pods <notebook-name>
    
  4. Drain node <node-name>.

    $ k drain <node-name> --delete-local-data --ignore-daemonsets
    
  5. shutdown on <node-name>

$ sudo systemctl poweroff

II. Uncordon the node after the machine boots up.

  1. Use lsblk command to list the device information:

    $ ssh <node>
    $ lsblk
    $ sudo vgs | grep '^\\s\\sceph-' | awk '{print \\$1}' | xargs -I{} vgchange -a y {}
    
  2. Uncordon <node-name>:

    $ k uncordon <node-name>
    <node-name> uncordoned
    
  3. Check the status:

    $ kubectl -n hub get nodes | grep <node-name>
    <node-name>   Ready    worker                     3y98d    v1.15.3
    
  4. Unset noout:

    $ k ceph osd unset noout
    
  5. Wait for ceph status is HEALTH_OK:

    ubuntu@<node-name>:~$ k ceph status
      cluster:
        id:     e3c254e8-7e80-4d0c-bef3-1142bd30a0a1
        health: HEALTH_OK
     
      services:
        mon: 3 daemons, quorum v,u,s
        mgr: a(active)
        osd: 29 osds: 29 up, 29 in
     
      data:
        pools:   8 pools, 1212 pgs
        objects: 2.54 M objects, 9.4 TiB
        usage:   28 TiB used, 105 TiB / 133 TiB avail
        pgs:     1211 active+clean
                 1    active+clean+scrubbing+deep
     
      io:
        client:   21 KiB/s wr, 0 op/s rd, 2 op/s wr
    
    $ kubectl -n rook get pod | grep osd
    

Reference