How can I prepare an LKE Cluster for Scheduled Maintenance?
I received a ticket letting me know some of my Linodes will be undergoing scheduled maintenance which involves a reboot. Are there any steps to follow to ensure my workloads aren't impacted by this?
There are a few things to take care of before a reboot occurs. Below you'll find some steps you can take to prepare one of your LKE nodes for maintenance. To avoid potential impacts to your workloads, we recommend you drain any impacted nodes well ahead of time.
You can find any impacted nodes, and download a list of them as needed, from the Linodes page of the Cloud Manager.
For customers with scheduled maintenance and sensitive workloads that cannot tolerate downtime (such as applications that require a quorum like CockroachDB or clustered Redis), we recommend the following steps:
If you use
hostPathor other local storage techniques, back up the data on those Linodes scheduled for maintenance. As an example, you can attach a Block Storage PV and place the data there, pipe the data to another Linode using network utilities, or back up the data to Linode Object Storage. (If you are using Block Storage Persistent Volumes, you probably do not need to take this step.)
Add an additional Node Pool to your LKE cluster, of a plan type and size which can accommodate your existing workloads.
After the new Linodes have joined the cluster, drain any Linodes scheduled for maintenance. This will cause the workloads to be rescheduled to other Linodes in the cluster. We recommend draining the Linodes in your cluster one at a time, ensuring that the workloads have been rescheduled to new Linodes and are running before moving on to the next one. An example Node drain command:
kubectl drain lke123-634-4b0e9f2a4718
At this point you may delete the old Node Pool or choose to keep it for after the maintenance is complete. Note, if you keep the Node Pool, you will be charged for it.
If maintenance has been completed and you kept your previous Linodes, once they have booted you can mark them as schedulable again by using the following command:
kubectl uncordon lke123-634-4b0e9f2a4718
In addition to the above, we strongly recommend avoiding the use of local storage on LKE nodes when possible. Kubernetes workloads move around the cluster, which enables use cases like highly available distributed systems. We recommend that any customers using storage on the filesystem of the Linodes in an LKE cluster move this data to Persistent Volumes with network attached storage.
A volume is considered to be using local storage if it is of type
emptyDir. You can use the following command to determine if any of your pods are using local storage volumes:
kubectl get pods -A -oyaml | grep -E 'hostPath|emptyDir|local'
Some workloads use these techniques for temporary storage, which don't need to be addressed. Please consider whether or not you are using the "host" (Node) disk for your application data.
If you are using hostPath to store your application data, you can follow this process to copy it over to a Persistent Volume.
- Create a Persistent Volume Claim
- Deploy a new Pod that mounts the new Persistent Volume Claim along with the hostPath of your existing pod. It should look similar to the example below:
Existing Pod's Volumes and VolumeMounts
# shared-data is using local storage volumes: - name: shared-data hostPath: path: $HOSTPATH ........ volumeMounts: - name: shared-data mountPath: $MOUNTPATH
New Pod With Both Volumes Mounted
# shared-data is using local storage, pvc-test is a Persistent Volume Claim volumes: - name: shared-data hostPath: path: $HOSTPATH - name: pvc-test persistentVolumeClaim: claimName: pvc-test ........ volumeMounts: - name: shared-data mountPath: $MOUNTPATH - name: pvc-test mountPath: $CSIVolumePath
Connect to a shell in your new pod using
kubectl exec -it $PODNAME -- /bin/bash, where
$PODNAMEis the name of the new pod you created.
From the shell, copy the files from local storage to the PVC. Based on the example above, the command would look like
cp -P $MOUNTPATH $CSIVolumePath.
Delete the pod you created in step 2, and then re-create it. You should now see that all the data is stored in the CSI Volume.