For worker nodes with GPU, you are required to install the Nvidia docker runtime.
Note: If your environment is in an air-gap (cannot access the public internet), then you will first need to upload the required files into the air-gap.
Before installing the Nvidia driver, first install the required packages.
sudo apt-get update
sudo apt-get install gcc make libc6-dev --no-install-recommends -y
Use the following commands to download, and then install the Nvidia device driver.
curl -OL <https://download.nvidia.com/XFree86/Linux-x86_64/510.54/NVIDIA-Linux-x86_64-510.54.run>
sudo bash NVIDIA-Linux-x86_64-510.54.run --ui=none
Download the Helm chart for nvidia-device-plugin
curl -OL <https://nvidia.github.io/k8s-device-plugin/stable/nvidia-device-plugin-0.11.0.tgz>
Use the following command to set up the stable repository. (x64)
echo \\
"deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] <https://download.docker.com/linux/ubuntu> \\
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
Install nvidia-docker2
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L <https://nvidia.github.io/nvidia-docker/gpgkey> | sudo apt-key add -
curl -s -L <https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list> | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update && sudo apt-get install -y nvidia-docker2
Edit /etc/docker/daemon.json
{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
}
Restart docker.