If you have GPU devices, you will need to install GPU drivers and a device plugin. You can only use GPU when your device plugin and driver are correctly installed.
In this section, you will install device drivers and plugins for your GPU device and ensure they are correctly detected.
Note: If you only have CPU devices, then you do not need to install a GPU driver and device plugin and can skip this step.
The installation method for MicroK8s is different, so if you installed Microk8s as your container runtime, please follow the MicroK8s instructions below.
Configure and Install Microk8s GPU Driver and Device Plugin
[Appendix] Install and Configure MicroK8s GPU Driver and Device Plugin
Configure and Install Nvidia GPU Driver and Device Plugin
[Appendix] Install and Configure Nvidia GPU Driver and Device Plugin
Check GPU status and gpu-operator
is successfully installed:
nvidia-smi
You will see a similar output below:
Tue Mar 12 22:44:42 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07 Driver Version: 535.161.07 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA RTX A5000 Off | 00000000:51:00.0 Off | 0 |
| 30% 37C P8 24W / 230W | 0MiB / 23028MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
Verify the Nvidia device plugin for Microk8s
kubectl logs -n gpu-operator-resources -lapp=nvidia-operator-validator -c nvidia-operator-validator
If you see the pod logs showing as the following, then the validation is completed successfully:
all validations are successful