Helm chart: add volumeMounts for nvidiaDriverRoot + custom env. for NFD pod#1678
Helm chart: add volumeMounts for nvidiaDriverRoot + custom env. for NFD pod#1678plevart wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
Signed-off-by: Peter Levart <peter.levart@gmail.com>
There was a problem hiding this comment.
Although it may be useful to set the envvars for the GFD pod, I would argue that using this to set the location of libnvidia-ml.so.1 is not the correct mechanism. We already support loading a SPECIFIC path to libnvidia-ml.so.1 and the initialisation of the library here
should be updated.See for example how this is done in the DRA driver here https://github.com/kubernetes-sigs/dra-driver-nvidia-gpu/blob/ceb2f34a763def3531062efef01593f40861258e/cmd/gpu-kubelet-plugin/nvlib.go#L54-L69
Note that in the case of the GPU operator, the Device Plugin and GFD are consisdered "management" containers and are explicitly run using a specific runtime class to ensure that they have access to the requried devices and libraries. In cases where native CDI is used, we have an NRI plugin that ensures that these pods have GPU access based on pod annotations.
This is a proposed fix for #1677
With this patch, I can specify something like the following in the Helm chart custom values.yaml:
...and the gfd Pod is happy.
I also cleaned-up and aligned the logic of this part of template with device plugin template.