Helm chart: add volumeMounts for nvidiaDriverRoot + custom env. for NFD pod by plevart · Pull Request #1678 · NVIDIA/k8s-device-plugin

plevart · 2026-03-28T08:52:27Z

This is a proposed fix for #1677

With this patch, I can specify something like the following in the Helm chart custom values.yaml:

deviceListStrategy: cdi-cri
nvidiaDriverRoot: "/"
gfd:
  enabled: true
  env:
    # when deviceListStrategy=cdi-cri, there is no runc wrapper to inject the library
    - name: LD_PRELOAD
      value: /driver-root/usr/lib64/libnvidia-ml.so.1

...and the gfd Pod is happy.

I also cleaned-up and aligned the logic of this part of template with device plugin template.

copy-pr-bot · 2026-03-28T08:52:31Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: Peter Levart <peter.levart@gmail.com>

elezar

Although it may be useful to set the envvars for the GFD pod, I would argue that using this to set the location of libnvidia-ml.so.1 is not the correct mechanism. We already support loading a SPECIFIC path to libnvidia-ml.so.1 and the initialisation of the library here

k8s-device-plugin/cmd/gpu-feature-discovery/main.go

Line 182 in 51f7ece

nvmllib := nvml.New()

should be updated.

See for example how this is done in the DRA driver here https://github.com/kubernetes-sigs/dra-driver-nvidia-gpu/blob/ceb2f34a763def3531062efef01593f40861258e/cmd/gpu-kubelet-plugin/nvlib.go#L54-L69

Note that in the case of the GPU operator, the Device Plugin and GFD are consisdered "management" containers and are explicitly run using a specific runtime class to ensure that they have access to the requried devices and libraries. In cases where native CDI is used, we have an NRI plugin that ensures that these pods have GPU access based on pod annotations.

plevart force-pushed the main branch from 992e2bb to 573c5cc Compare April 8, 2026 04:02

add volumeMounts for nvidiaDriverRoot + custom env. for NFD pod

b5e66e0

Signed-off-by: Peter Levart <peter.levart@gmail.com>

plevart force-pushed the main branch from 573c5cc to b5e66e0 Compare April 8, 2026 04:05

elezar reviewed Apr 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Helm chart: add volumeMounts for nvidiaDriverRoot + custom env. for NFD pod#1678

Helm chart: add volumeMounts for nvidiaDriverRoot + custom env. for NFD pod#1678
plevart wants to merge 1 commit intoNVIDIA:mainfrom
plevart:main

plevart commented Mar 28, 2026 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented Mar 28, 2026

Uh oh!

elezar left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

plevart commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot Bot commented Mar 28, 2026

Uh oh!

elezar left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

plevart commented Mar 28, 2026 •

edited

Loading

elezar left a comment •

edited

Loading