Skip to content

feat: exclude HAMI virtualized GPUs from NVIDIA device plugin via Node annotation watch#1674

Open
Zhangxiurui520 wants to merge 1 commit intoNVIDIA:mainfrom
Zhangxiurui520:zhangxr_runwithhami
Open

feat: exclude HAMI virtualized GPUs from NVIDIA device plugin via Node annotation watch#1674
Zhangxiurui520 wants to merge 1 commit intoNVIDIA:mainfrom
Zhangxiurui520:zhangxr_runwithhami

Conversation

@Zhangxiurui520
Copy link
Copy Markdown

Background

In clusters where HAMi virtualizes selected GPUs, Node annotation hami.io/node-nvidia-register contains the GPU UUIDs managed by HAMi (virtualized GPUs).
The NVIDIA should avoid reporting these GPUs to kubelet to prevent resource overlap and scheduling conflicts.

What this PR changes

  1. Adds a Node annotation watcher in [main.go]:
  • Uses Kubernetes Watch API (not polling) to watch only current Node (metadata.name=<NODE_NAME>).

  • Watches annotation hami.io/node-nvidia-register.

  • Parses GPU UUIDs (supports both GPU-... and hami-core:GPU-... token format).

  • Triggers plugin update when annotation changes.

  1. Extends plugin interface for dynamic filtering:
  • Adds [HandleAllowedDeviceIDs([]string)] in [api.go].

  • Device plugin implementation now handles runtime device-set updates.

  1. Enables dynamic re-reporting to kubelet:
  • In [server.go], adds an update signal channel.

  • ListAndWatch now re-sends device list on filter updates, so kubelet sees changes without restarting plugin.

  1. Adds filtering capability in resource manager:
  • In [rm.go], stores both:
    full discovered device set ([allDevices]
    ,currently exposed device set ([devices]

  • HandleAllowedDeviceIDs excludes GPUs present in HAMi annotation UUID list.

  • Empty UUID list restores full device set.

  • Adds RW mutex protection for concurrent read/update access.

  1. Keeps resource manager constructors aligned:

NVML/Tegra resource managers initialize both [allDevices] and [devices] so runtime filtering is safe and reversible.

Behavior

On startup: watcher does an initial Node GET sync.
On Node annotation update: plugin recalculates exposed GPUs and updates kubelet via ListAndWatch.
On annotation cleanup/removal: plugin restores full discovered GPU set.

Why Watch instead of polling

Faster reaction to annotation changes.
Lower API overhead compared with periodic GET loops.
Cleaner event-driven behavior for production clusters.

Compatibility / Notes

  • Backward compatible when HAMi annotation is absent: plugin behavior remains unchanged.

  • Requires running in cluster (InClusterConfig) and Node read/watch RBAC.

  • Recommended to inject NODE_NAME via Downward API ([spec.nodeName] for accurate node targeting.

image image

…ect, supporting dynamic adjustment of GPU devices

Signed-off-by: zhangxr <944702164@qq.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Mar 25, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant