Skip to content

fix: avoid panic on empty cmdline in config-manager findPidToSignal#1735

Open
SAY-5 wants to merge 1 commit intoNVIDIA:mainfrom
SAY-5:fix/config-manager-oor-1726
Open

fix: avoid panic on empty cmdline in config-manager findPidToSignal#1735
SAY-5 wants to merge 1 commit intoNVIDIA:mainfrom
SAY-5:fix/config-manager-oor-1726

Conversation

@SAY-5
Copy link
Copy Markdown

@SAY-5 SAY-5 commented May 4, 2026

Fixes #1726.

procfs.Proc.CmdLine() returns an empty slice for kernel threads and processes that exit between AllProcs() and CmdLine(). The current code in findPidToSignal indexes cmdline[0] without a length check, which panics with index out of range [0] with length 0 and crashes the config-manager into CrashLoopBackOff when dynamically switching to a config such as mps-10x.

Skip processes with an empty cmdline and continue the scan so the manager can either locate the target PID or fall through to the existing no process found error.

panic: runtime error: index out of range [0] with length 0
goroutine 1 [running]:
main.findPidToSignal(...)
        /build/cmd/config-manager/main.go:456 +0x206

procfs.Proc.CmdLine() returns an empty slice for kernel threads and
processes that exit between AllProcs() and CmdLine(). Indexing
cmdline[0] without a length check panics with 'index out of range [0]
with length 0', crashing the config-manager into CrashLoopBackOff.

Skip such processes and continue scanning so the manager can either
locate the target PID or return the existing 'no process found' error.

Fixes NVIDIA#1726

Signed-off-by: SAY-5 <saiasish.cnp@gmail.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 4, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Panic: index out of range [0] in config-manager when dynamically switching to MPS mode

1 participant