Skip to content

fix(security): stop passing NVIDIA_API_KEY into sandbox and command lines#675

Open
ericksoa wants to merge 2 commits intomainfrom
fix/issue-429-credential-environ-exposure
Open

fix(security): stop passing NVIDIA_API_KEY into sandbox and command lines#675
ericksoa wants to merge 2 commits intomainfrom
fix/issue-429-credential-environ-exposure

Conversation

@ericksoa
Copy link
Contributor

@ericksoa ericksoa commented Mar 23, 2026

Summary

Closes #429. Lands independently of #617.

The OpenShell gateway proxies inference and injects stored credentials server-side — the raw NVIDIA_API_KEY was never needed inside the sandbox but was passed via env args, setup.sh, walkthrough commands, and the setupSpark sudo call, exposing it in ps aux, /proc/pid/cmdline, docker inspect, and k3s audit logs.

  • Remove NVIDIA_API_KEY from openshell sandbox create env args (onboard.js, setup.sh)
  • Use env-name-only --credential NVIDIA_API_KEY form in setup.sh (same pattern as fix(security): stop leaking API keys in process args visible via ps aux #330)
  • Remove key from walkthrough.sh tmux/connect commands
  • Remove unnecessary key + ensureApiKey() from setupSpark (script never reads it)
  • Clear key from process.env after setupInference handoff
  • Add 6 regression tests for credential exposure

What this does NOT fix

Why it works

Verified in OpenShell source:

  • proxy.rs:1068 — gateway strips all Authorization / X-Api-Key headers from sandbox requests
  • backend.rs:101 — gateway re-authenticates upstream using stored provider key
  • grpc.rs:3446 — inference provider credentials are NOT injected into sandbox env
  • nemoclaw-start.sh:19write_auth_profile() gracefully no-ops if key is absent

Test plan

  • 225/225 tests pass (including 6 new regression tests)
  • E2E: nemoclaw onboard completes, inference works through gateway without key in sandbox

Summary by CodeRabbit

  • Tests

    • Added comprehensive test suite to verify proper credential handling across agent setup and sandbox configuration.
  • Chores

    • Refined agent initialization scripts and sandbox environment configuration.
    • Updated documentation in setup walkthrough to reflect streamlined agent startup process.
    • Improved code clarity with enhanced inline comments regarding gateway-side configuration.

…ines

The OpenShell gateway proxies inference requests and injects stored
credentials server-side (proxy.rs strips client auth headers,
backend.rs re-authenticates upstream). The raw key was never needed
inside the sandbox but was passed via env args, setup.sh, walkthrough
commands, and the setupSpark sudo call — exposing it in ps aux,
/proc/pid/cmdline, docker inspect, and k3s audit logs.

Changes:
- Remove NVIDIA_API_KEY from openshell sandbox create env args
- Use env-name-only credential form in setup.sh
- Remove key from walkthrough.sh tmux/connect commands
- Remove unnecessary key + ensureApiKey() from setupSpark
- Clear key from process.env after setupInference handoff
- Add 6 regression tests for credential exposure

Does NOT fix /proc/pid/environ (kernel snapshot is immutable after
exec — requires file-based credential loading in OpenShell).
Messaging tokens left in sandbox env pending #617 merge.

Closes #429.
@coderabbitai
Copy link

coderabbitai bot commented Mar 23, 2026

📝 Walkthrough

Walkthrough

These changes relocate NVIDIA API key handling from client-side environment injection to server-side gateway credential injection. The API key is removed from sandbox and child process environments across multiple scripts and binaries, while an explicit cleanup step deletes the key from the parent process. New tests validate that the key is not exposed in command invocations.

Changes

Cohort / File(s) Summary
Core credential handling
bin/lib/onboard.js, bin/nemoclaw.js, scripts/setup.sh, scripts/walkthrough.sh
Removed NVIDIA_API_KEY from sandbox environment construction, child process spawning, and setup script invocations. Added delete process.env.NVIDIA_API_KEY in onboard flow and inline documentation clarifying server-side credential injection via OpenShell gateway.
Security test suite
test/runner.test.js
Added comprehensive test suite validating absence of NVIDIA_API_KEY in credential-sensitive locations: sandbox env args, setup script commands, spark setup invocations, and walkthrough instructions. Prevents regression of credential exposure.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A rabbit's verse on keys kept safe:

No secrets whispered to the wind,
No logs that leave a trail of sin,
The gateway guards them, tried and true,
So credentials stay from plaintext view!
— Bun the Security Rabbit 🔐

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 22.22% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main security fix: stopping NVIDIA_API_KEY from being passed into sandbox and command lines.
Linked Issues check ✅ Passed The PR comprehensively addresses #429 by removing NVIDIA_API_KEY from sandbox env args, command lines, and process.env, implementing server-side credential handling and adding 6 regression tests.
Out of Scope Changes check ✅ Passed All changes are tightly scoped to removing NVIDIA_API_KEY credential exposure from sandbox creation, command lines, and process environment inheritance as required by #429.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/issue-429-credential-environ-exposure

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/walkthrough.sh`:
- Around line 46-49: Remove the hard failure that requires NVIDIA_API_KEY by
deleting or changing the conditional block that checks [ -n
"${NVIDIA_API_KEY:-}" ] and exits (the NVIDIA_API_KEY guard); instead, either
remove the check entirely or replace it with a non-fatal warning (echo a message
if empty but do not call exit 1) so the walkthrough can run without that secret
in the host environment; update any references to NVIDIA_API_KEY in the script
to handle an empty value safely where actually needed.

In `@test/runner.test.js`:
- Around line 263-274: The current test "walkthrough.sh does not embed
NVIDIA_API_KEY in tmux or sandbox commands" only scans tmux/openshell lines
(cmdLines) so other exposures can slip through; update the test to scan the full
script content (src) — or at least all non-comment/non-echo executable lines —
for the string "NVIDIA_API_KEY" and assert no matches, replacing the filtered
cmdLines logic with a broader filter over src.split("\n") (reusing variables
like src, fs, path) so any direct env usage anywhere in walkthrough.sh fails the
test.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 10e304b6-c6bb-4666-9a41-7b37e1097a57

📥 Commits

Reviewing files that changed from the base of the PR and between d37a09f and f166577.

📒 Files selected for processing (5)
  • bin/lib/onboard.js
  • bin/nemoclaw.js
  • scripts/setup.sh
  • scripts/walkthrough.sh
  • test/runner.test.js

Comment on lines +46 to +49
[ -n "${NVIDIA_API_KEY:-}" ] || {
echo "NVIDIA_API_KEY required"
exit 1
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Remove hard NVIDIA_API_KEY runtime requirement from walkthrough.

The walkthrough no longer injects this key into sandbox commands, so failing fast here unnecessarily keeps the secret in host process environments.

🔧 Proposed fix
-[ -n "${NVIDIA_API_KEY:-}" ] || {
-  echo "NVIDIA_API_KEY required"
-  exit 1
-}
+# NVIDIA_API_KEY is not required at walkthrough runtime.
+# Credentials are resolved server-side by the OpenShell gateway provider config.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
[ -n "${NVIDIA_API_KEY:-}" ] || {
echo "NVIDIA_API_KEY required"
exit 1
}
# NVIDIA_API_KEY is not required at walkthrough runtime.
# Credentials are resolved server-side by the OpenShell gateway provider config.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/walkthrough.sh` around lines 46 - 49, Remove the hard failure that
requires NVIDIA_API_KEY by deleting or changing the conditional block that
checks [ -n "${NVIDIA_API_KEY:-}" ] and exits (the NVIDIA_API_KEY guard);
instead, either remove the check entirely or replace it with a non-fatal warning
(echo a message if empty but do not call exit 1) so the walkthrough can run
without that secret in the host environment; update any references to
NVIDIA_API_KEY in the script to handle an empty value safely where actually
needed.

Comment on lines +263 to +274
it("walkthrough.sh does not embed NVIDIA_API_KEY in tmux or sandbox commands", () => {
const fs = require("fs");
const src = fs.readFileSync(path.join(__dirname, "..", "scripts", "walkthrough.sh"), "utf-8");
// Check only executable lines (tmux spawn, openshell connect) — not comments/docs
const cmdLines = src.split("\n").filter(
(l) => !l.trim().startsWith("#") && !l.trim().startsWith("echo") &&
(l.includes("tmux") || l.includes("openshell sandbox connect"))
);
for (const line of cmdLines) {
expect(line.includes("NVIDIA_API_KEY")).toBe(false);
}
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Credential regression guard is too narrow for walkthrough script.

This check only inspects tmux/connect command lines, so a direct env dependency/exposure path can slip through undetected.

🧪 Proposed test hardening
     it("walkthrough.sh does not embed NVIDIA_API_KEY in tmux or sandbox commands", () => {
       const fs = require("fs");
       const src = fs.readFileSync(path.join(__dirname, "..", "scripts", "walkthrough.sh"), "utf-8");
+      // Guard against runtime env preconditions that force secret presence.
+      expect(src.includes('[ -n "${NVIDIA_API_KEY:-}" ]')).toBe(false);
       // Check only executable lines (tmux spawn, openshell connect) — not comments/docs
       const cmdLines = src.split("\n").filter(
         (l) => !l.trim().startsWith("#") && !l.trim().startsWith("echo") &&
                (l.includes("tmux") || l.includes("openshell sandbox connect"))
       );
       for (const line of cmdLines) {
         expect(line.includes("NVIDIA_API_KEY")).toBe(false);
       }
     });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/runner.test.js` around lines 263 - 274, The current test "walkthrough.sh
does not embed NVIDIA_API_KEY in tmux or sandbox commands" only scans
tmux/openshell lines (cmdLines) so other exposures can slip through; update the
test to scan the full script content (src) — or at least all
non-comment/non-echo executable lines — for the string "NVIDIA_API_KEY" and
assert no matches, replacing the filtered cmdLines logic with a broader filter
over src.split("\n") (reusing variables like src, fs, path) so any direct env
usage anywhere in walkthrough.sh fails the test.

ericksoa added a commit that referenced this pull request Mar 23, 2026
Main switched to --credential "OPENAI_API_KEY" (env-lookup form from
#675). We set the proxy token in the environment so openshell reads it
via the env-name-only pattern while still using our random per-instance
token instead of the static "ollama" dummy value.
Copy link
Contributor

@cv cv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — important security fix. The gateway already injects credentials server-side, so the sandbox never needed the raw key. Good documentation of the /proc/pid/environ limitation and solid regression tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Security] NVIDIA API key passed via environment variable is logged by k3s/systemd journal in plaintext — key exposed in system logs

2 participants