Skip to content

fix(sandbox): stop openclaw doctor tightening mutable config perms (#4538, #4859)#5208

Closed
yimoj wants to merge 2 commits into
NVIDIA:mainfrom
yimoj:fix/4538-raw-doctor-mutable-perms
Closed

fix(sandbox): stop openclaw doctor tightening mutable config perms (#4538, #4859)#5208
yimoj wants to merge 2 commits into
NVIDIA:mainfrom
yimoj:fix/4538-raw-doctor-mutable-perms

Conversation

@yimoj

@yimoj yimoj commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

openclaw doctor --fix, run inside a NemoClaw mutable sandbox, tightens the
group-writable OpenClaw config contract (/sandbox/.openclaw 2770 / openclaw.json
660) back to single-user 700/600, durably breaking the gateway UID's ability to
persist control-UI config writes — and prints scary "permissions are too open" /
"Tightened permissions" output. This patches OpenClaw's doctor at the source so
it tolerates the intended group-shared perms inside an OpenShell sandbox while
still catching genuinely world-accessible modes.

Related Issue

Fixes #4538
Fixes #4859

Changes

  • New scripts/patch-openclaw-doctor-mutable-perms.mts — build-time patch that
    narrows OpenClaw doctor's over-open test from group+other bits (stat.mode & 63)
    to OTHER (world) bits only (stat.mode & 7) when running inside an OpenShell
    sandbox (OPENSHELL_SANDBOX set). Group bits — the gateway's intended shared
    access — are tolerated, so the 2770/660 contract is neither warned about nor
    tightened; world-accessible modes (e.g. 2777/664) are still flagged/repaired.
    Out-of-sandbox behavior is byte-identical to upstream. Classifies the compiled
    doctor state-integrity module by content signature, requires exactly the two
    reviewed permission gates, fails closed on shape drift, and is idempotent.
  • Dockerfile — COPY + chmod + RUN node --experimental-strip-types the new
    patch against /usr/local/lib/node_modules/openclaw/dist, alongside the existing
    OpenClaw patches.
  • src/lib/sandbox/build-context.ts — stage the new patch script into the
    optimized sandbox build context (so onboard's image build can COPY it).
  • Teststest/openclaw-doctor-mutable-perms-patch.test.ts (behavioral:
    tolerates 2770/660 in-sandbox, still flags world bits, upstream behavior when
    unset, idempotent, fails loud on shape drift) and a build-context staging
    assertion.

Why a source patch (vs the #5017 guard alone)

PR #5017's always-on openclaw() shell guard only restores 2770/660 after doctor
exits — the warning/tighten still runs, and any path that bypasses the connect-shell
guard (stale image, raw command openclaw, the macOS reporter flow in #4859) leaves
the tree durably locked at 700/600. Fixing doctor at the OpenClaw source makes the
contract hold regardless of host OS or whether the guard is loaded, resolving both
the All-Platforms #4538 and the macOS #4859 reports with one image-level fix. The
#5017 guard is kept as defense-in-depth for sandboxes built before this patch.

Note on the OPENSHELL_SANDBOX gate

The gate keys on OPENSHELL_SANDBOX being non-empty rather than === "1": OpenShell
injects the sandbox name into that variable (e.g. OPENSHELL_SANDBOX=my-sandbox),
verified against OpenShell 0.0.44 — it is not the literal "1". A non-empty check
("inside any OpenShell sandbox") is what actually fires in the runtime where the
reporter runs openclaw doctor.

Type of Change

  • Code change (feature, bug fix, or refactor)

Verification

  • npm test passes (643 files, 7929 tests)
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed

E2E (real onboarded sandbox, patched image, OpenClaw 2026.5.27)

Scenario doctor exit scary messages bashrc EACCES final perms
connect-shell openclaw doctor --fix (#4538) 0 0 0 2770/660
raw command openclaw doctor --fix, guard bypassed (#4859 path) 0 0 2770/660
unpatched baseline 1 4 1 700/600

Built the sandbox image through nemoclaw onboard (which stages the worktree
Dockerfile), confirmed the patch is baked into the image's doctor module, then ran
the exact reporter workflow (statopenclaw doctor --fixstat) through a
real login shell and through a raw guard-bypassing shell.


Signed-off-by: Yimo Jiang yimoj@nvidia.com

Summary by CodeRabbit

  • New Features

    • Sandbox-aware permission handling: runtime relaxes group/shared checks when detected in a sandbox, preserving stricter checks otherwise.
  • Tests

    • End-to-end tests added to validate patch behavior, idempotency, sandbox vs non-sandbox outcomes, and failure on unexpected module shapes.
  • Chores

    • Build now stages and applies the permission compatibility patch automatically and fails fast if the expected distribution shape is not found.

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5326dcb7-b2bd-41db-869c-d05ddc0c9800

📥 Commits

Reviewing files that changed from the base of the PR and between bb46f0f and b1c0dfa.

📒 Files selected for processing (2)
  • scripts/patch-openclaw-doctor-mutable-perms.mts
  • test/openclaw-doctor-mutable-perms-patch.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/openclaw-doctor-mutable-perms-patch.test.ts

📝 Walkthrough

Walkthrough

A Node.js CLI is added that scans compiled OpenClaw doctor modules and rewrites two permission-check gates to a sandbox-aware conditional when OPENSHELL_SANDBOX is set. The script is staged into sandbox build contexts, executed in the Dockerfile build, and covered by end-to-end tests for rewrite correctness, idempotency, no-op, and failure modes.

Changes

Sandbox-aware OpenClaw doctor permission patch

Layer / File(s) Summary
Patch script core implementation
scripts/patch-openclaw-doctor-mutable-perms.mts
CLI script that scans OpenClaw's compiled doctor module(s), enforces exact preconditions and idempotency, and rewrites two (stat.mode & 63) !== 0 gates to (process.env.OPENSHELL_SANDBOX ? 7 : 63) with a patch marker and verification.
Docker build-time patch application
Dockerfile
Copies the patch script into /usr/local/lib/nemoclaw/, makes it executable, and runs it via node --experimental-strip-types against /usr/local/lib/node_modules/openclaw/dist during image build.
Build context fixture staging
src/lib/sandbox/build-context.ts, test/sandbox-build-context.test.ts
Stages the patch script into optimized sandbox build contexts and updates the test fixture to include it in generated build-context stubs.
End-to-end patch behavior tests
test/openclaw-doctor-mutable-perms-patch.test.ts
Vitest tests that synthesize doctor module fixtures to validate sandbox-aware rewrites, runtime predicate behavior under OPENSHELL_SANDBOX vs unset, idempotency across runs, no-op when no module exists, and failure when gate count/shape mismatches occur.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

area: sandbox, integration: openclaw, bug-fix, v0.0.63

Suggested reviewers

  • cv
  • ericksoa

Poem

🐰 A rabbit hops with a careful plea,
"Let gateways write where they need to be."
I stitch the doctor to mind the sand,
So group-writes keep their gentle hand,
Patches snug, and systems free.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: patching OpenClaw's doctor to stop tightening mutable config permissions in sandbox mode by narrowing permission checks.
Linked Issues check ✅ Passed All code changes directly address linked issues #4538 and #4859: the patch script narrows permission checks when OPENSHELL_SANDBOX is set, preserving group-writable 2770/660 contract; Docker and build-context updates apply it at image build time; tests verify sandbox-aware behavior and idempotency.
Out of Scope Changes check ✅ Passed All changes are in scope: patch script, Dockerfile, build-context staging, and comprehensive tests directly support the objective of preventing doctor from tightening mutable sandbox permissions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

…VIDIA#4538, NVIDIA#4859)

OpenClaw's `doctor` state-integrity check flags the NemoClaw mutable-config
contract — /sandbox/.openclaw (2770 setgid + group-writable) and openclaw.json
(660) — as "permissions are too open" and, with --fix, tightens them to
single-user 700/600. That durably breaks the group-shared contract the gateway
UID (a member of the sandbox group) needs to persist control-UI config writes,
and prints scary "too open"/"Tightened permissions" output that misrepresents
the intended contract.

PR NVIDIA#5017's always-on openclaw() shell guard only restores 2770/660 *after* a
raw in-sandbox `openclaw doctor --fix` exits — so the warning/tighten still
runs, and any path that bypasses the connect-shell guard (stale image, raw
`command openclaw`, the macOS reporter flow in NVIDIA#4859) leaves the tree durably
locked at 700/600.

Fix the root cause at the OpenClaw source via the established Dockerfile-patch
mechanism: a new patch-openclaw-doctor-mutable-perms.mts narrows doctor's
over-open test from group+other bits (`stat.mode & 63`) to OTHER (world) bits
only (`stat.mode & 7`) when running inside an OpenShell sandbox. Group bits —
the gateway's intended shared access — are then tolerated, so the 2770/660
contract is neither warned about nor tightened, while genuinely world-accessible
modes (e.g. 2777/664) are still caught and repaired. Out-of-sandbox behavior is
byte-identical to upstream. The gate keys on OPENSHELL_SANDBOX being set (not
`=== "1"`): OpenShell injects the sandbox name there, verified on 0.0.44.

Because the fix lives in the baked image, it applies regardless of host OS and
whether the connect-shell guard is loaded, resolving both the All-Platforms
NVIDIA#4538 and the macOS NVIDIA#4859 reports. The NVIDIA#5017 guard is kept as defense-in-depth
for sandboxes built before this patch.

E2E (real onboarded sandbox, patched image, OpenClaw 2026.5.27):
- connect-shell `openclaw doctor --fix`: exit 0, 0 scary messages, 0 bashrc
  EACCES, perms stay 2770/660.
- raw `command openclaw doctor --fix` (guard bypassed, NVIDIA#4859 path): exit 0,
  0 scary messages, perms stay 2770/660.
- unpatched baseline: 4 scary messages, perms regress to 700/600.

The patch classifies the compiled doctor state-integrity module by content
signature, requires exactly the two reviewed permission gates, fails closed on
shape drift, and is idempotent and a no-op when absent.

Signed-off-by: Yimo Jiang <yimoj@nvidia.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@yimoj yimoj force-pushed the fix/4538-raw-doctor-mutable-perms branch from d7ded9d to bb46f0f Compare June 11, 2026 07:46

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
Dockerfile (1)

525-526: Run the recommended E2E matrix against this image change before merge.

Because this is Dockerfile patching of installed runtime bits, validate with the listed sandbox E2Es (cloud-e2e, sandbox-survival-e2e, hermes-e2e, rebuild-openclaw-e2e, openclaw-tui-chat-correlation-e2e) on this branch image to confirm no runtime regressions.

As per coding guidelines: Dockerfile layer/config behavior is only fully testable with a real container build, and those E2E jobs are the recommended selective set.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile` around lines 525 - 526, This change modifies runtime bits via the
Dockerfile RUN step that executes
/usr/local/lib/nemoclaw/patch-openclaw-doctor-mutable-perms.mts against the
installed openclaw runtime, so before merging you must build the branch image
and run the recommended E2E matrix (cloud-e2e, sandbox-survival-e2e, hermes-e2e,
rebuild-openclaw-e2e, openclaw-tui-chat-correlation-e2e) against that image to
validate there are no runtime regressions; ensure you reference the updated RUN
invocation in the Dockerfile when selecting the image and attach the E2E job
results to the PR.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/patch-openclaw-doctor-mutable-perms.mts`:
- Around line 154-161: The code currently exits with success when
locateDoctorModules(dirs) returns no modules, allowing an unpatched image to
ship; change this to fail closed by logging an error (use console.error or
processLogger.error) with context including the searched dirs and then call
process.exit with a non-zero code (e.g., process.exit(1)) instead of
process.exit(0); update the block that checks modules.length === 0 and any
related messaging to reflect the failure semantics so the patch flow stops on
drift.

---

Nitpick comments:
In `@Dockerfile`:
- Around line 525-526: This change modifies runtime bits via the Dockerfile RUN
step that executes
/usr/local/lib/nemoclaw/patch-openclaw-doctor-mutable-perms.mts against the
installed openclaw runtime, so before merging you must build the branch image
and run the recommended E2E matrix (cloud-e2e, sandbox-survival-e2e, hermes-e2e,
rebuild-openclaw-e2e, openclaw-tui-chat-correlation-e2e) against that image to
validate there are no runtime regressions; ensure you reference the updated RUN
invocation in the Dockerfile when selecting the image and attach the E2E job
results to the PR.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 8efbb1d5-99a2-4c20-87f9-60ea5c728bf4

📥 Commits

Reviewing files that changed from the base of the PR and between d629786 and d7ded9d.

📒 Files selected for processing (5)
  • Dockerfile
  • scripts/patch-openclaw-doctor-mutable-perms.mts
  • src/lib/sandbox/build-context.ts
  • test/openclaw-doctor-mutable-perms-patch.test.ts
  • test/sandbox-build-context.test.ts

Comment thread scripts/patch-openclaw-doctor-mutable-perms.mts
Address CodeRabbit review on NVIDIA#5208: the doctor state-integrity module is core
OpenClaw and always present where this patch runs, so an absent module means the
OpenClaw dist shape/path drifted. Exiting 0 (no-op) there would silently ship an
image whose `doctor --fix` resumes tightening the NemoClaw 2770/660 mutable
contract — the exact NVIDIA#4538/NVIDIA#4859 regression. Fail closed (non-zero) instead,
matching the fetch-guard patches' "fail on unknown OpenClaw dist shape" stance,
and update the unit test to assert the fail-closed exit.

Signed-off-by: Yimo Jiang <yimoj@nvidia.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@yimoj

yimoj commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

Thanks @coderabbitai — addressed in b1c0dfa:

  • Fail closed when no doctor module is discovered (actionable): Agreed. The doctor state-integrity module is core OpenClaw and always present where this patch runs (the OpenClaw runtime dist), so an absent module means dist drift. Changed the no-module branch from process.exit(0) to fail(...) (non-zero), matching the fetch-guard patches' "fail on unknown OpenClaw dist shape" stance, and updated the unit test to assert the fail-closed exit.
  • E2E matrix (nitpick): This image change was validated end-to-end against a real branch-built sandbox image (nemoclaw onboard → patched image confirmed baked in → exact reporter workflow stat → openclaw doctor --fix → stat, both via the connect-shell guard path and a raw guard-bypassed shell): doctor exits 0, no "too open"/"Tightened" output, no bashrc EACCES, and perms stay 2770/660; the unpatched baseline regresses to 700/600 with 4 warnings. CI's sandbox E2E jobs (test-e2e-sandbox, test-e2e-gateway-isolation, test-e2e-port-overrides, test-non-root-sandbox-smoke, test-e2e-ollama-proxy) are all green on this branch image.

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Tip

For best results, initiate chat on the files or code changes.

@yimoj Thanks for the detailed follow-up! Both points look well resolved:

  • Fail-closed on missing module – Changing the no-module branch to fail(...) is the right call; it's consistent with the fetch-guard patches and ensures dist drift is caught at image build time rather than silently succeeding. Good that the unit test was updated to assert the non-zero exit too.
  • E2E coverage – The stat→doctor --fix→stat matrix across both the guarded connect-shell path and the raw bypass path, with a confirmed unpatched regression baseline (700/600 + 4 warnings), is solid validation. All five CI E2E jobs being green gives good confidence this doesn't regress adjacent scenarios.

@yimoj

yimoj commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@yimoj yimoj closed this Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants