fix(sandbox): stop openclaw doctor tightening mutable config perms (#4538, #4859)#5208
fix(sandbox): stop openclaw doctor tightening mutable config perms (#4538, #4859)#5208yimoj wants to merge 2 commits into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughA Node.js CLI is added that scans compiled OpenClaw doctor modules and rewrites two permission-check gates to a sandbox-aware conditional when ChangesSandbox-aware OpenClaw doctor permission patch
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
…VIDIA#4538, NVIDIA#4859) OpenClaw's `doctor` state-integrity check flags the NemoClaw mutable-config contract — /sandbox/.openclaw (2770 setgid + group-writable) and openclaw.json (660) — as "permissions are too open" and, with --fix, tightens them to single-user 700/600. That durably breaks the group-shared contract the gateway UID (a member of the sandbox group) needs to persist control-UI config writes, and prints scary "too open"/"Tightened permissions" output that misrepresents the intended contract. PR NVIDIA#5017's always-on openclaw() shell guard only restores 2770/660 *after* a raw in-sandbox `openclaw doctor --fix` exits — so the warning/tighten still runs, and any path that bypasses the connect-shell guard (stale image, raw `command openclaw`, the macOS reporter flow in NVIDIA#4859) leaves the tree durably locked at 700/600. Fix the root cause at the OpenClaw source via the established Dockerfile-patch mechanism: a new patch-openclaw-doctor-mutable-perms.mts narrows doctor's over-open test from group+other bits (`stat.mode & 63`) to OTHER (world) bits only (`stat.mode & 7`) when running inside an OpenShell sandbox. Group bits — the gateway's intended shared access — are then tolerated, so the 2770/660 contract is neither warned about nor tightened, while genuinely world-accessible modes (e.g. 2777/664) are still caught and repaired. Out-of-sandbox behavior is byte-identical to upstream. The gate keys on OPENSHELL_SANDBOX being set (not `=== "1"`): OpenShell injects the sandbox name there, verified on 0.0.44. Because the fix lives in the baked image, it applies regardless of host OS and whether the connect-shell guard is loaded, resolving both the All-Platforms NVIDIA#4538 and the macOS NVIDIA#4859 reports. The NVIDIA#5017 guard is kept as defense-in-depth for sandboxes built before this patch. E2E (real onboarded sandbox, patched image, OpenClaw 2026.5.27): - connect-shell `openclaw doctor --fix`: exit 0, 0 scary messages, 0 bashrc EACCES, perms stay 2770/660. - raw `command openclaw doctor --fix` (guard bypassed, NVIDIA#4859 path): exit 0, 0 scary messages, perms stay 2770/660. - unpatched baseline: 4 scary messages, perms regress to 700/600. The patch classifies the compiled doctor state-integrity module by content signature, requires exactly the two reviewed permission gates, fails closed on shape drift, and is idempotent and a no-op when absent. Signed-off-by: Yimo Jiang <yimoj@nvidia.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
d7ded9d to
bb46f0f
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
Dockerfile (1)
525-526: Run the recommended E2E matrix against this image change before merge.Because this is Dockerfile patching of installed runtime bits, validate with the listed sandbox E2Es (
cloud-e2e,sandbox-survival-e2e,hermes-e2e,rebuild-openclaw-e2e,openclaw-tui-chat-correlation-e2e) on this branch image to confirm no runtime regressions.As per coding guidelines:
Dockerfilelayer/config behavior is only fully testable with a real container build, and those E2E jobs are the recommended selective set.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Dockerfile` around lines 525 - 526, This change modifies runtime bits via the Dockerfile RUN step that executes /usr/local/lib/nemoclaw/patch-openclaw-doctor-mutable-perms.mts against the installed openclaw runtime, so before merging you must build the branch image and run the recommended E2E matrix (cloud-e2e, sandbox-survival-e2e, hermes-e2e, rebuild-openclaw-e2e, openclaw-tui-chat-correlation-e2e) against that image to validate there are no runtime regressions; ensure you reference the updated RUN invocation in the Dockerfile when selecting the image and attach the E2E job results to the PR.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/patch-openclaw-doctor-mutable-perms.mts`:
- Around line 154-161: The code currently exits with success when
locateDoctorModules(dirs) returns no modules, allowing an unpatched image to
ship; change this to fail closed by logging an error (use console.error or
processLogger.error) with context including the searched dirs and then call
process.exit with a non-zero code (e.g., process.exit(1)) instead of
process.exit(0); update the block that checks modules.length === 0 and any
related messaging to reflect the failure semantics so the patch flow stops on
drift.
---
Nitpick comments:
In `@Dockerfile`:
- Around line 525-526: This change modifies runtime bits via the Dockerfile RUN
step that executes
/usr/local/lib/nemoclaw/patch-openclaw-doctor-mutable-perms.mts against the
installed openclaw runtime, so before merging you must build the branch image
and run the recommended E2E matrix (cloud-e2e, sandbox-survival-e2e, hermes-e2e,
rebuild-openclaw-e2e, openclaw-tui-chat-correlation-e2e) against that image to
validate there are no runtime regressions; ensure you reference the updated RUN
invocation in the Dockerfile when selecting the image and attach the E2E job
results to the PR.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 8efbb1d5-99a2-4c20-87f9-60ea5c728bf4
📒 Files selected for processing (5)
Dockerfilescripts/patch-openclaw-doctor-mutable-perms.mtssrc/lib/sandbox/build-context.tstest/openclaw-doctor-mutable-perms-patch.test.tstest/sandbox-build-context.test.ts
Address CodeRabbit review on NVIDIA#5208: the doctor state-integrity module is core OpenClaw and always present where this patch runs, so an absent module means the OpenClaw dist shape/path drifted. Exiting 0 (no-op) there would silently ship an image whose `doctor --fix` resumes tightening the NemoClaw 2770/660 mutable contract — the exact NVIDIA#4538/NVIDIA#4859 regression. Fail closed (non-zero) instead, matching the fetch-guard patches' "fail on unknown OpenClaw dist shape" stance, and update the unit test to assert the fail-closed exit. Signed-off-by: Yimo Jiang <yimoj@nvidia.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Thanks @coderabbitai — addressed in b1c0dfa:
|
|
Tip For best results, initiate chat on the files or code changes.
|
|
@coderabbitai review |
✅ Action performedReview finished.
|
Summary
openclaw doctor --fix, run inside a NemoClaw mutable sandbox, tightens thegroup-writable OpenClaw config contract (
/sandbox/.openclaw2770 /openclaw.json660) back to single-user 700/600, durably breaking the gateway UID's ability to
persist control-UI config writes — and prints scary "permissions are too open" /
"Tightened permissions" output. This patches OpenClaw's
doctorat the source soit tolerates the intended group-shared perms inside an OpenShell sandbox while
still catching genuinely world-accessible modes.
Related Issue
Fixes #4538
Fixes #4859
Changes
scripts/patch-openclaw-doctor-mutable-perms.mts— build-time patch thatnarrows OpenClaw doctor's over-open test from group+other bits (
stat.mode & 63)to OTHER (world) bits only (
stat.mode & 7) when running inside an OpenShellsandbox (
OPENSHELL_SANDBOXset). Group bits — the gateway's intended sharedaccess — are tolerated, so the 2770/660 contract is neither warned about nor
tightened; world-accessible modes (e.g. 2777/664) are still flagged/repaired.
Out-of-sandbox behavior is byte-identical to upstream. Classifies the compiled
doctor state-integrity module by content signature, requires exactly the two
reviewed permission gates, fails closed on shape drift, and is idempotent.
Dockerfile— COPY + chmod +RUN node --experimental-strip-typesthe newpatch against
/usr/local/lib/node_modules/openclaw/dist, alongside the existingOpenClaw patches.
src/lib/sandbox/build-context.ts— stage the new patch script into theoptimized sandbox build context (so onboard's image build can COPY it).
test/openclaw-doctor-mutable-perms-patch.test.ts(behavioral:tolerates 2770/660 in-sandbox, still flags world bits, upstream behavior when
unset, idempotent, fails loud on shape drift) and a build-context staging
assertion.
Why a source patch (vs the #5017 guard alone)
PR #5017's always-on
openclaw()shell guard only restores 2770/660 after doctorexits — the warning/tighten still runs, and any path that bypasses the connect-shell
guard (stale image, raw
command openclaw, the macOS reporter flow in #4859) leavesthe tree durably locked at 700/600. Fixing doctor at the OpenClaw source makes the
contract hold regardless of host OS or whether the guard is loaded, resolving both
the All-Platforms #4538 and the macOS #4859 reports with one image-level fix. The
#5017 guard is kept as defense-in-depth for sandboxes built before this patch.
Note on the
OPENSHELL_SANDBOXgateThe gate keys on
OPENSHELL_SANDBOXbeing non-empty rather than=== "1": OpenShellinjects the sandbox name into that variable (e.g.
OPENSHELL_SANDBOX=my-sandbox),verified against OpenShell 0.0.44 — it is not the literal
"1". A non-empty check("inside any OpenShell sandbox") is what actually fires in the runtime where the
reporter runs
openclaw doctor.Type of Change
Verification
npm testpasses (643 files, 7929 tests)E2E (real onboarded sandbox, patched image, OpenClaw 2026.5.27)
openclaw doctor --fix(#4538)2770/660command openclaw doctor --fix, guard bypassed (#4859 path)2770/660700/600Built the sandbox image through
nemoclaw onboard(which stages the worktreeDockerfile), confirmed the patch is baked into the image's doctor module, then ran
the exact reporter workflow (
stat→openclaw doctor --fix→stat) through areal login shell and through a raw guard-bypassing shell.
Signed-off-by: Yimo Jiang yimoj@nvidia.com
Summary by CodeRabbit
New Features
Tests
Chores