fix(openclaw): fail closed on config restore merge errors#5178
Conversation
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThe PR adds OpenClaw ChangesOpenClaw config hash refresh validation and integration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
PR Review AdvisorFindings: 0 needs attention, 1 worth checking, 0 nice ideas Review findings🛠️ Needs attention
🔎 Worth checking
🌱 Nice ideas
Consider writing more tests for
This is an automated advisory review. A human maintainer must make the final merge decision. |
E2E Advisor RecommendationRequired E2E: None Full advisor summaryE2E Recommendation AdvisorFailed: Could not parse JSON from advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/e2e-advisor/e2e-advisor-raw-output.txt |
Vitest E2E Scenario RecommendationRequired Vitest E2E scenarios: None Full Vitest E2E advisor summaryVitest E2E Scenario AdvisorFailed: Could not parse JSON from advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/e2e-advisor/e2e-scenario-advisor-raw-output.txt |
Selective E2E Results — ❌ Some jobs failedRun: 27307941381
|
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Selective E2E Results — ❌ Some jobs failedRun: 27315161743
|
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/lib/actions/sandbox/rebuild.ts (1)
100-102: ⚡ Quick winConsider atomic file operations to prevent hash file corruption.
Lines 100-102 have a narrow time-of-check-time-of-use window where
openclaw.jsoncould be deleted orsha256sumcould fail after the file existence check. The> .config-hashredirection truncates the hash file beforesha256sumruns, so failure leaves.config-hashempty or corrupted. While the non-zero exit status is detected and the user is warned (line 1290), atomic operations would eliminate this risk entirely.🔄 Refactor to use atomic file operations
'[ -f "$config_file" ] || exit 0', 'cd "$config_dir" || exit 13', -"sha256sum openclaw.json > .config-hash", +"sha256sum openclaw.json > .config-hash.tmp && mv .config-hash.tmp .config-hash", "chmod 660 .config-hash 2>/dev/null || true",This ensures
.config-hashis never left in a partially-written state — either the hash is fully computed and atomically moved into place, or the original.config-hashremains untouched on failure.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/actions/sandbox/rebuild.ts` around lines 100 - 102, Replace the direct redirection "sha256sum openclaw.json > .config-hash" with an atomic write: compute the hash into a temporary file (use mktemp or similar) while ensuring you are in the directory pointed to by config_dir and that config_file exists, then move the temp file into place with mv (atomic rename) only on successful sha256sum; also ensure the temp file is removed on failure (use a trap/cleanup). Target the block that checks '[ -f "$config_file" ] || exit 0', the cd "$config_dir" || exit 13 step, and the "sha256sum openclaw.json > .config-hash" command when making this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/lib/actions/sandbox/rebuild.ts`:
- Around line 100-102: Replace the direct redirection "sha256sum openclaw.json >
.config-hash" with an atomic write: compute the hash into a temporary file (use
mktemp or similar) while ensuring you are in the directory pointed to by
config_dir and that config_file exists, then move the temp file into place with
mv (atomic rename) only on successful sha256sum; also ensure the temp file is
removed on failure (use a trap/cleanup). Target the block that checks '[ -f
"$config_file" ] || exit 0', the cd "$config_dir" || exit 13 step, and the
"sha256sum openclaw.json > .config-hash" command when making this change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 80e9a7a3-b9b9-4267-8523-41e415817c19
📒 Files selected for processing (4)
scripts/nemoclaw-start.shsrc/lib/actions/sandbox/rebuild-config-hash.test.tssrc/lib/actions/sandbox/rebuild.tstest/repro-4538-raw-doctor-perms.test.ts
Selective E2E Results — ✅ All requested jobs passedRun: 27389516622
|
…estore-unit-tests # Conflicts: # test/repro-4538-raw-doctor-perms.test.ts
Selective E2E Results — ✅ All requested jobs passedRun: 27390976297
|
Selective E2E Results — ✅ All requested jobs passedRun: 27399232492
|
|
✨ |
## Summary Refreshes release-prep documentation for NemoClaw v0.0.65. Adds the v0.0.65 release-notes section and refreshes generated `nemoclaw-user-*` skills from the Fern MDX source docs. ## Changes - Added the v0.0.65 release notes to `docs/about/release-notes.mdx` with links to the deeper docs pages for lifecycle, troubleshooting, inference, CLI commands, messaging, credentials, network policy, Hermes, and sub-agents. - Regenerated the `nemoclaw-user-*` skills with `scripts/docs-to-skills.py` so release-prep skill output matches the merged source docs. - Used the v0.0.65 announcement discussion as release context: #5472. ## Source Summary - #2492 -> `docs/about/release-notes.mdx`: Documents deadline-based gateway wait reliability in the v0.0.65 recovery summary. - #4958 -> `docs/about/release-notes.mdx`: Documents re-execed OpenClaw gateway health check recovery in the sandbox recovery summary. - #5163 -> `docs/about/release-notes.mdx`: Documents safer uninstall TTY confirmation behavior in the day-two CLI summary. - #5178 -> `docs/about/release-notes.mdx`: Documents fail-closed config restore merge behavior in the rebuild and restore summary. - #5179 -> `docs/about/release-notes.mdx`: Documents WeChat QR token redaction in the messaging summary. - #5182 -> `docs/about/release-notes.mdx`: Documents sustained gateway serving checks in the recovery summary. - #5194 -> `docs/about/release-notes.mdx`: Documents model-router teardown during uninstall in the day-two CLI summary. - #5195 -> `docs/about/release-notes.mdx`: Documents Shields auto-restore lock reconfirmation in the rebuild and restore summary. - #5198 -> `docs/about/release-notes.mdx`: Documents Docker Desktop WSL CDI injection failure handling in the onboarding diagnostics summary. - #5201 -> `docs/about/release-notes.mdx`: Documents sandbox download/upload wrappers and sessions export in the day-two CLI summary. - #5205 -> `docs/about/release-notes.mdx`: Documents reporter-owned model metadata preservation in the rebuild and restore summary. - #5214 -> `docs/about/release-notes.mdx`: Documents managed vLLM model preflight before side effects in the inference setup summary. - #5215 -> `docs/about/release-notes.mdx`: Documents managed vLLM extra serve arguments in the inference setup summary. - #5216 -> `docs/about/release-notes.mdx`: Documents silent OpenClaw runtime fallback surfacing in the onboarding diagnostics summary. - #5225 -> `docs/about/release-notes.mdx`: Documents persisted sandbox gateway lookup in the gateway recovery summary. - #5238 -> `docs/about/release-notes.mdx`: Documents sub-agent gateway dial-back through the sandbox interface in the Hermes and sub-agent summary. - #5248 -> `docs/about/release-notes.mdx`: Documents Discord per-account proxy resolution in the messaging summary. - #5264 -> `docs/about/release-notes.mdx`: Documents reserved Hermes port `8642` handling in the Hermes compatibility summary. - #5267 -> `docs/about/release-notes.mdx`: Documents the narrower Hermes baseline policy in the Hermes compatibility summary. - #5321 -> `docs/about/release-notes.mdx`: Documents restored gateway guard chains in the gateway recovery summary. - #5328 -> `docs/about/release-notes.mdx`: Documents compact persisted messaging plans in the messaging summary. - #5338 -> `docs/about/release-notes.mdx`: Documents manifest channel migration in the messaging summary. - #5352 -> `docs/about/release-notes.mdx`: Documents persisted agent preservation through registry recovery in the rebuild and restore summary. - #5371 -> `.agents/skills/nemoclaw-user-reference/references/commands.md`: Refreshes generated skill output for custom build cache and layer-ordering source docs. - #5379 -> `docs/about/release-notes.mdx`: Documents dashboard port allocation across multiple NemoClaw gateways in the recovery summary. - #5382 -> `docs/about/release-notes.mdx`: Documents recovery when an active gateway has no sandbox spec in the recovery summary. - #5389 -> `.agents/skills/nemoclaw-user-reference/references/troubleshooting.md`: Refreshes generated skill output for declared agent `forward_ports` recovery source docs. - #5400 -> `docs/about/release-notes.mdx`: Documents bounded compatible endpoint probes in the inference setup summary. - #5410 -> `docs/about/release-notes.mdx`: Documents provider credential hash removal from sandbox registry entries in the messaging summary. - #5418 -> `docs/about/release-notes.mdx`: Documents summarized inference validation failures in the onboarding diagnostics summary. - #5457 -> `docs/about/release-notes.mdx`: Documents context-window recomputation after runtime model switches in the inference setup summary. - #5463 -> `docs/about/release-notes.mdx`: Documents cleanup of hard-coded messaging channel stragglers in the messaging summary. ## Skipped - #5366 matched `docs/.docs-skip` entries through skipped experimental paths, so this PR does not add new release-note text for that commit. ## Type of Change - [ ] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [ ] Doc only (prose changes, no code sample modifications) - [x] Doc only (includes code sample changes) ## Verification - [x] Git hooks passed during commit and push, or `npx prek run --from-ref main --to-ref HEAD` passes - [ ] Targeted tests pass for changed behavior - [ ] Full `npm test` passes (broad runtime changes only) - [ ] Tests added or updated for new or changed behavior - [x] No secrets, API keys, or credentials committed - [x] Docs updated for user-facing behavior changes - [ ] `npm run docs` builds without warnings (doc changes only) - [x] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages include SPDX header and frontmatter (new pages only) Verification notes: - `npm run docs` passed after rerunning outside the sandbox. Fern reported 0 errors and 1 hidden warning. - The first sandboxed `npm run docs` attempt failed before validation because `tsx` could not create its local IPC pipe under sandbox restrictions. - `npm run build:cli` passed before push to refresh the local `dist/` artifacts used by the CLI typecheck hook. - `npm test` was not run because this is a docs-only release refresh. --- Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Released NemoClaw v0.0.65 with improved gateway/sandbox recovery, safer day-two workflows, and enhanced Hermes compatibility. * Added managed vLLM extra-arguments configuration via `NEMOCLAW_VLLM_EXTRA_ARGS_JSON`. * Added Hermes troubleshooting guidance for port forwarding and health checks. * **Documentation** * Updated NVIDIA Endpoints/NIM setup and examples to use `NVIDIA_INFERENCE_API_KEY`. * Refined NVIDIA network policy and Model Router API base configuration. * Expanded CLI/environment variable documentation (including sub-agent gateway connectivity) and plugin build performance tips. * **Tests** * Expanded Vitest-backed E2E release validation coverage. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
Follow-up to #5174 that pins the risky OpenClaw config restore fallback cases. The restore now fails closed when selective merging cannot read or parse the current rebuilt config, or cannot parse the backup, leaving the fresh runtime config intact instead of writing a stale sanitized backup wholesale.
Related Issue
Follow-up to #5174.
Changes
openclaw.jsonand invalid backed-upopenclaw.json.openclaw.jsonstate files instead of zero-byte successful reads.Type of Change
Verification
npx prek run --all-filespassesnpm testpassesnpm run docsbuilds without warnings (doc changes only)Signed-off-by: Carlos Villela cvillela@nvidia.com
Summary by CodeRabbit
New Features
Tests