Skip to content

fix(recovery): restore OpenClaw guard chain#5340

Closed
jyaunches wants to merge 1 commit into
NVIDIA:mainfrom
jyaunches:fix/2701-guard-reemit-simple
Closed

fix(recovery): restore OpenClaw guard chain#5340
jyaunches wants to merge 1 commit into
NVIDIA:mainfrom
jyaunches:fix/2701-guard-reemit-simple

Conversation

@jyaunches

@jyaunches jyaunches commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #2701 with a narrow OpenClaw recovery change: when /tmp/nemoclaw-proxy-env.sh is missing after a pod/container recreate, the recovery script re-emits the guard preload files from the baked /usr/local/lib/nemoclaw/preloads/ directory and regenerates /tmp/nemoclaw-proxy-env.sh before launching the gateway.

This preserves the failing-test-first guard from #5049 without adding new framework, registry, workflow, or migration infrastructure.

Related Issues

Fixes #2701
Refs #2478
Refs #5049
Refs #5098

Simplicity / #5098 alignment

  • Test shape: focused existing Vitest unit coverage in src/lib/agent/runtime.test.ts.
  • New shared helpers/framework/registry/ledger: none.
  • Workflow changes: none.
  • Legacy bash changes: none.
  • Production change: one localized shell fragment inside buildOpenClawRecoveryScript().

Verification

  • npm run build:cli
  • sh -n against generated buildOpenClawRecoveryScript(18789) output
  • npx vitest run src/lib/agent/runtime.test.ts

Note: pre-push full Test (CLI) is still noisy locally with existing environment/timeouts (json5 fixture path, local OpenShell/gateway state, shell supervisor signal assertions); targeted validation above passed.

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced OpenClaw gateway recovery with stricter validation and improved error handling for missing proxy environment files.
  • Tests

    • Added regression tests for gateway recovery scenarios to ensure proper guard chain restoration and execution order.

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 8dfed37f-9bec-43ec-92ef-3a02945ddb53

📥 Commits

Reviewing files that changed from the base of the PR and between 7104567 and 9c7b497.

📒 Files selected for processing (2)
  • src/lib/agent/runtime.test.ts
  • src/lib/agent/runtime.ts

📝 Walkthrough

Walkthrough

This PR fixes a crash loop on aarch64 systems after pod recreate by implementing automatic re-emission of OpenClaw guard preload chains in the recovery path. A new buildOpenClawGuardRestoreCommand() helper generates shell to restore guard files from bundled sources, integrated into buildOpenClawRecoveryScript() with stricter failure checks, and validated by updated and new tests.

Changes

Guard Chain Restoration for Crash Loop Fix

Layer / File(s) Summary
Guard restoration helper and preload restoration
src/lib/agent/runtime.ts
New buildOpenClawGuardRestoreCommand(port) generates shell to restore /tmp/nemoclaw-*.js guard preloads from /usr/local/lib/nemoclaw/preloads/, validates sources, re-creates /tmp/nemoclaw-proxy-env.sh with gateway environment and NODE_OPTIONS preload chain, enforces permissions, and hard-fails with gateway-log entries on validation errors.
Recovery script integration and stricter failure handling
src/lib/agent/runtime.ts
buildOpenClawRecoveryScript(port) calls the new guard-restore command and adds explicit post-restoration checks: missing proxy-env file becomes a hard error, and presence of the env file without required NODE_OPTIONS guards becomes a hard error (preventing unguarded gateway relaunch).
Test updates and regression test
src/lib/agent/runtime.test.ts
Updated existing test to assert guard-chain restoration marker emission and log tail. New regression test verifies correct execution order (log selection → guard-chain restore → sourcing → nohup) when proxy-env.sh is missing.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Possibly related PRs

  • NVIDIA/NemoClaw#5049: Adds E2E and recovery helper tests that validate guard-chain restoration and marker/log contracts modified in this PR.

Suggested labels

bug-fix, integration: openclaw, area: sandbox, platform: dgx-spark, v0.0.64

Suggested reviewers

  • cv

Poem

🐰 When /tmp guards are swept away by a pod's final breath,
Our recovery script now restores them before gateway's death.
No more crash loops on aarch64 shores—
We rebuild the chains and then relaunch floors! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'fix(recovery): restore OpenClaw guard chain' directly describes the main change: adding guard-chain restoration to the recovery script to fix issue #2701.
Linked Issues check ✅ Passed The PR implements the core recovery fix from #2701 by adding guard-restore logic and stricter failure handling to buildOpenClawRecoveryScript, plus comprehensive unit tests validating re-emission behavior and execution order.
Out of Scope Changes check ✅ Passed All changes are scoped to test and recovery-script generation; no modifications to unrelated files, CLI flags, nemoclaw-start.sh, or permission models.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@jyaunches jyaunches closed this Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant