Skip to content

test(e2e): skip TC-SBX-09 lifecycle drive when sandbox blocks tmux fork#4640

Merged
cv merged 2 commits into
mainfrom
fix/4606-tc-sbx-09-tmux-fork-eperm
Jun 2, 2026
Merged

test(e2e): skip TC-SBX-09 lifecycle drive when sandbox blocks tmux fork#4640
cv merged 2 commits into
mainfrom
fix/4606-tc-sbx-09-tmux-fork-eperm

Conversation

@laitingsheng

@laitingsheng laitingsheng commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Summary

TC-SBX-09: Tmux Session Flow has been failing on every scheduled nightly E2E run since #4606 merged. The first assertion (tmux binary present) still passes; the second assertion — drive a full detached new-sessionlist-sessionskill-session cycle inside the sandbox — consistently fails with create window failed: fork failed: Permission denied.

Root cause: OpenShell sandbox hardening (seccomp, no-new-privileges, nproc=512 ulimit) blocks tmux's fork-to-spawn child window when invoked under the e2e SSH session account. The binary-presence assertion already covers the surface of issue #4513; the lifecycle drive depends on sandbox runtime capabilities that are environment-dependent and out of scope for this case. Degrade that branch to skip with the observed fork failed output so the suite reports the limitation without failing the nightly.

Latest failing scheduled nightly: run 26790528855. Same failure also blocks PR review on #4388 via inherited advisor reruns run 26790735708 and run 26791599457.

Related Issue

Follow-up to #4606 (which added TC-SBX-09 alongside the sandbox-image tmux pin). The PR body of #4606 noted "A full image-build + live-sandbox E2E was not run in this environment" — the lifecycle drive added by that PR turned out to be incompatible with the live OpenShell sandbox seccomp + capability profile, so every scheduled E2E / Nightly run since the merge has reported sandbox-operations-e2e as failing on this single assertion. This PR keeps the binary-presence guard from #4606 intact (the actual surface of #4513) while making the lifecycle drive a soft skip when the sandbox refuses to fork, so the nightly pipeline can go green again without masking real regressions (a non-fork failed error still hits the fail branch).

Changes

  • test/e2e/test-sandbox-operations.sh: in test_sbx_09_tmux_session_flow, add a skip branch matching fork failed: (Permission denied|Resource temporarily unavailable) between the existing pass/fail branches; keeps best-effort kill-session cleanup before recording the skip.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Tinson Lai tinsonl@nvidia.com

Summary by CodeRabbit

  • Tests
    • Improved tmux sandbox operations test to better detect and handle fork failures with enhanced error recovery and clearer skip messages.

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f821f8b0-6fa2-4e89-806b-4241816b7736

📥 Commits

Reviewing files that changed from the base of the PR and between fa73997 and 40de946.

📒 Files selected for processing (1)
  • test/e2e/test-sandbox-operations.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/e2e/test-sandbox-operations.sh

📝 Walkthrough

Walkthrough

This PR updates the test_sbx_09_tmux_session_flow test to detect tmux fork-policy/resource-limit failures, perform best-effort tmux session cleanup, and mark the test as skipped while including truncated failure output.

Changes

Sandbox Hardening Fork Error Handling

Layer / File(s) Summary
Fork failure pattern matching and skip handling
test/e2e/test-sandbox-operations.sh
When tmux operations emit fork failed: (Permission denied|Resource temporarily unavailable|Operation not permitted), the test matches that pattern, runs a best-effort tmux kill-session cleanup for the session, and records the test as skip including the first three lines of flow_out rather than following the generic failure path.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related issues

Possibly related PRs

  • NVIDIA/NemoClaw#4606: Directly overlaps in modifying the tmux session flow test and sandbox handling logic.

Suggested labels

Sandbox, nightly-e2e

Suggested reviewers

  • ericksoa
  • cv
  • cjagwani

Poem

"I'm a rabbit in the CI wood, hopping light and quick,
When tmux won't fork I tidy up—my cleanup trick.
I nibble logs, trim three lines neat,
Then skip along on silent feet.
Hooray for tests that fail with grace! 🐇✨"

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: skipping TC-SBX-09 test when sandbox prevents tmux from forking, which is the core modification to test_sbx_09_tmux_session_flow.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/4606-tc-sbx-09-tmux-fork-eperm

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: None
Optional E2E: None

Workflow run

Full advisor summary

E2E Recommendation Advisor

Failed: Could not parse JSON from advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/e2e-advisor/e2e-advisor-raw-output.txt

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

E2E Scenario Advisor Recommendation

Required scenario E2E: None
Optional scenario E2E: None

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Failed: Could not parse JSON from advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/e2e-advisor/e2e-scenario-advisor-raw-output.txt

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 3 worth checking, 0 nice ideas
Since last review: 1 prior item resolved, 2 still apply, 0 new items found

Review findings

🛠️ Needs attention

  • None.

🔎 Worth checking

  • Source-of-truth review needed: TC-SBX-09 tmux lifecycle fork-denial skip: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: Lines 484-491 add a matched fork-denial skip with cleanup and a comment naming hardening sources, but do not reference a tracked follow-up, alternate lifecycle validation, or removal condition.
  • TC-SBX-09 fork-denial skip weakens sandbox lifecycle coverage (test/e2e/test-sandbox-operations.sh:484): The new branch turns matched tmux fork-denial lifecycle failures into a skip. That may be appropriate for hardened sandbox environments, but TC-SBX-09 previously enforced the bundled tmux-session lifecycle and now can report only binary presence plus a skipped lifecycle drive. This is a security-adjacent coverage reduction because sandbox session lifecycle behavior can regress without this e2e failing in fork-denial environments.
    • Recommendation: Document where the full tmux new/list/kill lifecycle is validated when this e2e skips, or add a lightweight regression/static guard that proves the intended lifecycle path remains required in environments that support forking.
    • Evidence: At lines 482-491, `TMUX_FLOW_OK` still passes, but `fork failed: (Permission denied|Resource temporarily unavailable|Operation not permitted)` now executes cleanup and `skip "TC-SBX-09" ...` instead of failing. Existing nearby unit coverage in `test/runner.test.ts` only checks that the tmux e2e exists, contains `command -v tmux`, and is wired into the suite.
  • Fork-denial workaround lacks source-of-truth removal criteria (test/e2e/test-sandbox-operations.sh:484): This is localized tolerant behavior for an invalid state: tmux is installed, but the sandbox e2e SSH account cannot fork the tmux child window. The comment now identifies likely sandbox hardening sources, but the code does not explain why the source cannot be fixed here, what source-owned regression test prevents the behavior from drifting, or when this skip should be removed.
    • Recommendation: Add a short breadcrumb in the comment or a tracked follow-up identifying the source boundary, why the source behavior is intentionally out of scope for this PR, the validation that covers tmux lifecycle elsewhere, and the condition for removing the skip.
    • Evidence: The added comment cites `seccomp + no-new-privileges + nproc cap` and says the lifecycle is environment-dependent, but no removal condition or alternate lifecycle validation is referenced before the new `skip` branch.

🌱 Nice ideas

  • None.
Since last review details

Current findings:

  • Source-of-truth review needed: TC-SBX-09 tmux lifecycle fork-denial skip: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: Lines 484-491 add a matched fork-denial skip with cleanup and a comment naming hardening sources, but do not reference a tracked follow-up, alternate lifecycle validation, or removal condition.
  • TC-SBX-09 fork-denial skip weakens sandbox lifecycle coverage (test/e2e/test-sandbox-operations.sh:484): The new branch turns matched tmux fork-denial lifecycle failures into a skip. That may be appropriate for hardened sandbox environments, but TC-SBX-09 previously enforced the bundled tmux-session lifecycle and now can report only binary presence plus a skipped lifecycle drive. This is a security-adjacent coverage reduction because sandbox session lifecycle behavior can regress without this e2e failing in fork-denial environments.
    • Recommendation: Document where the full tmux new/list/kill lifecycle is validated when this e2e skips, or add a lightweight regression/static guard that proves the intended lifecycle path remains required in environments that support forking.
    • Evidence: At lines 482-491, `TMUX_FLOW_OK` still passes, but `fork failed: (Permission denied|Resource temporarily unavailable|Operation not permitted)` now executes cleanup and `skip "TC-SBX-09" ...` instead of failing. Existing nearby unit coverage in `test/runner.test.ts` only checks that the tmux e2e exists, contains `command -v tmux`, and is wired into the suite.
  • Fork-denial workaround lacks source-of-truth removal criteria (test/e2e/test-sandbox-operations.sh:484): This is localized tolerant behavior for an invalid state: tmux is installed, but the sandbox e2e SSH account cannot fork the tmux child window. The comment now identifies likely sandbox hardening sources, but the code does not explain why the source cannot be fixed here, what source-owned regression test prevents the behavior from drifting, or when this skip should be removed.
    • Recommendation: Add a short breadcrumb in the comment or a tracked follow-up identifying the source boundary, why the source behavior is intentionally out of scope for this PR, the validation that covers tmux lifecycle elsewhere, and the condition for removing the skip.
    • Evidence: The added comment cites `seccomp + no-new-privileges + nproc cap` and says the lifecycle is environment-dependent, but no removal condition or alternate lifecycle validation is referenced before the new `skip` branch.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

…mment

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
@laitingsheng laitingsheng added the v0.0.57 Release target label Jun 2, 2026
@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26793795334
Target ref: fix/4606-tc-sbx-09-tmux-fork-eperm
Workflow ref: main
Requested jobs: sandbox-operations-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
sandbox-operations-e2e ✅ success

@cv cv merged commit ad1c83f into main Jun 2, 2026
32 of 33 checks passed
@cv cv deleted the fix/4606-tc-sbx-09-tmux-fork-eperm branch June 2, 2026 02:59
@wscurran wscurran added area: e2e End-to-end tests, nightly failures, or validation infrastructure bug-fix PR fixes a bug or regression and removed priority: high labels Jun 3, 2026
yimoj added a commit to yimoj/NemoClaw that referenced this pull request Jun 9, 2026
…te a PTY (NVIDIA#4513)

PR NVIDIA#4606 installed tmux in the sandbox image, but the bundled tmux-session
flow still failed with `create window failed: fork failed: Permission denied`
(QA reopened NVIDIA#4513; NVIDIA#4640 then degraded the e2e drive to a soft skip).

Root cause is not fork/seccomp/nproc — plain fork() works and nproc is nearly
empty. It is PTY allocation: the OpenShell sandbox landlock filesystem policy
grants /dev/null and /dev/urandom but never /dev/pts, so forkpty() opening
/dev/ptmx (-> /dev/pts/ptmx) and a /dev/pts/<n> slave is denied with EACCES.
tmux surfaces that EACCES as "fork failed: Permission denied".

Grant /dev/pts (read_write) in the OpenClaw base + permissive policies. Grant
the directory, not /dev/ptmx — the supervisor refuses to chown the symlink, and
the devpts directory grant covers ptmx + slaves via the landlock path hierarchy.
This follows the same base+permissive pattern as the prior /home/linuxbrew
(NVIDIA#3913) and GPU /proc additions, which permissive-runtime already unions on
shields-down.

Verified hermetically with the real openshell-sandbox supervisor + the actual
edited policy: before, openpty()=EACCES and tmux new/list/kill fails; after,
openpty() returns /dev/pts/0 and the full tmux lifecycle succeeds.

Changes:
- nemoclaw-blueprint/policies/openclaw-sandbox.yaml: add /dev/pts to
  filesystem_policy.read_write with the PTY rationale.
- nemoclaw-blueprint/policies/openclaw-sandbox-permissive.yaml and
  agents/openclaw/policy-permissive.yaml: mirror /dev/pts so `shields down`
  never removes a live filesystem path (OpenShell rejects removals).
- test/e2e/test-sandbox-operations.sh: restore TC-SBX-09 to a hard assertion
  (drop the NVIDIA#4640 fork-failure soft skip), add an explicit openpty() probe that
  pins the root cause, and fail loudly if devpts ever regresses.
- test/runner.test.ts: assert both policies grant /dev/pts (and not /dev/ptmx)
  and that TC-SBX-09 no longer skips on fork failure.
- docs/reference/network-policies.mdx, docs/security/best-practices.mdx: list
  /dev/pts as a default writable path with the tmux rationale.

Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
cv pushed a commit that referenced this pull request Jun 9, 2026
…te a PTY (#4513) (#5019)

## Summary

The reopened #4513: PR #4606 installed `tmux` in the sandbox image, but
OpenClaw's bundled tmux-session flow still failed with `create window
failed: fork failed: Permission denied`, and #4640 then degraded the e2e
drive to a soft skip. This grants `/dev/pts` in the OpenClaw sandbox
landlock filesystem policy so the flow can actually allocate a PTY, and
restores the e2e to a hard assertion.

## Related Issue

Fixes #4513

## Changes

**Root cause (diagnosed, not assumed):** it is not fork/seccomp/nproc —
plain `fork()` works and `nproc` (512) is nearly empty. It is **PTY
allocation**. The OpenShell sandbox landlock `filesystem_policy` grants
`/dev/null` and `/dev/urandom` but never `/dev/pts`, so `forkpty()`
opening `/dev/ptmx` (→ `/dev/pts/ptmx`) and a `/dev/pts/<n>` slave is
denied with `EACCES`. tmux surfaces that `EACCES` as `create window
failed: fork failed: Permission denied`.

- `nemoclaw-blueprint/policies/openclaw-sandbox.yaml`: add `/dev/pts` to
`filesystem_policy.read_write` with the PTY rationale. Grant the
**directory**, not `/dev/ptmx` — the supervisor refuses to `chown` the
`/dev/ptmx` symlink, and the devpts directory grant covers `ptmx` +
slaves via the landlock path hierarchy.
- `nemoclaw-blueprint/policies/openclaw-sandbox-permissive.yaml` and
`agents/openclaw/policy-permissive.yaml`: mirror `/dev/pts` so `shields
down` never removes a live filesystem path (OpenShell rejects live
filesystem removals). Same base+permissive pattern as the prior
`/home/linuxbrew` (#3913) and GPU `/proc` additions, which
`permissive-runtime` already unions on shields-down.
- `test/e2e/test-sandbox-operations.sh` (`TC-SBX-09`): restore the
lifecycle drive to a **hard assertion** (drop the #4640 fork-failure
soft skip), add an explicit `openpty()` probe that pins the root cause,
and fail loudly if devpts ever regresses. The `PTY_OK` sentinel is
emitted by a shell `&& echo` (not inside the python source) so a Python
traceback echoing the source line cannot make the probe falsely pass.
- `test/runner.test.ts`: assert both OpenClaw policies grant `/dev/pts`
(and not `/dev/ptmx`), and that `TC-SBX-09` no longer skips on fork
failure.
- `docs/reference/network-policies.mdx`,
`docs/security/best-practices.mdx`: list `/dev/pts` as a default
writable path with the tmux rationale.

## Verification

**Reporter-workflow E2E — exact tmux-session/forkpty flow inside the
hardened NemoClaw sandbox.**

Reproduced and verified hermetically with the **real `openshell-sandbox`
supervisor** (which applies the live seccomp + landlock + `RLIMIT_NPROC`
hardening; no gateway/GPU needed), running inside
`ghcr.io/nvidia/nemoclaw/sandbox-base` + `tmux`, driven by the **actual
edited `openclaw-sandbox.yaml`** as the policy data.

Command (same shape as `TC-SBX-09`):

```
openshell-sandbox --policy-rules sandbox-policy.rego --policy-data <openclaw-sandbox.yaml> -- \
  bash -lc 'export TMUX_TMPDIR=/tmp; python3 -c "import os; _,s=os.openpty(); print(os.ttyname(s))";
            tmux new-session -d -s smoke "sleep 30"; tmux list-sessions; tmux kill-session -t smoke; echo TMUX_FLOW_OK'
```

Logs:

```
# BEFORE (committed policy, no /dev/pts):
-- openpty --  PermissionError: [Errno 13] Permission denied
-- tmux --     create window failed: fork failed: Permission denied      (new/list/kill RC=1/1/1)

# AFTER (this PR's policy, /dev/pts granted):
-- openpty --  /dev/pts/0
-- tmux --     smoke: 1 windows (created ...)                            (new/list/kill RC=0/0/0)  TMUX_FLOW_OK
```

This is the exact bundled tmux-session lifecycle (`new-session` →
`list-sessions` → `kill-session`) the reporter hit. **Pipeline
coverage:** the restored `TC-SBX-09`
(`test/e2e/test-sandbox-operations.sh`) drives this same lifecycle plus
the `openpty()` root-cause probe against a live OpenShell sandbox on the
scheduled E2E job, now as a hard assertion (a `fork failed` is a
failure, not a skip).

PR CI on head `b75befddb`: all required checks green — `static-checks`,
`build-typecheck`, `cli-test-shards (1–5)`, `plugin-tests`,
`unit-vitest-linux`, `ShellCheck`, `CodeQL`, `wsl-e2e`, `macos-e2e`,
`dco-check`, `commit-lint`, `codebase-growth-guardrails` (run
`27184680225`).

- [x] `npx prek run --all-files` passes (all
formatter/lint/security/docs hooks green; the heavy `Test (CLI)` hook
shows only load-induced timeout flakes that pass in isolation — this
change touches no `src/` code)
- [x] `npm test` — targeted suites pass (`runner`, `policies`,
`initial-policy`, `permissive-runtime`, `policy-tiers`, `shields`); CI
`cli-test-shards (1–5)` + `plugin-tests` + `unit-vitest-linux` all green
on PR head
- [x] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [x] Docs updated for user-facing behavior changes

## Type of Change

- [x] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates

## Notes — known, pre-existing limitation (out of scope)

On an OpenClaw sandbox **created before this change and upgraded without
recreate**, a `shields down` then `shields up` cycle can leave the
sandbox in shields-down state: `shields down` adds `/dev/pts` via the
permissive policy, and `shields up` restores the pre-upgrade live
snapshot, which omits `/dev/pts` (OpenShell rejects the resulting live
filesystem removal). This is identical to the existing `/home/linuxbrew`
(#3913), GPU `/proc`, and Hermes `/opt/hermes` base-policy additions —
shields-up restore has never normalized the snapshot for any path. Such
a sandbox has not received the creation-locked filesystem fix anyway;
the remediation (recreate / re-onboard) resolves both the tmux fix and
this edge case. A symmetric snapshot-normalization in shields-restore
would address the whole class and belongs in its own change.

---
Signed-off-by: Yimo Jiang <yimoj@nvidia.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* PTY allocation now works inside sandboxes, allowing terminal tools
(tmux, script, interactive shells) and preventing "fork failed:
Permission denied" failures.

* **Documentation**
* Sandbox filesystem docs updated to include PTY device access and
clarify configuration for terminal allocation tooling.

* **Tests**
* End-to-end and unit tests tightened to assert PTY support and to fail
when PTY allocation regresses.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: e2e End-to-end tests, nightly failures, or validation infrastructure bug-fix PR fixes a bug or regression v0.0.57 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants