Skip to content

test(e2e): migrate dashboard remote bind to Vitest#5133

Closed
cv wants to merge 18 commits into
mainfrom
codex/e2e-migrate-dashboard-remote-bind
Closed

test(e2e): migrate dashboard remote bind to Vitest#5133
cv wants to merge 18 commits into
mainfrom
codex/e2e-migrate-dashboard-remote-bind

Conversation

@cv

@cv cv commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Summary

Migrates the dashboard remote-bind guard from a standalone bash script into the live Vitest E2E project. Branch validation now invokes the Vitest live test remotely for the existing dashboard-remote-bind suite name, preserving the workflow contract without adding another runner.

Related Issue

Part of #5098.

Changes

  • Add test/e2e-scenario/live/dashboard-remote-bind.test.ts with opt-in coverage for CLI availability, dashboard forward restart, NEMOCLAW_DASHBOARD_BIND=0.0.0.0, forward-list capture, loopback rejection, and all-interface bind proof.
  • Update test/e2e/brev-e2e.test.ts so the dashboard-remote-bind branch-validation suite runs the live Vitest test remotely.
  • Delete test/e2e/test-dashboard-remote-bind.sh.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Carlos Villela cvillela@nvidia.com

@cv cv self-assigned this Jun 10, 2026
@copy-pr-bot

copy-pr-bot Bot commented Jun 10, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c00e1afb-dfd6-4ef8-97ba-e02da0e4f44c

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/e2e-migrate-dashboard-remote-bind

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: None
Optional E2E: dashboard-remote-bind

Dispatch hint: test_suite=dashboard-remote-bind

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/codex/e2e-simplify-migration-tracking
Head: HEAD
Confidence: high

Required E2E

  • None. No required E2E is recommended because this PR only changes E2E tests and the E2E branch-validation harness, with no production CLI, installer, sandbox lifecycle, credentials, policy, inference, deployment, or assistant-flow code changed.

Optional E2E

  • dashboard-remote-bind (Brev CPU instance; short remote-bind validation): Optional confidence check for the test migration itself: this is the existing E2E / Branch Validation workflow with test_suite=dashboard-remote-bind, now routed through the new Vitest live scenario. It is not required for runtime risk because only E2E test files changed.

New E2E recommendations

  • None.

Dispatch hint

  • Workflow: .github/workflows/e2e-branch-validation.yaml
  • jobs input: test_suite=dashboard-remote-bind

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

E2E Scenario Advisor Recommendation

Required scenario E2E: e2e-scenarios-all
Optional scenario E2E: None

Dispatch required scenario E2E:

  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref>

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/codex/e2e-simplify-migration-tracking
Head: HEAD
Confidence: high

Required scenario E2E

  • e2e-scenarios-all: The PR adds a new live Vitest scenario test under test/e2e-scenario/live/. This is directly scenario-E2E-relevant, but it is not a trusted-main live-supported typed registry scenario ID that can be targeted through the scenarios input, so use the all-scenarios fan-out rather than inventing a targeted ID.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref>

Optional scenario E2E

  • None.

Relevant changed files

  • test/e2e-scenario/live/dashboard-remote-bind.test.ts

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 4 worth checking, 0 nice ideas
Since last review: 0 prior items resolved, 2 still apply, 1 new item found

Review findings

🛠️ Needs attention

  • None.

🔎 Worth checking

  • Source-of-truth review needed: Tolerant parsing of `openshell forward list` output: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `matchingForwardLine()` uses `line.includes(sandboxName) && line.includes(dashboardPort)` and returns the first match before applying bind assertions.
  • Quote or validate the remote Vitest test path before shell execution (test/e2e/brev-e2e.test.ts:439): `runRemoteVitest()` builds a remote shell command by joining string fragments and inserts `testPath` directly. The current caller passes a hardcoded repository path, so this is not an immediate exploit in this PR, but the helper is generic branch-validation infrastructure. If a future caller passes PR-, env-, or workflow-selected paths, shell metacharacters in the path could alter the command executed on the Brev VM.
    • Recommendation: Shell-quote `testPath` with the existing shell-escaping pattern, or validate it against a narrow allowlist/pattern of repository-relative Vitest files before adding it to the command. Add a regression test that rejects or safely quotes paths containing shell metacharacters.
    • Evidence: The command array includes `"npx vitest run --project e2e-scenarios-live"`, `testPath`, and reporter flags, then joins them with spaces. Env values are escaped, but `testPath` is not.
  • Scope forward cleanup to the target sandbox (test/e2e-scenario/live/dashboard-remote-bind.test.ts:73): The migrated test stops the dashboard forward with only the port even though it already knows `sandboxName`. Nearby source code documents the sandbox-scoped `forward stop <port> <sandbox>` form as the safer option because port-only stops can kill whichever sandbox owns that port, including through a TOCTOU window. This is test-only and inherited from the deleted shell script, but it is still sandbox lifecycle code in a high-risk area.
    • Recommendation: Call `sandbox.openshell(["forward", "stop", dashboardPort, sandboxName], ...)` so the cleanup is scoped to the sandbox under test. If OpenShell compatibility requires a fallback, make the fallback explicit and tested.
    • Evidence: The test calls `sandbox.openshell(["forward", "stop", dashboardPort], ...)`; `lib/onboard/forward-cleanup.ts` says sandbox-aware call sites should prefer `forward stop <port> <sandbox>` to avoid collateral cleanup.
  • Make forward-list matching unambiguous for the target sandbox (test/e2e-scenario/live/dashboard-remote-bind.test.ts:15): `matchingForwardLine()` uses substring checks for both sandbox name and port. That preserves the deleted shell script's broad `awk` behavior, but it can select an all-interface row for another sandbox whose name or port contains the target values before a localhost-only row for the actual sandbox. In that case the migrated guard could false-pass the behavior it is meant to catch.
    • Recommendation: Parse the `openshell forward list` row shape more tightly, or at least split rows into fields and require an exact sandbox and port match before applying the bind assertions. Add a support/unit test with multiple rows where another sandbox has the all-interface bind and the target sandbox remains localhost-only.
    • Evidence: `matchingForwardLine()` returns the first trimmed line where `line.includes(sandboxName) && line.includes(dashboardPort)`, then `bindsLoopback()` and `bindsAllInterfaces()` assert only on that selected line.

🌱 Nice ideas

  • None.
Consider writing more tests for
  • **Runtime validation** — runRemoteVitest quotes or rejects test paths containing shell metacharacters before building the remote shell command. The PR is test-only and mostly migrates an existing live guard, but it adds reusable remote command construction and keeps sandbox-forward cleanup/parsing behavior in a high-risk E2E lifecycle path.
  • **Runtime validation** — runRemoteVitest rejects invalid environment variable names and preserves single-quoted env values. The PR is test-only and mostly migrates an existing live guard, but it adds reusable remote command construction and keeps sandbox-forward cleanup/parsing behavior in a high-risk E2E lifecycle path.
  • **Runtime validation** — dashboard remote-bind cleanup calls openshell forward stop with the dashboard port and target sandbox name. The PR is test-only and mostly migrates an existing live guard, but it adds reusable remote command construction and keeps sandbox-forward cleanup/parsing behavior in a high-risk E2E lifecycle path.
  • **Runtime validation** — dashboard remote-bind forward-list matching does not accept an all-interface row for another sandbox when the target sandbox row is localhost-only. The PR is test-only and mostly migrates an existing live guard, but it adds reusable remote command construction and keeps sandbox-forward cleanup/parsing behavior in a high-risk E2E lifecycle path.
  • **Acceptance clause:** Add `test/e2e-scenario/live/dashboard-remote-bind.test.ts` with opt-in coverage for CLI availability, dashboard forward restart, `NEMOCLAW_DASHBOARD_BIND=0.0.0.0`, forward-list capture, loopback rejection, and all-interface bind proof. — add test evidence or identify existing coverage. The new test is gated by `NEMOCLAW_E2E_DASHBOARD_REMOTE_BIND=1`, probes both CLIs, restarts the forward, runs `nemoclaw connect` with `NEMOCLAW_DASHBOARD_BIND=0.0.0.0`, captures `forward-list.txt`, rejects loopback, and requires an all-interface bind. The restart remains port-only rather than sandbox-scoped, and row matching is substring-based.
  • **Acceptance clause:** Legacy contract: validates dashboard remote-bind behavior and the associated Brev branch-validation path. — add test evidence or identify existing coverage. The Vitest scenario preserves the key live assertions and the Brev suite invokes it remotely. The parser can still false-match ambiguous `forward list` output, so the validation would be stronger with exact row matching.
  • **Acceptance clause:** Intentionally retired behavior: the bash wrapper/direct shell lane is retired; no product behavior assertion is intentionally dropped. — add test evidence or identify existing coverage. The main assertions from the shell script are represented in Vitest. The direct shell lane is removed as intended, but cleanup and forward-list matching semantics remain broad rather than improved during migration.
  • **Acceptance clause:** Part of Epic: Migrate legacy bash E2E into the Vitest E2E system #5098. — add test evidence or identify existing coverage. The deterministic context did not include issue Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 body or comments, so no literal linked-issue acceptance clauses could be extracted or verified.
Since last review details

Current findings:

  • Source-of-truth review needed: Tolerant parsing of `openshell forward list` output: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `matchingForwardLine()` uses `line.includes(sandboxName) && line.includes(dashboardPort)` and returns the first match before applying bind assertions.
  • Quote or validate the remote Vitest test path before shell execution (test/e2e/brev-e2e.test.ts:439): `runRemoteVitest()` builds a remote shell command by joining string fragments and inserts `testPath` directly. The current caller passes a hardcoded repository path, so this is not an immediate exploit in this PR, but the helper is generic branch-validation infrastructure. If a future caller passes PR-, env-, or workflow-selected paths, shell metacharacters in the path could alter the command executed on the Brev VM.
    • Recommendation: Shell-quote `testPath` with the existing shell-escaping pattern, or validate it against a narrow allowlist/pattern of repository-relative Vitest files before adding it to the command. Add a regression test that rejects or safely quotes paths containing shell metacharacters.
    • Evidence: The command array includes `"npx vitest run --project e2e-scenarios-live"`, `testPath`, and reporter flags, then joins them with spaces. Env values are escaped, but `testPath` is not.
  • Scope forward cleanup to the target sandbox (test/e2e-scenario/live/dashboard-remote-bind.test.ts:73): The migrated test stops the dashboard forward with only the port even though it already knows `sandboxName`. Nearby source code documents the sandbox-scoped `forward stop <port> <sandbox>` form as the safer option because port-only stops can kill whichever sandbox owns that port, including through a TOCTOU window. This is test-only and inherited from the deleted shell script, but it is still sandbox lifecycle code in a high-risk area.
    • Recommendation: Call `sandbox.openshell(["forward", "stop", dashboardPort, sandboxName], ...)` so the cleanup is scoped to the sandbox under test. If OpenShell compatibility requires a fallback, make the fallback explicit and tested.
    • Evidence: The test calls `sandbox.openshell(["forward", "stop", dashboardPort], ...)`; `lib/onboard/forward-cleanup.ts` says sandbox-aware call sites should prefer `forward stop <port> <sandbox>` to avoid collateral cleanup.
  • Make forward-list matching unambiguous for the target sandbox (test/e2e-scenario/live/dashboard-remote-bind.test.ts:15): `matchingForwardLine()` uses substring checks for both sandbox name and port. That preserves the deleted shell script's broad `awk` behavior, but it can select an all-interface row for another sandbox whose name or port contains the target values before a localhost-only row for the actual sandbox. In that case the migrated guard could false-pass the behavior it is meant to catch.
    • Recommendation: Parse the `openshell forward list` row shape more tightly, or at least split rows into fields and require an exact sandbox and port match before applying the bind assertions. Add a support/unit test with multiple rows where another sandbox has the all-interface bind and the target sandbox remains localhost-only.
    • Evidence: `matchingForwardLine()` returns the first trimmed line where `line.includes(sandboxName) && line.includes(dashboardPort)`, then `bindsLoopback()` and `bindsAllInterfaces()` assert only on that selected line.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@wscurran wscurran added area: e2e End-to-end tests, nightly failures, or validation infrastructure refactor PR restructures code without intended behavior change labels Jun 10, 2026
@wscurran

Copy link
Copy Markdown
Contributor

Base automatically changed from codex/e2e-simplify-migration-tracking to main June 10, 2026 20:53
@jyaunches

Copy link
Copy Markdown
Contributor

Closing in favor of the smaller replacement PR #5186. I used this branch as the seed/reference, but rebuilt it from current main with only the direct dashboard remote-bind migration and without the broader #5126-era framework/tooling noise.

@jyaunches

Copy link
Copy Markdown
Contributor

Superseded by #5186.

@jyaunches jyaunches closed this Jun 11, 2026
cv pushed a commit that referenced this pull request Jun 11, 2026
## Summary
Migrate the active `dashboard-remote-bind` Brev suite path to a small
live Vitest test while keeping `test/e2e/test-dashboard-remote-bind.sh`
in place per maintainer request.

This uses Carlos's PR #5133 as the seed, but keeps only the direct
dashboard remote-bind migration and drops the stale #5126-era
framework/ledger/workflow noise.

## Related Issues
Refs #5098
Refs #5133

## Contract mapping
- Legacy assertion: required CLIs are available before the remote-bind
check runs.
- Replacement: `test/e2e-scenario/live/dashboard-remote-bind.test.ts`
runs `command -v nemoclaw && command -v openshell`.
  - Boundary preserved: real remote host shell/PATH.
- Legacy assertion: dashboard forward is restarted before checking bind
behavior.
- Replacement: the Vitest test runs `openshell forward stop
<dashboardPort>` then `nemoclaw <sandbox> connect`.
- Boundary preserved: real OpenShell forward + real NemoClaw connect
path.
- Legacy assertion: `NEMOCLAW_DASHBOARD_BIND=0.0.0.0` is honored.
- Replacement: `host.nemoclaw([...], { env: { NEMOCLAW_DASHBOARD_BIND:
"0.0.0.0" } })`.
  - Boundary preserved: real environment-driven CLI behavior.
- Legacy assertion: loopback-only binding is rejected/avoided and
all-interface bind is proven.
- Replacement: parse `openshell forward list`, fail on
`127.0.0.1`/`localhost`, require `0.0.0.0`/`*` for the dashboard port.
  - Boundary preserved: real OpenShell forward-list output.

## Simplicity check
- Test shape: simple live Vitest test with local helper functions.
- New shared helpers: none.
- New framework/registry/ledger: **none**.
- Workflow changes: no workflow YAML change. The existing
`dashboard-remote-bind` suite now invokes the live Vitest test from
`test/e2e/brev-e2e.test.ts`.
- Legacy shell status: `test/e2e/test-dashboard-remote-bind.sh` remains
in place; this PR no longer deletes any `test/e2e` shell script.

## Verification
- `NEMOCLAW_RUN_E2E_SCENARIOS=1 npx vitest run --project
e2e-scenarios-live test/e2e-scenario/live/dashboard-remote-bind.test.ts
--silent=false --reporter=default` (local opt-in live test is skipped
unless `NEMOCLAW_E2E_DASHBOARD_REMOTE_BIND=1` is set by branch
validation)
- `npx vitest run --project cli test/e2e-script-workflow.test.ts
--silent=false --reporter=default`
- `git diff --check origin/main...HEAD`
- `npm run build:cli && npm run typecheck:cli`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: e2e End-to-end tests, nightly failures, or validation infrastructure refactor PR restructures code without intended behavior change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants