Skip to content

test(e2e): simplify legacy migration tracking#5126

Merged
jyaunches merged 18 commits into
mainfrom
codex/e2e-simplify-migration-tracking
Jun 10, 2026
Merged

test(e2e): simplify legacy migration tracking#5126
jyaunches merged 18 commits into
mainfrom
codex/e2e-simplify-migration-tracking

Conversation

@cv

@cv cv commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Summary

Simplifies legacy E2E migration tracking now that #5106 retired the typed-shell/YAML scenario runner. The PR removes stale repo-local JSON migration ledgers, documents GitHub issues and PRs as the source of truth for migration state, adopts the #5156-style deterministic legacy bash workflow guard, and clarifies that NemoClaw owns one Vitest E2E system with fixtures/support code rather than a second framework.

Related Issue

Refs #5098

Changes

  • Removes test/e2e-scenario/migration/legacy-inventory.json so it no longer acts as a detailed script-by-script migration roadmap.
  • Removes the orphaned generated assertion inventory at test/e2e/docs/parity-inventory.generated.json; its recorded generator no longer exists, and it was already stale against migrated/deleted legacy scripts.
  • Adds deterministic workflow contract tests that freeze the current top-level test/e2e/test-*.sh legacy script set, freeze scheduled nightly-e2e.yaml legacy script wiring, and assert every nightly-wired legacy script still exists.
  • Replaces inventory validation with migration policy/source-of-truth checks that keep repo-local durable taxonomy and generated assertion ledgers out of migration docs.
  • Removes the PR Review Advisor PR-body deletion-evidence checker/contract; the advisor still blocks reintroducing retired migration ledgers.
  • Updates MIGRATION.md, README.md, RETIREMENT.md, and release-note wording to describe one Vitest E2E system, workflow allowlist guardrails, and fixture/support code rather than a second E2E framework or runner.
  • Renames the fast Vitest support project from e2e-scenario-framework to e2e-vitest-support, moves tests from framework-tests/ to support-tests/, and renames the shared helper path to test/e2e-scenario/fixtures/.
  • Rewords the E2E scenario advisor prompt, summary, and sticky-comment text so it recommends Vitest-backed E2E scenario dispatches without teaching “scenario E2E” as a separate E2E class.
  • Fixes shared sandbox fixture helpers so SandboxClient.exec() uses openshell sandbox exec -n <name> and exposes execShell() / upload() for follow-on migrations.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx vitest run --project e2e-vitest-support --silent=false --reporter=default

  • npx vitest run --project cli test/pr-workflow-contract.test.ts test/e2e-scenario-advisor.test.ts --silent=false --reporter=default

  • npx vitest run --project cli test/e2e-script-workflow.test.ts test/pr-review-advisor.test.ts --reporter=default

  • npx vitest run --project e2e-vitest-support test/e2e-scenario/support-tests/e2e-migration-policy.test.ts --reporter=default

  • npm run test-size:check -- test/e2e-script-workflow.test.ts

  • npx markdownlint-cli2 docs/about/release-notes.mdx test/e2e-scenario/docs/MIGRATION.md test/e2e-scenario/docs/README.md test/e2e-scenario/docs/RETIREMENT.md "!node_modules/**" "!skills/**"

  • node -e "JSON.parse(require("fs").readFileSync("tools/e2e-advisor/scenarios-schema.json","utf8"))"

  • npm run typecheck:cli

  • git diff --check

  • Commit and push hooks passed with SSH_AUTH_SOCK=/tmp/ssh-Nu4xGJZo0F/agent.712827.

  • npx prek run --all-files passes

  • npm test passes

  • Tests added or updated for new or changed behavior

  • No secrets, API keys, or credentials committed

  • Docs updated for user-facing behavior changes

  • npm run docs builds without warnings (doc changes only)

  • Doc pages follow the style guide (doc changes only)

  • New doc pages include SPDX header and frontmatter (new pages only)


Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

  • Refactor

    • Consolidated E2E test infra under a Vitest-driven fixtures/support layer and renamed the E2E support project.
  • Documentation

    • Updated migration guidance to use GitHub issues/PRs as the source of truth and clarified fixture-owned responsibilities and migration checklist.
  • Tests

    • Added migration-policy/source-of-truth tests; removed legacy repo-local migration ledger; updated scenario and PR-review advisor outputs to target Vitest E2E and to block reintroduced ledgers.
  • New Features

    • Fixture sandbox helpers added (shell-run/upload) and command usage normalized for named sandboxes.

@cv cv self-assigned this Jun 10, 2026
@copy-pr-bot

copy-pr-bot Bot commented Jun 10, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Consolidates E2E into a Vitest-owned model: rewires tests to fixture modules, renames the Vitest support project, removes repo-local migration ledger and its validation, adds support-tests enforcing migration-policy, updates scenario-advisor wording/schema to “Vitest E2E”, and injects deterministic PR-review blockers for retired-ledger reintroductions.

Changes

Vitest E2E consolidation and deterministic governance

Layer / File(s) Summary
Fixture boundary contracts and redaction/ENV refactor
test/e2e-scenario/fixtures/redaction.ts, test/e2e-scenario/fixtures/availability-env.ts, test/e2e-scenario/fixtures/*
Rename framework→fixture terminology: overlay/allowlist identifiers, BuildChildEnvOptions now uses fixtureOverlay, buildChildEnv filters use fixture allowlists/prefixes, and a fixtureEnvAllowlistSnapshot helper replaces the framework snapshot.
Sandbox client command contract and shell-probe updates
test/e2e-scenario/fixtures/clients/sandbox.ts, test/e2e-scenario/support-tests/e2e-*
SandboxClient.exec switches to openshell sandbox exec -n <name> -- ...; new execShell and upload helpers added; tests adjusted to assert -n arg insertion and shifted payload indices.
Test import rewiring to fixtures
test/e2e-scenario/live/*.test.ts, test/e2e-scenario/support-tests/*.test.ts, vitest.config.ts
Repoint many tests from ../framework/* to ../fixtures/*; update vitest project replacement e2e-scenario-frameworke2e-vitest-support and include globs.
Migration governance documentation
test/e2e-scenario/docs/MIGRATION.md, test/e2e-scenario/docs/README.md, test/e2e-scenario/docs/RETIREMENT.md
Docs rewritten to make GitHub issues/PRs the authoritative source for migration state, remove local legacy-inventory.json reliance, and document the Vitest fixture/support project naming.
Migration policy enforcement in support-tests
test/e2e-scenario/support-tests/e2e-migration-policy.test.ts, test/e2e-scenario/support-tests/e2e-migration-source-of-truth.test.ts
Add tests that assert forbidden repo-local migration ledger files do not exist, verify migration docs include required GitHub-sourced phrasing, and forbid durable taxonomy markers in repo docs.
Legacy shell script allowlists and workflow tests
test/e2e-script-workflow.test.ts
Add frozen allowlists for legacy top-level E2E scripts and nightly-workflow referenced scripts; helpers enumerate and assert directory and workflow-referenced scripts exactly match allowlists and exist.
Vitest E2E advisor wording, schema, and sticky comment
tools/e2e-advisor/scenarios.mts, tools/e2e-advisor/scenarios-schema.json, tools/e2e-advisor/scenario-comment.mts, test/e2e-scenario-advisor.test.ts, docs/about/release-notes.mdx
Update prompts, rendered summary headings, schema title, and sticky comment templates to consistently use “Vitest E2E” terminology; adjust advisor tests and release notes lines.
PR-review deterministic retired-ledger detection & enforcement
tools/pr-review-advisor/analyze.mts, test/pr-review-advisor.test.ts, test/pr-workflow-contract.test.ts
Add retiredE2eMigrationLedgerChanges to deterministic context, export findRetiredE2eMigrationLedgerChanges(diff), include evidence in model payload, and inject blocker findings when retired ledger files are added/modified; update tests and workflow contract expectation to e2e-vitest-support.
Release test git config
test/release-latest-tag.test.ts
Add commit.gpgSign=false to test environment builders alongside existing tag.gpgSign=false.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Possibly related PRs

  • NVIDIA/NemoClaw#5052: Related removal/changes to the legacy migration inventory and its validation.
  • NVIDIA/NemoClaw#5109: Overlaps on edits to the legacy migration inventory entries and retirement planning.
  • NVIDIA/NemoClaw#5105: Related scenario-advisor wording/normalization changes toward Vitest E2E.

Suggested labels

area: architecture

🐰 I hopped from framework fields to fixtures bright and merry,
Vitest now tends the tests where bash once did carry,
Ledgers retired, allowlists fixed and tidy,
I nibble docs and prompts—migration paths made tidy,
One harness, one home — the rabbit cheers: hippity-hoppy!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.90% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'test(e2e): simplify legacy migration tracking' directly and specifically describes the main change: removal of complex repo-local migration ledgers and simplification of E2E migration tracking to rely on GitHub issues/PRs and deterministic policy tests instead.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/e2e-simplify-migration-tracking

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: e2e-scenarios-all, openshell-version-pin-vitest
Optional E2E: None

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • e2e-scenarios-all (high): Shared Vitest E2E fixtures and live registry scenario entrypoints changed. Run the full live Vitest scenario fan-out to validate that onboarding, sandbox lifecycle, inference/state probes, artifact capture, and renamed fixture imports still work through the maintained E2E workflow.
  • openshell-version-pin-vitest (low): The free-standing OpenShell version-pin live test is directly touched and shares the renamed fixture layer. Run its dedicated workflow job to validate the hermetic installer/OpenShell pin contract still executes under the updated Vitest E2E project layout.

Optional E2E

  • None.

New E2E recommendations

  • Vitest free-standing live tests (medium): test/e2e-scenario/live/ubuntu-repo-cli-smoke.test.ts exists and is touched, but the inspected e2e-vitest-scenarios.yaml workflow only dispatches registry-scenarios.test.ts plus the openshell-version-pin free-standing job. There is no existing workflow job that runs the CLI smoke live test.
    • Suggested test: Add a discrete e2e-vitest-scenarios.yaml job for test/e2e-scenario/live/ubuntu-repo-cli-smoke.test.ts, or fold the CLI smoke into the dispatchable Vitest live matrix.

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

E2E Scenario Advisor Recommendation

Required scenario E2E: e2e-scenarios-all
Optional scenario E2E: None

Dispatch required scenario E2E:

  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref>

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

  • e2e-scenarios-all: This PR changes shared Vitest scenario fixture/client/phase code, live scenario entry points, scenario typing, support coverage, and Vitest project configuration. Those surfaces can affect every live-supported scenario, so the full Vitest scenario fan-out is required.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref>

Optional scenario E2E

  • None.

Relevant changed files

  • test/e2e-scenario/docs/MIGRATION.md
  • test/e2e-scenario/docs/README.md
  • test/e2e-scenario/docs/RETIREMENT.md
  • test/e2e-scenario/fixtures/artifacts.ts
  • test/e2e-scenario/fixtures/availability-env.ts
  • test/e2e-scenario/fixtures/cleanup.ts
  • test/e2e-scenario/fixtures/clients/command.ts
  • test/e2e-scenario/fixtures/clients/gateway.ts
  • test/e2e-scenario/fixtures/clients/host.ts
  • test/e2e-scenario/fixtures/clients/index.ts
  • test/e2e-scenario/fixtures/clients/provider.ts
  • test/e2e-scenario/fixtures/clients/sandbox.ts
  • test/e2e-scenario/fixtures/clients/state.ts
  • test/e2e-scenario/fixtures/e2e-test.ts
  • test/e2e-scenario/fixtures/live-project-gate.ts
  • test/e2e-scenario/fixtures/phases/environment.ts
  • test/e2e-scenario/fixtures/phases/index.ts
  • test/e2e-scenario/fixtures/phases/lifecycle.ts
  • test/e2e-scenario/fixtures/phases/onboarding.ts
  • test/e2e-scenario/fixtures/phases/runtime.ts
  • test/e2e-scenario/fixtures/phases/state-validation.ts
  • test/e2e-scenario/fixtures/redaction.ts
  • test/e2e-scenario/fixtures/secrets.ts
  • test/e2e-scenario/fixtures/shell-probe.ts
  • test/e2e-scenario/fixtures/shell/supervisor.ts
  • test/e2e-scenario/fixtures/shell/trusted-command.ts
  • test/e2e-scenario/framework-tests/e2e-migration-inventory.test.ts
  • test/e2e-scenario/live/openshell-version-pin.test.ts
  • test/e2e-scenario/live/registry-scenarios.test.ts
  • test/e2e-scenario/live/ubuntu-repo-cli-smoke.test.ts
  • test/e2e-scenario/migration/legacy-inventory.json
  • test/e2e-scenario/scenarios/types.ts
  • test/e2e-scenario/support-tests/e2e-clients.test.ts
  • test/e2e-scenario/support-tests/e2e-expected-state.test.ts
  • test/e2e-scenario/support-tests/e2e-fixture-context.test.ts
  • test/e2e-scenario/support-tests/e2e-live-project-config.test.ts
  • test/e2e-scenario/support-tests/e2e-live-registry-discovery.test.ts
  • test/e2e-scenario/support-tests/e2e-live-skip-name-contract.test.ts
  • test/e2e-scenario/support-tests/e2e-manifests.test.ts
  • test/e2e-scenario/support-tests/e2e-migration-policy.test.ts
  • test/e2e-scenario/support-tests/e2e-migration-source-of-truth.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-environment.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-lifecycle.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-onboarding.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-runtime.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-state-validation.test.ts
  • test/e2e-scenario/support-tests/e2e-redaction-entry.test.ts
  • test/e2e-scenario/support-tests/e2e-redaction-parity.test.ts
  • test/e2e-scenario/support-tests/e2e-scenario-matrix.test.ts
  • test/e2e-scenario/support-tests/e2e-scenario-registry.test.ts
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • test/e2e-scenario/support-tests/e2e-shell-supervisor.test.ts
  • vitest.config.ts

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 0 worth checking, 0 nice ideas
Since last review: 0 prior items resolved, 0 still apply, 0 new items found

Consider writing more tests for
  • **Runtime validation** — Verify the `e2e-vitest-support` Vitest project discovers renamed `support-tests` and no stale `e2e-scenario-framework` project name is required by scripts or docs.. Unit/support tests cover the refactor and helper contracts, but the changed surfaces include runtime/sandbox fixture helpers, advisor rendering, and workflow-contract tooling where targeted behavioral validation improves confidence.
  • **Runtime validation** — Verify the scenario advisor normalization and sticky-comment path emits only canonical `e2e-vitest-scenarios.yaml` dispatch commands after the Vitest wording/schema rename.. Unit/support tests cover the refactor and helper contracts, but the changed surfaces include runtime/sandbox fixture helpers, advisor rendering, and workflow-contract tooling where targeted behavioral validation improves confidence.
  • **Runtime validation** — Verify `SandboxClient.exec()` against a real or CLI-stubbed OpenShell boundary accepts `openshell sandbox exec -n <name> -- <argv>` and rejects the old positional-name form if unsupported.. Unit/support tests cover the refactor and helper contracts, but the changed surfaces include runtime/sandbox fixture helpers, advisor rendering, and workflow-contract tooling where targeted behavioral validation improves confidence.
  • **Runtime validation** — Verify retired migration ledger reintroduction produces the PR Review Advisor blocker while deleting those ledgers remains allowed.. Unit/support tests cover the refactor and helper contracts, but the changed surfaces include runtime/sandbox fixture helpers, advisor rendering, and workflow-contract tooling where targeted behavioral validation improves confidence.
  • **Runtime validation** — Verify the workflow freeze contract fails when a top-level `test/e2e/test-*.sh` is added without updating the allowlist or when `nightly-e2e.yaml` references a missing script.. Unit/support tests cover the refactor and helper contracts, but the changed surfaces include runtime/sandbox fixture helpers, advisor rendering, and workflow-contract tooling where targeted behavioral validation improves confidence.
  • **Acceptance clause:** Refs Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 — add test evidence or identify existing coverage. The PR body references Epic: Migrate legacy bash E2E into the Vitest E2E system #5098, but deterministic linkedIssues was empty, so no literal Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 issue body or comments were available to extract and map. The changed migration docs state that Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 tracks direct legacy bash-suite migration decisions.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@github-actions

Copy link
Copy Markdown
Contributor

@wscurran wscurran added area: docs Documentation, examples, guides, or docs build area: e2e End-to-end tests, nightly failures, or validation infrastructure chore Build, CI, dependency, or tooling maintenance labels Jun 10, 2026
@wscurran

Copy link
Copy Markdown
Contributor

@cv

cv commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

Superseded by commit 052375164: #5126 no longer adds a PR-body legacy deletion-evidence checker.

The current foundation adopts the #5156-style deterministic guard instead: freeze the top-level test/e2e/test-*.sh set, freeze scheduled nightly-e2e.yaml script wiring, and assert nightly-wired scripts exist. The PR Review Advisor still blocks reintroducing retired repo-local migration ledgers, but it no longer tries to judge shell-script deletion fidelity from PR-body prose.

@cv cv marked this pull request as ready for review June 10, 2026 14:51

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/about/release-notes.mdx (1)

2-3: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use HTML comments for SPDX in Markdown files.

This .mdx file currently uses # comments for SPDX metadata; the Markdown rule requires HTML comment form instead.

As per coding guidelines, Markdown files must use HTML comments for SPDX headers (shell scripts are the only files that use # comments).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/about/release-notes.mdx` around lines 2 - 3, Replace the SPDX header
lines that currently use Markdown/Hash comments with HTML comment form required
for MDX: change the two lines beginning with "# SPDX-FileCopyrightText:
Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved." and "#
SPDX-License-Identifier: Apache-2.0" into HTML comments like "<!--
SPDX-FileCopyrightText: ... -->" and "<!-- SPDX-License-Identifier: Apache-2.0
-->" so the SPDX metadata is preserved but uses the correct MDX comment syntax.

Source: Coding guidelines

🧹 Nitpick comments (1)
test/e2e-scenario/docs/README.md (1)

6-7: ⚡ Quick win

Remove decorative bolding and punctuation-style colon in Line 6.

Use plain text for routine statements and avoid using a colon here unless it introduces a list.

As per coding guidelines: “Bold is reserved for UI labels, parameter names, and genuine warnings.” and “Colons should only introduce a list.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e-scenario/docs/README.md` around lines 6 - 7, The sentence "NemoClaw
E2E now has one target execution model: **Vitest as the harness** and" uses
decorative bolding and a colon; remove the markdown bold around "Vitest as the
harness" and replace the colon with plain punctuation (e.g., a comma or em dash)
so the line reads in plain text (for example: "NemoClaw E2E now has one target
execution model, Vitest as the harness and GitHub Actions as the matrix.").

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tools/e2e-advisor/scenarios-schema.json`:
- Around line 1-4: Add SPDX metadata to the root JSON object by inserting SPDX
entries before the existing "$schema" key: add a "$comment" string containing
the SPDX-FileCopyrightText notice and add top-level keys
"SPDX-FileCopyrightText" (string) and "SPDX-License-Identifier" (string) or at
minimum the "SPDX-License-Identifier" matching repo license; update the root
object (where "$schema", "$id", and "title" are defined) so the new keys appear
at the top of the JSON document.

---

Outside diff comments:
In `@docs/about/release-notes.mdx`:
- Around line 2-3: Replace the SPDX header lines that currently use
Markdown/Hash comments with HTML comment form required for MDX: change the two
lines beginning with "# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA
CORPORATION & AFFILIATES. All rights reserved." and "# SPDX-License-Identifier:
Apache-2.0" into HTML comments like "<!-- SPDX-FileCopyrightText: ... -->" and
"<!-- SPDX-License-Identifier: Apache-2.0 -->" so the SPDX metadata is preserved
but uses the correct MDX comment syntax.

---

Nitpick comments:
In `@test/e2e-scenario/docs/README.md`:
- Around line 6-7: The sentence "NemoClaw E2E now has one target execution
model: **Vitest as the harness** and" uses decorative bolding and a colon;
remove the markdown bold around "Vitest as the harness" and replace the colon
with plain punctuation (e.g., a comma or em dash) so the line reads in plain
text (for example: "NemoClaw E2E now has one target execution model, Vitest as
the harness and GitHub Actions as the matrix.").
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: cd385387-36e3-49a9-9fbc-9403fcd322c3

📥 Commits

Reviewing files that changed from the base of the PR and between 5f60ed6 and f9422bc.

📒 Files selected for processing (36)
  • docs/about/release-notes.mdx
  • test/e2e-scenario-advisor.test.ts
  • test/e2e-scenario/docs/MIGRATION.md
  • test/e2e-scenario/docs/README.md
  • test/e2e-scenario/docs/RETIREMENT.md
  • test/e2e-scenario/framework-tests/e2e-migration-inventory.test.ts
  • test/e2e-scenario/framework/redaction.ts
  • test/e2e-scenario/migration/legacy-inventory.json
  • test/e2e-scenario/support-tests/e2e-clients.test.ts
  • test/e2e-scenario/support-tests/e2e-expected-state.test.ts
  • test/e2e-scenario/support-tests/e2e-fixture-context.test.ts
  • test/e2e-scenario/support-tests/e2e-live-project-config.test.ts
  • test/e2e-scenario/support-tests/e2e-live-registry-discovery.test.ts
  • test/e2e-scenario/support-tests/e2e-live-skip-name-contract.test.ts
  • test/e2e-scenario/support-tests/e2e-manifests.test.ts
  • test/e2e-scenario/support-tests/e2e-migration-policy.test.ts
  • test/e2e-scenario/support-tests/e2e-migration-source-of-truth.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-environment.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-lifecycle.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-onboarding.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-runtime.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-state-validation.test.ts
  • test/e2e-scenario/support-tests/e2e-redaction-entry.test.ts
  • test/e2e-scenario/support-tests/e2e-redaction-parity.test.ts
  • test/e2e-scenario/support-tests/e2e-scenario-matrix.test.ts
  • test/e2e-scenario/support-tests/e2e-scenario-registry.test.ts
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • test/e2e-scenario/support-tests/e2e-shell-supervisor.test.ts
  • test/e2e/docs/parity-inventory.generated.json
  • test/pr-review-advisor.test.ts
  • test/pr-workflow-contract.test.ts
  • tools/e2e-advisor/scenario-comment.mts
  • tools/e2e-advisor/scenarios-schema.json
  • tools/e2e-advisor/scenarios.mts
  • tools/pr-review-advisor/analyze.mts
  • vitest.config.ts
💤 Files with no reviewable changes (2)
  • test/e2e-scenario/framework-tests/e2e-migration-inventory.test.ts
  • test/e2e-scenario/migration/legacy-inventory.json

Comment thread tools/e2e-advisor/scenarios-schema.json
@cv

cv commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

Follow-up to the latest advisor notes: commit debe9586b tightens the deletion-evidence guard and docs.

Changes added after the advisor comment:

  • Replacement Vitest coverage: now counts only if it points at an existing .test.ts file in the PR checkout.
  • test/pr-review-advisor.test.ts now covers a nonexistent replacement path and expects the guard to reject it.
  • MIGRATION.md now says the PR body is the machine-checkable boundary; linked issues can summarize/link the PR evidence, but the advisor does not pretend to validate issue comments.

That keeps the guard lightweight while avoiding placeholder evidence and resolving the PR-body-vs-linked-issue mismatch.

@jyaunches jyaunches left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pushing this cleanup forward. I agree with the direction of removing stale repo-local migration inventories, but I think the new PR Review Advisor deletion-evidence contract is too heavy for the invariant we actually need.

The part I would like changed is the PR-body evidence/blocker flow for deleting test/e2e/test-*.sh scripts: requiring Legacy contract, replacement/retirement rationale, intentionally retired behavior, fidelity verification, and parsing that from tools/pr-review-advisor/analyze.mts. That feels over-engineered and also asks the advisor to infer migration fidelity from PR text, which is not a great deterministic source of truth.

I opened #5156 as a proposed smaller alternative: freeze the current top-level legacy bash E2E script set, freeze the scheduled nightly-e2e.yaml legacy script wiring, and assert every nightly-wired script still exists. With that model, when we are ready to retire a nightly-wired legacy test, the retiring PR removes it from nightly and deletes the script together. New E2E coverage should go to the newer Vitest/regression path unless maintainers intentionally update the allowlist.

Could we remove the PR Review Advisor PR-body evidence checker/contract from this PR and either depend on #5156 or adopt that deterministic workflow-test approach here?

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e-scenario/fixtures/redaction.ts`:
- Around line 18-21: The comment in test/e2e-scenario/fixtures/redaction.ts
incorrectly states the fixture layer "imports" the canonical regex sets from
src/lib/security/secret-patterns.ts; update the wording to reflect that this
module maintains a local mirror of those patterns which is validated against the
canonical source by parity tests (e.g., change "import the canonical regex sets
and apply them here" to something like "maintains a local mirror of the
canonical regex sets, validated by parity tests, to stay in lockstep with
product runtime"). Ensure the revised comment mentions the canonical source
(src/lib/security/secret-patterns.ts) and the parity test validation approach to
avoid future architecture drift.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6d842ee2-e93f-48b5-b802-c251ac27b16c

📥 Commits

Reviewing files that changed from the base of the PR and between 6062f53 and 3c1ffcc.

📒 Files selected for processing (46)
  • test/e2e-scenario-advisor.test.ts
  • test/e2e-scenario/docs/MIGRATION.md
  • test/e2e-scenario/docs/README.md
  • test/e2e-scenario/docs/RETIREMENT.md
  • test/e2e-scenario/fixtures/artifacts.ts
  • test/e2e-scenario/fixtures/availability-env.ts
  • test/e2e-scenario/fixtures/cleanup.ts
  • test/e2e-scenario/fixtures/clients/command.ts
  • test/e2e-scenario/fixtures/clients/gateway.ts
  • test/e2e-scenario/fixtures/clients/host.ts
  • test/e2e-scenario/fixtures/clients/index.ts
  • test/e2e-scenario/fixtures/clients/provider.ts
  • test/e2e-scenario/fixtures/clients/sandbox.ts
  • test/e2e-scenario/fixtures/clients/state.ts
  • test/e2e-scenario/fixtures/e2e-test.ts
  • test/e2e-scenario/fixtures/live-project-gate.ts
  • test/e2e-scenario/fixtures/phases/environment.ts
  • test/e2e-scenario/fixtures/phases/index.ts
  • test/e2e-scenario/fixtures/phases/lifecycle.ts
  • test/e2e-scenario/fixtures/phases/onboarding.ts
  • test/e2e-scenario/fixtures/phases/runtime.ts
  • test/e2e-scenario/fixtures/phases/state-validation.ts
  • test/e2e-scenario/fixtures/redaction.ts
  • test/e2e-scenario/fixtures/secrets.ts
  • test/e2e-scenario/fixtures/shell-probe.ts
  • test/e2e-scenario/fixtures/shell/supervisor.ts
  • test/e2e-scenario/fixtures/shell/trusted-command.ts
  • test/e2e-scenario/live/openshell-version-pin.test.ts
  • test/e2e-scenario/live/registry-scenarios.test.ts
  • test/e2e-scenario/live/ubuntu-repo-cli-smoke.test.ts
  • test/e2e-scenario/scenarios/types.ts
  • test/e2e-scenario/support-tests/e2e-clients.test.ts
  • test/e2e-scenario/support-tests/e2e-fixture-context.test.ts
  • test/e2e-scenario/support-tests/e2e-live-project-config.test.ts
  • test/e2e-scenario/support-tests/e2e-migration-policy.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-environment.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-lifecycle.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-onboarding.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-runtime.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-state-validation.test.ts
  • test/e2e-scenario/support-tests/e2e-redaction-entry.test.ts
  • test/e2e-scenario/support-tests/e2e-redaction-parity.test.ts
  • test/e2e-scenario/support-tests/e2e-shell-supervisor.test.ts
  • test/release-latest-tag.test.ts
  • tools/e2e-advisor/scenarios.mts
  • vitest.config.ts
✅ Files skipped from review due to trivial changes (17)
  • test/e2e-scenario/fixtures/shell/trusted-command.ts
  • test/e2e-scenario/fixtures/clients/provider.ts
  • test/e2e-scenario/fixtures/shell/supervisor.ts
  • test/e2e-scenario/support-tests/e2e-fixture-context.test.ts
  • test/e2e-scenario/live/registry-scenarios.test.ts
  • test/e2e-scenario/fixtures/secrets.ts
  • test/e2e-scenario/scenarios/types.ts
  • test/e2e-scenario/support-tests/e2e-live-project-config.test.ts
  • test/e2e-scenario/fixtures/phases/state-validation.ts
  • test/e2e-scenario/live/openshell-version-pin.test.ts
  • test/e2e-scenario/fixtures/shell-probe.ts
  • test/e2e-scenario/fixtures/phases/lifecycle.ts
  • test/e2e-scenario/support-tests/e2e-phase-onboarding.test.ts
  • test/e2e-scenario/support-tests/e2e-clients.test.ts
  • test/e2e-scenario/docs/RETIREMENT.md
  • tools/e2e-advisor/scenarios.mts
  • test/e2e-scenario/docs/README.md
🚧 Files skipped from review as they are similar to previous changes (4)
  • test/e2e-scenario/support-tests/e2e-migration-policy.test.ts
  • vitest.config.ts
  • test/e2e-scenario/docs/MIGRATION.md
  • test/e2e-scenario-advisor.test.ts

Comment thread test/e2e-scenario/fixtures/redaction.ts Outdated
@cv

cv commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

Foundation follow-up: pushed 3c1ffccf8, which renames the E2E helper namespace from test/e2e-scenario/framework/ to test/e2e-scenario/fixtures/ and removes the remaining framework terminology from the E2E scenario/advisor surface. This makes the repo layout match the intended end state: Vitest is the harness; NemoClaw owns fixtures/support helpers, not a second framework/runner.\n\nI also refreshed all 13 child draft branches onto this foundation and verified their remote diffs do not reintroduce retired ledgers, framework-tests/, or test/e2e-scenario/framework/.

@cv

cv commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

Merge-gate status after the fixtures rename, sandbox helper follow-ups, deterministic guard pivot, and helper-boundary tightening: #5126 is at 69ad9666b. The PR adopts the #5156-style workflow allowlist guard, removes the PR-body deletion-evidence checker that triggered requested changes, and makes SandboxClient.execShell() require trustedSandboxShellScript() while rejecting flag-shaped upload paths. GitHub Actions are rerunning for the new head. The branch remains mergeable, with the existing requested-changes review state pending maintainer re-review.

@cv

cv commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

All checks are green again on 69ad9666b.

I believe this now addresses the requested-changes review by taking the simpler deterministic shape from #5156:

  • frozen top-level legacy bash allowlist
  • frozen nightly-e2e.yaml legacy wiring allowlist
  • scheduled legacy script existence check
  • no legacy-inventory.json ledger
  • no PR-body deletion evidence contract
  • helper boundary tightened with trustedSandboxShellScript() and upload path validation

The PR is still blocked by the earlier requested-changes review rather than CI. @jyaunches, could you re-review when you have a chance?

…gration-tracking

# Conflicts:
#	test/e2e-scenario/migration/legacy-inventory.json

@jyaunches jyaunches left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks — this now matches the deterministic guard direction: the PR-body deletion-evidence contract is gone, the legacy/script allowlists cover the new main changes, and the redaction wording/CodeRabbit item is fixed.\n\nValidated locally after merging main:\n- npm run build:cli\n- npm run typecheck:cli\n- npx vitest run --project cli test/e2e-script-workflow.test.ts test/e2e-scenario/support-tests/e2e-migration-policy.test.ts test/e2e-scenario/support-tests/e2e-redaction-parity.test.ts test/pr-review-advisor.test.ts --silent=false --reporter=default

@jyaunches jyaunches enabled auto-merge (squash) June 10, 2026 20:49
@jyaunches jyaunches disabled auto-merge June 10, 2026 20:51
@jyaunches jyaunches merged commit 2c2cc5c into main Jun 10, 2026
32 checks passed
@jyaunches jyaunches deleted the codex/e2e-simplify-migration-tracking branch June 10, 2026 20:53
jyaunches added a commit that referenced this pull request Jun 10, 2026
## Summary
Retires the legacy `test/e2e/test-strict-tool-call-probe.sh` regression
and replaces it with a focused Vitest process test that drives the same
caller-level Local Ollama strict Chat Completions tool-call validation
behavior against a hermetic mock endpoint.

The legacy script was process-level mock-driven (`node -e ...` against
in-process module mocks plus a localhost stub) — the same shape
`nemoclaw-e2e-legacy-migrate` Phase 0 explicitly refuses, with a
recommendation to follow #5119's retirement pattern: author the
replacement directly as a Vitest test in `test/`, delete the bash
script, remove the regression-e2e job.

## Related Issue
Refs #5098
Refs #4537
Refs #4349
Refs #5119

## Changes
- Adds `test/strict-tool-call-probe.test.ts` — a Vitest test that spawns
`test/fixtures/strict-tool-call-probe-driver.ts` via `tsx` and asserts
the four legacy `[PASS]` markers (success, onboarding caller,
transient-502 retry, plain-text fail-closed).
- Adds `test/fixtures/strict-tool-call-probe-driver.ts` containing the
former inline `node -e` block from the bash script. The driver is
authored as TypeScript (rather than `.cjs`) per the codebase-growth
guardrail forbidding newly added `.js`/`.cjs`/`.mjs` files; body is
JS-shaped because the embedded `node -e` strings must remain plain
CommonJS for the spawned children, and the dist/lib/* targets are CJS
modules. Uses `createRequire(import.meta.url)` plus
`fileURLToPath`-derived `__dirname` so tsx's ESM-default execution can
still load the CJS dist artifacts. `// @ts-nocheck` keeps the surface
identical to the retired bash heredoc.
- Driver runs in a fresh `tsx` child process so its `curl`-based
validation subprocess is identical to production runtime conditions and
is not affected by Vitest's worker pool, fetch shim, or signal handling.
- Deletes `test/e2e/test-strict-tool-call-probe.sh`.
- Removes the `strict-tool-call-probe-e2e` job, its selector branch,
output declaration, and `Valid:` description entry from
`.github/workflows/regression-e2e.yaml`.
- Adds a workflow-contract test in
`test/regression-e2e-workflow.test.ts` proving the retired job is not
advertised or selected (mirrors the docker-unreachable contract test
added in #5119).
- Removes the deleted shell script's row from
`test/e2e-scenario/migration/legacy-inventory.json` (mirrors #5119's
inventory hygiene); the retirement rationale is captured in this PR and
the linked issue trail.

### Why not live E2E scenario fixtures?
This probe asserts payload shape and retry behavior on production caller
code against a localhost OpenAI-compatible stub. It does not need a live
sandbox lifecycle, scenario registry entry, shared fixture, or matrix
runner. Keeping it as a focused process Vitest test preserves the same
caller boundary without adding another live E2E surface.


### Simplicity check
- Test shape: focused process Vitest test.
- New shared helpers: none.
- New live E2E fixtures/registry/ledger: none.
- Workflow changes: remove the retired regression lane.
- Pre-#5126 compatibility: the `legacy-inventory.json` row is removed
only because the current deletion gate requires the inventory to match
existing `test/e2e/test-*.sh` entry points.

### Build dependency
This test consumes built artifacts from `dist/lib/...` (matches the
existing CLI-shard convention). CI runs `npm run build:cli` before
vitest. For local standalone runs, run `npm run build:cli` first;
without it, the wrapper fails fast with a clear `npm run build:cli`
diagnostic before launching the driver.

## Type of Change
- [x] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [ ] Doc only (prose changes, no code sample modifications)
- [ ] Doc only (includes code sample changes)

## Verification
- `npm run build:cli`
- `node_modules/.bin/tsx test/fixtures/strict-tool-call-probe-driver.ts`
(legacy parity check — all four `[PASS]` markers print; matches `bash
test/e2e/test-strict-tool-call-probe.sh` output line-for-line)
- `npx vitest run --project cli test/strict-tool-call-probe.test.ts
test/regression-e2e-workflow.test.ts
test/e2e-scenario/framework-tests/e2e-migration-inventory.test.ts
test/e2e-scenario/framework-tests/e2e-migration-inventory-lock.test.ts`
→ 10/10 passing
- `npx tsc --noEmit -p tsconfig.cli.json` clean for the touched files
- `npx biome check test/strict-tool-call-probe.test.ts
test/fixtures/strict-tool-call-probe-driver.ts` clean
- `npx tsx scripts/check-test-file-size-budget.ts` passes
- YAML syntax check on `.github/workflows/regression-e2e.yaml` passes

- [ ] `npx prek run --all-files` passes
- [ ] `npm test` passes
- [x] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [ ] Docs updated for user-facing behavior changes
- [ ] `npm run docs` builds without warnings (doc changes only)


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Removed strict tool-call probe regression test lane from the
continuous integration workflow
* Updated test infrastructure with modernized validation for Chat
Completions tool-call enforcement on local integration paths
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>
Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Co-authored-by: Prekshi Vyas <prekshiv@nvidia.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
jyaunches added a commit that referenced this pull request Jun 10, 2026
## Summary
Retires the legacy caller-level onboard inference smoke bash regression
now that the same contract is covered by a focused Vitest process test.

## Related Issues
Refs #3253
Refs #4349
Refs #5098
Refs #5119

## Changes
- Adds `test/onboard-inference-smoke.test.ts`, which drives
`setupInference()` in a subprocess with PATH-shimmed `openshell` and
`curl`.
- Verifies onboard probes `/chat/completions`, fails closed on upstream
HTTP 503, surfaces provider/model/API base/credential diagnostics, and
does not print route success after smoke failure.
- Deletes `test/e2e/test-onboard-inference-smoke.sh`.
- Removes the retired `onboard-inference-smoke-e2e` lane from
`regression-e2e.yaml`.
- Updates the post-#5126 frozen legacy E2E shell allowlist and
regression workflow contract test to keep the retired lane out of
dispatch selection.

## Simplicity check
- Test shape: focused process Vitest test.
- New shared helpers: none.
- New framework/registry/ledger: none.
- Workflow changes: remove retired regression lane and update frozen
shell/workflow contract allowlists.

## Verification
- `npm run build:cli`
- `npx vitest run test/onboard-inference-smoke.test.ts`
- `npx tsc --noEmit -p jsconfig.json`
- `npx vitest run --project cli test/onboard-inference-smoke.test.ts
test/regression-e2e-workflow.test.ts test/e2e-script-workflow.test.ts
--silent=false --reporter=default`
- `git diff --check`

Notes:
- A local full hook/all-files run was attempted and failed on existing
environment-dependent failures unrelated to this change (missing nested
`nemoclaw/dist` / `nemoclaw/node_modules/json5`, Docker daemon
unavailable, and several local timeout-sensitive tests). CI should run
the repository's normal validation in the correct environment.

## Type of Change
- [x] Test-only change
- [x] CI/workflow cleanup for retired legacy coverage
- [x] No secrets, API keys, or credentials committed


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* Migrated onboard inference smoke test from shell script to
Vitest-based test for improved reliability and isolation.

* **Chores**
* Removed `onboard-inference-smoke-e2e` from regression workflow
selectable jobs.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
cv pushed a commit that referenced this pull request Jun 11, 2026
## Summary
Migrate the `test/e2e/test-vm-driver-privileged-exec-routing.sh`
contract into focused Vitest coverage while keeping the legacy shell
file present for now.

Seeded from Carlos's PR #5144, but rebuilt on current `main` with only
the one-script migration changes and no #5126-era framework diff.

## Related Issues
Refs #5098
Refs #4245

## Contract mapping
- Legacy assertion: VM-driver privileged exec routes to the direct
sandbox container, not `openshell-gateway-nemoclaw`.
- Replacement: `test/vm-driver-privileged-exec-routing.test.ts` asserts
`alpha` selects `openshell-alpha-abc123` while ignoring
`openshell-alpha-child` and gateway containers.
- Boundary preserved: real built `dist/lib/sandbox/privileged-exec.js`
helper plus fake registry and fake `docker ps` subprocess boundary.
- Legacy assertion: exact direct container wins when present.
  - Replacement: `alpha-child` selects `openshell-alpha-child`.
- Boundary preserved: fake Docker output exercises the same
direct-container selection path.
- Legacy assertion: Docker-driver and older registry entries without
recorded driver still route to direct sandbox containers.
- Replacement: `dockerbox` and `unknown-driver` cases assert direct
container argv.
- Boundary preserved: built helper reads a temp
`~/.nemoclaw/sandboxes.json`.
- Legacy assertion: missing direct VM container fails clearly instead of
falling back to gateway routing.
- Replacement: missing-container case asserts the `No running direct
OpenShell sandbox container...driver: vm` error.
- Boundary preserved: fake `docker ps` contains only gateway/unrelated
containers.

## Simplicity check
- Test shape: focused process/unit-shaped Vitest test in `test/`.
- New shared helpers: none; fake registry/Docker setup is local to the
test.
- New framework/registry/ledger: **none**.
- Workflow changes: retired/unwired the nightly shell-script lane from
selective dispatch and nightly shell-script wiring. The legacy shell
file remains in `test/e2e/` and in the top-level frozen shell allowlist;
replacement coverage runs in normal CLI Vitest.

## Verification
- `npm run build:cli`
- `npx vitest run --project cli
test/vm-driver-privileged-exec-routing.test.ts --silent=false
--reporter=default`
- `npx vitest run --project cli test/e2e-script-workflow.test.ts
--silent=false --reporter=default`
- includes a contract assertion that the unwired
`vm-driver-privileged-exec-routing-e2e` lane is absent from nightly
dispatch/wiring, the legacy shell file remains present, the replacement
test is in the CLI Vitest include path, and the CLI coverage shard
rebuilds `dist` before Vitest
- `npm run test-size:check --
test/vm-driver-privileged-exec-routing.test.ts
test/e2e-script-workflow.test.ts`
- `git diff --check`

## CI follow-up
The first selective E2E auto-dispatch still used the trusted `main`
nightly workflow, which attempted to run the shell-script lane. This PR
now removes `vm-driver-privileged-exec-routing-e2e` from the nightly
dispatch input/list/job definitions so the unwired lane is not
auto-dispatched. The legacy shell file has been restored per maintainer
preference; this PR no longer deletes any file under `test/e2e/`.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* Added a regression test for privileged-exec routing to validate
correct OpenShell sandbox selection across VM, Docker, and other
drivers.
* Added a contract test to ensure the retired nightly job is absent and
that the CLI coverage shard runs the build step before executing CLI
tests.

* **Chores**
* Retired a nightly E2E job and updated CI guidance to reference the new
CLI test shard and narrowed nightly run targets.
  * Test helper updated to include CLI coverage shard metadata.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
cv pushed a commit that referenced this pull request Jun 11, 2026
## Summary
Migrate the active `dashboard-remote-bind` Brev suite path to a small
live Vitest test while keeping `test/e2e/test-dashboard-remote-bind.sh`
in place per maintainer request.

This uses Carlos's PR #5133 as the seed, but keeps only the direct
dashboard remote-bind migration and drops the stale #5126-era
framework/ledger/workflow noise.

## Related Issues
Refs #5098
Refs #5133

## Contract mapping
- Legacy assertion: required CLIs are available before the remote-bind
check runs.
- Replacement: `test/e2e-scenario/live/dashboard-remote-bind.test.ts`
runs `command -v nemoclaw && command -v openshell`.
  - Boundary preserved: real remote host shell/PATH.
- Legacy assertion: dashboard forward is restarted before checking bind
behavior.
- Replacement: the Vitest test runs `openshell forward stop
<dashboardPort>` then `nemoclaw <sandbox> connect`.
- Boundary preserved: real OpenShell forward + real NemoClaw connect
path.
- Legacy assertion: `NEMOCLAW_DASHBOARD_BIND=0.0.0.0` is honored.
- Replacement: `host.nemoclaw([...], { env: { NEMOCLAW_DASHBOARD_BIND:
"0.0.0.0" } })`.
  - Boundary preserved: real environment-driven CLI behavior.
- Legacy assertion: loopback-only binding is rejected/avoided and
all-interface bind is proven.
- Replacement: parse `openshell forward list`, fail on
`127.0.0.1`/`localhost`, require `0.0.0.0`/`*` for the dashboard port.
  - Boundary preserved: real OpenShell forward-list output.

## Simplicity check
- Test shape: simple live Vitest test with local helper functions.
- New shared helpers: none.
- New framework/registry/ledger: **none**.
- Workflow changes: no workflow YAML change. The existing
`dashboard-remote-bind` suite now invokes the live Vitest test from
`test/e2e/brev-e2e.test.ts`.
- Legacy shell status: `test/e2e/test-dashboard-remote-bind.sh` remains
in place; this PR no longer deletes any `test/e2e` shell script.

## Verification
- `NEMOCLAW_RUN_E2E_SCENARIOS=1 npx vitest run --project
e2e-scenarios-live test/e2e-scenario/live/dashboard-remote-bind.test.ts
--silent=false --reporter=default` (local opt-in live test is skipped
unless `NEMOCLAW_E2E_DASHBOARD_REMOTE_BIND=1` is set by branch
validation)
- `npx vitest run --project cli test/e2e-script-workflow.test.ts
--silent=false --reporter=default`
- `git diff --check origin/main...HEAD`
- `npm run build:cli && npm run typecheck:cli`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: docs Documentation, examples, guides, or docs build area: e2e End-to-end tests, nightly failures, or validation infrastructure chore Build, CI, dependency, or tooling maintenance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants