test(e2e): typed-shell-runner cutover (parity → retirement) by jyaunches · Pull Request #5106 · NVIDIA/NemoClaw

jyaunches · 2026-06-10T03:44:59Z

Implements Phase 0 PR-1 of #5098: cut over from the typed-shell scenario runner to the Vitest scenario path in one coordinated PR.

What changed

Added Vitest workflow parity for the retiring typed-shell workflows: dispatch matrix summary, per-scenario structured summary, run-plan.json, per-phase result artifacts, explicit artifact allowlist, and 14-day retention.
Persisted environment.result.json, onboarding.result.json, and state-validation.result.json from the phase fixtures.
Slimmed test/e2e-scenario/scenarios/run.ts to the surviving --emit-live-matrix entry point.
Moved shared redaction helpers into the Vitest framework surface.
Deleted the retired typed-shell subsystem: legacy scenario workflows, typed orchestrators, YAML/bash scenario workers, validation suites, runtime helper shell libraries, and dead tests.
Added test/e2e-scenario/docs/RETIREMENT.md and refreshed the scenario docs/inventory tests to describe the Vitest-only scenario runner state.

Verification

npx vitest run --project e2e-scenario-framework --silent=false --reporter=default
npm run typecheck:cli
npx tsx test/e2e-scenario/scenarios/run.ts --emit-live-matrix
git diff --check

Refs #5098, #4990, #4941

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

Refs #5098 Empty placeholder commit so the branch has a parent commit for the draft PR. Implementation lands in subsequent commits per the phased plan in the PR body. Signed-off-by: J. Yaunches <jyaunches@nvidia.com>

copy-pr-bot · 2026-06-10T03:45:02Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-06-10T03:45:09Z

Warning

Review limit reached

@cv, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 51 minutes and 9 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6573fc84-1c0e-4bca-8472-66e1ac3ce06a

📥 Commits

Reviewing files that changed from the base of the PR and between 56cbecf and 5496811.

📒 Files selected for processing (177)

.github/workflows/e2e-scenarios-all.yaml
.github/workflows/e2e-scenarios.yaml
.github/workflows/e2e-vitest-scenarios.yaml
scripts/e2e/lint-conventions.ts
test/e2e-scenario/docs/MIGRATION.md
test/e2e-scenario/docs/README.md
test/e2e-scenario/docs/RETIREMENT.md
test/e2e-scenario/framework-tests/e2e-assertion-modules.test.ts
test/e2e-scenario/framework-tests/e2e-context-helper.test.ts
test/e2e-scenario/framework-tests/e2e-convention-lint.test.ts
test/e2e-scenario/framework-tests/e2e-expected-state.test.ts
test/e2e-scenario/framework-tests/e2e-fixture-context.test.ts
test/e2e-scenario/framework-tests/e2e-lib-helpers.test.ts
test/e2e-scenario/framework-tests/e2e-live-registry-discovery.test.ts
test/e2e-scenario/framework-tests/e2e-manifests.test.ts
test/e2e-scenario/framework-tests/e2e-migration-inventory.test.ts
test/e2e-scenario/framework-tests/e2e-negative-matcher.test.ts
test/e2e-scenario/framework-tests/e2e-phase-environment.test.ts
test/e2e-scenario/framework-tests/e2e-phase-onboarding.test.ts
test/e2e-scenario/framework-tests/e2e-phase-orchestrators.test.ts
test/e2e-scenario/framework-tests/e2e-phase-state-validation.test.ts
test/e2e-scenario/framework-tests/e2e-plan-compiler.test.ts
test/e2e-scenario/framework-tests/e2e-probes.test.ts
test/e2e-scenario/framework-tests/e2e-redaction-entry.test.ts
test/e2e-scenario/framework-tests/e2e-redaction-parity.test.ts
test/e2e-scenario/framework-tests/e2e-scenario-matrix.test.ts
test/e2e-scenario/framework-tests/e2e-scenario-registry.test.ts
test/e2e-scenario/framework-tests/e2e-scenarios-workflow.test.ts
test/e2e-scenario/framework/availability-env.ts
test/e2e-scenario/framework/e2e-test.ts
test/e2e-scenario/framework/phases/environment.ts
test/e2e-scenario/framework/phases/onboarding.ts
test/e2e-scenario/framework/phases/state-validation.ts
test/e2e-scenario/framework/redaction.ts
test/e2e-scenario/framework/secrets.ts
test/e2e-scenario/live/registry-scenarios.test.ts
test/e2e-scenario/live/run-plan.ts
test/e2e-scenario/manifests/openclaw-nvidia-rebuild.yaml
test/e2e-scenario/migration/legacy-inventory.json
test/e2e-scenario/nemoclaw_scenarios/dispatch-action.sh
test/e2e-scenario/nemoclaw_scenarios/fixtures/_fake-http-stub.sh
test/e2e-scenario/nemoclaw_scenarios/fixtures/fake-discord.sh
test/e2e-scenario/nemoclaw_scenarios/fixtures/fake-openai.sh
test/e2e-scenario/nemoclaw_scenarios/fixtures/fake-slack.sh
test/e2e-scenario/nemoclaw_scenarios/fixtures/fake-telegram.sh
test/e2e-scenario/nemoclaw_scenarios/fixtures/older-base-image.sh
test/e2e-scenario/nemoclaw_scenarios/helpers/emit-context-from-plan.sh
test/e2e-scenario/nemoclaw_scenarios/install/dispatch.sh
test/e2e-scenario/nemoclaw_scenarios/install/helpers/install-path-refresh.sh
test/e2e-scenario/nemoclaw_scenarios/install/launchable.sh
test/e2e-scenario/nemoclaw_scenarios/install/ollama.sh
test/e2e-scenario/nemoclaw_scenarios/install/public-curl.sh
test/e2e-scenario/nemoclaw_scenarios/install/repo-current.sh
test/e2e-scenario/nemoclaw_scenarios/lifecycle/dispatch.sh
test/e2e-scenario/nemoclaw_scenarios/lifecycle/rebuild-current-version.sh
test/e2e-scenario/nemoclaw_scenarios/onboard/cloud-hermes.sh
test/e2e-scenario/nemoclaw_scenarios/onboard/cloud-openclaw-no-docker.sh
test/e2e-scenario/nemoclaw_scenarios/onboard/cloud-openclaw.sh
test/e2e-scenario/nemoclaw_scenarios/onboard/dispatch.sh
test/e2e-scenario/nemoclaw_scenarios/onboard/local-ollama-openclaw.sh
test/e2e-scenario/nemoclaw_scenarios/probes/cli-installed.sh
test/e2e-scenario/nemoclaw_scenarios/probes/dispatch.sh
test/e2e-scenario/nemoclaw_scenarios/probes/gateway-absent.sh
test/e2e-scenario/nemoclaw_scenarios/probes/gateway-healthy.sh
test/e2e-scenario/nemoclaw_scenarios/probes/sandbox-absent.sh
test/e2e-scenario/nemoclaw_scenarios/probes/sandbox-running.sh
test/e2e-scenario/nemoclaw_scenarios/scenarios.yaml
test/e2e-scenario/onboarding_assertions/base/00-cli-installed.sh
test/e2e-scenario/onboarding_assertions/preflight/00-preflight-expected-failed.sh
test/e2e-scenario/onboarding_assertions/preflight/00-preflight-passed.sh
test/e2e-scenario/runtime/lib/context.sh
test/e2e-scenario/runtime/lib/env.sh
test/e2e-scenario/runtime/lib/logging.sh
test/e2e-scenario/runtime/lib/onboard-state.sh
test/e2e-scenario/runtime/lib/sandbox-teardown.sh
test/e2e-scenario/runtime/reports/render-gap-report.ts
test/e2e-scenario/scenarios/assertions/diagnostics.ts
test/e2e-scenario/scenarios/assertions/hermes.ts
test/e2e-scenario/scenarios/assertions/inference.ts
test/e2e-scenario/scenarios/assertions/lifecycle.ts
test/e2e-scenario/scenarios/assertions/messaging.ts
test/e2e-scenario/scenarios/assertions/negative.ts
test/e2e-scenario/scenarios/assertions/platform.ts
test/e2e-scenario/scenarios/assertions/registry.ts
test/e2e-scenario/scenarios/assertions/security.ts
test/e2e-scenario/scenarios/compiler.ts
test/e2e-scenario/scenarios/expected-states.ts
test/e2e-scenario/scenarios/matrix.ts
test/e2e-scenario/scenarios/orchestrators/context.ts
test/e2e-scenario/scenarios/orchestrators/environment.ts
test/e2e-scenario/scenarios/orchestrators/lifecycle.ts
test/e2e-scenario/scenarios/orchestrators/negative-matcher.ts
test/e2e-scenario/scenarios/orchestrators/onboarding.ts
test/e2e-scenario/scenarios/orchestrators/phase.ts
test/e2e-scenario/scenarios/orchestrators/runner.ts
test/e2e-scenario/scenarios/orchestrators/runtime.ts
test/e2e-scenario/scenarios/orchestrators/state-validation.ts
test/e2e-scenario/scenarios/probes/builtin.ts
test/e2e-scenario/scenarios/probes/diagnostics.ts
test/e2e-scenario/scenarios/probes/docs-validation.ts
test/e2e-scenario/scenarios/probes/injection-blocked.ts
test/e2e-scenario/scenarios/probes/network-policy.ts
test/e2e-scenario/scenarios/probes/registry.ts
test/e2e-scenario/scenarios/probes/shields-config.ts
test/e2e-scenario/scenarios/probes/types.ts
test/e2e-scenario/scenarios/probes/util.ts
test/e2e-scenario/scenarios/run.ts
test/e2e-scenario/scenarios/scenarios/baseline.ts
test/e2e-scenario/scenarios/types.ts
test/e2e-scenario/validation_suites/assert/gateway-alive.sh
test/e2e-scenario/validation_suites/assert/inference-works.sh
test/e2e-scenario/validation_suites/assert/messaging-bridge-reachable.sh
test/e2e-scenario/validation_suites/assert/no-credentials-leaked.sh
test/e2e-scenario/validation_suites/assert/policy-preset-applied.sh
test/e2e-scenario/validation_suites/assert/sandbox-alive.sh
test/e2e-scenario/validation_suites/baseline-onboarding/00-cli-and-openshell.sh
test/e2e-scenario/validation_suites/baseline-onboarding/01-sandbox-state.sh
test/e2e-scenario/validation_suites/baseline-onboarding/02-route-and-smoke.sh
test/e2e-scenario/validation_suites/hermes/00-hermes-health.sh
test/e2e-scenario/validation_suites/hermes/01-history-writable.sh
test/e2e-scenario/validation_suites/inference/cloud/00-models-health.sh
test/e2e-scenario/validation_suites/inference/cloud/01-chat-completion.sh
test/e2e-scenario/validation_suites/inference/cloud/02-inference-local-from-sandbox.sh
test/e2e-scenario/validation_suites/inference/kimi-compatibility/00-plugin-wiring.sh
test/e2e-scenario/validation_suites/inference/kimi-compatibility/01-kimi-compatible-models-route.sh
test/e2e-scenario/validation_suites/inference/model-router/00-healthy-endpoint.sh
test/e2e-scenario/validation_suites/inference/model-router/01-provider-routed-completion.sh
test/e2e-scenario/validation_suites/inference/ollama-auth-proxy/00-proxy-reachable.sh
test/e2e-scenario/validation_suites/inference/ollama-auth-proxy/01-auth-enforcement.sh
test/e2e-scenario/validation_suites/inference/ollama-gpu/00-ollama-models-health.sh
test/e2e-scenario/validation_suites/inference/ollama-gpu/01-ollama-chat-completion.sh
test/e2e-scenario/validation_suites/inference/routing/00-inference-local-chat-completion.sh
test/e2e-scenario/validation_suites/inference/routing/01-provider-route-health.sh
test/e2e-scenario/validation_suites/inference/switch/00-route-state-updated.sh
test/e2e-scenario/validation_suites/inference/switch/01-switched-inference-local-chat.sh
test/e2e-scenario/validation_suites/lib/baseline_onboarding.sh
test/e2e-scenario/validation_suites/lib/inference_routing.sh
test/e2e-scenario/validation_suites/lib/messaging_providers.sh
test/e2e-scenario/validation_suites/lib/rebuild_upgrade.sh
test/e2e-scenario/validation_suites/lib/sandbox_lifecycle.sh
test/e2e-scenario/validation_suites/lib/security_policy_credentials.sh
test/e2e-scenario/validation_suites/messaging/common/00-provider-attached.sh
test/e2e-scenario/validation_suites/messaging/common/01-placeholder-configured.sh
test/e2e-scenario/validation_suites/messaging/common/02-no-secret-leak.sh
test/e2e-scenario/validation_suites/messaging/common/03-bridge-reachable.sh
test/e2e-scenario/validation_suites/messaging/discord/00-discord-gateway-path.sh
test/e2e-scenario/validation_suites/messaging/slack/00-slack-provider-state.sh
test/e2e-scenario/validation_suites/messaging/telegram/00-telegram-injection-safety.sh
test/e2e-scenario/validation_suites/messaging/telegram/01-telegram-injection-payload-classes.sh
test/e2e-scenario/validation_suites/messaging/token-rotation/00-provider-rotation-isolated.sh
test/e2e-scenario/validation_suites/onboarding/state/00-registry-provider-model-policies.sh
test/e2e-scenario/validation_suites/onboarding/state/01-session-provider-model-policies.sh
test/e2e-scenario/validation_suites/platform/macos/00-macos-smoke.sh
test/e2e-scenario/validation_suites/platform/wsl/00-wsl-smoke.sh
test/e2e-scenario/validation_suites/rebuild_upgrade/00-state-preserved.sh
test/e2e-scenario/validation_suites/rebuild_upgrade/01-agent-version-upgraded.sh
test/e2e-scenario/validation_suites/rebuild_upgrade/02-post-rebuild-inference.sh
test/e2e-scenario/validation_suites/rebuild_upgrade/03-policy-config-preserved.sh
test/e2e-scenario/validation_suites/rebuild_upgrade/04-upgrade-survivor-reachable.sh
test/e2e-scenario/validation_suites/sandbox-exec.sh
test/e2e-scenario/validation_suites/sandbox/lifecycle/00-gateway-health.sh
test/e2e-scenario/validation_suites/sandbox/lifecycle/01-gateway-recovery.sh
test/e2e-scenario/validation_suites/sandbox/operations/00-list-and-status.sh
test/e2e-scenario/validation_suites/sandbox/operations/01-logs-and-exec.sh
test/e2e-scenario/validation_suites/sandbox/snapshot/00-create-list-restore.sh
test/e2e-scenario/validation_suites/security/credentials/00-credentials-present.sh
test/e2e-scenario/validation_suites/security/credentials/01-no-plaintext-host-store.sh
test/e2e-scenario/validation_suites/security/injection/00-telegram-message-not-shell-executed.sh
test/e2e-scenario/validation_suites/security/policy/00-telegram-preset-applied.sh
test/e2e-scenario/validation_suites/security/policy/01-openshell-version-supports-credential-rewrite.sh
test/e2e-scenario/validation_suites/security/shields/00-config-consistent.sh
test/e2e-scenario/validation_suites/smoke/00-cli-available.sh
test/e2e-scenario/validation_suites/smoke/01-gateway-health.sh
test/e2e-scenario/validation_suites/smoke/02-sandbox-listed.sh
test/e2e-scenario/validation_suites/smoke/03-sandbox-shell.sh
test/e2e-scenario/validation_suites/suites.yaml
tools/e2e-scenarios/workflow-boundary.mts

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch issue-5098-typed-shell-cutover

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-06-10T03:45:40Z

E2E Advisor Recommendation

Required E2E: e2e-vitest-scenarios:live-scenarios
Optional E2E: cloud-e2e, rebuild-openclaw-e2e, credential-sanitization-e2e, telegram-injection-e2e, messaging-providers-e2e, network-policy-e2e

Dispatch hint: scenarios=ubuntu-repo-cloud-openclaw,ubuntu-repo-docker-post-reboot-recovery

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

e2e-vitest-scenarios:live-scenarios (high; two ubuntu-latest live scenario matrix jobs, up to 45 minutes each): This is the surviving scenario E2E workflow and it is directly modified. Run the supported live Vitest scenario matrix to verify workflow dispatch, matrix emission, CLI build, install/onboarding, state-validation probes, run-plan summaries, redaction-safe artifact paths, and upload allowlists after retiring the typed-shell workflows.

Optional E2E

cloud-e2e (high): Useful independent confidence that the core install → onboard → sandbox → live inference user journey still works through the legacy direct E2E path after removing the typed-shell scenario runner.
rebuild-openclaw-e2e (high): Recommended because rebuild manifest/lifecycle coverage is touched and the retired validation suites included rebuild-upgrade assertions that are not yet broadly represented in the supported Vitest live matrix.
credential-sanitization-e2e (medium-high): Useful security backstop because the removed validation_suites/security coverage included no-secret-leak and host credential-store hardening assertions, while redaction and secrets fixture code changed.
telegram-injection-e2e (medium-high): Optional security boundary confidence because Telegram injection validation scripts and probes were removed from the scenario-runner tree; the direct legacy E2E remains the stronger existing coverage for shell-injection prevention.
messaging-providers-e2e (high): Optional confidence for provider credential isolation and messaging bridge behavior after deleting the scenario validation suites for Telegram/Discord/Slack provider attachment, placeholder configuration, no-secret-leak, and bridge reachability.
network-policy-e2e (medium-high): Optional network/security confidence because network-policy and shields probes were removed from the retired scenario-runner tree; this verifies the direct legacy network policy path still protects the boundary.

New E2E recommendations

security-credentials-coverage (high): After this retirement, the supported Vitest live scenario matrix covers only the baseline OpenClaw and post-reboot-recovery paths. The deleted typed-shell validation suites contained credential presence, no-plaintext-host-store, shields, policy preset, and no-secret-leak assertions that are not yet equivalent live Vitest probes.
- Suggested test: Add a live Vitest scenario fixture path for credential/security boundary validation that records credential presence, no raw secret output, shields config consistency, and policy preset evidence under the new artifact allowlist.
messaging-and-injection-coverage (high): Messaging validation suites and Telegram injection probes are deleted, but the Vitest live scenario support whitelist does not yet wire Telegram/Discord/Slack onboarding profiles. Existing legacy E2Es remain, but the new scenario framework lacks parity.
- Suggested test: Add supported Vitest live messaging scenarios for Telegram, Discord, and Slack that validate provider attachment, bridge reachability, token isolation, and injection payload non-execution.
network-policy-and-egress-coverage (medium): Network-policy probes were removed from the scenario runner, and the current supported Vitest matrix does not exercise Brave/common-egress or gateway credential-rewrite behavior.
- Suggested test: Add a Vitest live network-policy scenario covering allowed and blocked egress, policy preset application, and credential rewrite behavior with redacted evidence artifacts.
rebuild-upgrade-coverage (medium): The old scenario workflow explicitly included ubuntu-rebuild-openclaw, and the deleted validation suites covered rebuild state preservation, post-rebuild inference, policy preservation, and upgrade survivor reachability. The surviving supported Vitest matrix does not yet include a rebuild profile.
- Suggested test: Add a supported Vitest live rebuild lifecycle scenario that runs rebuild, verifies workspace/state preservation, policy/config preservation, and post-rebuild inference.

Dispatch hint

Workflow: .github/workflows/e2e-vitest-scenarios.yaml
jobs input: scenarios=ubuntu-repo-cloud-openclaw,ubuntu-repo-docker-post-reboot-recovery

github-actions · 2026-06-10T03:45:41Z

E2E Scenario Advisor Recommendation

Required scenario E2E: e2e-scenarios-all
Optional scenario E2E: None

Dispatch required scenario E2E:

gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref>

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

e2e-scenarios-all: Required fan-out: this PR changes the Vitest scenario workflow plus broad scenario runner/runtime surfaces, live registry entrypoint, framework fixtures, phase helpers, expected-state and scenario metadata, manifests, retirement of typed-shell scenario surfaces, and artifact/reporting behavior. The safest and policy-required validation is the full live-supported Vitest scenario matrix.
- Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref>

Optional scenario E2E

None.

Relevant changed files

.github/workflows/e2e-scenarios-all.yaml
.github/workflows/e2e-scenarios.yaml
.github/workflows/e2e-vitest-scenarios.yaml
scripts/e2e/lint-conventions.ts
test/e2e-scenario/docs/MIGRATION.md
test/e2e-scenario/docs/README.md
test/e2e-scenario/docs/RETIREMENT.md
test/e2e-scenario/framework-tests/e2e-assertion-modules.test.ts
test/e2e-scenario/framework-tests/e2e-context-helper.test.ts
test/e2e-scenario/framework-tests/e2e-convention-lint.test.ts
test/e2e-scenario/framework-tests/e2e-expected-state.test.ts
test/e2e-scenario/framework-tests/e2e-fixture-context.test.ts
test/e2e-scenario/framework-tests/e2e-lib-helpers.test.ts
test/e2e-scenario/framework-tests/e2e-live-registry-discovery.test.ts
test/e2e-scenario/framework-tests/e2e-manifests.test.ts
test/e2e-scenario/framework-tests/e2e-migration-inventory.test.ts
test/e2e-scenario/framework-tests/e2e-negative-matcher.test.ts
test/e2e-scenario/framework-tests/e2e-phase-environment.test.ts
test/e2e-scenario/framework-tests/e2e-phase-onboarding.test.ts
test/e2e-scenario/framework-tests/e2e-phase-orchestrators.test.ts
test/e2e-scenario/framework-tests/e2e-phase-state-validation.test.ts
test/e2e-scenario/framework-tests/e2e-plan-compiler.test.ts
test/e2e-scenario/framework-tests/e2e-probes.test.ts
test/e2e-scenario/framework-tests/e2e-redaction-entry.test.ts
test/e2e-scenario/framework-tests/e2e-redaction-parity.test.ts
test/e2e-scenario/framework-tests/e2e-scenario-matrix.test.ts
test/e2e-scenario/framework-tests/e2e-scenario-registry.test.ts
test/e2e-scenario/framework-tests/e2e-scenarios-workflow.test.ts
test/e2e-scenario/framework/availability-env.ts
test/e2e-scenario/framework/e2e-test.ts
test/e2e-scenario/framework/phases/environment.ts
test/e2e-scenario/framework/phases/onboarding.ts
test/e2e-scenario/framework/phases/state-validation.ts
test/e2e-scenario/framework/redaction.ts
test/e2e-scenario/framework/secrets.ts
test/e2e-scenario/live/registry-scenarios.test.ts
test/e2e-scenario/live/run-plan.ts
test/e2e-scenario/manifests/openclaw-nvidia-rebuild.yaml
test/e2e-scenario/migration/legacy-inventory.json
test/e2e-scenario/nemoclaw_scenarios/dispatch-action.sh
test/e2e-scenario/nemoclaw_scenarios/fixtures/_fake-http-stub.sh
test/e2e-scenario/nemoclaw_scenarios/fixtures/fake-discord.sh
test/e2e-scenario/nemoclaw_scenarios/fixtures/fake-openai.sh
test/e2e-scenario/nemoclaw_scenarios/fixtures/fake-slack.sh
test/e2e-scenario/nemoclaw_scenarios/fixtures/fake-telegram.sh
test/e2e-scenario/nemoclaw_scenarios/fixtures/older-base-image.sh
test/e2e-scenario/nemoclaw_scenarios/helpers/emit-context-from-plan.sh
test/e2e-scenario/nemoclaw_scenarios/install/dispatch.sh
test/e2e-scenario/nemoclaw_scenarios/install/helpers/install-path-refresh.sh
test/e2e-scenario/nemoclaw_scenarios/install/launchable.sh
test/e2e-scenario/nemoclaw_scenarios/install/ollama.sh
test/e2e-scenario/nemoclaw_scenarios/install/public-curl.sh
test/e2e-scenario/nemoclaw_scenarios/install/repo-current.sh
test/e2e-scenario/nemoclaw_scenarios/lifecycle/dispatch.sh
test/e2e-scenario/nemoclaw_scenarios/lifecycle/rebuild-current-version.sh
test/e2e-scenario/nemoclaw_scenarios/onboard/cloud-hermes.sh
test/e2e-scenario/nemoclaw_scenarios/onboard/cloud-openclaw-no-docker.sh
test/e2e-scenario/nemoclaw_scenarios/onboard/cloud-openclaw.sh
test/e2e-scenario/nemoclaw_scenarios/onboard/dispatch.sh
test/e2e-scenario/nemoclaw_scenarios/onboard/local-ollama-openclaw.sh
test/e2e-scenario/nemoclaw_scenarios/probes/cli-installed.sh
test/e2e-scenario/nemoclaw_scenarios/probes/dispatch.sh
test/e2e-scenario/nemoclaw_scenarios/probes/gateway-absent.sh
test/e2e-scenario/nemoclaw_scenarios/probes/gateway-healthy.sh
test/e2e-scenario/nemoclaw_scenarios/probes/sandbox-absent.sh
test/e2e-scenario/nemoclaw_scenarios/probes/sandbox-running.sh
test/e2e-scenario/nemoclaw_scenarios/scenarios.yaml
test/e2e-scenario/onboarding_assertions/base/00-cli-installed.sh
test/e2e-scenario/onboarding_assertions/preflight/00-preflight-expected-failed.sh
test/e2e-scenario/onboarding_assertions/preflight/00-preflight-passed.sh
test/e2e-scenario/runtime/lib/context.sh
test/e2e-scenario/runtime/lib/env.sh
test/e2e-scenario/runtime/lib/logging.sh
test/e2e-scenario/runtime/lib/onboard-state.sh
test/e2e-scenario/runtime/lib/sandbox-teardown.sh
test/e2e-scenario/runtime/reports/render-gap-report.ts
test/e2e-scenario/scenarios/assertions/diagnostics.ts
test/e2e-scenario/scenarios/assertions/hermes.ts
test/e2e-scenario/scenarios/assertions/inference.ts
test/e2e-scenario/scenarios/assertions/lifecycle.ts
test/e2e-scenario/scenarios/assertions/messaging.ts
test/e2e-scenario/scenarios/assertions/negative.ts
test/e2e-scenario/scenarios/assertions/platform.ts
test/e2e-scenario/scenarios/assertions/registry.ts
test/e2e-scenario/scenarios/assertions/security.ts
test/e2e-scenario/scenarios/compiler.ts
test/e2e-scenario/scenarios/expected-states.ts
test/e2e-scenario/scenarios/matrix.ts
test/e2e-scenario/scenarios/orchestrators/context.ts
test/e2e-scenario/scenarios/orchestrators/environment.ts
test/e2e-scenario/scenarios/orchestrators/lifecycle.ts
test/e2e-scenario/scenarios/orchestrators/negative-matcher.ts
test/e2e-scenario/scenarios/orchestrators/onboarding.ts
test/e2e-scenario/scenarios/orchestrators/phase.ts
test/e2e-scenario/scenarios/orchestrators/runner.ts
test/e2e-scenario/scenarios/orchestrators/runtime.ts
test/e2e-scenario/scenarios/orchestrators/state-validation.ts
test/e2e-scenario/scenarios/probes/builtin.ts
test/e2e-scenario/scenarios/probes/diagnostics.ts
test/e2e-scenario/scenarios/probes/docs-validation.ts
test/e2e-scenario/scenarios/probes/injection-blocked.ts
test/e2e-scenario/scenarios/probes/network-policy.ts
test/e2e-scenario/scenarios/probes/registry.ts
test/e2e-scenario/scenarios/probes/shields-config.ts
test/e2e-scenario/scenarios/probes/types.ts
test/e2e-scenario/scenarios/probes/util.ts
test/e2e-scenario/scenarios/run.ts
test/e2e-scenario/scenarios/scenarios/baseline.ts
test/e2e-scenario/scenarios/types.ts
test/e2e-scenario/validation_suites/assert/gateway-alive.sh
test/e2e-scenario/validation_suites/assert/inference-works.sh
test/e2e-scenario/validation_suites/assert/messaging-bridge-reachable.sh
test/e2e-scenario/validation_suites/assert/no-credentials-leaked.sh
test/e2e-scenario/validation_suites/assert/policy-preset-applied.sh
test/e2e-scenario/validation_suites/assert/sandbox-alive.sh
test/e2e-scenario/validation_suites/baseline-onboarding/00-cli-and-openshell.sh
test/e2e-scenario/validation_suites/baseline-onboarding/01-sandbox-state.sh
test/e2e-scenario/validation_suites/baseline-onboarding/02-route-and-smoke.sh
test/e2e-scenario/validation_suites/hermes/00-hermes-health.sh
test/e2e-scenario/validation_suites/hermes/01-history-writable.sh
test/e2e-scenario/validation_suites/inference/cloud/00-models-health.sh
test/e2e-scenario/validation_suites/inference/cloud/01-chat-completion.sh
test/e2e-scenario/validation_suites/inference/cloud/02-inference-local-from-sandbox.sh
test/e2e-scenario/validation_suites/inference/kimi-compatibility/00-plugin-wiring.sh
test/e2e-scenario/validation_suites/inference/kimi-compatibility/01-kimi-compatible-models-route.sh
test/e2e-scenario/validation_suites/inference/model-router/00-healthy-endpoint.sh
test/e2e-scenario/validation_suites/inference/model-router/01-provider-routed-completion.sh
test/e2e-scenario/validation_suites/inference/ollama-auth-proxy/00-proxy-reachable.sh
test/e2e-scenario/validation_suites/inference/ollama-auth-proxy/01-auth-enforcement.sh
test/e2e-scenario/validation_suites/inference/ollama-gpu/00-ollama-models-health.sh
test/e2e-scenario/validation_suites/inference/ollama-gpu/01-ollama-chat-completion.sh
test/e2e-scenario/validation_suites/inference/routing/00-inference-local-chat-completion.sh
test/e2e-scenario/validation_suites/inference/routing/01-provider-route-health.sh
test/e2e-scenario/validation_suites/inference/switch/00-route-state-updated.sh
test/e2e-scenario/validation_suites/inference/switch/01-switched-inference-local-chat.sh
test/e2e-scenario/validation_suites/lib/baseline_onboarding.sh
test/e2e-scenario/validation_suites/lib/inference_routing.sh
test/e2e-scenario/validation_suites/lib/messaging_providers.sh
test/e2e-scenario/validation_suites/lib/rebuild_upgrade.sh
test/e2e-scenario/validation_suites/lib/sandbox_lifecycle.sh
test/e2e-scenario/validation_suites/lib/security_policy_credentials.sh
test/e2e-scenario/validation_suites/messaging/common/00-provider-attached.sh
test/e2e-scenario/validation_suites/messaging/common/01-placeholder-configured.sh
test/e2e-scenario/validation_suites/messaging/common/02-no-secret-leak.sh
test/e2e-scenario/validation_suites/messaging/common/03-bridge-reachable.sh
test/e2e-scenario/validation_suites/messaging/discord/00-discord-gateway-path.sh
test/e2e-scenario/validation_suites/messaging/slack/00-slack-provider-state.sh
test/e2e-scenario/validation_suites/messaging/telegram/00-telegram-injection-safety.sh
test/e2e-scenario/validation_suites/messaging/telegram/01-telegram-injection-payload-classes.sh
test/e2e-scenario/validation_suites/messaging/token-rotation/00-provider-rotation-isolated.sh
test/e2e-scenario/validation_suites/onboarding/state/00-registry-provider-model-policies.sh
test/e2e-scenario/validation_suites/onboarding/state/01-session-provider-model-policies.sh
test/e2e-scenario/validation_suites/platform/macos/00-macos-smoke.sh
test/e2e-scenario/validation_suites/platform/wsl/00-wsl-smoke.sh
test/e2e-scenario/validation_suites/rebuild_upgrade/00-state-preserved.sh
test/e2e-scenario/validation_suites/rebuild_upgrade/01-agent-version-upgraded.sh
test/e2e-scenario/validation_suites/rebuild_upgrade/02-post-rebuild-inference.sh
test/e2e-scenario/validation_suites/rebuild_upgrade/03-policy-config-preserved.sh
test/e2e-scenario/validation_suites/rebuild_upgrade/04-upgrade-survivor-reachable.sh
test/e2e-scenario/validation_suites/sandbox-exec.sh
test/e2e-scenario/validation_suites/sandbox/lifecycle/00-gateway-health.sh
test/e2e-scenario/validation_suites/sandbox/lifecycle/01-gateway-recovery.sh
test/e2e-scenario/validation_suites/sandbox/operations/00-list-and-status.sh
test/e2e-scenario/validation_suites/sandbox/operations/01-logs-and-exec.sh
test/e2e-scenario/validation_suites/sandbox/snapshot/00-create-list-restore.sh
test/e2e-scenario/validation_suites/security/credentials/00-credentials-present.sh
test/e2e-scenario/validation_suites/security/credentials/01-no-plaintext-host-store.sh
test/e2e-scenario/validation_suites/security/injection/00-telegram-message-not-shell-executed.sh
test/e2e-scenario/validation_suites/security/policy/00-telegram-preset-applied.sh
test/e2e-scenario/validation_suites/security/policy/01-openshell-version-supports-credential-rewrite.sh
test/e2e-scenario/validation_suites/security/shields/00-config-consistent.sh
test/e2e-scenario/validation_suites/smoke/00-cli-available.sh
test/e2e-scenario/validation_suites/smoke/01-gateway-health.sh
test/e2e-scenario/validation_suites/smoke/02-sandbox-listed.sh
test/e2e-scenario/validation_suites/smoke/03-sandbox-shell.sh
test/e2e-scenario/validation_suites/suites.yaml
tools/e2e-scenarios/workflow-boundary.mts

github-actions · 2026-06-10T03:49:38Z

PR Review Advisor

Findings: 0 needs attention, 1 worth checking, 0 nice ideas
Top item: PR review advisor unavailable

Review findings

🛠️ Needs attention

None.

🔎 Worth checking

PR review advisor unavailable: The automated advisor could not complete: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt
- Recommendation: Re-run the PR Review Advisor or perform a manual review.
- Evidence: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt

🌱 Nice ideas

None.

Consider writing more tests for

**Runtime validation** — Add or identify targeted runtime/integration validation for the changed behavior; do not report external E2E job pass/fail here.. Runtime/sandbox/infrastructure paths need behavioral runtime validation: .github/workflows/e2e-scenarios-all.yaml, .github/workflows/e2e-scenarios.yaml, .github/workflows/e2e-vitest-scenarios.yaml, scripts/e2e/lint-conventions.ts, tools/e2e-scenarios/workflow-boundary.mts.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

Refs #5098 Empty placeholder commit so the branch has a parent commit for the draft PR. Implementation lands in subsequent commits per the phased plan in the PR body. Signed-off-by: J. Yaunches <jyaunches@nvidia.com>

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

copy-pr-bot · 2026-06-10T05:25:23Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

github-actions · 2026-06-10T06:42:37Z

Selective E2E Results — ✅ All requested jobs passed

Run: 27257661280
Target ref: 3d3149346c68573d03e9ef5490ebc85a236d8af3
Workflow ref: codex/e2e-advisor-vitest-workflow
Requested jobs: cloud-onboard-e2e,rebuild-openclaw-e2e,credential-sanitization-e2e
Summary: 3 passed, 0 failed, 0 skipped

Job	Result
cloud-onboard-e2e	✅ success
credential-sanitization-e2e	✅ success
rebuild-openclaw-e2e	✅ success

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

## Summary Simplifies legacy E2E migration tracking now that #5106 retired the typed-shell/YAML scenario runner. The PR removes stale repo-local JSON migration ledgers, documents GitHub issues and PRs as the source of truth for migration state, adopts the #5156-style deterministic legacy bash workflow guard, and clarifies that NemoClaw owns one Vitest E2E system with fixtures/support code rather than a second framework. ## Related Issue Refs #5098 ## Changes - Removes `test/e2e-scenario/migration/legacy-inventory.json` so it no longer acts as a detailed script-by-script migration roadmap. - Removes the orphaned generated assertion inventory at `test/e2e/docs/parity-inventory.generated.json`; its recorded generator no longer exists, and it was already stale against migrated/deleted legacy scripts. - Adds deterministic workflow contract tests that freeze the current top-level `test/e2e/test-*.sh` legacy script set, freeze scheduled `nightly-e2e.yaml` legacy script wiring, and assert every nightly-wired legacy script still exists. - Replaces inventory validation with migration policy/source-of-truth checks that keep repo-local durable taxonomy and generated assertion ledgers out of migration docs. - Removes the PR Review Advisor PR-body deletion-evidence checker/contract; the advisor still blocks reintroducing retired migration ledgers. - Updates `MIGRATION.md`, `README.md`, `RETIREMENT.md`, and release-note wording to describe one Vitest E2E system, workflow allowlist guardrails, and fixture/support code rather than a second E2E framework or runner. - Renames the fast Vitest support project from `e2e-scenario-framework` to `e2e-vitest-support`, moves tests from `framework-tests/` to `support-tests/`, and renames the shared helper path to `test/e2e-scenario/fixtures/`. - Rewords the E2E scenario advisor prompt, summary, and sticky-comment text so it recommends Vitest-backed E2E scenario dispatches without teaching “scenario E2E” as a separate E2E class. - Fixes shared sandbox fixture helpers so `SandboxClient.exec()` uses `openshell sandbox exec -n <name>` and exposes `execShell()` / `upload()` for follow-on migrations. ## Type of Change - [ ] Code change (feature, bug fix, or refactor) - [x] Code change with doc updates - [ ] Doc only (prose changes, no code sample modifications) - [ ] Doc only (includes code sample changes) ## Verification - `npx vitest run --project e2e-vitest-support --silent=false --reporter=default` - `npx vitest run --project cli test/pr-workflow-contract.test.ts test/e2e-scenario-advisor.test.ts --silent=false --reporter=default` - `npx vitest run --project cli test/e2e-script-workflow.test.ts test/pr-review-advisor.test.ts --reporter=default` - `npx vitest run --project e2e-vitest-support test/e2e-scenario/support-tests/e2e-migration-policy.test.ts --reporter=default` - `npm run test-size:check -- test/e2e-script-workflow.test.ts` - `npx markdownlint-cli2 docs/about/release-notes.mdx test/e2e-scenario/docs/MIGRATION.md test/e2e-scenario/docs/README.md test/e2e-scenario/docs/RETIREMENT.md "!node_modules/**" "!skills/**"` - `node -e "JSON.parse(require("fs").readFileSync("tools/e2e-advisor/scenarios-schema.json","utf8"))"` - `npm run typecheck:cli` - `git diff --check` - Commit and push hooks passed with `SSH_AUTH_SOCK=/tmp/ssh-Nu4xGJZo0F/agent.712827`. - [ ] `npx prek run --all-files` passes - [ ] `npm test` passes - [x] Tests added or updated for new or changed behavior - [x] No secrets, API keys, or credentials committed - [x] Docs updated for user-facing behavior changes - [ ] `npm run docs` builds without warnings (doc changes only) - [ ] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages include SPDX header and frontmatter (new pages only) --- Signed-off-by: Carlos Villela <cvillela@nvidia.com>  ## Summary by CodeRabbit * **Refactor** * Consolidated E2E test infra under a Vitest-driven fixtures/support layer and renamed the E2E support project. * **Documentation** * Updated migration guidance to use GitHub issues/PRs as the source of truth and clarified fixture-owned responsibilities and migration checklist. * **Tests** * Added migration-policy/source-of-truth tests; removed legacy repo-local migration ledger; updated scenario and PR-review advisor outputs to target Vitest E2E and to block reintroduced ledgers. * **New Features** * Fixture sandbox helpers added (shell-run/upload) and command usage normalized for named sandboxes.  --------- Co-authored-by: Julie Yaunches <jyaunches@nvidia.com>

cv and others added 2 commits June 9, 2026 20:39

ci(e2e): route scenario advisor to Vitest workflow

664f486

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

Initial cutover plan placeholder

3570edf

Refs #5098 Empty placeholder commit so the branch has a parent commit for the draft PR. Implementation lands in subsequent commits per the phased plan in the PR body. Signed-off-by: J. Yaunches <jyaunches@nvidia.com>

jyaunches added the area: e2e End-to-end tests, nightly failures, or validation infrastructure label Jun 10, 2026

cv added 3 commits June 9, 2026 20:49

test(e2e): enforce scenario advisor registry ids

cf77030

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

test(e2e): require live-supported scenario recommendations

74f37d4

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

test(e2e): document advisor trusted registry boundary

47e20ac

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

cv changed the base branch from main to codex/e2e-advisor-vitest-workflow June 10, 2026 04:15

jyaunches and others added 3 commits June 9, 2026 21:16

Initial cutover plan placeholder

55c4998

Refs #5098 Empty placeholder commit so the branch has a parent commit for the draft PR. Implementation lands in subsequent commits per the phased plan in the PR body. Signed-off-by: J. Yaunches <jyaunches@nvidia.com>

test(e2e): retire typed-shell scenario runner

94ab779

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

merge: reconcile remote cutover placeholder

4b89504

cv marked this pull request as ready for review June 10, 2026 05:26

cv added 3 commits June 9, 2026 22:29

docs(e2e): fix retirement note lint

04ba517

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

fix(e2e): align Vitest scenario artifacts

3184f9f

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

fix(e2e): include shell probe artifacts

3d31493

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

github-actions Bot mentioned this pull request Jun 10, 2026

refactor(test/e2e): consolidate shell-probe with scenario spawn infra #5033

Merged

12 tasks

Base automatically changed from codex/e2e-advisor-vitest-workflow to main June 10, 2026 06:40

merge: sync typed-shell cutover with main

5496811

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

cv approved these changes Jun 10, 2026

View reviewed changes

cv merged commit 82499f8 into main Jun 10, 2026
34 checks passed

cv deleted the issue-5098-typed-shell-cutover branch June 10, 2026 07:08

This was referenced Jun 10, 2026

Adopt Vitest fixtures as the E2E scenario execution model #4941

Open

Epic: Migrate legacy bash E2E into the Vitest E2E system #5098

Open

test(e2e): freeze legacy YAML/bash scenario runner #5073

Closed

cv added the v0.0.63 Release target label Jun 10, 2026

Conversation

jyaunches commented Jun 10, 2026 • edited by cv Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changed

Verification

Uh oh!

copy-pr-bot Bot commented Jun 10, 2026

Uh oh!

coderabbitai Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Uh oh!

github-actions Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

github-actions Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Scenario Advisor Recommendation

E2E Scenario Advisor

Required scenario E2E

Optional scenario E2E

Relevant changed files

Uh oh!

github-actions Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Advisor

🛠️ Needs attention

🔎 Worth checking

🌱 Nice ideas

Uh oh!

copy-pr-bot Bot commented Jun 10, 2026

Uh oh!

github-actions Bot commented Jun 10, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jyaunches commented Jun 10, 2026 •

edited by cv

Loading

coderabbitai Bot commented Jun 10, 2026 •

edited

Loading

github-actions Bot commented Jun 10, 2026 •

edited

Loading

github-actions Bot commented Jun 10, 2026 •

edited

Loading

github-actions Bot commented Jun 10, 2026 •

edited

Loading