test(e2e): migrate full e2e journey to Vitest by cv · Pull Request #5493 · NVIDIA/NemoClaw

cv · 2026-06-16T04:53:55Z

Summary

Migrate test-full-e2e.sh into the live Vitest E2E system. Adds test/e2e-scenario/live/full-e2e.test.ts and full-e2e-vitest workflow wiring. The Vitest test runs install.sh --non-interactive, verifies installed CLI/OpenShell, list/status, inference configuration, policy, direct hosted inference, sandbox inference.local, logs, and cleanup.

Related Issue

Refs #5098

Changes

Adds or wires the free-standing live Vitest scenario full-e2e.
Adds selective workflow dispatch via full-e2e-vitest in .github/workflows/e2e-vitest-scenarios.yaml.
Preserves the legacy system boundaries from test-full-e2e.sh while leaving legacy shell retirement to Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 Phase 11.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

Git hooks passed during commit and push, or npx prek run --from-ref main --to-ref HEAD passes
Targeted tests pass for changed behavior
Full npm test passes (broad runtime changes only)
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
npm run docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

Targeted local checks run while preparing these branches:

npx vitest run --project e2e-vitest-support test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts --silent=false --reporter=default
npm run typecheck:cli for branches adding new TypeScript tests
git diff --check

Selective same-runner dispatch: https://github.com/NVIDIA/NemoClaw/actions/runs/27638939083 — passed after merge-from-main refresh

Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

Tests
- Added a new live “full E2E” Vitest scenario covering the complete workflow end-to-end, including sandbox setup/teardown, policy output validation, hosted inference verification, in-sandbox inference behavior, and log checks, with post-run cleanup and registry removal.
Chores
- Extended the CI pipeline with a dedicated full-e2e-vitest job and wired it into pull request result aggregation, including uploading full-e2e artifacts for review.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

copy-pr-bot · 2026-06-16T04:53:59Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-06-16T04:54:02Z

📝 Walkthrough

Walkthrough

Adds a new live Vitest E2E test file (test/e2e-scenario/live/full-e2e.test.ts) that covers a complete install-through-inference workflow including CLI validation, OpenShell policy checks, hosted and sandbox inference, log fetching, and cleanup. A corresponding full-e2e-vitest CI job is added to the workflow and wired into report-to-pr.

Changes

Full E2E Vitest test and CI wiring

Layer / File(s)	Summary
Test constants, env builder, and helpers `test/e2e-scenario/live/full-e2e.test.ts`	Defines sandbox/CLI constants, live-test gating, `env()` composer, shell probe formatter, `nemoclaw` host-runner with artifact capture and timeout, cleanup helper (nemoclaw destroy + OpenShell sandbox/gateway teardown with swallowed errors), and `chatRequest`/`parseReplyCommand` utilities.
Full E2E test body and CI job `test/e2e-scenario/live/full-e2e.test.ts`, `.github/workflows/e2e-vitest-scenarios.yaml`	Implements the multi-phase test (Docker check, install, CLI probe, nemoclaw list/status, OpenShell policy assertions, hosted curl inference, sandbox curl inference asserting "42", logs validation, cleanup, registry assertion, and `scenario-result.json` artifact). Adds the `full-e2e-vitest` free-standing CI job (with OpenShell install, vitest run, and artifact upload) and includes it in `report-to-pr.needs`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Epic: Migrate legacy bash E2E into the Vitest E2E system #5098: This PR directly implements the Phase 3 migration described in that issue — creating test/e2e-scenario/live/full-e2e.test.ts and wiring it into .github/workflows/e2e-vitest-scenarios.yaml.

Possibly related PRs

NVIDIA/NemoClaw#5243: Introduced the shared free-standing job-selector and validate-jobs plumbing in the same workflow that this PR's full-e2e-vitest job plugs into.
NVIDIA/NemoClaw#5219: Also wires a live Vitest job into report-to-pr.needs using the same pattern.
NVIDIA/NemoClaw#5354: Updates the same workflow to add another free-standing live Vitest job with identical wiring structure.

Suggested labels

area: e2e, chore, v0.0.65

🐇 A fresh install, a sandbox to roam,
nemoclaw list confirms it's home.
Ask the model "what's 6 times 7?"
"42" comes back — pure coding heaven!
Logs fetched, cleanup done, the test is passed,
Full E2E vitest wired up at last! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: migrating the full e2e journey test from shell to Vitest, which is the primary objective reflected in both the file additions and workflow changes.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch e2e-migrate/test-full-e2e

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-06-16T04:54:27Z

PR Review Advisor

Findings: 2 needs attention, 2 worth checking, 0 nice ideas
Since last review: 0 prior items resolved, 4 still apply, 0 new items found

Review findings

🛠️ Needs attention

Avoid shell-injecting hosted model values and exposing the API key in argv (test/e2e-scenario/live/full-e2e.test.ts:197): The sandbox inference probe embeds an env-derived hosted model inside a single-quoted `sh -lc` command. `JSON.stringify` protects JSON syntax but does not escape shell single quotes, so a model override containing `'` can break out of the `--data` argument and execute shell inside the sandbox. The direct hosted curl probe also passes `Authorization: Bearer ${hosted.apiKey}` as a process argument, so artifact redaction cannot prevent the secret from being visible in argv while curl runs. Because `hosted.endpointUrl` is also env-derived, this same direct probe can send the bearer token to an unallowlisted host if the endpoint is overridden.
- Recommendation: Avoid constructing the sandbox request as a shell string: pass curl args directly with `--data-raw`, write a temporary payload/header file with restrictive permissions, or use a small wrapper that reads the secret from env without exposing it in argv. Validate or allowlist the hosted endpoint before attaching the bearer token.
- Evidence: `sandbox.exec(..., ["sh", "-lc", `curl ... --data '${chatRequest(hosted.model)}' | ...`])` interpolates the model into shell single quotes. Earlier, `host.command("curl", [..., "-H", `Authorization: Bearer ${hosted.apiKey}`, `${hosted.endpointUrl}/models`])` puts the secret and env-derived endpoint into the curl argv path.
Restore the OpenClaw-mediated inference and SSRF-guard boundary (test/e2e-scenario/live/full-e2e.test.ts:193): The new Vitest test proves raw sandbox curl can reach `https://inference.local\`, but it does not preserve the legacy Phase 4c boundary that ran `openclaw agent --agent main --json` inside the sandbox. Raw curl bypasses OpenClaw's HTTP client and SSRF-guard path, so this migration can pass while the OpenClaw-mediated inference route is broken.
- Recommendation: Add a Vitest phase that executes `openclaw agent --agent main --json` in the onboarded sandbox and asserts the parsed model reply contains `42` for the `6 multiplied by 7` prompt, using stdout-only JSON parsing so prompt echo or stderr cannot satisfy the check.
- Evidence: The new test only runs `curl ... https://inference.local/v1/chat/completions ...` via `sandbox.exec`. The legacy `test/e2e/test-full-e2e.sh` Phase 4c ran `openclaw agent --agent main --json --session-id ... -m 'What is 6 multiplied by 7?...'` and parsed the JSON reply for `42`.

🔎 Worth checking

Preserve the NemoClaw OpenClaw plugin registration check (test/e2e-scenario/live/full-e2e.test.ts:151): The migration verifies CLI availability, list/status, inference configuration, and policy output, but it omits the legacy plugin registry/slash-alias/help assertion. That legacy check covered the user-facing `/nemoclaw` OpenClaw command surface after gateway policy initialization.
- Recommendation: Add a post-onboard assertion equivalent to the legacy check: inspect the `nemoclaw` OpenClaw plugin, verify the manifest has `name: nemoclaw` and `kind: runtime-slash`, and ensure `openclaw nemoclaw --help` is not a missing-command failure.
- Evidence: New Phase 3 checks `nemoclaw list`, `nemoclaw status`, `openshell inference get`, and `openshell policy get`. Legacy Phase 3e checked `openclaw plugins inspect nemoclaw`, `/sandbox/.openclaw/extensions/nemoclaw/openclaw.plugin.json`, runtime slash alias, and command help.
Direct hosted inference is weaker than the legacy chat-completion proof (test/e2e-scenario/live/full-e2e.test.ts:179): The direct hosted phase calls `/models` and checks for `data`. That proves authentication/connectivity and model listing, but not that the configured hosted model can complete a chat request as legacy Phase 4a did.
- Recommendation: Replace or augment the `/models` probe with a `/chat/completions` request for the configured hosted model and parse/assert a `PONG`-style model reply. Prefer an existing validated provider request helper so endpoint overrides are host-validated and request bodies avoid ad hoc curl construction.
- Evidence: The new direct phase calls `${hosted.endpointUrl}/models`. Legacy Phase 4a posted to `${HOSTED_INFERENCE_BASE_URL}/chat/completions`, parsed the model reply, and required `PONG`.

🌱 Nice ideas

None.

Consider writing more tests for

**Runtime validation** — Run `openclaw agent --agent main --json` inside the onboarded sandbox and assert the stdout-only parsed reply contains `42` for `6 multiplied by 7`, so prompt echo or stderr cannot satisfy the check.. The changed behavior crosses workflow dispatch, installer execution, Docker/OpenShell sandbox lifecycle, hosted inference credentials, policy configuration, inference routing, OpenClaw client behavior, logging, and cleanup. Static inspection and unit tests cannot prove these live contracts.
**Runtime validation** — Verify OpenClaw plugin registration after onboarding: `openclaw plugins inspect nemoclaw`, manifest `name: nemoclaw`, manifest `kind: runtime-slash`, and `openclaw nemoclaw --help` is not a missing-command failure.. The changed behavior crosses workflow dispatch, installer execution, Docker/OpenShell sandbox lifecycle, hosted inference credentials, policy configuration, inference routing, OpenClaw client behavior, logging, and cleanup. Static inspection and unit tests cannot prove these live contracts.
**Runtime validation** — Exercise direct hosted `/chat/completions` for the configured hosted model and assert a parsed `PONG` reply.. The changed behavior crosses workflow dispatch, installer execution, Docker/OpenShell sandbox lifecycle, hosted inference credentials, policy configuration, inference routing, OpenClaw client behavior, logging, and cleanup. Static inspection and unit tests cannot prove these live contracts.
**Runtime validation** — Verify hosted model values containing single quotes or shell metacharacters are rejected or passed without executing shell.. The changed behavior crosses workflow dispatch, installer execution, Docker/OpenShell sandbox lifecycle, hosted inference credentials, policy configuration, inference routing, OpenClaw client behavior, logging, and cleanup. Static inspection and unit tests cannot prove these live contracts.
**Runtime validation** — Verify the hosted API key is not visible in process argv during direct hosted inference.. The changed behavior crosses workflow dispatch, installer execution, Docker/OpenShell sandbox lifecycle, hosted inference credentials, policy configuration, inference routing, OpenClaw client behavior, logging, and cleanup. Static inspection and unit tests cannot prove these live contracts.
**Acceptance clause:** Migrate `test-full-e2e.sh` into the live Vitest E2E system. — add test evidence or identify existing coverage. The PR adds `test/e2e-scenario/live/full-e2e.test.ts`, records `legacySource: "test/e2e/test-full-e2e.sh"`, and wires `full-e2e-vitest`; however, the replacement omits legacy Phase 3e plugin registry/slash-alias/help coverage and legacy Phase 4c OpenClaw-mediated inference.
**Acceptance clause:** The Vitest test runs `install.sh --non-interactive`, verifies installed CLI/OpenShell, list/status, inference configuration, policy, direct hosted inference, sandbox `inference.local`, logs, and cleanup. — add test evidence or identify existing coverage. The test runs `bash install.sh --non-interactive --fresh`, checks `command -v nemoclaw`, `command -v openshell`, list/status, `openshell inference get`, `openshell policy get`, sandbox `inference.local`, `nemoclaw logs`, and registry cleanup. The direct hosted phase uses `/models` rather than legacy chat completion, and list/status/logs are run through the repo `bin/nemoclaw.js` entrypoint rather than the installed `nemoclaw` wrapper.
**Acceptance clause:** Preserves the legacy system boundaries from `test-full-e2e.sh` while leaving legacy shell retirement to Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 Phase 11. — add test evidence or identify existing coverage. The legacy shell file is not removed, but key boundaries are not preserved in the Vitest replacement: legacy Phase 3e plugin registry/slash-alias/help coverage is absent, direct hosted chat completion is downgraded to `/models`, and legacy Phase 4c OpenClaw-mediated inference is replaced by raw curl to `inference.local`.

Since last review details

Current findings:

Avoid shell-injecting hosted model values and exposing the API key in argv (test/e2e-scenario/live/full-e2e.test.ts:197): The sandbox inference probe embeds an env-derived hosted model inside a single-quoted `sh -lc` command. `JSON.stringify` protects JSON syntax but does not escape shell single quotes, so a model override containing `'` can break out of the `--data` argument and execute shell inside the sandbox. The direct hosted curl probe also passes `Authorization: Bearer ${hosted.apiKey}` as a process argument, so artifact redaction cannot prevent the secret from being visible in argv while curl runs. Because `hosted.endpointUrl` is also env-derived, this same direct probe can send the bearer token to an unallowlisted host if the endpoint is overridden.
- Recommendation: Avoid constructing the sandbox request as a shell string: pass curl args directly with `--data-raw`, write a temporary payload/header file with restrictive permissions, or use a small wrapper that reads the secret from env without exposing it in argv. Validate or allowlist the hosted endpoint before attaching the bearer token.
- Evidence: `sandbox.exec(..., ["sh", "-lc", `curl ... --data '${chatRequest(hosted.model)}' | ...`])` interpolates the model into shell single quotes. Earlier, `host.command("curl", [..., "-H", `Authorization: Bearer ${hosted.apiKey}`, `${hosted.endpointUrl}/models`])` puts the secret and env-derived endpoint into the curl argv path.
Restore the OpenClaw-mediated inference and SSRF-guard boundary (test/e2e-scenario/live/full-e2e.test.ts:193): The new Vitest test proves raw sandbox curl can reach `https://inference.local\`, but it does not preserve the legacy Phase 4c boundary that ran `openclaw agent --agent main --json` inside the sandbox. Raw curl bypasses OpenClaw's HTTP client and SSRF-guard path, so this migration can pass while the OpenClaw-mediated inference route is broken.
- Recommendation: Add a Vitest phase that executes `openclaw agent --agent main --json` in the onboarded sandbox and asserts the parsed model reply contains `42` for the `6 multiplied by 7` prompt, using stdout-only JSON parsing so prompt echo or stderr cannot satisfy the check.
- Evidence: The new test only runs `curl ... https://inference.local/v1/chat/completions ...` via `sandbox.exec`. The legacy `test/e2e/test-full-e2e.sh` Phase 4c ran `openclaw agent --agent main --json --session-id ... -m 'What is 6 multiplied by 7?...'` and parsed the JSON reply for `42`.
Preserve the NemoClaw OpenClaw plugin registration check (test/e2e-scenario/live/full-e2e.test.ts:151): The migration verifies CLI availability, list/status, inference configuration, and policy output, but it omits the legacy plugin registry/slash-alias/help assertion. That legacy check covered the user-facing `/nemoclaw` OpenClaw command surface after gateway policy initialization.
- Recommendation: Add a post-onboard assertion equivalent to the legacy check: inspect the `nemoclaw` OpenClaw plugin, verify the manifest has `name: nemoclaw` and `kind: runtime-slash`, and ensure `openclaw nemoclaw --help` is not a missing-command failure.
- Evidence: New Phase 3 checks `nemoclaw list`, `nemoclaw status`, `openshell inference get`, and `openshell policy get`. Legacy Phase 3e checked `openclaw plugins inspect nemoclaw`, `/sandbox/.openclaw/extensions/nemoclaw/openclaw.plugin.json`, runtime slash alias, and command help.
Direct hosted inference is weaker than the legacy chat-completion proof (test/e2e-scenario/live/full-e2e.test.ts:179): The direct hosted phase calls `/models` and checks for `data`. That proves authentication/connectivity and model listing, but not that the configured hosted model can complete a chat request as legacy Phase 4a did.
- Recommendation: Replace or augment the `/models` probe with a `/chat/completions` request for the configured hosted model and parse/assert a `PONG`-style model reply. Prefer an existing validated provider request helper so endpoint overrides are host-validated and request bodies avoid ad hoc curl construction.
- Evidence: The new direct phase calls `${hosted.endpointUrl}/models`. Legacy Phase 4a posted to `${HOSTED_INFERENCE_BASE_URL}/chat/completions`, parsed the model reply, and required `PONG`.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

github-actions · 2026-06-16T04:54:32Z

E2E Advisor Recommendation

Required E2E: full-e2e-vitest
Optional E2E: None

Dispatch hint: full-e2e-vitest

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

full-e2e-vitest (high): This PR adds the full-e2e live Vitest scenario and its workflow job; run it to validate the new job wiring and the end-to-end install/onboard/inference/sandbox cleanup flow.

Optional E2E

None.

New E2E recommendations

None.

Dispatch hint

Workflow: .github/workflows/e2e-vitest-scenarios.yaml
jobs input: full-e2e-vitest

github-actions · 2026-06-16T04:54:33Z

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: full-e2e-vitest
Optional Vitest E2E scenarios: None

Dispatch required Vitest E2E scenarios:

gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=full-e2e-vitest

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

full-e2e-vitest: Focused free-standing Vitest job wired for changed live test test/e2e-scenario/live/full-e2e.test.ts.
- Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=full-e2e-vitest

Optional Vitest E2E scenarios

None.

Relevant changed files

.github/workflows/e2e-vitest-scenarios.yaml
test/e2e-scenario/live/full-e2e.test.ts

github-code-quality · 2026-06-16T15:36:54Z

Code Coverage Overview

Languages: TypeScript

TypeScript / code-coverage/plugin

The overall coverage in the e2e-migrate/test-ful... branch is 96%. Coverage data for the main branch is not yet available.

Show a code coverage summary of the most covered files.

File	main	e2e-migrate/test-ful... `f1eb293`	+/-
`nemoclaw/src/se...cret-scanner.ts`	—	100%	—
`nemoclaw/src/commands/slash.ts`	—	100%	—
`nemoclaw/src/li...bprocess-env.ts`	—	100%	—
`nemoclaw/src/bl...eprint/state.ts`	—	98%	—
`nemoclaw/src/onboard/config.ts`	—	98%	—
`nemoclaw/src/bl...int/snapshot.ts`	—	97%	—
`nemoclaw/src/bl...print/runner.ts`	—	95%	—
`nemoclaw/src/co...ration-state.ts`	—	94%	—
`nemoclaw/src/bl...ate-networks.ts`	—	94%	—
`nemoclaw/src/index.ts`	—	94%	—

TypeScript / code-coverage/cli

The overall coverage in the e2e-migrate/test-ful... branch is 46%. Coverage data for the main branch is not yet available.

Show a code coverage summary of the most covered files.

File	main	e2e-migrate/test-ful... `f1eb293`	+/-
`src/lib/state/o...oard-session.ts`	—	90%	—
`src/lib/inference/local.ts`	—	76%	—
`src/lib/sandbox/config.ts`	—	72%	—
`src/lib/actions...dbox/rebuild.ts`	—	67%	—
`src/lib/onboard/preflight.ts`	—	64%	—
`src/lib/actions...licy-channel.ts`	—	56%	—
`src/lib/state/sandbox.ts`	—	55%	—
`src/lib/policy/index.ts`	—	49%	—
`src/lib/onboard...er-gpu-patch.ts`	—	44%	—
`src/lib/onboard.ts`	—	18%	—

_{Updated June 16, 2026 18:24 UTC
Code Coverage is in Public Preview. Learn more and provide us with your feedback.}

github-actions · 2026-06-16T15:50:31Z

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 27629698074
Workflow ref: e2e-migrate/test-full-e2e
Requested scenarios: (default — all supported)
Requested jobs: full-e2e-vitest
Summary: 1 passed, 1 failed, 35 skipped

Job	Result
bedrock-runtime-compatible-anthropic-vitest	⏭️ skipped
channels-add-remove-vitest	⏭️ skipped
cloud-inference-vitest	⏭️ skipped
common-egress-agent-vitest	⏭️ skipped
credential-migration-vitest	⏭️ skipped
credential-sanitization-vitest	⏭️ skipped
double-onboard-vitest	⏭️ skipped
full-e2e-vitest	❌ failure
gateway-drift-preflight-vitest	⏭️ skipped
gateway-guard-recovery	⏭️ skipped
gateway-health-honest-vitest	⏭️ skipped
generate-matrix	✅ success
hermes-e2e-vitest	⏭️ skipped
hermes-root-entrypoint-smoke-vitest	⏭️ skipped
inference-routing-vitest	⏭️ skipped
issue-2478-crash-loop-recovery-vitest	⏭️ skipped
issue-4434-tui-unreachable-inference-vitest	⏭️ skipped
launchable-smoke-vitest	⏭️ skipped
live-scenarios	⏭️ skipped
messaging-compatible-endpoint-vitest	⏭️ skipped
messaging-providers-vitest	⏭️ skipped
model-router-provider-routed-inference-vitest	⏭️ skipped
network-policy-vitest	⏭️ skipped
onboard-negative-paths-vitest	⏭️ skipped
openclaw-inference-switch-vitest	⏭️ skipped
openclaw-skill-cli-vitest	⏭️ skipped
openclaw-tui-chat-correlation-vitest	⏭️ skipped
openshell-version-pin-vitest	⏭️ skipped
rebuild-openclaw-vitest	⏭️ skipped
runtime-overrides-vitest	⏭️ skipped
sandbox-rebuild-vitest	⏭️ skipped
sandbox-survival-vitest	⏭️ skipped
sessions-agents-cli-vitest	⏭️ skipped
shields-config-vitest	⏭️ skipped
skill-agent-vitest	⏭️ skipped
state-backup-restore-vitest	⏭️ skipped
token-rotation-vitest	⏭️ skipped

Failed jobs: full-e2e-vitest. Check run artifacts for logs.

github-actions · 2026-06-16T15:56:37Z

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 27630400974
Workflow ref: e2e-migrate/test-full-e2e
Requested scenarios: (default — all supported)
Requested jobs: full-e2e-vitest
Summary: 1 passed, 1 failed, 35 skipped

Job	Result
bedrock-runtime-compatible-anthropic-vitest	⏭️ skipped
channels-add-remove-vitest	⏭️ skipped
cloud-inference-vitest	⏭️ skipped
common-egress-agent-vitest	⏭️ skipped
credential-migration-vitest	⏭️ skipped
credential-sanitization-vitest	⏭️ skipped
double-onboard-vitest	⏭️ skipped
full-e2e-vitest	❌ failure
gateway-drift-preflight-vitest	⏭️ skipped
gateway-guard-recovery	⏭️ skipped
gateway-health-honest-vitest	⏭️ skipped
generate-matrix	✅ success
hermes-e2e-vitest	⏭️ skipped
hermes-root-entrypoint-smoke-vitest	⏭️ skipped
inference-routing-vitest	⏭️ skipped
issue-2478-crash-loop-recovery-vitest	⏭️ skipped
issue-4434-tui-unreachable-inference-vitest	⏭️ skipped
launchable-smoke-vitest	⏭️ skipped
live-scenarios	⏭️ skipped
messaging-compatible-endpoint-vitest	⏭️ skipped
messaging-providers-vitest	⏭️ skipped
model-router-provider-routed-inference-vitest	⏭️ skipped
network-policy-vitest	⏭️ skipped
onboard-negative-paths-vitest	⏭️ skipped
openclaw-inference-switch-vitest	⏭️ skipped
openclaw-skill-cli-vitest	⏭️ skipped
openclaw-tui-chat-correlation-vitest	⏭️ skipped
openshell-version-pin-vitest	⏭️ skipped
rebuild-openclaw-vitest	⏭️ skipped
runtime-overrides-vitest	⏭️ skipped
sandbox-rebuild-vitest	⏭️ skipped
sandbox-survival-vitest	⏭️ skipped
sessions-agents-cli-vitest	⏭️ skipped
shields-config-vitest	⏭️ skipped
skill-agent-vitest	⏭️ skipped
state-backup-restore-vitest	⏭️ skipped
token-rotation-vitest	⏭️ skipped

Failed jobs: full-e2e-vitest. Check run artifacts for logs.

github-actions · 2026-06-16T16:08:35Z

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 27630771256
Workflow ref: e2e-migrate/test-full-e2e
Requested scenarios: (default — all supported)
Requested jobs: full-e2e-vitest
Summary: 1 passed, 1 failed, 35 skipped

Job	Result
bedrock-runtime-compatible-anthropic-vitest	⏭️ skipped
channels-add-remove-vitest	⏭️ skipped
cloud-inference-vitest	⏭️ skipped
common-egress-agent-vitest	⏭️ skipped
credential-migration-vitest	⏭️ skipped
credential-sanitization-vitest	⏭️ skipped
double-onboard-vitest	⏭️ skipped
full-e2e-vitest	❌ failure
gateway-drift-preflight-vitest	⏭️ skipped
gateway-guard-recovery	⏭️ skipped
gateway-health-honest-vitest	⏭️ skipped
generate-matrix	✅ success
hermes-e2e-vitest	⏭️ skipped
hermes-root-entrypoint-smoke-vitest	⏭️ skipped
inference-routing-vitest	⏭️ skipped
issue-2478-crash-loop-recovery-vitest	⏭️ skipped
issue-4434-tui-unreachable-inference-vitest	⏭️ skipped
launchable-smoke-vitest	⏭️ skipped
live-scenarios	⏭️ skipped
messaging-compatible-endpoint-vitest	⏭️ skipped
messaging-providers-vitest	⏭️ skipped
model-router-provider-routed-inference-vitest	⏭️ skipped
network-policy-vitest	⏭️ skipped
onboard-negative-paths-vitest	⏭️ skipped
openclaw-inference-switch-vitest	⏭️ skipped
openclaw-skill-cli-vitest	⏭️ skipped
openclaw-tui-chat-correlation-vitest	⏭️ skipped
openshell-version-pin-vitest	⏭️ skipped
rebuild-openclaw-vitest	⏭️ skipped
runtime-overrides-vitest	⏭️ skipped
sandbox-rebuild-vitest	⏭️ skipped
sandbox-survival-vitest	⏭️ skipped
sessions-agents-cli-vitest	⏭️ skipped
shields-config-vitest	⏭️ skipped
skill-agent-vitest	⏭️ skipped
state-backup-restore-vitest	⏭️ skipped
token-rotation-vitest	⏭️ skipped

Failed jobs: full-e2e-vitest. Check run artifacts for logs.

github-actions · 2026-06-16T16:23:33Z

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27631598887
Workflow ref: e2e-migrate/test-full-e2e
Requested scenarios: (default — all supported)
Requested jobs: full-e2e-vitest
Summary: 2 passed, 0 failed, 35 skipped

Job	Result
bedrock-runtime-compatible-anthropic-vitest	⏭️ skipped
channels-add-remove-vitest	⏭️ skipped
cloud-inference-vitest	⏭️ skipped
common-egress-agent-vitest	⏭️ skipped
credential-migration-vitest	⏭️ skipped
credential-sanitization-vitest	⏭️ skipped
double-onboard-vitest	⏭️ skipped
full-e2e-vitest	✅ success
gateway-drift-preflight-vitest	⏭️ skipped
gateway-guard-recovery	⏭️ skipped
gateway-health-honest-vitest	⏭️ skipped
generate-matrix	✅ success
hermes-e2e-vitest	⏭️ skipped
hermes-root-entrypoint-smoke-vitest	⏭️ skipped
inference-routing-vitest	⏭️ skipped
issue-2478-crash-loop-recovery-vitest	⏭️ skipped
issue-4434-tui-unreachable-inference-vitest	⏭️ skipped
launchable-smoke-vitest	⏭️ skipped
live-scenarios	⏭️ skipped
messaging-compatible-endpoint-vitest	⏭️ skipped
messaging-providers-vitest	⏭️ skipped
model-router-provider-routed-inference-vitest	⏭️ skipped
network-policy-vitest	⏭️ skipped
onboard-negative-paths-vitest	⏭️ skipped
openclaw-inference-switch-vitest	⏭️ skipped
openclaw-skill-cli-vitest	⏭️ skipped
openclaw-tui-chat-correlation-vitest	⏭️ skipped
openshell-version-pin-vitest	⏭️ skipped
rebuild-openclaw-vitest	⏭️ skipped
runtime-overrides-vitest	⏭️ skipped
sandbox-rebuild-vitest	⏭️ skipped
sandbox-survival-vitest	⏭️ skipped
sessions-agents-cli-vitest	⏭️ skipped
shields-config-vitest	⏭️ skipped
skill-agent-vitest	⏭️ skipped
state-backup-restore-vitest	⏭️ skipped
token-rotation-vitest	⏭️ skipped

github-actions · 2026-06-16T16:42:22Z

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27632892525
Workflow ref: e2e-migrate/test-full-e2e
Requested scenarios: (default — all supported)
Requested jobs: full-e2e-vitest
Summary: 2 passed, 0 failed, 35 skipped

Job	Result
bedrock-runtime-compatible-anthropic-vitest	⏭️ skipped
channels-add-remove-vitest	⏭️ skipped
cloud-inference-vitest	⏭️ skipped
common-egress-agent-vitest	⏭️ skipped
credential-migration-vitest	⏭️ skipped
credential-sanitization-vitest	⏭️ skipped
double-onboard-vitest	⏭️ skipped
full-e2e-vitest	✅ success
gateway-drift-preflight-vitest	⏭️ skipped
gateway-guard-recovery	⏭️ skipped
gateway-health-honest-vitest	⏭️ skipped
generate-matrix	✅ success
hermes-e2e-vitest	⏭️ skipped
hermes-root-entrypoint-smoke-vitest	⏭️ skipped
inference-routing-vitest	⏭️ skipped
issue-2478-crash-loop-recovery-vitest	⏭️ skipped
issue-4434-tui-unreachable-inference-vitest	⏭️ skipped
launchable-smoke-vitest	⏭️ skipped
live-scenarios	⏭️ skipped
messaging-compatible-endpoint-vitest	⏭️ skipped
messaging-providers-vitest	⏭️ skipped
model-router-provider-routed-inference-vitest	⏭️ skipped
network-policy-vitest	⏭️ skipped
onboard-negative-paths-vitest	⏭️ skipped
openclaw-inference-switch-vitest	⏭️ skipped
openclaw-skill-cli-vitest	⏭️ skipped
openclaw-tui-chat-correlation-vitest	⏭️ skipped
openshell-version-pin-vitest	⏭️ skipped
rebuild-openclaw-vitest	⏭️ skipped
runtime-overrides-vitest	⏭️ skipped
sandbox-rebuild-vitest	⏭️ skipped
sandbox-survival-vitest	⏭️ skipped
sessions-agents-cli-vitest	⏭️ skipped
shields-config-vitest	⏭️ skipped
skill-agent-vitest	⏭️ skipped
state-backup-restore-vitest	⏭️ skipped
token-rotation-vitest	⏭️ skipped

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/e2e-vitest-scenarios.yaml:
- Around line 1792-1840: The `timeout-minutes: 60` value for the full-e2e-vitest
job is insufficient because the test itself requires up to 50 minutes, and the
job also includes additional setup steps (checkout, dependency installation, CLI
build, and OpenShell installation) that consume additional time. This leaves
inadequate headroom and causes workflow preemption. Increase the
`timeout-minutes` value from 60 to a higher value (such as 90 or 120 minutes) to
provide sufficient buffer for all steps including the test execution.

In `@test/e2e-scenario/live/full-e2e.test.ts`:
- Around line 209-217: The test for the nemoclaw logs command currently only
validates that the combined output (stdout and stderr) is non-empty, but this
allows the test to pass even if the command failed and only stderr contains
error messages. Add an explicit assertion to check that the repoNemoclaw call
succeeded (verify the exit code is 0 or the success status is true) before the
existing expect statement that validates the output length. This ensures the
logs command actually executed successfully, not just that something was written
to stderr.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: fd663c40-eccd-4dd6-8d6d-46f06c46c012

📥 Commits

Reviewing files that changed from the base of the PR and between 6c0fb04 and 6e78436.

📒 Files selected for processing (2)

.github/workflows/e2e-vitest-scenarios.yaml
test/e2e-scenario/live/full-e2e.test.ts

github-actions · 2026-06-16T17:54:54Z

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27636902315
Workflow ref: e2e-migrate/test-full-e2e
Requested scenarios: (default — all supported)
Requested jobs: full-e2e-vitest
Summary: 2 passed, 0 failed, 35 skipped

Job	Result
bedrock-runtime-compatible-anthropic-vitest	⏭️ skipped
channels-add-remove-vitest	⏭️ skipped
cloud-inference-vitest	⏭️ skipped
common-egress-agent-vitest	⏭️ skipped
credential-migration-vitest	⏭️ skipped
credential-sanitization-vitest	⏭️ skipped
double-onboard-vitest	⏭️ skipped
full-e2e-vitest	✅ success
gateway-drift-preflight-vitest	⏭️ skipped
gateway-guard-recovery	⏭️ skipped
gateway-health-honest-vitest	⏭️ skipped
generate-matrix	✅ success
hermes-e2e-vitest	⏭️ skipped
hermes-root-entrypoint-smoke-vitest	⏭️ skipped
inference-routing-vitest	⏭️ skipped
issue-2478-crash-loop-recovery-vitest	⏭️ skipped
issue-4434-tui-unreachable-inference-vitest	⏭️ skipped
launchable-smoke-vitest	⏭️ skipped
live-scenarios	⏭️ skipped
messaging-compatible-endpoint-vitest	⏭️ skipped
messaging-providers-vitest	⏭️ skipped
model-router-provider-routed-inference-vitest	⏭️ skipped
network-policy-vitest	⏭️ skipped
onboard-negative-paths-vitest	⏭️ skipped
openclaw-inference-switch-vitest	⏭️ skipped
openclaw-skill-cli-vitest	⏭️ skipped
openclaw-tui-chat-correlation-vitest	⏭️ skipped
openshell-version-pin-vitest	⏭️ skipped
rebuild-openclaw-vitest	⏭️ skipped
runtime-overrides-vitest	⏭️ skipped
sandbox-rebuild-vitest	⏭️ skipped
sandbox-survival-vitest	⏭️ skipped
sessions-agents-cli-vitest	⏭️ skipped
shields-config-vitest	⏭️ skipped
skill-agent-vitest	⏭️ skipped
state-backup-restore-vitest	⏭️ skipped
token-rotation-vitest	⏭️ skipped

…-e2e # Conflicts: # .github/workflows/e2e-vitest-scenarios.yaml

github-actions · 2026-06-16T18:31:01Z

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27638939083
Workflow ref: e2e-migrate/test-full-e2e
Requested scenarios: (default — all supported)
Requested jobs: full-e2e-vitest
Summary: 2 passed, 0 failed, 36 skipped

Job	Result
bedrock-runtime-compatible-anthropic-vitest	⏭️ skipped
channels-add-remove-vitest	⏭️ skipped
cloud-inference-vitest	⏭️ skipped
common-egress-agent-vitest	⏭️ skipped
credential-migration-vitest	⏭️ skipped
credential-sanitization-vitest	⏭️ skipped
double-onboard-vitest	⏭️ skipped
full-e2e-vitest	✅ success
gateway-drift-preflight-vitest	⏭️ skipped
gateway-guard-recovery	⏭️ skipped
gateway-health-honest-vitest	⏭️ skipped
generate-matrix	✅ success
hermes-e2e-vitest	⏭️ skipped
hermes-root-entrypoint-smoke-vitest	⏭️ skipped
inference-routing-vitest	⏭️ skipped
issue-2478-crash-loop-recovery-vitest	⏭️ skipped
issue-4434-tui-unreachable-inference-vitest	⏭️ skipped
launchable-smoke-vitest	⏭️ skipped
live-scenarios	⏭️ skipped
messaging-compatible-endpoint-vitest	⏭️ skipped
messaging-providers-vitest	⏭️ skipped
model-router-provider-routed-inference-vitest	⏭️ skipped
network-policy-vitest	⏭️ skipped
onboard-negative-paths-vitest	⏭️ skipped
onboard-resume-vitest	⏭️ skipped
openclaw-inference-switch-vitest	⏭️ skipped
openclaw-skill-cli-vitest	⏭️ skipped
openclaw-tui-chat-correlation-vitest	⏭️ skipped
openshell-version-pin-vitest	⏭️ skipped
rebuild-openclaw-vitest	⏭️ skipped
runtime-overrides-vitest	⏭️ skipped
sandbox-rebuild-vitest	⏭️ skipped
sandbox-survival-vitest	⏭️ skipped
sessions-agents-cli-vitest	⏭️ skipped
shields-config-vitest	⏭️ skipped
skill-agent-vitest	⏭️ skipped
state-backup-restore-vitest	⏭️ skipped
token-rotation-vitest	⏭️ skipped

test(e2e): migrate full e2e journey to Vitest

7070216

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

cv self-assigned this Jun 16, 2026

test(e2e): add full journey Vitest migration

f3ffeec

ci(e2e): use NVIDIA_API_KEY for full e2e dispatch

ca51c04

test(e2e): make full direct inference probe deterministic

6aa9270

cv added 2 commits June 16, 2026 09:12

test(e2e): avoid heredoc in full sandbox inference

7006f0a

test(e2e): fix full inference parser quoting

60bdfb8

test(e2e): avoid prompt echo in full sandbox inference

6e78436

cv marked this pull request as ready for review June 16, 2026 17:03

coderabbitai Bot reviewed Jun 16, 2026

View reviewed changes

Comment thread .github/workflows/e2e-vitest-scenarios.yaml Outdated

Comment thread test/e2e-scenario/live/full-e2e.test.ts

cv added 2 commits June 16, 2026 10:47

test(e2e): address full e2e review feedback

179427f

test(e2e): assert full logs command succeeds

2a9f341

Merge remote-tracking branch 'origin/main' into e2e-migrate/test-full…

f1eb293

…-e2e # Conflicts: # .github/workflows/e2e-vitest-scenarios.yaml

cv merged commit eb665ad into main Jun 16, 2026
83 checks passed

cv deleted the e2e-migrate/test-full-e2e branch June 16, 2026 20:57

coderabbitai Bot mentioned this pull request Jun 16, 2026

test(e2e): migrate GPU double onboard to Vitest #5495

Open

13 tasks

Conversation

cv commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Type of Change

Verification

Summary by CodeRabbit

Uh oh!

copy-pr-bot Bot commented Jun 16, 2026

Uh oh!

coderabbitai Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Advisor

🛠️ Needs attention

🔎 Worth checking

🌱 Nice ideas

Uh oh!

github-actions Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

github-actions Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Vitest E2E Scenario Recommendation

Vitest E2E Scenario Advisor

Required Vitest E2E scenarios

Optional Vitest E2E scenarios

Relevant changed files

Uh oh!

github-code-quality Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage Overview

TypeScript / code-coverage/plugin

TypeScript / code-coverage/cli

Uh oh!

github-actions Bot commented Jun 16, 2026

Vitest E2E Scenario Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented Jun 16, 2026

Vitest E2E Scenario Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented Jun 16, 2026

Vitest E2E Scenario Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented Jun 16, 2026

Vitest E2E Scenario Results — ✅ All jobs passed

Uh oh!

github-actions Bot commented Jun 16, 2026

Vitest E2E Scenario Results — ✅ All jobs passed

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 16, 2026

Vitest E2E Scenario Results — ✅ All jobs passed

Uh oh!

github-actions Bot commented Jun 16, 2026

Vitest E2E Scenario Results — ✅ All jobs passed

Uh oh!

Uh oh!

Reviewers

cv commented Jun 16, 2026 •

edited

Loading

coderabbitai Bot commented Jun 16, 2026 •

edited

Loading

github-actions Bot commented Jun 16, 2026 •

edited

Loading

github-actions Bot commented Jun 16, 2026 •

edited

Loading

github-actions Bot commented Jun 16, 2026 •

edited

Loading

github-code-quality Bot commented Jun 16, 2026 •

edited

Loading