test(e2e): migrate GPU double onboard to Vitest by cv · Pull Request #5495 · NVIDIA/NemoClaw

cv · 2026-06-16T04:54:04Z

Summary

Migrate test-gpu-double-onboard.sh into the live Vitest E2E system. Adds test/e2e-scenario/live/gpu-double-onboard.test.ts and gpu-double-onboard-vitest workflow wiring on the GPU runner. The Vitest test verifies GPU/Docker prerequisites, Ollama install, first onboard, persisted auth-proxy token, re-onboard, token consistency, auth rejection, sandbox inference, and cleanup.

Related Issue

Refs #5098

Changes

Adds or wires the free-standing live Vitest scenario gpu-double-onboard.
Adds selective workflow dispatch via gpu-double-onboard-vitest in .github/workflows/e2e-vitest-scenarios.yaml.
Preserves the legacy system boundaries from test-gpu-double-onboard.sh while leaving legacy shell retirement to Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 Phase 11.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

Git hooks passed during commit and push, or npx prek run --from-ref main --to-ref HEAD passes
Targeted tests pass for changed behavior
Full npm test passes (broad runtime changes only)
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
npm run docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

Targeted local checks run while preparing these branches:

npx vitest run --project e2e-vitest-support test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts --silent=false --reporter=default
npm run typecheck:cli for branches adding new TypeScript tests
git diff --check

Selective same-runner dispatch: https://github.com/NVIDIA/NemoClaw/actions/runs/27649245791 — passed after merge-from-main refresh

Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

Tests
- Added a new live GPU end-to-end Vitest scenario, replacing the prior shell-based GPU flow.
- Verifies Docker/NVIDIA availability, runs onboarding with an Ollama-backed proxy, and checks the persisted proxy token (including expected permissions).
- Confirms correct proxy authentication (authorized vs unauthorized) and performs in-sandbox inference with expected output.
- Re-runs onboarding to ensure the token remains unchanged, then cleans up and removes the sandbox from the registry.
Chores
- Added a dedicated GPU CI job and included it in PR status/reporting, with updated artifact uploads.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

copy-pr-bot · 2026-06-16T04:54:08Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-06-16T04:54:11Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a new live Vitest E2E test file (gpu-double-onboard.test.ts) covering the GPU double-onboard scenario with Ollama provider, proxy token persistence, HTTP endpoint authentication, and in-sandbox inference validation. A corresponding CI workflow job (gpu-double-onboard-vitest) is added with a GPU runner and Ollama configuration, wired into the report-to-pr aggregation.

Changes

GPU Double-Onboard E2E Test and CI Job

Layer / File(s)	Summary
Test configuration, env helpers, and shell utilities `test/e2e-scenario/live/gpu-double-onboard.test.ts`	Defines repo/CLI paths, sandbox/proxy env config, live timeout gating, `env()` helper for merging availability-probe environment, and shell command utilities: `resultText` formatter, `nemoclaw` CLI invocation wrapper, `cleanup` (destroy/delete/kill processes), `httpStatus` (curl-based status extraction with optional bearer token), `fileMode` (file permission checker), `chatRequest` (payload builder), and `parseReplyCommand` (JSON reply extraction).
Main live E2E scenario body `test/e2e-scenario/live/gpu-double-onboard.test.ts`	Implements the full scenario: scenario metadata JSON creation, Docker/NVIDIA GPU prerequisite checks, Ollama install and service cleanup, first onboard with sandbox listing validation, proxy token file creation/persistence with mode 600 verification, HTTP 200/401 endpoint assertions (`/v1/models` with valid/invalid/missing bearer token, `/api/tags` shape validation), first in-sandbox chat completion inference asserting `"42"`, re-onboard with sandbox recreation and token persistence validation, proxy and inference re-validation, final cleanup, sandbox registry removal confirmation, and scenario-result.json artifact write.
CI workflow job and report-to-pr integration `.github/workflows/e2e-vitest-scenarios.yaml`	Adds `gpu-double-onboard-vitest` free-standing job with GPU runner selection, `NEMOCLAW_PROVIDER=ollama` and proxy port env vars, Node/dependency/CLI/OpenShell installation, Vitest test execution with `openshell` on PATH, and artifact upload to `e2e-artifacts/vitest/gpu-double-onboard/`; updates `report-to-pr` job `needs` to include the new job for PR comment aggregation.

Sequence Diagram(s)

sequenceDiagram
  participant Test as gpu-double-onboard.test.ts
  participant CLI as nemoclaw CLI
  participant Proxy as Ollama Auth Proxy
  participant Sandbox as Docker/GPU Sandbox

  Test->>Sandbox: docker info / nvidia-smi (check prerequisites)
  Test->>CLI: install (Ollama service)
  Test->>CLI: onboard (first time, non-interactive)
  CLI->>Proxy: create and persist auth token
  Proxy-->>Test: token file created with mode 600
  Test->>Proxy: GET /v1/models with valid bearer token → 200
  Test->>Proxy: GET /api/tags (any token) → numeric status
  Test->>Proxy: GET /v1/models unauthenticated → 401
  Test->>Proxy: GET /v1/models with wrong token → 401
  Test->>Sandbox: curl inference.local chat completion (HTTPS)
  Sandbox-->>Test: JSON response containing "42"
  Test->>CLI: onboard again (sandbox recreation, same token expected)
  Proxy-->>Test: token persisted with same content
  Test->>Sandbox: curl inference.local chat completion (second time)
  Sandbox-->>Test: JSON response containing "42"
  Test->>CLI: cleanup (destroy sandbox, delete gateway, kill processes)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

NVIDIA/NemoClaw#5243: Both PRs modify .github/workflows/e2e-vitest-scenarios.yaml by adding/wiring free-standing Vitest jobs and updating report-to-pr's needs dependencies; the GPU double-onboard job's reporting integration is directly affected by the selector/validation workflow plumbing.
NVIDIA/NemoClaw#5218: Both PRs add/wire a free-standing double-onboard Vitest E2E job and corresponding live test file into .github/workflows/e2e-vitest-scenarios.yaml and report-to-pr dependencies, covering the same scenario (base vs GPU variant).
NVIDIA/NemoClaw#5493: Both PRs update the same e2e-vitest-scenarios.yaml workflow by adding a new live Vitest job and wiring it into report-to-pr.needs, with both introducing live Vitest E2E tests that perform in-sandbox inference validation and cleanup steps.

Suggested labels

area: e2e

Poem

🐇 Hop hop, the GPU awakes,
A double onboard the test harness makes!
The Ollama proxy guards the door,
Returns a 200—then 401 once more.
"42" emerges from inference deep,
This rabbit's CI never misses a beep! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change—migrating a GPU double onboard test from shell script to Vitest. It is specific, concise, and clearly indicates the primary objective of the changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch e2e-migrate/test-gpu-double-onboard

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-06-16T04:54:43Z

E2E Advisor Recommendation

Required E2E: None
Optional E2E: gpu-double-onboard-vitest

Dispatch hint: gpu-double-onboard-vitest

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

None.

Optional E2E

gpu-double-onboard-vitest (high): Useful to validate the newly added workflow job and live scenario wiring on the intended GPU runner, including Ollama provider onboarding and re-onboarding behavior. Not merge-blocking because the PR changes only E2E test/workflow files, not runtime user-flow code.

New E2E recommendations

None.

Dispatch hint

Workflow: .github/workflows/e2e-vitest-scenarios.yaml
jobs input: gpu-double-onboard-vitest

github-actions · 2026-06-16T04:54:44Z

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: gpu-double-onboard-vitest
Optional Vitest E2E scenarios: None

Dispatch required Vitest E2E scenarios:

gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=gpu-double-onboard-vitest

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

gpu-double-onboard-vitest: Focused free-standing Vitest job wired for changed live test test/e2e-scenario/live/gpu-double-onboard.test.ts.
- Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=gpu-double-onboard-vitest

Optional Vitest E2E scenarios

None.

Relevant changed files

.github/workflows/e2e-vitest-scenarios.yaml
test/e2e-scenario/live/gpu-double-onboard.test.ts

github-actions · 2026-06-16T04:56:20Z

PR Review Advisor

Findings: 1 needs attention, 7 worth checking, 0 nice ideas
Since last review: 0 prior items resolved, 6 still apply, 1 new item found

Review findings

🛠️ Needs attention

Generated Ollama proxy token can leak into ShellProbe artifacts (test/e2e-scenario/live/gpu-double-onboard.test.ts:95): The test reads ~/.nemoclaw/ollama-proxy-token, interpolates that generated bearer token into a bash -lc curl command, and uploads ShellProbe artifacts for the command. ShellProbe redacts environment-derived secrets or explicit redactionValues, but this call provides neither for the generated token, so the bearer can be persisted in .result.json command metadata and exposed in process argv.
- Recommendation: Avoid placing the token in a shell command string. Pass it via a dedicated environment variable, stdin, or a file descriptor and pass redactionValues: [token] to ShellProbe for every token-auth probe; ideally avoid bash -lc for token-auth curl probes.
- Evidence: httpStatus() builds `const header = token ? `-H 'Authorization: Bearer ${token}'` : ""` and passes a single `bash -lc` command. `ShellProbe.run()` writes the redacted command array to `${artifactBase}.result.json`, but `httpStatus()` does not pass `redactionValues`.

🔎 Worth checking

Source-of-truth review needed: test/e2e-scenario/live/gpu-double-onboard.test.ts cleanup helper: The advisor marked localized patch analysis as needs_followup.
- Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
- Evidence: `cleanup()` catches and suppresses failures for NemoClaw destroy, OpenShell sandbox delete, OpenShell gateway destroy, and process cleanup; final cleanup only checks that the NemoClaw registry file text does not contain the sandbox name.
Env-derived port and model values are interpolated into shell strings (test/e2e-scenario/live/gpu-double-onboard.test.ts:95): PROXY_PORT comes from NEMOCLAW_OLLAMA_PROXY_PORT without numeric TCP-port validation and is embedded in host bash -lc curl commands. NEMOCLAW_MODEL is JSON-stringified and then embedded inside a single-quoted sandbox sh -lc payload. Workflow defaults are safe, but custom or local runs can supply quotes or shell metacharacters that break quoting at a host or sandbox command boundary.
- Recommendation: Validate NEMOCLAW_OLLAMA_PROXY_PORT as a numeric TCP port before use. Build curl invocations without shell interpolation where possible, or use a shell-safe quoting helper for every dynamic value including the model JSON.
- Evidence: `const PROXY_PORT = process.env.NEMOCLAW_OLLAMA_PROXY_PORT ?? "11435"`; `httpStatus()` embeds `${PROXY_PORT}` through a URL inside `bash -lc`; `expectSandboxInference42()` embeds `--data '${chatRequest(model)}'` where `model` can come from `process.env.NEMOCLAW_MODEL`.
Live GPU scenario executes an unpinned remote Ollama installer (test/e2e-scenario/live/gpu-double-onboard.test.ts:194): The new live scenario executes mutable third-party code from https://ollama.com/install.sh on a high-trust GPU runner that also manages sandbox lifecycle and an auth-proxy boundary. The installer is not pinned to a version, checksum, or signature in this PR.
- Recommendation: Pin the Ollama package/version or verify the downloaded installer/package with a maintained checksum or signature before execution. If this must remain a live installer boundary, document the trust assumption and keep credentials isolated from the job.
- Evidence: The install step runs `command -v ollama >/dev/null 2>&1 || curl -fsSL https://ollama.com/install.sh | sh`; the workflow job sets `NEMOCLAW_ACCEPT_THIRD_PARTY_SOFTWARE=1`.
Final cleanup can pass while sandbox, gateway, or proxy state remains (test/e2e-scenario/live/gpu-double-onboard.test.ts:61): The same best-effort cleanup helper is used for pre-run cleanup and final cleanup. It suppresses failures from NemoClaw destroy, OpenShell sandbox delete, OpenShell gateway destroy, and Ollama/proxy process cleanup, then the final assertion only checks the NemoClaw registry file. This only partially supports the PR's cleanup acceptance claim and can write scenario-result.json as passed while OpenShell or proxy state remains.
- Recommendation: Keep best-effort cleanup for pre-run state, but use a strict final cleanup path that fails on unexpected destroy/delete errors and verifies the sandbox is absent from NemoClaw registry and OpenShell listings, the nemoclaw gateway is removed, and no ollama-auth-proxy remains for the selected port.
- Evidence: `cleanup()` catches and ignores errors at lines 62, 71, 78, and 92. The final path calls `await cleanup(host, sandbox)`, checks only `~/.nemoclaw/sandboxes.json`, and then writes `scenario-result.json` with status `passed`.
Inference model selection drifts from the legacy test (test/e2e-scenario/live/gpu-double-onboard.test.ts:231): When NEMOCLAW_MODEL is unset, the Vitest migration hardcodes llama3.2:1b. The legacy shell test instead discovered the model actually available after onboard from Ollama. The new workflow does not set NEMOCLAW_MODEL, so this test can fail because it asks for a model that onboard did not install or select, instead of validating the double-onboard token regression.
- Recommendation: After first onboard, discover the configured or installed model from the same source as the product, such as Ollama /api/tags or NemoClaw/OpenShell inference state, and use that resolved model for both first-onboard and post-re-onboard sandbox inference assertions.
- Evidence: The new test uses `const model = process.env.NEMOCLAW_MODEL ?? "llama3.2:1b"`. The legacy `test/e2e/test-gpu-double-onboard.sh` used NEMOCLAW_MODEL when set, otherwise queried `http://127.0.0.1:11434/api/tags\` for the first installed model.
Token stability assertion may be stricter than the legacy regression contract (test/e2e-scenario/live/gpu-double-onboard.test.ts:254): The new test requires the token after re-onboard to equal the first-onboard token. The legacy regression check primarily verified that the token persisted after re-onboard is accepted by the running proxy. If the product ever intentionally rotates the token during re-onboard while keeping disk and proxy state consistent, this migration would fail for behavior that still satisfies the original token-divergence contract.
- Recommendation: Confirm the product contract. If token stability is not required, remove the equality assertion and assert instead that the post-re-onboard persisted token is accepted by the running proxy and wrong/unauthenticated requests are rejected.
- Evidence: The new test asserts `expect(tokenAfterSecond).toBe(tokenAfterFirst)` before probing the proxy. The legacy script's core check reads `TOKEN_AFTER_SECOND` after re-onboard and verifies the running proxy accepts that persisted token.
Security-sensitive runtime edges need explicit negative assertions (test/e2e-scenario/live/gpu-double-onboard.test.ts:95): The scenario covers important positive behavior and proxy auth rejection, but it does not assert that the generated proxy token is absent from ShellProbe artifacts, that unsafe env-derived shell inputs are rejected or safely quoted, or that final cleanup failures fail the scenario.
- Recommendation: Add behavior-specific runtime assertions for artifact redaction, invalid proxy port/model shell metacharacters, and strict final cleanup. These should exercise the real host/sandbox boundary rather than mocks.
- Evidence: The test uploads ShellProbe artifacts for token-auth curl probes, uses env-derived PROXY_PORT and model values in shell commands, and suppresses cleanup failures while only reading the NemoClaw registry file at the end.

🌱 Nice ideas

None.

Consider writing more tests for

**Runtime validation** — GPU double-onboard with no `NEMOCLAW_MODEL` discovers the model selected or pulled by onboard and uses it for both first-onboard and post-re-onboard sandbox inference.. This change exercises live GPU, Docker, mutable third-party installer execution, OpenShell sandbox lifecycle, Ollama auth-proxy token persistence, and in-sandbox inference. Direct runtime validation is appropriate; mocks would not validate these security-sensitive system boundaries.
**Runtime validation** — `~/.nemoclaw/ollama-proxy-token` never appears in ShellProbe `.result.json`, command metadata, stdout, stderr, or uploaded artifact contents for token-auth curl probes.. This change exercises live GPU, Docker, mutable third-party installer execution, OpenShell sandbox lifecycle, Ollama auth-proxy token persistence, and in-sandbox inference. Direct runtime validation is appropriate; mocks would not validate these security-sensitive system boundaries.
**Runtime validation** — A custom `NEMOCLAW_OLLAMA_PROXY_PORT` containing non-numeric characters or shell metacharacters is rejected before any `bash -lc` command is built.. This change exercises live GPU, Docker, mutable third-party installer execution, OpenShell sandbox lifecycle, Ollama auth-proxy token persistence, and in-sandbox inference. Direct runtime validation is appropriate; mocks would not validate these security-sensitive system boundaries.
**Runtime validation** — A custom `NEMOCLAW_MODEL` containing quotes or shell metacharacters cannot break the sandbox inference command, or the command is built without shell interpolation.. This change exercises live GPU, Docker, mutable third-party installer execution, OpenShell sandbox lifecycle, Ollama auth-proxy token persistence, and in-sandbox inference. Direct runtime validation is appropriate; mocks would not validate these security-sensitive system boundaries.
**Runtime validation** — Final cleanup fails the scenario when NemoClaw destroy, OpenShell sandbox delete, gateway destroy, or auth-proxy cleanup leaves runtime state behind.. This change exercises live GPU, Docker, mutable third-party installer execution, OpenShell sandbox lifecycle, Ollama auth-proxy token persistence, and in-sandbox inference. Direct runtime validation is appropriate; mocks would not validate these security-sensitive system boundaries.
**Security-sensitive runtime edges need explicit negative assertions** — Add behavior-specific runtime assertions for artifact redaction, invalid proxy port/model shell metacharacters, and strict final cleanup. These should exercise the real host/sandbox boundary rather than mocks.
**Acceptance clause:** Migrate `test-gpu-double-onboard.sh` into the live Vitest E2E system. — add test evidence or identify existing coverage. The PR adds `test/e2e-scenario/live/gpu-double-onboard.test.ts` and a `gpu-double-onboard-vitest` workflow job. Most legacy phases are represented, but model discovery differs from the legacy shell test and final cleanup verification is weaker.
**Acceptance clause:** The Vitest test verifies GPU/Docker prerequisites, Ollama install, first onboard, persisted auth-proxy token, re-onboard, token consistency, auth rejection, sandbox inference, and cleanup. — add test evidence or identify existing coverage. The test runs `docker info` and `nvidia-smi`, installs Ollama, runs `install.sh --non-interactive`, checks token file presence/mode, runs re-onboard, compares tokens, checks correct-token/unauth/wrong-token proxy responses, and performs sandbox inference before and after re-onboard. Cleanup is only partially verified because cleanup failures are suppressed and only `~/.nemoclaw/sandboxes.json` is checked.

Since last review details

Current findings:

Source-of-truth review needed: test/e2e-scenario/live/gpu-double-onboard.test.ts cleanup helper: The advisor marked localized patch analysis as needs_followup.
- Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
- Evidence: `cleanup()` catches and suppresses failures for NemoClaw destroy, OpenShell sandbox delete, OpenShell gateway destroy, and process cleanup; final cleanup only checks that the NemoClaw registry file text does not contain the sandbox name.
Generated Ollama proxy token can leak into ShellProbe artifacts (test/e2e-scenario/live/gpu-double-onboard.test.ts:95): The test reads ~/.nemoclaw/ollama-proxy-token, interpolates that generated bearer token into a bash -lc curl command, and uploads ShellProbe artifacts for the command. ShellProbe redacts environment-derived secrets or explicit redactionValues, but this call provides neither for the generated token, so the bearer can be persisted in .result.json command metadata and exposed in process argv.
- Recommendation: Avoid placing the token in a shell command string. Pass it via a dedicated environment variable, stdin, or a file descriptor and pass redactionValues: [token] to ShellProbe for every token-auth probe; ideally avoid bash -lc for token-auth curl probes.
- Evidence: httpStatus() builds `const header = token ? `-H 'Authorization: Bearer ${token}'` : ""` and passes a single `bash -lc` command. `ShellProbe.run()` writes the redacted command array to `${artifactBase}.result.json`, but `httpStatus()` does not pass `redactionValues`.
Env-derived port and model values are interpolated into shell strings (test/e2e-scenario/live/gpu-double-onboard.test.ts:95): PROXY_PORT comes from NEMOCLAW_OLLAMA_PROXY_PORT without numeric TCP-port validation and is embedded in host bash -lc curl commands. NEMOCLAW_MODEL is JSON-stringified and then embedded inside a single-quoted sandbox sh -lc payload. Workflow defaults are safe, but custom or local runs can supply quotes or shell metacharacters that break quoting at a host or sandbox command boundary.
- Recommendation: Validate NEMOCLAW_OLLAMA_PROXY_PORT as a numeric TCP port before use. Build curl invocations without shell interpolation where possible, or use a shell-safe quoting helper for every dynamic value including the model JSON.
- Evidence: `const PROXY_PORT = process.env.NEMOCLAW_OLLAMA_PROXY_PORT ?? "11435"`; `httpStatus()` embeds `${PROXY_PORT}` through a URL inside `bash -lc`; `expectSandboxInference42()` embeds `--data '${chatRequest(model)}'` where `model` can come from `process.env.NEMOCLAW_MODEL`.
Live GPU scenario executes an unpinned remote Ollama installer (test/e2e-scenario/live/gpu-double-onboard.test.ts:194): The new live scenario executes mutable third-party code from https://ollama.com/install.sh on a high-trust GPU runner that also manages sandbox lifecycle and an auth-proxy boundary. The installer is not pinned to a version, checksum, or signature in this PR.
- Recommendation: Pin the Ollama package/version or verify the downloaded installer/package with a maintained checksum or signature before execution. If this must remain a live installer boundary, document the trust assumption and keep credentials isolated from the job.
- Evidence: The install step runs `command -v ollama >/dev/null 2>&1 || curl -fsSL https://ollama.com/install.sh | sh`; the workflow job sets `NEMOCLAW_ACCEPT_THIRD_PARTY_SOFTWARE=1`.
Final cleanup can pass while sandbox, gateway, or proxy state remains (test/e2e-scenario/live/gpu-double-onboard.test.ts:61): The same best-effort cleanup helper is used for pre-run cleanup and final cleanup. It suppresses failures from NemoClaw destroy, OpenShell sandbox delete, OpenShell gateway destroy, and Ollama/proxy process cleanup, then the final assertion only checks the NemoClaw registry file. This only partially supports the PR's cleanup acceptance claim and can write scenario-result.json as passed while OpenShell or proxy state remains.
- Recommendation: Keep best-effort cleanup for pre-run state, but use a strict final cleanup path that fails on unexpected destroy/delete errors and verifies the sandbox is absent from NemoClaw registry and OpenShell listings, the nemoclaw gateway is removed, and no ollama-auth-proxy remains for the selected port.
- Evidence: `cleanup()` catches and ignores errors at lines 62, 71, 78, and 92. The final path calls `await cleanup(host, sandbox)`, checks only `~/.nemoclaw/sandboxes.json`, and then writes `scenario-result.json` with status `passed`.
Inference model selection drifts from the legacy test (test/e2e-scenario/live/gpu-double-onboard.test.ts:231): When NEMOCLAW_MODEL is unset, the Vitest migration hardcodes llama3.2:1b. The legacy shell test instead discovered the model actually available after onboard from Ollama. The new workflow does not set NEMOCLAW_MODEL, so this test can fail because it asks for a model that onboard did not install or select, instead of validating the double-onboard token regression.
- Recommendation: After first onboard, discover the configured or installed model from the same source as the product, such as Ollama /api/tags or NemoClaw/OpenShell inference state, and use that resolved model for both first-onboard and post-re-onboard sandbox inference assertions.
- Evidence: The new test uses `const model = process.env.NEMOCLAW_MODEL ?? "llama3.2:1b"`. The legacy `test/e2e/test-gpu-double-onboard.sh` used NEMOCLAW_MODEL when set, otherwise queried `http://127.0.0.1:11434/api/tags\` for the first installed model.
Token stability assertion may be stricter than the legacy regression contract (test/e2e-scenario/live/gpu-double-onboard.test.ts:254): The new test requires the token after re-onboard to equal the first-onboard token. The legacy regression check primarily verified that the token persisted after re-onboard is accepted by the running proxy. If the product ever intentionally rotates the token during re-onboard while keeping disk and proxy state consistent, this migration would fail for behavior that still satisfies the original token-divergence contract.
- Recommendation: Confirm the product contract. If token stability is not required, remove the equality assertion and assert instead that the post-re-onboard persisted token is accepted by the running proxy and wrong/unauthenticated requests are rejected.
- Evidence: The new test asserts `expect(tokenAfterSecond).toBe(tokenAfterFirst)` before probing the proxy. The legacy script's core check reads `TOKEN_AFTER_SECOND` after re-onboard and verifies the running proxy accepts that persisted token.
Security-sensitive runtime edges need explicit negative assertions (test/e2e-scenario/live/gpu-double-onboard.test.ts:95): The scenario covers important positive behavior and proxy auth rejection, but it does not assert that the generated proxy token is absent from ShellProbe artifacts, that unsafe env-derived shell inputs are rejected or safely quoted, or that final cleanup failures fail the scenario.
- Recommendation: Add behavior-specific runtime assertions for artifact redaction, invalid proxy port/model shell metacharacters, and strict final cleanup. These should exercise the real host/sandbox boundary rather than mocks.
- Evidence: The test uploads ShellProbe artifacts for token-auth curl probes, uses env-derived PROXY_PORT and model values in shell commands, and suppresses cleanup failures while only reading the NemoClaw registry file at the end.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

github-code-quality · 2026-06-16T15:41:47Z

Code Coverage Overview

Languages: TypeScript

TypeScript / code-coverage/plugin

The overall coverage in the e2e-migrate/test-gpu... branch is 96%. Coverage data for the main branch is not yet available.

Show a code coverage summary of the most covered files.

File	main	e2e-migrate/test-gpu... `c2f23f9`	+/-
`nemoclaw/src/se...cret-scanner.ts`	—	100%	—
`nemoclaw/src/commands/slash.ts`	—	100%	—
`nemoclaw/src/li...bprocess-env.ts`	—	100%	—
`nemoclaw/src/bl...eprint/state.ts`	—	98%	—
`nemoclaw/src/onboard/config.ts`	—	98%	—
`nemoclaw/src/bl...int/snapshot.ts`	—	97%	—
`nemoclaw/src/bl...print/runner.ts`	—	95%	—
`nemoclaw/src/co...ration-state.ts`	—	94%	—
`nemoclaw/src/bl...ate-networks.ts`	—	94%	—
`nemoclaw/src/index.ts`	—	94%	—

TypeScript / code-coverage/cli

The overall coverage in the e2e-migrate/test-gpu... branch is 46%. Coverage data for the main branch is not yet available.

Show a code coverage summary of the most covered files.

File	main	e2e-migrate/test-gpu... `c2f23f9`	+/-
`src/lib/state/o...oard-session.ts`	—	90%	—
`src/lib/inference/local.ts`	—	76%	—
`src/lib/sandbox/config.ts`	—	72%	—
`src/lib/actions...dbox/rebuild.ts`	—	67%	—
`src/lib/onboard/preflight.ts`	—	64%	—
`src/lib/actions...licy-channel.ts`	—	56%	—
`src/lib/state/sandbox.ts`	—	55%	—
`src/lib/policy/index.ts`	—	49%	—
`src/lib/onboard...er-gpu-patch.ts`	—	44%	—
`src/lib/onboard.ts`	—	18%	—

_{Updated June 16, 2026 21:28 UTC
Code Coverage is in Public Preview. Learn more and provide us with your feedback.}

github-actions · 2026-06-16T16:05:24Z

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 27629702812
Workflow ref: e2e-migrate/test-gpu-double-onboard
Requested scenarios: (default — all supported)
Requested jobs: gpu-double-onboard-vitest
Summary: 1 passed, 1 failed, 35 skipped

Job	Result
bedrock-runtime-compatible-anthropic-vitest	⏭️ skipped
channels-add-remove-vitest	⏭️ skipped
cloud-inference-vitest	⏭️ skipped
common-egress-agent-vitest	⏭️ skipped
credential-migration-vitest	⏭️ skipped
credential-sanitization-vitest	⏭️ skipped
double-onboard-vitest	⏭️ skipped
gateway-drift-preflight-vitest	⏭️ skipped
gateway-guard-recovery	⏭️ skipped
gateway-health-honest-vitest	⏭️ skipped
generate-matrix	✅ success
gpu-double-onboard-vitest	❌ failure
hermes-e2e-vitest	⏭️ skipped
hermes-root-entrypoint-smoke-vitest	⏭️ skipped
inference-routing-vitest	⏭️ skipped
issue-2478-crash-loop-recovery-vitest	⏭️ skipped
issue-4434-tui-unreachable-inference-vitest	⏭️ skipped
launchable-smoke-vitest	⏭️ skipped
live-scenarios	⏭️ skipped
messaging-compatible-endpoint-vitest	⏭️ skipped
messaging-providers-vitest	⏭️ skipped
model-router-provider-routed-inference-vitest	⏭️ skipped
network-policy-vitest	⏭️ skipped
onboard-negative-paths-vitest	⏭️ skipped
openclaw-inference-switch-vitest	⏭️ skipped
openclaw-skill-cli-vitest	⏭️ skipped
openclaw-tui-chat-correlation-vitest	⏭️ skipped
openshell-version-pin-vitest	⏭️ skipped
rebuild-openclaw-vitest	⏭️ skipped
runtime-overrides-vitest	⏭️ skipped
sandbox-rebuild-vitest	⏭️ skipped
sandbox-survival-vitest	⏭️ skipped
sessions-agents-cli-vitest	⏭️ skipped
shields-config-vitest	⏭️ skipped
skill-agent-vitest	⏭️ skipped
state-backup-restore-vitest	⏭️ skipped
token-rotation-vitest	⏭️ skipped

Failed jobs: gpu-double-onboard-vitest. Check run artifacts for logs.

github-actions · 2026-06-16T16:35:02Z

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27631600832
Workflow ref: e2e-migrate/test-gpu-double-onboard
Requested scenarios: (default — all supported)
Requested jobs: gpu-double-onboard-vitest
Summary: 2 passed, 0 failed, 35 skipped

Job	Result
bedrock-runtime-compatible-anthropic-vitest	⏭️ skipped
channels-add-remove-vitest	⏭️ skipped
cloud-inference-vitest	⏭️ skipped
common-egress-agent-vitest	⏭️ skipped
credential-migration-vitest	⏭️ skipped
credential-sanitization-vitest	⏭️ skipped
double-onboard-vitest	⏭️ skipped
gateway-drift-preflight-vitest	⏭️ skipped
gateway-guard-recovery	⏭️ skipped
gateway-health-honest-vitest	⏭️ skipped
generate-matrix	✅ success
gpu-double-onboard-vitest	✅ success
hermes-e2e-vitest	⏭️ skipped
hermes-root-entrypoint-smoke-vitest	⏭️ skipped
inference-routing-vitest	⏭️ skipped
issue-2478-crash-loop-recovery-vitest	⏭️ skipped
issue-4434-tui-unreachable-inference-vitest	⏭️ skipped
launchable-smoke-vitest	⏭️ skipped
live-scenarios	⏭️ skipped
messaging-compatible-endpoint-vitest	⏭️ skipped
messaging-providers-vitest	⏭️ skipped
model-router-provider-routed-inference-vitest	⏭️ skipped
network-policy-vitest	⏭️ skipped
onboard-negative-paths-vitest	⏭️ skipped
openclaw-inference-switch-vitest	⏭️ skipped
openclaw-skill-cli-vitest	⏭️ skipped
openclaw-tui-chat-correlation-vitest	⏭️ skipped
openshell-version-pin-vitest	⏭️ skipped
rebuild-openclaw-vitest	⏭️ skipped
runtime-overrides-vitest	⏭️ skipped
sandbox-rebuild-vitest	⏭️ skipped
sandbox-survival-vitest	⏭️ skipped
sessions-agents-cli-vitest	⏭️ skipped
shields-config-vitest	⏭️ skipped
skill-agent-vitest	⏭️ skipped
state-backup-restore-vitest	⏭️ skipped
token-rotation-vitest	⏭️ skipped

github-actions · 2026-06-16T17:01:09Z

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27632894484
Workflow ref: e2e-migrate/test-gpu-double-onboard
Requested scenarios: (default — all supported)
Requested jobs: gpu-double-onboard-vitest
Summary: 2 passed, 0 failed, 35 skipped

Job	Result
bedrock-runtime-compatible-anthropic-vitest	⏭️ skipped
channels-add-remove-vitest	⏭️ skipped
cloud-inference-vitest	⏭️ skipped
common-egress-agent-vitest	⏭️ skipped
credential-migration-vitest	⏭️ skipped
credential-sanitization-vitest	⏭️ skipped
double-onboard-vitest	⏭️ skipped
gateway-drift-preflight-vitest	⏭️ skipped
gateway-guard-recovery	⏭️ skipped
gateway-health-honest-vitest	⏭️ skipped
generate-matrix	✅ success
gpu-double-onboard-vitest	✅ success
hermes-e2e-vitest	⏭️ skipped
hermes-root-entrypoint-smoke-vitest	⏭️ skipped
inference-routing-vitest	⏭️ skipped
issue-2478-crash-loop-recovery-vitest	⏭️ skipped
issue-4434-tui-unreachable-inference-vitest	⏭️ skipped
launchable-smoke-vitest	⏭️ skipped
live-scenarios	⏭️ skipped
messaging-compatible-endpoint-vitest	⏭️ skipped
messaging-providers-vitest	⏭️ skipped
model-router-provider-routed-inference-vitest	⏭️ skipped
network-policy-vitest	⏭️ skipped
onboard-negative-paths-vitest	⏭️ skipped
openclaw-inference-switch-vitest	⏭️ skipped
openclaw-skill-cli-vitest	⏭️ skipped
openclaw-tui-chat-correlation-vitest	⏭️ skipped
openshell-version-pin-vitest	⏭️ skipped
rebuild-openclaw-vitest	⏭️ skipped
runtime-overrides-vitest	⏭️ skipped
sandbox-rebuild-vitest	⏭️ skipped
sandbox-survival-vitest	⏭️ skipped
sessions-agents-cli-vitest	⏭️ skipped
shields-config-vitest	⏭️ skipped
skill-agent-vitest	⏭️ skipped
state-backup-restore-vitest	⏭️ skipped
token-rotation-vitest	⏭️ skipped

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e-scenario/live/gpu-double-onboard.test.ts`:
- Around line 208-223: The test currently validates only the new token
(tokenAfterSecond) without verifying it remains unchanged across the re-onboard
cycle, allowing a token rotation regression to pass undetected. Capture the
first token from TOKEN_FILE before the re-onboard operation (prior to line 208),
then add an explicit equality assertion comparing tokenAfterSecond to this first
captured token to enforce that the token identity is preserved across
re-onboard. Additionally, reuse the first token object for the post-reonboard
authentication check to validate the same token works for both requests.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 04a46ee6-a453-4db3-82ba-edb219d47255

📥 Commits

Reviewing files that changed from the base of the PR and between 6c0fb04 and cd198f3.

📒 Files selected for processing (2)

.github/workflows/e2e-vitest-scenarios.yaml
test/e2e-scenario/live/gpu-double-onboard.test.ts

github-actions · 2026-06-16T17:42:19Z

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27635624904
Workflow ref: e2e-migrate/test-gpu-double-onboard
Requested scenarios: (default — all supported)
Requested jobs: gpu-double-onboard-vitest
Summary: 2 passed, 0 failed, 35 skipped

Job	Result
bedrock-runtime-compatible-anthropic-vitest	⏭️ skipped
channels-add-remove-vitest	⏭️ skipped
cloud-inference-vitest	⏭️ skipped
common-egress-agent-vitest	⏭️ skipped
credential-migration-vitest	⏭️ skipped
credential-sanitization-vitest	⏭️ skipped
double-onboard-vitest	⏭️ skipped
gateway-drift-preflight-vitest	⏭️ skipped
gateway-guard-recovery	⏭️ skipped
gateway-health-honest-vitest	⏭️ skipped
generate-matrix	✅ success
gpu-double-onboard-vitest	✅ success
hermes-e2e-vitest	⏭️ skipped
hermes-root-entrypoint-smoke-vitest	⏭️ skipped
inference-routing-vitest	⏭️ skipped
issue-2478-crash-loop-recovery-vitest	⏭️ skipped
issue-4434-tui-unreachable-inference-vitest	⏭️ skipped
launchable-smoke-vitest	⏭️ skipped
live-scenarios	⏭️ skipped
messaging-compatible-endpoint-vitest	⏭️ skipped
messaging-providers-vitest	⏭️ skipped
model-router-provider-routed-inference-vitest	⏭️ skipped
network-policy-vitest	⏭️ skipped
onboard-negative-paths-vitest	⏭️ skipped
openclaw-inference-switch-vitest	⏭️ skipped
openclaw-skill-cli-vitest	⏭️ skipped
openclaw-tui-chat-correlation-vitest	⏭️ skipped
openshell-version-pin-vitest	⏭️ skipped
rebuild-openclaw-vitest	⏭️ skipped
runtime-overrides-vitest	⏭️ skipped
sandbox-rebuild-vitest	⏭️ skipped
sandbox-survival-vitest	⏭️ skipped
sessions-agents-cli-vitest	⏭️ skipped
shields-config-vitest	⏭️ skipped
skill-agent-vitest	⏭️ skipped
state-backup-restore-vitest	⏭️ skipped
token-rotation-vitest	⏭️ skipped

github-actions · 2026-06-16T18:08:05Z

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27636905667
Workflow ref: e2e-migrate/test-gpu-double-onboard
Requested scenarios: (default — all supported)
Requested jobs: gpu-double-onboard-vitest
Summary: 2 passed, 0 failed, 35 skipped

Job	Result
bedrock-runtime-compatible-anthropic-vitest	⏭️ skipped
channels-add-remove-vitest	⏭️ skipped
cloud-inference-vitest	⏭️ skipped
common-egress-agent-vitest	⏭️ skipped
credential-migration-vitest	⏭️ skipped
credential-sanitization-vitest	⏭️ skipped
double-onboard-vitest	⏭️ skipped
gateway-drift-preflight-vitest	⏭️ skipped
gateway-guard-recovery	⏭️ skipped
gateway-health-honest-vitest	⏭️ skipped
generate-matrix	✅ success
gpu-double-onboard-vitest	✅ success
hermes-e2e-vitest	⏭️ skipped
hermes-root-entrypoint-smoke-vitest	⏭️ skipped
inference-routing-vitest	⏭️ skipped
issue-2478-crash-loop-recovery-vitest	⏭️ skipped
issue-4434-tui-unreachable-inference-vitest	⏭️ skipped
launchable-smoke-vitest	⏭️ skipped
live-scenarios	⏭️ skipped
messaging-compatible-endpoint-vitest	⏭️ skipped
messaging-providers-vitest	⏭️ skipped
model-router-provider-routed-inference-vitest	⏭️ skipped
network-policy-vitest	⏭️ skipped
onboard-negative-paths-vitest	⏭️ skipped
openclaw-inference-switch-vitest	⏭️ skipped
openclaw-skill-cli-vitest	⏭️ skipped
openclaw-tui-chat-correlation-vitest	⏭️ skipped
openshell-version-pin-vitest	⏭️ skipped
rebuild-openclaw-vitest	⏭️ skipped
runtime-overrides-vitest	⏭️ skipped
sandbox-rebuild-vitest	⏭️ skipped
sandbox-survival-vitest	⏭️ skipped
sessions-agents-cli-vitest	⏭️ skipped
shields-config-vitest	⏭️ skipped
skill-agent-vitest	⏭️ skipped
state-backup-restore-vitest	⏭️ skipped
token-rotation-vitest	⏭️ skipped

…double-onboard # Conflicts: # .github/workflows/e2e-vitest-scenarios.yaml

github-actions · 2026-06-16T18:49:34Z

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27638943887
Workflow ref: e2e-migrate/test-gpu-double-onboard
Requested scenarios: (default — all supported)
Requested jobs: gpu-double-onboard-vitest
Summary: 2 passed, 0 failed, 36 skipped

Job	Result
bedrock-runtime-compatible-anthropic-vitest	⏭️ skipped
channels-add-remove-vitest	⏭️ skipped
cloud-inference-vitest	⏭️ skipped
common-egress-agent-vitest	⏭️ skipped
credential-migration-vitest	⏭️ skipped
credential-sanitization-vitest	⏭️ skipped
double-onboard-vitest	⏭️ skipped
gateway-drift-preflight-vitest	⏭️ skipped
gateway-guard-recovery	⏭️ skipped
gateway-health-honest-vitest	⏭️ skipped
generate-matrix	✅ success
gpu-double-onboard-vitest	✅ success
hermes-e2e-vitest	⏭️ skipped
hermes-root-entrypoint-smoke-vitest	⏭️ skipped
inference-routing-vitest	⏭️ skipped
issue-2478-crash-loop-recovery-vitest	⏭️ skipped
issue-4434-tui-unreachable-inference-vitest	⏭️ skipped
launchable-smoke-vitest	⏭️ skipped
live-scenarios	⏭️ skipped
messaging-compatible-endpoint-vitest	⏭️ skipped
messaging-providers-vitest	⏭️ skipped
model-router-provider-routed-inference-vitest	⏭️ skipped
network-policy-vitest	⏭️ skipped
onboard-negative-paths-vitest	⏭️ skipped
onboard-resume-vitest	⏭️ skipped
openclaw-inference-switch-vitest	⏭️ skipped
openclaw-skill-cli-vitest	⏭️ skipped
openclaw-tui-chat-correlation-vitest	⏭️ skipped
openshell-version-pin-vitest	⏭️ skipped
rebuild-openclaw-vitest	⏭️ skipped
runtime-overrides-vitest	⏭️ skipped
sandbox-rebuild-vitest	⏭️ skipped
sandbox-survival-vitest	⏭️ skipped
sessions-agents-cli-vitest	⏭️ skipped
shields-config-vitest	⏭️ skipped
skill-agent-vitest	⏭️ skipped
state-backup-restore-vitest	⏭️ skipped
token-rotation-vitest	⏭️ skipped

…double-onboard # Conflicts: # .github/workflows/e2e-vitest-scenarios.yaml

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

.github/workflows/e2e-vitest-scenarios.yaml (1)

1939-1951: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add Docker Hub authentication to reduce GPU E2E flakiness.

This job skips the Docker Hub login guard used by most other live onboarding jobs in this workflow. On GPU runners, anonymous pull limits can make this scenario fail intermittently.

Suggested patch

   gpu-double-onboard-vitest:
@@
     steps:
       - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
         with:
           persist-credentials: false
+
+      - name: Authenticate to Docker Hub
+        env:
+          DOCKERHUB_USERNAME: ${{ secrets.DOCKERHUB_USERNAME }}
+          DOCKERHUB_TOKEN: ${{ secrets.DOCKERHUB_TOKEN }}
+        shell: bash
+        run: |
+          set -euo pipefail
+          if [[ -z "${DOCKERHUB_USERNAME}" || -z "${DOCKERHUB_TOKEN}" ]]; then
+            echo "::notice::Docker Hub credentials not configured; continuing with anonymous pulls."
+            exit 0
+          fi
+          login_succeeded=0
+          for attempt in 1 2 3; do
+            if echo "${DOCKERHUB_TOKEN}" | timeout 30s docker login docker.io --username "${DOCKERHUB_USERNAME}" --password-stdin; then
+              login_succeeded=1
+              break
+            fi
+            if [[ "$attempt" -lt 3 ]]; then
+              echo "::warning::Docker Hub login attempt ${attempt} failed; retrying."
+              sleep 5
+            fi
+          done
+          if [[ "$login_succeeded" -ne 1 ]]; then
+            echo "::warning::Docker Hub login failed after 3 attempts; continuing with anonymous pulls."
+          fi

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/e2e-vitest-scenarios.yaml around lines 1939 - 1951, Add
Docker Hub authentication to this job to prevent intermittent pull failures on
GPU runners due to anonymous rate limits. After the "Set up Node" step and
before the "Install root dependencies" step, insert a new step that
authenticates with Docker Hub using the docker/login-action GitHub Action with
appropriate credentials. This aligns the authentication approach with other live
onboarding jobs in the workflow that use the Docker Hub login guard.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In @.github/workflows/e2e-vitest-scenarios.yaml:
- Around line 1939-1951: Add Docker Hub authentication to this job to prevent
intermittent pull failures on GPU runners due to anonymous rate limits. After
the "Set up Node" step and before the "Install root dependencies" step, insert a
new step that authenticates with Docker Hub using the docker/login-action GitHub
Action with appropriate credentials. This aligns the authentication approach
with other live onboarding jobs in the workflow that use the Docker Hub login
guard.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d4b06322-5678-48f6-9007-13d5e18d990a

📥 Commits

Reviewing files that changed from the base of the PR and between 6b837fa and 28499df.

📒 Files selected for processing (1)

.github/workflows/e2e-vitest-scenarios.yaml

…double-onboard # Conflicts: # .github/workflows/e2e-vitest-scenarios.yaml

github-actions · 2026-06-16T21:46:56Z

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27649245791
Workflow ref: e2e-migrate/test-gpu-double-onboard
Requested scenarios: (default — all supported)
Requested jobs: gpu-double-onboard-vitest
Summary: 2 passed, 0 failed, 38 skipped

Job	Result
bedrock-runtime-compatible-anthropic-vitest	⏭️ skipped
channels-add-remove-vitest	⏭️ skipped
cloud-inference-vitest	⏭️ skipped
cloud-onboard-vitest	⏭️ skipped
common-egress-agent-vitest	⏭️ skipped
credential-migration-vitest	⏭️ skipped
credential-sanitization-vitest	⏭️ skipped
double-onboard-vitest	⏭️ skipped
full-e2e-vitest	⏭️ skipped
gateway-drift-preflight-vitest	⏭️ skipped
gateway-guard-recovery	⏭️ skipped
gateway-health-honest-vitest	⏭️ skipped
generate-matrix	✅ success
gpu-double-onboard-vitest	✅ success
hermes-e2e-vitest	⏭️ skipped
hermes-root-entrypoint-smoke-vitest	⏭️ skipped
inference-routing-vitest	⏭️ skipped
issue-2478-crash-loop-recovery-vitest	⏭️ skipped
issue-4434-tui-unreachable-inference-vitest	⏭️ skipped
launchable-smoke-vitest	⏭️ skipped
live-scenarios	⏭️ skipped
messaging-compatible-endpoint-vitest	⏭️ skipped
messaging-providers-vitest	⏭️ skipped
model-router-provider-routed-inference-vitest	⏭️ skipped
network-policy-vitest	⏭️ skipped
onboard-negative-paths-vitest	⏭️ skipped
onboard-resume-vitest	⏭️ skipped
openclaw-inference-switch-vitest	⏭️ skipped
openclaw-skill-cli-vitest	⏭️ skipped
openclaw-tui-chat-correlation-vitest	⏭️ skipped
openshell-version-pin-vitest	⏭️ skipped
rebuild-openclaw-vitest	⏭️ skipped
runtime-overrides-vitest	⏭️ skipped
sandbox-rebuild-vitest	⏭️ skipped
sandbox-survival-vitest	⏭️ skipped
sessions-agents-cli-vitest	⏭️ skipped
shields-config-vitest	⏭️ skipped
skill-agent-vitest	⏭️ skipped
state-backup-restore-vitest	⏭️ skipped
token-rotation-vitest	⏭️ skipped

test(e2e): migrate GPU double onboard to Vitest

b35230d

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

cv self-assigned this Jun 16, 2026

test(e2e): add GPU double onboard Vitest migration

f435eb5

cv added 2 commits June 16, 2026 09:13

test(e2e): avoid heredoc in GPU sandbox inference

a9fbcd9

test(e2e): fix GPU inference parser quoting

f5e48b4

test(e2e): avoid prompt echo in GPU sandbox inference

cd198f3

cv marked this pull request as ready for review June 16, 2026 17:06

coderabbitai Bot reviewed Jun 16, 2026

View reviewed changes

Comment thread test/e2e-scenario/live/gpu-double-onboard.test.ts

test(e2e): restore GPU double-onboard parity checks

5c05377

Merge remote-tracking branch 'origin/main' into e2e-migrate/test-gpu-…

6b837fa

…double-onboard # Conflicts: # .github/workflows/e2e-vitest-scenarios.yaml

Merge remote-tracking branch 'origin/main' into e2e-migrate/test-gpu-…

28499df

…double-onboard # Conflicts: # .github/workflows/e2e-vitest-scenarios.yaml

coderabbitai Bot reviewed Jun 16, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into e2e-migrate/test-gpu-…

c2f23f9

…double-onboard # Conflicts: # .github/workflows/e2e-vitest-scenarios.yaml

Conversation

cv commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Type of Change

Verification

Summary by CodeRabbit

Uh oh!

copy-pr-bot Bot commented Jun 16, 2026

Uh oh!

coderabbitai Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

github-actions Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Vitest E2E Scenario Recommendation

Vitest E2E Scenario Advisor

Required Vitest E2E scenarios

Optional Vitest E2E scenarios

Relevant changed files

Uh oh!

github-actions Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Advisor

🛠️ Needs attention

🔎 Worth checking

🌱 Nice ideas

Uh oh!

github-code-quality Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage Overview

TypeScript / code-coverage/plugin

TypeScript / code-coverage/cli

Uh oh!

github-actions Bot commented Jun 16, 2026

Vitest E2E Scenario Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented Jun 16, 2026

Vitest E2E Scenario Results — ✅ All jobs passed

Uh oh!

github-actions Bot commented Jun 16, 2026

Vitest E2E Scenario Results — ✅ All jobs passed

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented Jun 16, 2026

Vitest E2E Scenario Results — ✅ All jobs passed

Uh oh!

github-actions Bot commented Jun 16, 2026

Vitest E2E Scenario Results — ✅ All jobs passed

Uh oh!

github-actions Bot commented Jun 16, 2026

Vitest E2E Scenario Results — ✅ All jobs passed

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 16, 2026

cv commented Jun 16, 2026 •

edited

Loading

coderabbitai Bot commented Jun 16, 2026 •

edited

Loading

github-actions Bot commented Jun 16, 2026 •

edited

Loading

github-actions Bot commented Jun 16, 2026 •

edited

Loading

github-actions Bot commented Jun 16, 2026 •

edited

Loading

github-code-quality Bot commented Jun 16, 2026 •

edited

Loading