Skip to content

test(e2e): migrate GPU double onboard to Vitest#5495

Open
cv wants to merge 9 commits into
mainfrom
e2e-migrate/test-gpu-double-onboard
Open

test(e2e): migrate GPU double onboard to Vitest#5495
cv wants to merge 9 commits into
mainfrom
e2e-migrate/test-gpu-double-onboard

Conversation

@cv

@cv cv commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Summary

Migrate test-gpu-double-onboard.sh into the live Vitest E2E system. Adds test/e2e-scenario/live/gpu-double-onboard.test.ts and gpu-double-onboard-vitest workflow wiring on the GPU runner. The Vitest test verifies GPU/Docker prerequisites, Ollama install, first onboard, persisted auth-proxy token, re-onboard, token consistency, auth rejection, sandbox inference, and cleanup.

Related Issue

Refs #5098

Changes

  • Adds or wires the free-standing live Vitest scenario gpu-double-onboard.
  • Adds selective workflow dispatch via gpu-double-onboard-vitest in .github/workflows/e2e-vitest-scenarios.yaml.
  • Preserves the legacy system boundaries from test-gpu-double-onboard.sh while leaving legacy shell retirement to Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 Phase 11.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • Git hooks passed during commit and push, or npx prek run --from-ref main --to-ref HEAD passes
  • Targeted tests pass for changed behavior
  • Full npm test passes (broad runtime changes only)
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Targeted local checks run while preparing these branches:

  • npx vitest run --project e2e-vitest-support test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts --silent=false --reporter=default
  • npm run typecheck:cli for branches adding new TypeScript tests
  • git diff --check

Selective same-runner dispatch: https://github.com/NVIDIA/NemoClaw/actions/runs/27649245791 — passed after merge-from-main refresh


Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

  • Tests
    • Added a new live GPU end-to-end Vitest scenario, replacing the prior shell-based GPU flow.
    • Verifies Docker/NVIDIA availability, runs onboarding with an Ollama-backed proxy, and checks the persisted proxy token (including expected permissions).
    • Confirms correct proxy authentication (authorized vs unauthorized) and performs in-sandbox inference with expected output.
    • Re-runs onboarding to ensure the token remains unchanged, then cleans up and removes the sandbox from the registry.
  • Chores
    • Added a dedicated GPU CI job and included it in PR status/reporting, with updated artifact uploads.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@cv cv self-assigned this Jun 16, 2026
@copy-pr-bot

copy-pr-bot Bot commented Jun 16, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a new live Vitest E2E test file (gpu-double-onboard.test.ts) covering the GPU double-onboard scenario with Ollama provider, proxy token persistence, HTTP endpoint authentication, and in-sandbox inference validation. A corresponding CI workflow job (gpu-double-onboard-vitest) is added with a GPU runner and Ollama configuration, wired into the report-to-pr aggregation.

Changes

GPU Double-Onboard E2E Test and CI Job

Layer / File(s) Summary
Test configuration, env helpers, and shell utilities
test/e2e-scenario/live/gpu-double-onboard.test.ts
Defines repo/CLI paths, sandbox/proxy env config, live timeout gating, env() helper for merging availability-probe environment, and shell command utilities: resultText formatter, nemoclaw CLI invocation wrapper, cleanup (destroy/delete/kill processes), httpStatus (curl-based status extraction with optional bearer token), fileMode (file permission checker), chatRequest (payload builder), and parseReplyCommand (JSON reply extraction).
Main live E2E scenario body
test/e2e-scenario/live/gpu-double-onboard.test.ts
Implements the full scenario: scenario metadata JSON creation, Docker/NVIDIA GPU prerequisite checks, Ollama install and service cleanup, first onboard with sandbox listing validation, proxy token file creation/persistence with mode 600 verification, HTTP 200/401 endpoint assertions (/v1/models with valid/invalid/missing bearer token, /api/tags shape validation), first in-sandbox chat completion inference asserting "42", re-onboard with sandbox recreation and token persistence validation, proxy and inference re-validation, final cleanup, sandbox registry removal confirmation, and scenario-result.json artifact write.
CI workflow job and report-to-pr integration
.github/workflows/e2e-vitest-scenarios.yaml
Adds gpu-double-onboard-vitest free-standing job with GPU runner selection, NEMOCLAW_PROVIDER=ollama and proxy port env vars, Node/dependency/CLI/OpenShell installation, Vitest test execution with openshell on PATH, and artifact upload to e2e-artifacts/vitest/gpu-double-onboard/; updates report-to-pr job needs to include the new job for PR comment aggregation.

Sequence Diagram(s)

sequenceDiagram
  participant Test as gpu-double-onboard.test.ts
  participant CLI as nemoclaw CLI
  participant Proxy as Ollama Auth Proxy
  participant Sandbox as Docker/GPU Sandbox

  Test->>Sandbox: docker info / nvidia-smi (check prerequisites)
  Test->>CLI: install (Ollama service)
  Test->>CLI: onboard (first time, non-interactive)
  CLI->>Proxy: create and persist auth token
  Proxy-->>Test: token file created with mode 600
  Test->>Proxy: GET /v1/models with valid bearer token → 200
  Test->>Proxy: GET /api/tags (any token) → numeric status
  Test->>Proxy: GET /v1/models unauthenticated → 401
  Test->>Proxy: GET /v1/models with wrong token → 401
  Test->>Sandbox: curl inference.local chat completion (HTTPS)
  Sandbox-->>Test: JSON response containing "42"
  Test->>CLI: onboard again (sandbox recreation, same token expected)
  Proxy-->>Test: token persisted with same content
  Test->>Sandbox: curl inference.local chat completion (second time)
  Sandbox-->>Test: JSON response containing "42"
  Test->>CLI: cleanup (destroy sandbox, delete gateway, kill processes)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#5243: Both PRs modify .github/workflows/e2e-vitest-scenarios.yaml by adding/wiring free-standing Vitest jobs and updating report-to-pr's needs dependencies; the GPU double-onboard job's reporting integration is directly affected by the selector/validation workflow plumbing.

  • NVIDIA/NemoClaw#5218: Both PRs add/wire a free-standing double-onboard Vitest E2E job and corresponding live test file into .github/workflows/e2e-vitest-scenarios.yaml and report-to-pr dependencies, covering the same scenario (base vs GPU variant).

  • NVIDIA/NemoClaw#5493: Both PRs update the same e2e-vitest-scenarios.yaml workflow by adding a new live Vitest job and wiring it into report-to-pr.needs, with both introducing live Vitest E2E tests that perform in-sandbox inference validation and cleanup steps.

Suggested labels

area: e2e

Poem

🐇 Hop hop, the GPU awakes,
A double onboard the test harness makes!
The Ollama proxy guards the door,
Returns a 200—then 401 once more.
"42" emerges from inference deep,
This rabbit's CI never misses a beep! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change—migrating a GPU double onboard test from shell script to Vitest. It is specific, concise, and clearly indicates the primary objective of the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch e2e-migrate/test-gpu-double-onboard

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: None
Optional E2E: gpu-double-onboard-vitest

Dispatch hint: gpu-double-onboard-vitest

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • None.

Optional E2E

  • gpu-double-onboard-vitest (high): Useful to validate the newly added workflow job and live scenario wiring on the intended GPU runner, including Ollama provider onboarding and re-onboarding behavior. Not merge-blocking because the PR changes only E2E test/workflow files, not runtime user-flow code.

New E2E recommendations

  • None.

Dispatch hint

  • Workflow: .github/workflows/e2e-vitest-scenarios.yaml
  • jobs input: gpu-double-onboard-vitest

@github-actions

github-actions Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: gpu-double-onboard-vitest
Optional Vitest E2E scenarios: None

Dispatch required Vitest E2E scenarios:

  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=gpu-double-onboard-vitest

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

  • gpu-double-onboard-vitest: Focused free-standing Vitest job wired for changed live test test/e2e-scenario/live/gpu-double-onboard.test.ts.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=gpu-double-onboard-vitest

Optional Vitest E2E scenarios

  • None.

Relevant changed files

  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/live/gpu-double-onboard.test.ts

@github-actions

github-actions Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 1 needs attention, 7 worth checking, 0 nice ideas
Since last review: 0 prior items resolved, 6 still apply, 1 new item found

Review findings

🛠️ Needs attention

  • Generated Ollama proxy token can leak into ShellProbe artifacts (test/e2e-scenario/live/gpu-double-onboard.test.ts:95): The test reads ~/.nemoclaw/ollama-proxy-token, interpolates that generated bearer token into a bash -lc curl command, and uploads ShellProbe artifacts for the command. ShellProbe redacts environment-derived secrets or explicit redactionValues, but this call provides neither for the generated token, so the bearer can be persisted in .result.json command metadata and exposed in process argv.
    • Recommendation: Avoid placing the token in a shell command string. Pass it via a dedicated environment variable, stdin, or a file descriptor and pass redactionValues: [token] to ShellProbe for every token-auth probe; ideally avoid bash -lc for token-auth curl probes.
    • Evidence: httpStatus() builds `const header = token ? `-H 'Authorization: Bearer ${token}'` : ""` and passes a single `bash -lc` command. `ShellProbe.run()` writes the redacted command array to `${artifactBase}.result.json`, but `httpStatus()` does not pass `redactionValues`.

🔎 Worth checking

  • Source-of-truth review needed: test/e2e-scenario/live/gpu-double-onboard.test.ts cleanup helper: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `cleanup()` catches and suppresses failures for NemoClaw destroy, OpenShell sandbox delete, OpenShell gateway destroy, and process cleanup; final cleanup only checks that the NemoClaw registry file text does not contain the sandbox name.
  • Env-derived port and model values are interpolated into shell strings (test/e2e-scenario/live/gpu-double-onboard.test.ts:95): PROXY_PORT comes from NEMOCLAW_OLLAMA_PROXY_PORT without numeric TCP-port validation and is embedded in host bash -lc curl commands. NEMOCLAW_MODEL is JSON-stringified and then embedded inside a single-quoted sandbox sh -lc payload. Workflow defaults are safe, but custom or local runs can supply quotes or shell metacharacters that break quoting at a host or sandbox command boundary.
    • Recommendation: Validate NEMOCLAW_OLLAMA_PROXY_PORT as a numeric TCP port before use. Build curl invocations without shell interpolation where possible, or use a shell-safe quoting helper for every dynamic value including the model JSON.
    • Evidence: `const PROXY_PORT = process.env.NEMOCLAW_OLLAMA_PROXY_PORT ?? "11435"`; `httpStatus()` embeds `${PROXY_PORT}` through a URL inside `bash -lc`; `expectSandboxInference42()` embeds `--data '${chatRequest(model)}'` where `model` can come from `process.env.NEMOCLAW_MODEL`.
  • Live GPU scenario executes an unpinned remote Ollama installer (test/e2e-scenario/live/gpu-double-onboard.test.ts:194): The new live scenario executes mutable third-party code from https://ollama.com/install.sh on a high-trust GPU runner that also manages sandbox lifecycle and an auth-proxy boundary. The installer is not pinned to a version, checksum, or signature in this PR.
    • Recommendation: Pin the Ollama package/version or verify the downloaded installer/package with a maintained checksum or signature before execution. If this must remain a live installer boundary, document the trust assumption and keep credentials isolated from the job.
    • Evidence: The install step runs `command -v ollama >/dev/null 2>&1 || curl -fsSL https://ollama.com/install.sh | sh`; the workflow job sets `NEMOCLAW_ACCEPT_THIRD_PARTY_SOFTWARE=1`.
  • Final cleanup can pass while sandbox, gateway, or proxy state remains (test/e2e-scenario/live/gpu-double-onboard.test.ts:61): The same best-effort cleanup helper is used for pre-run cleanup and final cleanup. It suppresses failures from NemoClaw destroy, OpenShell sandbox delete, OpenShell gateway destroy, and Ollama/proxy process cleanup, then the final assertion only checks the NemoClaw registry file. This only partially supports the PR's cleanup acceptance claim and can write scenario-result.json as passed while OpenShell or proxy state remains.
    • Recommendation: Keep best-effort cleanup for pre-run state, but use a strict final cleanup path that fails on unexpected destroy/delete errors and verifies the sandbox is absent from NemoClaw registry and OpenShell listings, the nemoclaw gateway is removed, and no ollama-auth-proxy remains for the selected port.
    • Evidence: `cleanup()` catches and ignores errors at lines 62, 71, 78, and 92. The final path calls `await cleanup(host, sandbox)`, checks only `~/.nemoclaw/sandboxes.json`, and then writes `scenario-result.json` with status `passed`.
  • Inference model selection drifts from the legacy test (test/e2e-scenario/live/gpu-double-onboard.test.ts:231): When NEMOCLAW_MODEL is unset, the Vitest migration hardcodes llama3.2:1b. The legacy shell test instead discovered the model actually available after onboard from Ollama. The new workflow does not set NEMOCLAW_MODEL, so this test can fail because it asks for a model that onboard did not install or select, instead of validating the double-onboard token regression.
    • Recommendation: After first onboard, discover the configured or installed model from the same source as the product, such as Ollama /api/tags or NemoClaw/OpenShell inference state, and use that resolved model for both first-onboard and post-re-onboard sandbox inference assertions.
    • Evidence: The new test uses `const model = process.env.NEMOCLAW_MODEL ?? "llama3.2:1b"`. The legacy `test/e2e/test-gpu-double-onboard.sh` used NEMOCLAW_MODEL when set, otherwise queried `http://127.0.0.1:11434/api/tags\` for the first installed model.
  • Token stability assertion may be stricter than the legacy regression contract (test/e2e-scenario/live/gpu-double-onboard.test.ts:254): The new test requires the token after re-onboard to equal the first-onboard token. The legacy regression check primarily verified that the token persisted after re-onboard is accepted by the running proxy. If the product ever intentionally rotates the token during re-onboard while keeping disk and proxy state consistent, this migration would fail for behavior that still satisfies the original token-divergence contract.
    • Recommendation: Confirm the product contract. If token stability is not required, remove the equality assertion and assert instead that the post-re-onboard persisted token is accepted by the running proxy and wrong/unauthenticated requests are rejected.
    • Evidence: The new test asserts `expect(tokenAfterSecond).toBe(tokenAfterFirst)` before probing the proxy. The legacy script's core check reads `TOKEN_AFTER_SECOND` after re-onboard and verifies the running proxy accepts that persisted token.
  • Security-sensitive runtime edges need explicit negative assertions (test/e2e-scenario/live/gpu-double-onboard.test.ts:95): The scenario covers important positive behavior and proxy auth rejection, but it does not assert that the generated proxy token is absent from ShellProbe artifacts, that unsafe env-derived shell inputs are rejected or safely quoted, or that final cleanup failures fail the scenario.
    • Recommendation: Add behavior-specific runtime assertions for artifact redaction, invalid proxy port/model shell metacharacters, and strict final cleanup. These should exercise the real host/sandbox boundary rather than mocks.
    • Evidence: The test uploads ShellProbe artifacts for token-auth curl probes, uses env-derived PROXY_PORT and model values in shell commands, and suppresses cleanup failures while only reading the NemoClaw registry file at the end.

🌱 Nice ideas

  • None.
Consider writing more tests for
  • **Runtime validation** — GPU double-onboard with no `NEMOCLAW_MODEL` discovers the model selected or pulled by onboard and uses it for both first-onboard and post-re-onboard sandbox inference.. This change exercises live GPU, Docker, mutable third-party installer execution, OpenShell sandbox lifecycle, Ollama auth-proxy token persistence, and in-sandbox inference. Direct runtime validation is appropriate; mocks would not validate these security-sensitive system boundaries.
  • **Runtime validation** — `~/.nemoclaw/ollama-proxy-token` never appears in ShellProbe `.result.json`, command metadata, stdout, stderr, or uploaded artifact contents for token-auth curl probes.. This change exercises live GPU, Docker, mutable third-party installer execution, OpenShell sandbox lifecycle, Ollama auth-proxy token persistence, and in-sandbox inference. Direct runtime validation is appropriate; mocks would not validate these security-sensitive system boundaries.
  • **Runtime validation** — A custom `NEMOCLAW_OLLAMA_PROXY_PORT` containing non-numeric characters or shell metacharacters is rejected before any `bash -lc` command is built.. This change exercises live GPU, Docker, mutable third-party installer execution, OpenShell sandbox lifecycle, Ollama auth-proxy token persistence, and in-sandbox inference. Direct runtime validation is appropriate; mocks would not validate these security-sensitive system boundaries.
  • **Runtime validation** — A custom `NEMOCLAW_MODEL` containing quotes or shell metacharacters cannot break the sandbox inference command, or the command is built without shell interpolation.. This change exercises live GPU, Docker, mutable third-party installer execution, OpenShell sandbox lifecycle, Ollama auth-proxy token persistence, and in-sandbox inference. Direct runtime validation is appropriate; mocks would not validate these security-sensitive system boundaries.
  • **Runtime validation** — Final cleanup fails the scenario when NemoClaw destroy, OpenShell sandbox delete, gateway destroy, or auth-proxy cleanup leaves runtime state behind.. This change exercises live GPU, Docker, mutable third-party installer execution, OpenShell sandbox lifecycle, Ollama auth-proxy token persistence, and in-sandbox inference. Direct runtime validation is appropriate; mocks would not validate these security-sensitive system boundaries.
  • **Security-sensitive runtime edges need explicit negative assertions** — Add behavior-specific runtime assertions for artifact redaction, invalid proxy port/model shell metacharacters, and strict final cleanup. These should exercise the real host/sandbox boundary rather than mocks.
  • **Acceptance clause:** Migrate `test-gpu-double-onboard.sh` into the live Vitest E2E system. — add test evidence or identify existing coverage. The PR adds `test/e2e-scenario/live/gpu-double-onboard.test.ts` and a `gpu-double-onboard-vitest` workflow job. Most legacy phases are represented, but model discovery differs from the legacy shell test and final cleanup verification is weaker.
  • **Acceptance clause:** The Vitest test verifies GPU/Docker prerequisites, Ollama install, first onboard, persisted auth-proxy token, re-onboard, token consistency, auth rejection, sandbox inference, and cleanup. — add test evidence or identify existing coverage. The test runs `docker info` and `nvidia-smi`, installs Ollama, runs `install.sh --non-interactive`, checks token file presence/mode, runs re-onboard, compares tokens, checks correct-token/unauth/wrong-token proxy responses, and performs sandbox inference before and after re-onboard. Cleanup is only partially verified because cleanup failures are suppressed and only `~/.nemoclaw/sandboxes.json` is checked.
Since last review details

Current findings:

  • Source-of-truth review needed: test/e2e-scenario/live/gpu-double-onboard.test.ts cleanup helper: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `cleanup()` catches and suppresses failures for NemoClaw destroy, OpenShell sandbox delete, OpenShell gateway destroy, and process cleanup; final cleanup only checks that the NemoClaw registry file text does not contain the sandbox name.
  • Generated Ollama proxy token can leak into ShellProbe artifacts (test/e2e-scenario/live/gpu-double-onboard.test.ts:95): The test reads ~/.nemoclaw/ollama-proxy-token, interpolates that generated bearer token into a bash -lc curl command, and uploads ShellProbe artifacts for the command. ShellProbe redacts environment-derived secrets or explicit redactionValues, but this call provides neither for the generated token, so the bearer can be persisted in .result.json command metadata and exposed in process argv.
    • Recommendation: Avoid placing the token in a shell command string. Pass it via a dedicated environment variable, stdin, or a file descriptor and pass redactionValues: [token] to ShellProbe for every token-auth probe; ideally avoid bash -lc for token-auth curl probes.
    • Evidence: httpStatus() builds `const header = token ? `-H 'Authorization: Bearer ${token}'` : ""` and passes a single `bash -lc` command. `ShellProbe.run()` writes the redacted command array to `${artifactBase}.result.json`, but `httpStatus()` does not pass `redactionValues`.
  • Env-derived port and model values are interpolated into shell strings (test/e2e-scenario/live/gpu-double-onboard.test.ts:95): PROXY_PORT comes from NEMOCLAW_OLLAMA_PROXY_PORT without numeric TCP-port validation and is embedded in host bash -lc curl commands. NEMOCLAW_MODEL is JSON-stringified and then embedded inside a single-quoted sandbox sh -lc payload. Workflow defaults are safe, but custom or local runs can supply quotes or shell metacharacters that break quoting at a host or sandbox command boundary.
    • Recommendation: Validate NEMOCLAW_OLLAMA_PROXY_PORT as a numeric TCP port before use. Build curl invocations without shell interpolation where possible, or use a shell-safe quoting helper for every dynamic value including the model JSON.
    • Evidence: `const PROXY_PORT = process.env.NEMOCLAW_OLLAMA_PROXY_PORT ?? "11435"`; `httpStatus()` embeds `${PROXY_PORT}` through a URL inside `bash -lc`; `expectSandboxInference42()` embeds `--data '${chatRequest(model)}'` where `model` can come from `process.env.NEMOCLAW_MODEL`.
  • Live GPU scenario executes an unpinned remote Ollama installer (test/e2e-scenario/live/gpu-double-onboard.test.ts:194): The new live scenario executes mutable third-party code from https://ollama.com/install.sh on a high-trust GPU runner that also manages sandbox lifecycle and an auth-proxy boundary. The installer is not pinned to a version, checksum, or signature in this PR.
    • Recommendation: Pin the Ollama package/version or verify the downloaded installer/package with a maintained checksum or signature before execution. If this must remain a live installer boundary, document the trust assumption and keep credentials isolated from the job.
    • Evidence: The install step runs `command -v ollama >/dev/null 2>&1 || curl -fsSL https://ollama.com/install.sh | sh`; the workflow job sets `NEMOCLAW_ACCEPT_THIRD_PARTY_SOFTWARE=1`.
  • Final cleanup can pass while sandbox, gateway, or proxy state remains (test/e2e-scenario/live/gpu-double-onboard.test.ts:61): The same best-effort cleanup helper is used for pre-run cleanup and final cleanup. It suppresses failures from NemoClaw destroy, OpenShell sandbox delete, OpenShell gateway destroy, and Ollama/proxy process cleanup, then the final assertion only checks the NemoClaw registry file. This only partially supports the PR's cleanup acceptance claim and can write scenario-result.json as passed while OpenShell or proxy state remains.
    • Recommendation: Keep best-effort cleanup for pre-run state, but use a strict final cleanup path that fails on unexpected destroy/delete errors and verifies the sandbox is absent from NemoClaw registry and OpenShell listings, the nemoclaw gateway is removed, and no ollama-auth-proxy remains for the selected port.
    • Evidence: `cleanup()` catches and ignores errors at lines 62, 71, 78, and 92. The final path calls `await cleanup(host, sandbox)`, checks only `~/.nemoclaw/sandboxes.json`, and then writes `scenario-result.json` with status `passed`.
  • Inference model selection drifts from the legacy test (test/e2e-scenario/live/gpu-double-onboard.test.ts:231): When NEMOCLAW_MODEL is unset, the Vitest migration hardcodes llama3.2:1b. The legacy shell test instead discovered the model actually available after onboard from Ollama. The new workflow does not set NEMOCLAW_MODEL, so this test can fail because it asks for a model that onboard did not install or select, instead of validating the double-onboard token regression.
    • Recommendation: After first onboard, discover the configured or installed model from the same source as the product, such as Ollama /api/tags or NemoClaw/OpenShell inference state, and use that resolved model for both first-onboard and post-re-onboard sandbox inference assertions.
    • Evidence: The new test uses `const model = process.env.NEMOCLAW_MODEL ?? "llama3.2:1b"`. The legacy `test/e2e/test-gpu-double-onboard.sh` used NEMOCLAW_MODEL when set, otherwise queried `http://127.0.0.1:11434/api/tags\` for the first installed model.
  • Token stability assertion may be stricter than the legacy regression contract (test/e2e-scenario/live/gpu-double-onboard.test.ts:254): The new test requires the token after re-onboard to equal the first-onboard token. The legacy regression check primarily verified that the token persisted after re-onboard is accepted by the running proxy. If the product ever intentionally rotates the token during re-onboard while keeping disk and proxy state consistent, this migration would fail for behavior that still satisfies the original token-divergence contract.
    • Recommendation: Confirm the product contract. If token stability is not required, remove the equality assertion and assert instead that the post-re-onboard persisted token is accepted by the running proxy and wrong/unauthenticated requests are rejected.
    • Evidence: The new test asserts `expect(tokenAfterSecond).toBe(tokenAfterFirst)` before probing the proxy. The legacy script's core check reads `TOKEN_AFTER_SECOND` after re-onboard and verifies the running proxy accepts that persisted token.
  • Security-sensitive runtime edges need explicit negative assertions (test/e2e-scenario/live/gpu-double-onboard.test.ts:95): The scenario covers important positive behavior and proxy auth rejection, but it does not assert that the generated proxy token is absent from ShellProbe artifacts, that unsafe env-derived shell inputs are rejected or safely quoted, or that final cleanup failures fail the scenario.
    • Recommendation: Add behavior-specific runtime assertions for artifact redaction, invalid proxy port/model shell metacharacters, and strict final cleanup. These should exercise the real host/sandbox boundary rather than mocks.
    • Evidence: The test uploads ShellProbe artifacts for token-auth curl probes, uses env-derived PROXY_PORT and model values in shell commands, and suppresses cleanup failures while only reading the NemoClaw registry file at the end.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@github-code-quality

github-code-quality Bot commented Jun 16, 2026

Copy link
Copy Markdown

Code Coverage Overview

Languages: TypeScript

TypeScript / code-coverage/plugin

The overall coverage in the branch is 96%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File c2f23f9 +/-
nemoclaw/src/se...cret-scanner.ts 100%
nemoclaw/src/commands/slash.ts 100%
nemoclaw/src/li...bprocess-env.ts 100%
nemoclaw/src/bl...eprint/state.ts 98%
nemoclaw/src/onboard/config.ts 98%
nemoclaw/src/bl...int/snapshot.ts 97%
nemoclaw/src/bl...print/runner.ts 95%
nemoclaw/src/co...ration-state.ts 94%
nemoclaw/src/bl...ate-networks.ts 94%
nemoclaw/src/index.ts 94%

TypeScript / code-coverage/cli

The overall coverage in the branch is 46%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File c2f23f9 +/-
src/lib/state/o...oard-session.ts 90%
src/lib/inference/local.ts 76%
src/lib/sandbox/config.ts 72%
src/lib/actions...dbox/rebuild.ts 67%
src/lib/onboard/preflight.ts 64%
src/lib/actions...licy-channel.ts 56%
src/lib/state/sandbox.ts 55%
src/lib/policy/index.ts 49%
src/lib/onboard...er-gpu-patch.ts 44%
src/lib/onboard.ts 18%

Updated June 16, 2026 21:28 UTC
Code Coverage is in Public Preview. Learn more and provide us with your feedback.

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 27629702812
Workflow ref: e2e-migrate/test-gpu-double-onboard
Requested scenarios: (default — all supported)
Requested jobs: gpu-double-onboard-vitest
Summary: 1 passed, 1 failed, 35 skipped

Job Result
bedrock-runtime-compatible-anthropic-vitest ⏭️ skipped
channels-add-remove-vitest ⏭️ skipped
cloud-inference-vitest ⏭️ skipped
common-egress-agent-vitest ⏭️ skipped
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-drift-preflight-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
gateway-health-honest-vitest ⏭️ skipped
generate-matrix ✅ success
gpu-double-onboard-vitest ❌ failure
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-2478-crash-loop-recovery-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
messaging-compatible-endpoint-vitest ⏭️ skipped
messaging-providers-vitest ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-inference-switch-vitest ⏭️ skipped
openclaw-skill-cli-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
sessions-agents-cli-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
state-backup-restore-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped

Failed jobs: gpu-double-onboard-vitest. Check run artifacts for logs.

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27631600832
Workflow ref: e2e-migrate/test-gpu-double-onboard
Requested scenarios: (default — all supported)
Requested jobs: gpu-double-onboard-vitest
Summary: 2 passed, 0 failed, 35 skipped

Job Result
bedrock-runtime-compatible-anthropic-vitest ⏭️ skipped
channels-add-remove-vitest ⏭️ skipped
cloud-inference-vitest ⏭️ skipped
common-egress-agent-vitest ⏭️ skipped
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-drift-preflight-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
gateway-health-honest-vitest ⏭️ skipped
generate-matrix ✅ success
gpu-double-onboard-vitest ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-2478-crash-loop-recovery-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
messaging-compatible-endpoint-vitest ⏭️ skipped
messaging-providers-vitest ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-inference-switch-vitest ⏭️ skipped
openclaw-skill-cli-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
sessions-agents-cli-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
state-backup-restore-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27632894484
Workflow ref: e2e-migrate/test-gpu-double-onboard
Requested scenarios: (default — all supported)
Requested jobs: gpu-double-onboard-vitest
Summary: 2 passed, 0 failed, 35 skipped

Job Result
bedrock-runtime-compatible-anthropic-vitest ⏭️ skipped
channels-add-remove-vitest ⏭️ skipped
cloud-inference-vitest ⏭️ skipped
common-egress-agent-vitest ⏭️ skipped
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-drift-preflight-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
gateway-health-honest-vitest ⏭️ skipped
generate-matrix ✅ success
gpu-double-onboard-vitest ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-2478-crash-loop-recovery-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
messaging-compatible-endpoint-vitest ⏭️ skipped
messaging-providers-vitest ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-inference-switch-vitest ⏭️ skipped
openclaw-skill-cli-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
sessions-agents-cli-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
state-backup-restore-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped

@cv cv marked this pull request as ready for review June 16, 2026 17:06

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e-scenario/live/gpu-double-onboard.test.ts`:
- Around line 208-223: The test currently validates only the new token
(tokenAfterSecond) without verifying it remains unchanged across the re-onboard
cycle, allowing a token rotation regression to pass undetected. Capture the
first token from TOKEN_FILE before the re-onboard operation (prior to line 208),
then add an explicit equality assertion comparing tokenAfterSecond to this first
captured token to enforce that the token identity is preserved across
re-onboard. Additionally, reuse the first token object for the post-reonboard
authentication check to validate the same token works for both requests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 04a46ee6-a453-4db3-82ba-edb219d47255

📥 Commits

Reviewing files that changed from the base of the PR and between 6c0fb04 and cd198f3.

📒 Files selected for processing (2)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/live/gpu-double-onboard.test.ts

Comment thread test/e2e-scenario/live/gpu-double-onboard.test.ts
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27635624904
Workflow ref: e2e-migrate/test-gpu-double-onboard
Requested scenarios: (default — all supported)
Requested jobs: gpu-double-onboard-vitest
Summary: 2 passed, 0 failed, 35 skipped

Job Result
bedrock-runtime-compatible-anthropic-vitest ⏭️ skipped
channels-add-remove-vitest ⏭️ skipped
cloud-inference-vitest ⏭️ skipped
common-egress-agent-vitest ⏭️ skipped
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-drift-preflight-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
gateway-health-honest-vitest ⏭️ skipped
generate-matrix ✅ success
gpu-double-onboard-vitest ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-2478-crash-loop-recovery-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
messaging-compatible-endpoint-vitest ⏭️ skipped
messaging-providers-vitest ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-inference-switch-vitest ⏭️ skipped
openclaw-skill-cli-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
sessions-agents-cli-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
state-backup-restore-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27636905667
Workflow ref: e2e-migrate/test-gpu-double-onboard
Requested scenarios: (default — all supported)
Requested jobs: gpu-double-onboard-vitest
Summary: 2 passed, 0 failed, 35 skipped

Job Result
bedrock-runtime-compatible-anthropic-vitest ⏭️ skipped
channels-add-remove-vitest ⏭️ skipped
cloud-inference-vitest ⏭️ skipped
common-egress-agent-vitest ⏭️ skipped
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-drift-preflight-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
gateway-health-honest-vitest ⏭️ skipped
generate-matrix ✅ success
gpu-double-onboard-vitest ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-2478-crash-loop-recovery-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
messaging-compatible-endpoint-vitest ⏭️ skipped
messaging-providers-vitest ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-inference-switch-vitest ⏭️ skipped
openclaw-skill-cli-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
sessions-agents-cli-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
state-backup-restore-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped

…double-onboard

# Conflicts:
#	.github/workflows/e2e-vitest-scenarios.yaml
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27638943887
Workflow ref: e2e-migrate/test-gpu-double-onboard
Requested scenarios: (default — all supported)
Requested jobs: gpu-double-onboard-vitest
Summary: 2 passed, 0 failed, 36 skipped

Job Result
bedrock-runtime-compatible-anthropic-vitest ⏭️ skipped
channels-add-remove-vitest ⏭️ skipped
cloud-inference-vitest ⏭️ skipped
common-egress-agent-vitest ⏭️ skipped
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-drift-preflight-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
gateway-health-honest-vitest ⏭️ skipped
generate-matrix ✅ success
gpu-double-onboard-vitest ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-2478-crash-loop-recovery-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
messaging-compatible-endpoint-vitest ⏭️ skipped
messaging-providers-vitest ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
onboard-resume-vitest ⏭️ skipped
openclaw-inference-switch-vitest ⏭️ skipped
openclaw-skill-cli-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
sessions-agents-cli-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
state-backup-restore-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped

…double-onboard

# Conflicts:
#	.github/workflows/e2e-vitest-scenarios.yaml

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/e2e-vitest-scenarios.yaml (1)

1939-1951: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add Docker Hub authentication to reduce GPU E2E flakiness.

This job skips the Docker Hub login guard used by most other live onboarding jobs in this workflow. On GPU runners, anonymous pull limits can make this scenario fail intermittently.

Suggested patch
   gpu-double-onboard-vitest:
@@
     steps:
       - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
         with:
           persist-credentials: false
+
+      - name: Authenticate to Docker Hub
+        env:
+          DOCKERHUB_USERNAME: ${{ secrets.DOCKERHUB_USERNAME }}
+          DOCKERHUB_TOKEN: ${{ secrets.DOCKERHUB_TOKEN }}
+        shell: bash
+        run: |
+          set -euo pipefail
+          if [[ -z "${DOCKERHUB_USERNAME}" || -z "${DOCKERHUB_TOKEN}" ]]; then
+            echo "::notice::Docker Hub credentials not configured; continuing with anonymous pulls."
+            exit 0
+          fi
+          login_succeeded=0
+          for attempt in 1 2 3; do
+            if echo "${DOCKERHUB_TOKEN}" | timeout 30s docker login docker.io --username "${DOCKERHUB_USERNAME}" --password-stdin; then
+              login_succeeded=1
+              break
+            fi
+            if [[ "$attempt" -lt 3 ]]; then
+              echo "::warning::Docker Hub login attempt ${attempt} failed; retrying."
+              sleep 5
+            fi
+          done
+          if [[ "$login_succeeded" -ne 1 ]]; then
+            echo "::warning::Docker Hub login failed after 3 attempts; continuing with anonymous pulls."
+          fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/e2e-vitest-scenarios.yaml around lines 1939 - 1951, Add
Docker Hub authentication to this job to prevent intermittent pull failures on
GPU runners due to anonymous rate limits. After the "Set up Node" step and
before the "Install root dependencies" step, insert a new step that
authenticates with Docker Hub using the docker/login-action GitHub Action with
appropriate credentials. This aligns the authentication approach with other live
onboarding jobs in the workflow that use the Docker Hub login guard.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In @.github/workflows/e2e-vitest-scenarios.yaml:
- Around line 1939-1951: Add Docker Hub authentication to this job to prevent
intermittent pull failures on GPU runners due to anonymous rate limits. After
the "Set up Node" step and before the "Install root dependencies" step, insert a
new step that authenticates with Docker Hub using the docker/login-action GitHub
Action with appropriate credentials. This aligns the authentication approach
with other live onboarding jobs in the workflow that use the Docker Hub login
guard.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d4b06322-5678-48f6-9007-13d5e18d990a

📥 Commits

Reviewing files that changed from the base of the PR and between 6b837fa and 28499df.

📒 Files selected for processing (1)
  • .github/workflows/e2e-vitest-scenarios.yaml

…double-onboard

# Conflicts:
#	.github/workflows/e2e-vitest-scenarios.yaml
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27649245791
Workflow ref: e2e-migrate/test-gpu-double-onboard
Requested scenarios: (default — all supported)
Requested jobs: gpu-double-onboard-vitest
Summary: 2 passed, 0 failed, 38 skipped

Job Result
bedrock-runtime-compatible-anthropic-vitest ⏭️ skipped
channels-add-remove-vitest ⏭️ skipped
cloud-inference-vitest ⏭️ skipped
cloud-onboard-vitest ⏭️ skipped
common-egress-agent-vitest ⏭️ skipped
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
full-e2e-vitest ⏭️ skipped
gateway-drift-preflight-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
gateway-health-honest-vitest ⏭️ skipped
generate-matrix ✅ success
gpu-double-onboard-vitest ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-2478-crash-loop-recovery-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
messaging-compatible-endpoint-vitest ⏭️ skipped
messaging-providers-vitest ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
onboard-resume-vitest ⏭️ skipped
openclaw-inference-switch-vitest ⏭️ skipped
openclaw-skill-cli-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
sessions-agents-cli-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
state-backup-restore-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant