Skip to content

test(e2e): add inference routing Vitest coverage [ANCHOR-3]#5231

Merged
cv merged 15 commits into
mainfrom
e2e-migrate/test-inference-routing
Jun 12, 2026
Merged

test(e2e): add inference routing Vitest coverage [ANCHOR-3]#5231
cv merged 15 commits into
mainfrom
e2e-migrate/test-inference-routing

Conversation

@jyaunches

@jyaunches jyaunches commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

Migrate test/e2e/test-inference-routing.sh with the simplest equivalent live Vitest coverage.

Related Issues

Refs #5098

Contract mapping

  • Legacy assertion: TC-INF-06 invalid NVIDIA API key fails with credential-classified output, no raw stack trace, no key exposure, and no active sandbox left behind.
    • Replacement: test/e2e-scenario/live/inference-routing.test.ts / TC-INF-06 invalid API key fails with credential classification and cleanup
    • Boundary preserved: real node bin/nemoclaw.js onboard on a Docker/OpenShell-capable runner.
  • Legacy assertion: TC-INF-07 unreachable custom endpoint fails with transport-classified output, no raw stack trace, and no active sandbox left behind.
    • Replacement: test/e2e-scenario/live/inference-routing.test.ts / TC-INF-07 unreachable endpoint fails with transport classification and cleanup
    • Boundary preserved: real non-interactive onboard with custom endpoint env.
  • Legacy assertion: TC-INF-05 real NVIDIA_API_KEY is absent from sandbox env, process list, and sampled filesystem, and sandbox placeholder is not the real key.
    • Replacement: test/e2e-scenario/live/inference-routing.test.ts / TC-INF-05 real NVIDIA key is isolated from sandbox env, process list, and filesystem
    • Boundary preserved: real onboard plus openshell sandbox exec probes.
  • Legacy assertion: TC-INF-02/03/09 optional provider smokes route through https://inference.local for OpenAI, Anthropic, and compatible endpoints when their secrets/opt-in env are present.
    • Replacement: same Vitest file, provider-specific tests gated by env/secrets.
    • Boundary preserved: real sandbox-side curl to inference.local.

Simplicity check

  • Test shape: simple live Vitest test.
  • New shared helpers: none; one-off process/redaction helpers stay local to the test.
  • New framework/registry/ledger: none.
  • Workflow changes: adds a PR-safe inference-routing-vitest free-standing live job; legacy shell script and nightly shell lane remain for Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 Phase 11 retirement.

Verification

  • npm run build:cli
  • npx @biomejs/biome lint test/e2e-scenario/live/inference-routing.test.ts
  • npx tsc --noEmit --target ES2022 --module NodeNext --moduleResolution NodeNext --types node,vitest/globals --skipLibCheck --allowImportingTsExtensions test/e2e-scenario/live/inference-routing.test.ts
  • NEMOCLAW_RUN_E2E_SCENARIOS=1 npx vitest run --project e2e-scenarios-live test/e2e-scenario/live/inference-routing.test.ts --silent=false --reporter=default
  • npx vitest run test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts --project e2e-vitest-support
  • Live Docker-enabled validation: https://github.com/NVIDIA/NemoClaw/actions/runs/27350901619 (inference-routing-vitest passed)

Summary by CodeRabbit

  • New Features

    • Added a live inference-routing E2E run with artifact upload (14-day retention) and included its result in aggregated PR reports.
  • Tests

    • Comprehensive inference-routing smoke tests covering sandbox lifecycle, provider routing, timeouts, output capture, credential-exposure checks, and provider request validations.
    • Expanded workflow-boundary tests to validate the new job selector and matrix behavior.
  • Chores

    • Added selectable live-test job input with per-job validation; live-scenarios matrix now only runs when no job input is supplied.
    • Switched workflow step Python invocations to python3.

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c971d81c-f8a5-4d46-b2b4-cba0d2eb3fc7

📥 Commits

Reviewing files that changed from the base of the PR and between ef988df and d8fba9c.

📒 Files selected for processing (3)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • tools/e2e-scenarios/workflow-boundary.mts
💤 Files with no reviewable changes (1)
  • tools/e2e-scenarios/workflow-boundary.mts

📝 Walkthrough

Walkthrough

Adds a Vitest live E2E inference-routing test suite with sandbox lifecycle, subprocess execution, credential-isolation and provider smoke checks, and integrates it into the e2e Vitest GitHub Actions workflow with a new inference-routing-vitest job plus validator/test updates for dispatch/job selection.

Changes

E2E Inference Routing Tests

Layer / File(s) Summary
Test Foundation and Configuration
test/e2e-scenario/live/inference-routing.test.ts
Test initialization, repository/entrypoint paths, environment-based gating, provider smoke opt-in, skip helpers, result types, redaction and failure-classification regexes.
Subprocess Execution Pipeline
test/e2e-scenario/live/inference-routing.test.ts
Implements runRawCommand with detached process groups, timed SIGTERM→SIGKILL, stdout/stderr capture and redaction, JSON/text artifact emission; adds runNemoclawCli and runOpenShell wrappers.
Test Lifecycle and Sandbox Orchestration
test/e2e-scenario/live/inference-routing.test.ts
Prerequisite checks (built CLI/Docker/OpenShell), sandbox cleanup modes with evidence collection, onboarding-state clearing, and assert-no-active-sandbox helpers.
Onboard Helper and Timeouts
test/e2e-scenario/live/inference-routing.test.ts
onboardSandbox implementation, timeout handling, and success/failure assertions for onboarding CLI runs.
Provider Response Parsing & In-Sandbox Requests
test/e2e-scenario/live/inference-routing.test.ts
JSON parsing and provider-specific extractors plus sandbox-executed curl helpers targeting https://inference.local.
Credential Isolation & Failure Classification Tests
test/e2e-scenario/live/inference-routing.test.ts
TC-INF-06 (invalid key), TC-INF-07 (unreachable endpoint), TC-INF-05 (real-key isolation via env/process/filesystem canary).
Provider Integration Smoke Tests
test/e2e-scenario/live/inference-routing.test.ts
TC-INF-02 (OpenAI), TC-INF-03 (Anthropic), TC-INF-09 (OpenAI-compatible) onboard-and-request smoke tests asserting non-empty provider content.

CI Workflow & Validators

Layer / File(s) Summary
Workflow allowlist & matrix wiring
.github/workflows/e2e-vitest-scenarios.yaml
Adds inference-routing-vitest to validate-jobs allowlist, updates generate-matrix allowlist/free-standing scenario list, and switches Python invocations to python3 for summary steps.
New inference-routing-vitest job
.github/workflows/e2e-vitest-scenarios.yaml
New job that sets env, checks out code, sets up Node 22, installs deps, builds the CLI, runs test/e2e-scenario/live/inference-routing.test.ts under e2e-scenarios-live, and uploads artifacts to e2e-artifacts/vitest/inference-routing/ with 14-day retention.
Workflow boundary test updates
test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
Expands the in-test workflow fixture to accept inference-routing / inference-routing-vitest selections and adds matrix-generation assertions for those selectors.
Workflow-boundary validator changes
tools/e2e-scenarios/workflow-boundary.mts
Adds mapping for inference-routinginference-routing-vitest, validates validate-jobs and generate-matrix content for the new job/scenario, and requires inference-routing-vitest in report-to-pr needs.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#5243: Earlier additions to Vitest job-selector plumbing and workflow-boundary adjustments that this PR builds on.
  • NVIDIA/NemoClaw#5152: Overlapping workflow-boundary and free-standing job registration changes.
  • NVIDIA/NemoClaw#5233: Related free-standing Vitest job additions and report aggregation wiring.

Suggested labels

area: e2e, area: inference, area: ci

Suggested reviewers

  • cv
  • prekshivyas

Poem

🐰 In burrows bright the tests take flight,
Sandboxes spin through day and night,
Keys hidden safe, canaries survive,
Curl requests hop, responses arrive,
A rabbit cheers: the routes keep right!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and clearly describes the main change: adding inference routing test coverage using Vitest in the e2e testing suite.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch e2e-migrate/test-inference-routing

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

Copy link
Copy Markdown
Contributor

This repository limits contributors to 10 open pull requests. Please close or merge existing PRs before opening new ones.

@github-actions github-actions Bot closed this Jun 11, 2026
@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: inference-routing-vitest
Optional E2E: issue-4434-tui-unreachable-inference-vitest

Dispatch hint: inference-routing-vitest

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • inference-routing-vitest (medium): This PR introduces and wires the new inference-routing live Vitest job; run it directly to validate the workflow selector, CLI build path, OpenShell/Docker prerequisites, invalid-key and unreachable-endpoint behavior, artifact upload root, and cleanup contract.

Optional E2E

  • issue-4434-tui-unreachable-inference-vitest (high): Adjacent inference-unreachable coverage through the OpenClaw TUI/gateway path. Useful confidence if reviewers want to verify the new inference-routing lane does not diverge from the existing unreachable-inference user-flow coverage, but it is expensive and not required because no runtime TUI or gateway code changed.

New E2E recommendations

  • None.

Dispatch hint

  • Workflow: .github/workflows/e2e-vitest-scenarios.yaml
  • jobs input: inference-routing-vitest

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: inference-routing-vitest
Optional Vitest E2E scenarios: None

Dispatch required Vitest E2E scenarios:

  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=inference-routing-vitest

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

  • inference-routing-vitest: Focused free-standing Vitest job wired for changed live test test/e2e-scenario/live/inference-routing.test.ts.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=inference-routing-vitest

Optional Vitest E2E scenarios

  • None.

Relevant changed files

  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/live/inference-routing.test.ts
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • tools/e2e-scenarios/workflow-boundary.mts

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 3 worth checking, 0 nice ideas
Since last review: 0 prior items resolved, 4 still apply, 0 new items found

Review findings

🛠️ Needs attention

  • None.

🔎 Worth checking

  • Redact compatible endpoint URLs as sensitive artifact data (test/e2e-scenario/live/inference-routing.test.ts:939): The compatible-endpoint smoke writes the configured endpoint URL to scenario artifacts and only includes the compatible API key in explicit redaction values. If a future secret-backed lane supplies an endpoint URL containing userinfo, query tokens, fragments, or another credential that differs from COMPATIBLE_API_KEY, scenario JSON or raw command evidence can preserve that secret.
    • Recommendation: Treat NEMOCLAW_ENDPOINT_URL as sensitive when writing artifacts and raw command evidence. Redact the full URL, or store only a parsed non-sensitive origin/host with username, password, query, and fragment stripped. Include the endpoint URL itself in redactionValues for compatible endpoint onboard/probe commands when it may contain credentials.
    • Evidence: TC-INF-09 writes endpointUrl: redactString(endpointUrl, [apiKey]); the compatible endpoint onboardSandbox call passes redactionValues [apiKey] while endpointUrl comes from process.env.NEMOCLAW_ENDPOINT_URL.
  • Extend workflow-boundary governance to the new inference-routing job (tools/e2e-scenarios/workflow-boundary.mts:1333): The workflow YAML for inference-routing-vitest currently uses the expected shape, but the deterministic workflow-boundary validator only checks selector/needs behavior for this new job. It does not make the job's pinned actions, checkout credential persistence, npm ci --ignore-scripts, artifact safety, secret-env exclusion, or exact test target a source of truth for future edits.
    • Recommendation: Add a dedicated validateInferenceRoutingVitestJob validator with the same security/artifact/job-selector invariants used for other free-standing live jobs, plus negative support tests for missing needs, unpinned actions, persist-credentials true, unsafe artifact paths, secret env exposure, and missing report-to-pr dependency.
    • Evidence: .github/workflows/e2e-vitest-scenarios.yaml adds inference-routing-vitest and report-to-pr.needs includes it, but tools/e2e-scenarios/workflow-boundary.mts calls only validateFreeStandingJobSelector(errors, jobs, "inference-routing-vitest", "inference-routing").
  • Cover and document the provider-smoke explicit opt-in behavior (test/e2e-scenario/live/inference-routing.test.ts:46): The migration intentionally changes optional provider smokes from legacy run-when-secret-exists behavior to run only when both secrets and NEMOCLAW_INFERENCE_ROUTING_PROVIDER_SMOKE opt-in are present. That is a reasonable quota-safety boundary, but it is a behavior/source-of-truth change without deterministic coverage proving the selector matrix or a documented removal condition.
    • Recommendation: Add a small deterministic helper-level test showing provider smokes skip when a provider secret exists but the selector is unset, run only for the requested provider, and run for all providers when the selector is all. Also document the source boundary and whether this opt-in is permanent or removable once a formal secret-backed provider-smoke lane exists.
    • Evidence: shouldRunProviderSmoke() requires 1, true, all, or the provider name even when the provider API key is present, while the legacy shell comments state provider smokes skipped only when required API keys were not set.

🌱 Nice ideas

  • None.
Consider writing more tests for
  • **Runtime validation** — workflow-boundary rejects inference-routing-vitest without validate-jobs and generate-matrix needs. The change crosses GitHub Actions trusted workflow boundaries, live Docker/OpenShell sandbox lifecycle, credential isolation, inference routing, and artifact redaction. The PR adds useful live coverage, but deterministic support tests should lock down the new workflow job and sensitive endpoint artifact behavior.
  • **Runtime validation** — workflow-boundary rejects inference-routing-vitest with unpinned checkout setup-node or upload-artifact actions. The change crosses GitHub Actions trusted workflow boundaries, live Docker/OpenShell sandbox lifecycle, credential isolation, inference routing, and artifact redaction. The PR adds useful live coverage, but deterministic support tests should lock down the new workflow job and sensitive endpoint artifact behavior.
  • **Runtime validation** — workflow-boundary rejects inference-routing-vitest when checkout persist-credentials is true. The change crosses GitHub Actions trusted workflow boundaries, live Docker/OpenShell sandbox lifecycle, credential isolation, inference routing, and artifact redaction. The PR adds useful live coverage, but deterministic support tests should lock down the new workflow job and sensitive endpoint artifact behavior.
  • **Runtime validation** — workflow-boundary rejects inference-routing-vitest when npm install replaces npm ci --ignore-scripts. The change crosses GitHub Actions trusted workflow boundaries, live Docker/OpenShell sandbox lifecycle, credential isolation, inference routing, and artifact redaction. The PR adds useful live coverage, but deterministic support tests should lock down the new workflow job and sensitive endpoint artifact behavior.
  • **Runtime validation** — workflow-boundary rejects inference-routing-vitest when NVIDIA_API_KEY or GitHub tokens are exposed outside an intentional secret-backed run step. The change crosses GitHub Actions trusted workflow boundaries, live Docker/OpenShell sandbox lifecycle, credential isolation, inference routing, and artifact redaction. The PR adds useful live coverage, but deterministic support tests should lock down the new workflow job and sensitive endpoint artifact behavior.
  • **Cover and document the provider-smoke explicit opt-in behavior** — Add a small deterministic helper-level test showing provider smokes skip when a provider secret exists but the selector is unset, run only for the requested provider, and run for all providers when the selector is all. Also document the source boundary and whether this opt-in is permanent or removable once a formal secret-backed provider-smoke lane exists.
  • **Acceptance clause:** No linked issue clauses were available in deterministic context. — add test evidence or identify existing coverage. The trusted context reports linkedIssues: []; PR body says Refs Epic: Migrate legacy bash E2E into the Vitest E2E system #5098, but no issue text or comments were provided for literal clause extraction.
  • **Acceptance clause:** Legacy assertion: TC-INF-05 real NVIDIA_API_KEY is absent from sandbox env, process list, and sampled filesystem, and sandbox placeholder is not the real key. — add test evidence or identify existing coverage. The TC-INF-05 test performs real onboard and sandbox env/process/filesystem probes when NVIDIA_API_KEY is present, but the new workflow job does not pass NVIDIA_API_KEY, so this slice skips in the PR-safe workflow by default.
Since last review details

Current findings:

  • Redact compatible endpoint URLs as sensitive artifact data (test/e2e-scenario/live/inference-routing.test.ts:939): The compatible-endpoint smoke writes the configured endpoint URL to scenario artifacts and only includes the compatible API key in explicit redaction values. If a future secret-backed lane supplies an endpoint URL containing userinfo, query tokens, fragments, or another credential that differs from COMPATIBLE_API_KEY, scenario JSON or raw command evidence can preserve that secret.
    • Recommendation: Treat NEMOCLAW_ENDPOINT_URL as sensitive when writing artifacts and raw command evidence. Redact the full URL, or store only a parsed non-sensitive origin/host with username, password, query, and fragment stripped. Include the endpoint URL itself in redactionValues for compatible endpoint onboard/probe commands when it may contain credentials.
    • Evidence: TC-INF-09 writes endpointUrl: redactString(endpointUrl, [apiKey]); the compatible endpoint onboardSandbox call passes redactionValues [apiKey] while endpointUrl comes from process.env.NEMOCLAW_ENDPOINT_URL.
  • Extend workflow-boundary governance to the new inference-routing job (tools/e2e-scenarios/workflow-boundary.mts:1333): The workflow YAML for inference-routing-vitest currently uses the expected shape, but the deterministic workflow-boundary validator only checks selector/needs behavior for this new job. It does not make the job's pinned actions, checkout credential persistence, npm ci --ignore-scripts, artifact safety, secret-env exclusion, or exact test target a source of truth for future edits.
    • Recommendation: Add a dedicated validateInferenceRoutingVitestJob validator with the same security/artifact/job-selector invariants used for other free-standing live jobs, plus negative support tests for missing needs, unpinned actions, persist-credentials true, unsafe artifact paths, secret env exposure, and missing report-to-pr dependency.
    • Evidence: .github/workflows/e2e-vitest-scenarios.yaml adds inference-routing-vitest and report-to-pr.needs includes it, but tools/e2e-scenarios/workflow-boundary.mts calls only validateFreeStandingJobSelector(errors, jobs, "inference-routing-vitest", "inference-routing").
  • Cover and document the provider-smoke explicit opt-in behavior (test/e2e-scenario/live/inference-routing.test.ts:46): The migration intentionally changes optional provider smokes from legacy run-when-secret-exists behavior to run only when both secrets and NEMOCLAW_INFERENCE_ROUTING_PROVIDER_SMOKE opt-in are present. That is a reasonable quota-safety boundary, but it is a behavior/source-of-truth change without deterministic coverage proving the selector matrix or a documented removal condition.
    • Recommendation: Add a small deterministic helper-level test showing provider smokes skip when a provider secret exists but the selector is unset, run only for the requested provider, and run for all providers when the selector is all. Also document the source boundary and whether this opt-in is permanent or removable once a formal secret-backed provider-smoke lane exists.
    • Evidence: shouldRunProviderSmoke() requires 1, true, all, or the provider name even when the provider API key is present, while the legacy shell comments state provider smokes skipped only when required API keys were not set.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@jyaunches jyaunches reopened this Jun 11, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 27354444694
Workflow ref: e2e-migrate/test-inference-routing
Requested scenarios: (default — all supported)
Summary: 4 passed, 2 failed, 0 skipped

Job Result
gateway-guard-recovery ❌ failure
generate-matrix ✅ success
live-scenarios ✅ success
onboard-negative-paths-vitest ✅ success
openclaw-tui-chat-correlation-vitest ❌ failure
openshell-version-pin-vitest ✅ success

Failed jobs: gateway-guard-recovery, openclaw-tui-chat-correlation-vitest. Check run artifacts for logs.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/e2e-vitest-scenarios.yaml:
- Around line 258-303: The new GitHub Actions job inference-routing-vitest is
omitted from the report-to-pr dependency list so PR result aggregation can
finish before this job completes; update the report-to-pr job's needs array to
include inference-routing-vitest (add the job name inference-routing-vitest to
the needs/dependencies referenced by report-to-pr) so the PR comment waits for
and reflects this job's status.

In `@test/e2e-scenario/live/inference-routing.test.ts`:
- Around line 869-871: The code uses the nullish coalescing operator (??) so an
empty NEMOCLAW_ENDPOINT_URL string won't trigger skipLive; change the resolution
to treat empty/whitespace as missing by trimming and checking truthiness before
calling skipLive. For example, read process.env.NEMOCLAW_ENDPOINT_URL into a
variable, do const endpointUrl = raw?.trim() && raw.trim() || skipLive(skip,
"Missing NEMOCLAW_ENDPOINT_URL, NEMOCLAW_COMPAT_MODEL, or COMPATIBLE_API_KEY");
so the variable endpointUrl (and the env var name NEMOCLAW_ENDPOINT_URL) and
function skipLive/skip are the reference points to find and update.
- Around line 351-355: The test currently flags any occurrence of "running" or
"ready" in resultText(status) as active even when negated; update the assertion
that uses resultText(status) and sandboxName so it only treats standalone
positive statuses as active — e.g., mirror the stricter status check used
earlier by matching whole words (word boundaries) or additionally ensuring
phrases like "not running"/"not ready" are excluded before declaring the sandbox
active; modify the expect call that wraps /running|ready/i.test(text) to use
this stricter logic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 68a81c62-b111-483c-9094-a91aba1ed4c5

📥 Commits

Reviewing files that changed from the base of the PR and between be0ed7b and c63d311.

📒 Files selected for processing (2)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/live/inference-routing.test.ts

Comment thread .github/workflows/e2e-vitest-scenarios.yaml
Comment on lines +351 to +355
const text = resultText(status);
expect(
/running|ready/i.test(text),
`sandbox '${sandboxName}' is still active after failed onboard: ${text}`,
).toBe(false);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t treat negated status text as an active sandbox.

/running|ready/i also matches "not running" and "not ready", so a successful cleanup can still fail this assertion if the CLI reports the inactive state in prose. Mirror the stricter status handling used above, or at least exclude negated phrases before declaring the sandbox active.

Suggested fix
 async function expectNoActiveSandbox(host: HostCliClient, sandboxName: string): Promise<void> {
   const status = await host.command(process.execPath, [CLI_ENTRYPOINT, sandboxName, "status"], {
     artifactName: `post-failure-status-${sandboxName}`,
     env: buildAvailabilityProbeEnv(),
     timeoutMs: 30_000,
   });
   const text = resultText(status);
+  const reportsActive =
+    /\b(?:running|ready)\b/i.test(text) && !/\bnot\s+(?:running|ready)\b/i.test(text);
   expect(
-    /running|ready/i.test(text),
+    reportsActive,
     `sandbox '${sandboxName}' is still active after failed onboard: ${text}`,
   ).toBe(false);
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e-scenario/live/inference-routing.test.ts` around lines 351 - 355, The
test currently flags any occurrence of "running" or "ready" in
resultText(status) as active even when negated; update the assertion that uses
resultText(status) and sandboxName so it only treats standalone positive
statuses as active — e.g., mirror the stricter status check used earlier by
matching whole words (word boundaries) or additionally ensuring phrases like
"not running"/"not ready" are excluded before declaring the sandbox active;
modify the expect call that wraps /running|ready/i.test(text) to use this
stricter logic.

Comment on lines +869 to +871
const endpointUrl =
process.env.NEMOCLAW_ENDPOINT_URL ??
skipLive(skip, "Missing NEMOCLAW_ENDPOINT_URL, NEMOCLAW_COMPAT_MODEL, or COMPATIBLE_API_KEY");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Treat an empty NEMOCLAW_ENDPOINT_URL as missing.

?? only skips on null/undefined. If the env var is present but empty, this path tries to onboard with a blank endpoint instead of cleanly skipping the smoke test.

Suggested fix
-    const endpointUrl =
-      process.env.NEMOCLAW_ENDPOINT_URL ??
+    const endpointUrl =
+      process.env.NEMOCLAW_ENDPOINT_URL?.trim() ||
       skipLive(skip, "Missing NEMOCLAW_ENDPOINT_URL, NEMOCLAW_COMPAT_MODEL, or COMPATIBLE_API_KEY");
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const endpointUrl =
process.env.NEMOCLAW_ENDPOINT_URL ??
skipLive(skip, "Missing NEMOCLAW_ENDPOINT_URL, NEMOCLAW_COMPAT_MODEL, or COMPATIBLE_API_KEY");
const endpointUrl =
process.env.NEMOCLAW_ENDPOINT_URL?.trim() ||
skipLive(skip, "Missing NEMOCLAW_ENDPOINT_URL, NEMOCLAW_COMPAT_MODEL, or COMPATIBLE_API_KEY");
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e-scenario/live/inference-routing.test.ts` around lines 869 - 871, The
code uses the nullish coalescing operator (??) so an empty NEMOCLAW_ENDPOINT_URL
string won't trigger skipLive; change the resolution to treat empty/whitespace
as missing by trimming and checking truthiness before calling skipLive. For
example, read process.env.NEMOCLAW_ENDPOINT_URL into a variable, do const
endpointUrl = raw?.trim() && raw.trim() || skipLive(skip, "Missing
NEMOCLAW_ENDPOINT_URL, NEMOCLAW_COMPAT_MODEL, or COMPATIBLE_API_KEY"); so the
variable endpointUrl (and the env var name NEMOCLAW_ENDPOINT_URL) and function
skipLive/skip are the reference points to find and update.

@jyaunches jyaunches changed the title test(e2e): add inference routing Vitest coverage test(e2e): P1 anchor 3 migrate test-inference-routing.sh to vitest Jun 11, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 27355075383
Workflow ref: e2e-migrate/test-inference-routing
Requested scenarios: (default — all supported)
Summary: 1 passed, 1 failed, 4 skipped

Job Result
gateway-guard-recovery ❌ failure
generate-matrix ✅ success
live-scenarios ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped

Failed jobs: gateway-guard-recovery. Check run artifacts for logs.

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 27355314519
Workflow ref: e2e-migrate/test-inference-routing
Requested scenarios: (default — all supported)
Summary: 1 passed, 1 failed, 4 skipped

Job Result
gateway-guard-recovery ❌ failure
generate-matrix ✅ success
live-scenarios ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped

Failed jobs: gateway-guard-recovery. Check run artifacts for logs.

@jyaunches jyaunches changed the title test(e2e): P1 anchor 3 migrate test-inference-routing.sh to vitest test(e2e): add inference routing Vitest coverage Jun 11, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27355680619
Workflow ref: e2e-migrate/test-inference-routing
Requested scenarios: (default — all supported)
Summary: 2 passed, 0 failed, 5 skipped

Job Result
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
inference-routing-vitest ✅ success
live-scenarios ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tools/e2e-scenarios/workflow-boundary.mts (1)

417-419: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Require workflow_dispatch.inputs.jobs explicitly.

At Line 418, the validator only requires scenarios, but job gating now depends on inputs.jobs (enforced by requireFreeStandingJobSelector). Without validating the input exists, this boundary check can pass a workflow that violates the selector contract.

Suggested fix
   const dispatchInputs = asRecord(workflowDispatch.inputs);
   requireInput(errors, dispatchInputs, "scenarios");
+  requireInput(errors, dispatchInputs, "jobs");
   if (Object.hasOwn(dispatchInputs, "test_filter")) {
     errors.push("workflow_dispatch must not expose legacy test_filter input");
   }

Based on learnings from the provided PR objectives/review stack context, this PR introduces workflow dispatch/job selection via inputs.jobs, so the validator should assert that contract directly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/e2e-scenarios/workflow-boundary.mts` around lines 417 - 419, The
validator currently only requires "scenarios" (via requireInput(errors,
dispatchInputs, "scenarios")) but must also assert the presence of
workflowDispatch.inputs.jobs because job gating relies on it; update the
validation to call requireInput for "jobs" (or otherwise check
Object.hasOwn(dispatchInputs, "jobs")) before invoking
requireFreeStandingJobSelector so the selector contract is enforced (refer to
dispatchInputs, requireInput, requireFreeStandingJobSelector and
workflowDispatch.inputs.jobs).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tools/e2e-scenarios/workflow-boundary.mts`:
- Around line 307-309: The current check only trims the whole uploadPath string
and misses multiline values; update the validator around uploadPath and errors
so you split uploadPath into lines (e.g. uploadPath.split(/\r?\n/).map(s =>
s.trim())), then fail when any line equals "e2e-artifacts/vitest/" or when any
line startsWith("e2e-artifacts/vitest/") while none of the lines
startWith("e2e-artifacts/vitest/inference-routing/"); push the same error
message into errors for the inference-routing-vitest case.

---

Outside diff comments:
In `@tools/e2e-scenarios/workflow-boundary.mts`:
- Around line 417-419: The validator currently only requires "scenarios" (via
requireInput(errors, dispatchInputs, "scenarios")) but must also assert the
presence of workflowDispatch.inputs.jobs because job gating relies on it; update
the validation to call requireInput for "jobs" (or otherwise check
Object.hasOwn(dispatchInputs, "jobs")) before invoking
requireFreeStandingJobSelector so the selector contract is enforced (refer to
dispatchInputs, requireInput, requireFreeStandingJobSelector and
workflowDispatch.inputs.jobs).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f5c4a24c-5706-4859-a694-38841868d977

📥 Commits

Reviewing files that changed from the base of the PR and between c63d311 and c3b9a68.

📒 Files selected for processing (4)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/live/inference-routing.test.ts
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • tools/e2e-scenarios/workflow-boundary.mts
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/e2e-scenario/live/inference-routing.test.ts

Comment on lines +307 to +309
if (uploadPath.trim() === "e2e-artifacts/vitest/") {
errors.push("inference-routing-vitest artifact upload path must not list all Vitest artifacts");
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Harden inference artifact path validation for multiline path values.

At Line 307, uploadPath.trim() === "e2e-artifacts/vitest/" only catches a single-line exact value. A multiline path that includes both e2e-artifacts/vitest/ and e2e-artifacts/vitest/inference-routing/ bypasses this check and can still upload overly broad artifacts.

Suggested fix
-  if (uploadPath.trim() === "e2e-artifacts/vitest/") {
-    errors.push("inference-routing-vitest artifact upload path must not list all Vitest artifacts");
-  }
+  for (const line of uploadPath.split("\n")) {
+    if (line.trim() === "e2e-artifacts/vitest/") {
+      errors.push("inference-routing-vitest artifact upload path must not list all Vitest artifacts");
+    }
+  }

Based on learnings from the provided workflow contract snippet, this validator is intended to enforce scoped artifact upload for inference-routing-vitest, so broad parent-path inclusion should be blocked robustly.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (uploadPath.trim() === "e2e-artifacts/vitest/") {
errors.push("inference-routing-vitest artifact upload path must not list all Vitest artifacts");
}
for (const line of uploadPath.split("\n")) {
if (line.trim() === "e2e-artifacts/vitest/") {
errors.push("inference-routing-vitest artifact upload path must not list all Vitest artifacts");
}
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/e2e-scenarios/workflow-boundary.mts` around lines 307 - 309, The
current check only trims the whole uploadPath string and misses multiline
values; update the validator around uploadPath and errors so you split
uploadPath into lines (e.g. uploadPath.split(/\r?\n/).map(s => s.trim())), then
fail when any line equals "e2e-artifacts/vitest/" or when any line
startsWith("e2e-artifacts/vitest/") while none of the lines
startWith("e2e-artifacts/vitest/inference-routing/"); push the same error
message into errors for the inference-routing-vitest case.

@jyaunches jyaunches changed the title test(e2e): add inference routing Vitest coverage test(e2e): add inference routing Vitest coverage [ANCHOR-3] Jun 11, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27356301918
Workflow ref: e2e-migrate/test-inference-routing
Requested scenarios: (default — all supported)
Summary: 3 passed, 0 failed, 5 skipped

Job Result
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
inference-routing-vitest ✅ success
live-scenarios ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
validate-jobs ✅ success

@cv cv added the v0.0.64 Release target label Jun 11, 2026
…rence-routing

# Conflicts:
#	.github/workflows/e2e-vitest-scenarios.yaml
#	test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
#	tools/e2e-scenarios/workflow-boundary.mts
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27377304492
Workflow ref: e2e-migrate/test-inference-routing
Requested scenarios: (default — all supported)
Requested jobs: inference-routing-vitest
Summary: 3 passed, 0 failed, 5 skipped

Job Result
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
inference-routing-vitest ✅ success
live-scenarios ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
validate-jobs ✅ success

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts`:
- Around line 34-40: generateStep?.run is being cast to string and passed to
spawnSync, which can silently convert undefined to "undefined"; instead,
explicitly guard and fail if missing: read generateStep?.run into a local (e.g.,
const cmd = generateStep?.run), assert it is defined (throw or
expect(cmd).toBeDefined() / if (!cmd) throw new Error('generateStep.run is
undefined')), then pass that confirmed string to spawnSync (spawnSync("bash",
["-c", cmd], ...)). Update references to generateStep?.run in this block (the
spawnSync call) to use the validated local variable so the test fails clearly
when generateStep is missing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c971d81c-f8a5-4d46-b2b4-cba0d2eb3fc7

📥 Commits

Reviewing files that changed from the base of the PR and between ef988df and d8fba9c.

📒 Files selected for processing (3)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • tools/e2e-scenarios/workflow-boundary.mts
💤 Files with no reviewable changes (1)
  • tools/e2e-scenarios/workflow-boundary.mts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts`:
- Around line 34-40: generateStep?.run is being cast to string and passed to
spawnSync, which can silently convert undefined to "undefined"; instead,
explicitly guard and fail if missing: read generateStep?.run into a local (e.g.,
const cmd = generateStep?.run), assert it is defined (throw or
expect(cmd).toBeDefined() / if (!cmd) throw new Error('generateStep.run is
undefined')), then pass that confirmed string to spawnSync (spawnSync("bash",
["-c", cmd], ...)). Update references to generateStep?.run in this block (the
spawnSync call) to use the validated local variable so the test fails clearly
when generateStep is missing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c971d81c-f8a5-4d46-b2b4-cba0d2eb3fc7

📥 Commits

Reviewing files that changed from the base of the PR and between ef988df and d8fba9c.

📒 Files selected for processing (3)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • tools/e2e-scenarios/workflow-boundary.mts
💤 Files with no reviewable changes (1)
  • tools/e2e-scenarios/workflow-boundary.mts
🛑 Comments failed to post (1)
test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts (1)

34-40: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Guard against undefined generateStep before type assertion.

Line 34's expectation will correctly fail if generateStep?.run is undefined, but line 40's type assertion generateStep?.run as string will cast undefined to the string "undefined" and pass it to bash, causing a cryptic runtime error instead of a clear test failure.

🛡️ Proposed fix
   const generateStep = jobs["generate-matrix"]?.steps?.find(
     (step) => step.name === "Generate Vitest scenario matrix",
   );
-  expect(generateStep?.run).toEqual(expect.any(String));
+  expect(generateStep).toBeDefined();
+  expect(generateStep!.run).toEqual(expect.any(String));
 
   const tmp = fs.mkdtempSync(path.join(os.tmpdir(), "e2e-vitest-matrix-"));
   const outputPath = path.join(tmp, "github-output");
   const summaryPath = path.join(tmp, "github-summary");
   try {
-    const result = spawnSync("bash", ["-c", generateStep?.run as string], {
+    const result = spawnSync("bash", ["-c", generateStep!.run as string], {
       cwd: process.cwd(),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  expect(generateStep).toBeDefined();
  expect(generateStep!.run).toEqual(expect.any(String));

  const tmp = fs.mkdtempSync(path.join(os.tmpdir(), "e2e-vitest-matrix-"));
  const outputPath = path.join(tmp, "github-output");
  const summaryPath = path.join(tmp, "github-summary");
  try {
    const result = spawnSync("bash", ["-c", generateStep!.run as string], {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts` around lines
34 - 40, generateStep?.run is being cast to string and passed to spawnSync,
which can silently convert undefined to "undefined"; instead, explicitly guard
and fail if missing: read generateStep?.run into a local (e.g., const cmd =
generateStep?.run), assert it is defined (throw or expect(cmd).toBeDefined() /
if (!cmd) throw new Error('generateStep.run is undefined')), then pass that
confirmed string to spawnSync (spawnSync("bash", ["-c", cmd], ...)). Update
references to generateStep?.run in this block (the spawnSync call) to use the
validated local variable so the test fails clearly when generateStep is missing.

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27390570782
Workflow ref: e2e-migrate/test-inference-routing
Requested scenarios: (default — all supported)
Requested jobs: inference-routing-vitest
Summary: 3 passed, 0 failed, 14 skipped

Job Result
credential-migration-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
hermes-e2e-vitest ⏭️ skipped
inference-routing-vitest ✅ success
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped
validate-jobs ✅ success

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: e2e End-to-end tests, nightly failures, or validation infrastructure chore Build, CI, dependency, or tooling maintenance v0.0.64 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants