test(e2e): add inference routing Vitest coverage [ANCHOR-3]#5231
Conversation
ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (3)
💤 Files with no reviewable changes (1)
📝 WalkthroughWalkthroughAdds a Vitest live E2E inference-routing test suite with sandbox lifecycle, subprocess execution, credential-isolation and provider smoke checks, and integrates it into the e2e Vitest GitHub Actions workflow with a new ChangesE2E Inference Routing Tests
CI Workflow & Validators
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
This repository limits contributors to 10 open pull requests. Please close or merge existing PRs before opening new ones. |
E2E Advisor RecommendationRequired E2E: Dispatch hint: Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
Dispatch hint
|
Vitest E2E Scenario RecommendationRequired Vitest E2E scenarios: Dispatch required Vitest E2E scenarios:
Full Vitest E2E advisor summaryVitest E2E Scenario AdvisorBase: Required Vitest E2E scenarios
Optional Vitest E2E scenarios
Relevant changed files
|
PR Review AdvisorFindings: 0 needs attention, 3 worth checking, 0 nice ideas Review findings🛠️ Needs attention
🔎 Worth checking
🌱 Nice ideas
Consider writing more tests for
Since last review detailsCurrent findings:
This is an automated advisory review. A human maintainer must make the final merge decision. |
Vitest E2E Scenario Results — ❌ Some jobs failedRun: 27354444694
|
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/e2e-vitest-scenarios.yaml:
- Around line 258-303: The new GitHub Actions job inference-routing-vitest is
omitted from the report-to-pr dependency list so PR result aggregation can
finish before this job completes; update the report-to-pr job's needs array to
include inference-routing-vitest (add the job name inference-routing-vitest to
the needs/dependencies referenced by report-to-pr) so the PR comment waits for
and reflects this job's status.
In `@test/e2e-scenario/live/inference-routing.test.ts`:
- Around line 869-871: The code uses the nullish coalescing operator (??) so an
empty NEMOCLAW_ENDPOINT_URL string won't trigger skipLive; change the resolution
to treat empty/whitespace as missing by trimming and checking truthiness before
calling skipLive. For example, read process.env.NEMOCLAW_ENDPOINT_URL into a
variable, do const endpointUrl = raw?.trim() && raw.trim() || skipLive(skip,
"Missing NEMOCLAW_ENDPOINT_URL, NEMOCLAW_COMPAT_MODEL, or COMPATIBLE_API_KEY");
so the variable endpointUrl (and the env var name NEMOCLAW_ENDPOINT_URL) and
function skipLive/skip are the reference points to find and update.
- Around line 351-355: The test currently flags any occurrence of "running" or
"ready" in resultText(status) as active even when negated; update the assertion
that uses resultText(status) and sandboxName so it only treats standalone
positive statuses as active — e.g., mirror the stricter status check used
earlier by matching whole words (word boundaries) or additionally ensuring
phrases like "not running"/"not ready" are excluded before declaring the sandbox
active; modify the expect call that wraps /running|ready/i.test(text) to use
this stricter logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 68a81c62-b111-483c-9094-a91aba1ed4c5
📒 Files selected for processing (2)
.github/workflows/e2e-vitest-scenarios.yamltest/e2e-scenario/live/inference-routing.test.ts
| const text = resultText(status); | ||
| expect( | ||
| /running|ready/i.test(text), | ||
| `sandbox '${sandboxName}' is still active after failed onboard: ${text}`, | ||
| ).toBe(false); |
There was a problem hiding this comment.
Don’t treat negated status text as an active sandbox.
/running|ready/i also matches "not running" and "not ready", so a successful cleanup can still fail this assertion if the CLI reports the inactive state in prose. Mirror the stricter status handling used above, or at least exclude negated phrases before declaring the sandbox active.
Suggested fix
async function expectNoActiveSandbox(host: HostCliClient, sandboxName: string): Promise<void> {
const status = await host.command(process.execPath, [CLI_ENTRYPOINT, sandboxName, "status"], {
artifactName: `post-failure-status-${sandboxName}`,
env: buildAvailabilityProbeEnv(),
timeoutMs: 30_000,
});
const text = resultText(status);
+ const reportsActive =
+ /\b(?:running|ready)\b/i.test(text) && !/\bnot\s+(?:running|ready)\b/i.test(text);
expect(
- /running|ready/i.test(text),
+ reportsActive,
`sandbox '${sandboxName}' is still active after failed onboard: ${text}`,
).toBe(false);
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/e2e-scenario/live/inference-routing.test.ts` around lines 351 - 355, The
test currently flags any occurrence of "running" or "ready" in
resultText(status) as active even when negated; update the assertion that uses
resultText(status) and sandboxName so it only treats standalone positive
statuses as active — e.g., mirror the stricter status check used earlier by
matching whole words (word boundaries) or additionally ensuring phrases like
"not running"/"not ready" are excluded before declaring the sandbox active;
modify the expect call that wraps /running|ready/i.test(text) to use this
stricter logic.
| const endpointUrl = | ||
| process.env.NEMOCLAW_ENDPOINT_URL ?? | ||
| skipLive(skip, "Missing NEMOCLAW_ENDPOINT_URL, NEMOCLAW_COMPAT_MODEL, or COMPATIBLE_API_KEY"); |
There was a problem hiding this comment.
Treat an empty NEMOCLAW_ENDPOINT_URL as missing.
?? only skips on null/undefined. If the env var is present but empty, this path tries to onboard with a blank endpoint instead of cleanly skipping the smoke test.
Suggested fix
- const endpointUrl =
- process.env.NEMOCLAW_ENDPOINT_URL ??
+ const endpointUrl =
+ process.env.NEMOCLAW_ENDPOINT_URL?.trim() ||
skipLive(skip, "Missing NEMOCLAW_ENDPOINT_URL, NEMOCLAW_COMPAT_MODEL, or COMPATIBLE_API_KEY");📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const endpointUrl = | |
| process.env.NEMOCLAW_ENDPOINT_URL ?? | |
| skipLive(skip, "Missing NEMOCLAW_ENDPOINT_URL, NEMOCLAW_COMPAT_MODEL, or COMPATIBLE_API_KEY"); | |
| const endpointUrl = | |
| process.env.NEMOCLAW_ENDPOINT_URL?.trim() || | |
| skipLive(skip, "Missing NEMOCLAW_ENDPOINT_URL, NEMOCLAW_COMPAT_MODEL, or COMPATIBLE_API_KEY"); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/e2e-scenario/live/inference-routing.test.ts` around lines 869 - 871, The
code uses the nullish coalescing operator (??) so an empty NEMOCLAW_ENDPOINT_URL
string won't trigger skipLive; change the resolution to treat empty/whitespace
as missing by trimming and checking truthiness before calling skipLive. For
example, read process.env.NEMOCLAW_ENDPOINT_URL into a variable, do const
endpointUrl = raw?.trim() && raw.trim() || skipLive(skip, "Missing
NEMOCLAW_ENDPOINT_URL, NEMOCLAW_COMPAT_MODEL, or COMPATIBLE_API_KEY"); so the
variable endpointUrl (and the env var name NEMOCLAW_ENDPOINT_URL) and function
skipLive/skip are the reference points to find and update.
Vitest E2E Scenario Results — ❌ Some jobs failedRun: 27355075383
|
Vitest E2E Scenario Results — ❌ Some jobs failedRun: 27355314519
|
Vitest E2E Scenario Results — ✅ All jobs passedRun: 27355680619
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tools/e2e-scenarios/workflow-boundary.mts (1)
417-419:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRequire
workflow_dispatch.inputs.jobsexplicitly.At Line 418, the validator only requires
scenarios, but job gating now depends oninputs.jobs(enforced byrequireFreeStandingJobSelector). Without validating the input exists, this boundary check can pass a workflow that violates the selector contract.Suggested fix
const dispatchInputs = asRecord(workflowDispatch.inputs); requireInput(errors, dispatchInputs, "scenarios"); + requireInput(errors, dispatchInputs, "jobs"); if (Object.hasOwn(dispatchInputs, "test_filter")) { errors.push("workflow_dispatch must not expose legacy test_filter input"); }Based on learnings from the provided PR objectives/review stack context, this PR introduces workflow dispatch/job selection via
inputs.jobs, so the validator should assert that contract directly.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/e2e-scenarios/workflow-boundary.mts` around lines 417 - 419, The validator currently only requires "scenarios" (via requireInput(errors, dispatchInputs, "scenarios")) but must also assert the presence of workflowDispatch.inputs.jobs because job gating relies on it; update the validation to call requireInput for "jobs" (or otherwise check Object.hasOwn(dispatchInputs, "jobs")) before invoking requireFreeStandingJobSelector so the selector contract is enforced (refer to dispatchInputs, requireInput, requireFreeStandingJobSelector and workflowDispatch.inputs.jobs).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tools/e2e-scenarios/workflow-boundary.mts`:
- Around line 307-309: The current check only trims the whole uploadPath string
and misses multiline values; update the validator around uploadPath and errors
so you split uploadPath into lines (e.g. uploadPath.split(/\r?\n/).map(s =>
s.trim())), then fail when any line equals "e2e-artifacts/vitest/" or when any
line startsWith("e2e-artifacts/vitest/") while none of the lines
startWith("e2e-artifacts/vitest/inference-routing/"); push the same error
message into errors for the inference-routing-vitest case.
---
Outside diff comments:
In `@tools/e2e-scenarios/workflow-boundary.mts`:
- Around line 417-419: The validator currently only requires "scenarios" (via
requireInput(errors, dispatchInputs, "scenarios")) but must also assert the
presence of workflowDispatch.inputs.jobs because job gating relies on it; update
the validation to call requireInput for "jobs" (or otherwise check
Object.hasOwn(dispatchInputs, "jobs")) before invoking
requireFreeStandingJobSelector so the selector contract is enforced (refer to
dispatchInputs, requireInput, requireFreeStandingJobSelector and
workflowDispatch.inputs.jobs).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: f5c4a24c-5706-4859-a694-38841868d977
📒 Files selected for processing (4)
.github/workflows/e2e-vitest-scenarios.yamltest/e2e-scenario/live/inference-routing.test.tstest/e2e-scenario/support-tests/e2e-scenarios-workflow.test.tstools/e2e-scenarios/workflow-boundary.mts
🚧 Files skipped from review as they are similar to previous changes (1)
- test/e2e-scenario/live/inference-routing.test.ts
| if (uploadPath.trim() === "e2e-artifacts/vitest/") { | ||
| errors.push("inference-routing-vitest artifact upload path must not list all Vitest artifacts"); | ||
| } |
There was a problem hiding this comment.
Harden inference artifact path validation for multiline path values.
At Line 307, uploadPath.trim() === "e2e-artifacts/vitest/" only catches a single-line exact value. A multiline path that includes both e2e-artifacts/vitest/ and e2e-artifacts/vitest/inference-routing/ bypasses this check and can still upload overly broad artifacts.
Suggested fix
- if (uploadPath.trim() === "e2e-artifacts/vitest/") {
- errors.push("inference-routing-vitest artifact upload path must not list all Vitest artifacts");
- }
+ for (const line of uploadPath.split("\n")) {
+ if (line.trim() === "e2e-artifacts/vitest/") {
+ errors.push("inference-routing-vitest artifact upload path must not list all Vitest artifacts");
+ }
+ }Based on learnings from the provided workflow contract snippet, this validator is intended to enforce scoped artifact upload for inference-routing-vitest, so broad parent-path inclusion should be blocked robustly.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (uploadPath.trim() === "e2e-artifacts/vitest/") { | |
| errors.push("inference-routing-vitest artifact upload path must not list all Vitest artifacts"); | |
| } | |
| for (const line of uploadPath.split("\n")) { | |
| if (line.trim() === "e2e-artifacts/vitest/") { | |
| errors.push("inference-routing-vitest artifact upload path must not list all Vitest artifacts"); | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tools/e2e-scenarios/workflow-boundary.mts` around lines 307 - 309, The
current check only trims the whole uploadPath string and misses multiline
values; update the validator around uploadPath and errors so you split
uploadPath into lines (e.g. uploadPath.split(/\r?\n/).map(s => s.trim())), then
fail when any line equals "e2e-artifacts/vitest/" or when any line
startsWith("e2e-artifacts/vitest/") while none of the lines
startWith("e2e-artifacts/vitest/inference-routing/"); push the same error
message into errors for the inference-routing-vitest case.
Vitest E2E Scenario Results — ✅ All jobs passedRun: 27356301918
|
…rence-routing # Conflicts: # .github/workflows/e2e-vitest-scenarios.yaml # test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts # tools/e2e-scenarios/workflow-boundary.mts
Vitest E2E Scenario Results — ✅ All jobs passedRun: 27377304492
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts`:
- Around line 34-40: generateStep?.run is being cast to string and passed to
spawnSync, which can silently convert undefined to "undefined"; instead,
explicitly guard and fail if missing: read generateStep?.run into a local (e.g.,
const cmd = generateStep?.run), assert it is defined (throw or
expect(cmd).toBeDefined() / if (!cmd) throw new Error('generateStep.run is
undefined')), then pass that confirmed string to spawnSync (spawnSync("bash",
["-c", cmd], ...)). Update references to generateStep?.run in this block (the
spawnSync call) to use the validated local variable so the test fails clearly
when generateStep is missing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: c971d81c-f8a5-4d46-b2b4-cba0d2eb3fc7
📒 Files selected for processing (3)
.github/workflows/e2e-vitest-scenarios.yamltest/e2e-scenario/support-tests/e2e-scenarios-workflow.test.tstools/e2e-scenarios/workflow-boundary.mts
💤 Files with no reviewable changes (1)
- tools/e2e-scenarios/workflow-boundary.mts
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts`:
- Around line 34-40: generateStep?.run is being cast to string and passed to
spawnSync, which can silently convert undefined to "undefined"; instead,
explicitly guard and fail if missing: read generateStep?.run into a local (e.g.,
const cmd = generateStep?.run), assert it is defined (throw or
expect(cmd).toBeDefined() / if (!cmd) throw new Error('generateStep.run is
undefined')), then pass that confirmed string to spawnSync (spawnSync("bash",
["-c", cmd], ...)). Update references to generateStep?.run in this block (the
spawnSync call) to use the validated local variable so the test fails clearly
when generateStep is missing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: c971d81c-f8a5-4d46-b2b4-cba0d2eb3fc7
📒 Files selected for processing (3)
.github/workflows/e2e-vitest-scenarios.yamltest/e2e-scenario/support-tests/e2e-scenarios-workflow.test.tstools/e2e-scenarios/workflow-boundary.mts
💤 Files with no reviewable changes (1)
- tools/e2e-scenarios/workflow-boundary.mts
🛑 Comments failed to post (1)
test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts (1)
34-40:
⚠️ Potential issue | 🔴 Critical | ⚡ Quick winGuard against undefined
generateStepbefore type assertion.Line 34's expectation will correctly fail if
generateStep?.runis undefined, but line 40's type assertiongenerateStep?.run as stringwill castundefinedto the string"undefined"and pass it to bash, causing a cryptic runtime error instead of a clear test failure.🛡️ Proposed fix
const generateStep = jobs["generate-matrix"]?.steps?.find( (step) => step.name === "Generate Vitest scenario matrix", ); - expect(generateStep?.run).toEqual(expect.any(String)); + expect(generateStep).toBeDefined(); + expect(generateStep!.run).toEqual(expect.any(String)); const tmp = fs.mkdtempSync(path.join(os.tmpdir(), "e2e-vitest-matrix-")); const outputPath = path.join(tmp, "github-output"); const summaryPath = path.join(tmp, "github-summary"); try { - const result = spawnSync("bash", ["-c", generateStep?.run as string], { + const result = spawnSync("bash", ["-c", generateStep!.run as string], { cwd: process.cwd(),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.expect(generateStep).toBeDefined(); expect(generateStep!.run).toEqual(expect.any(String)); const tmp = fs.mkdtempSync(path.join(os.tmpdir(), "e2e-vitest-matrix-")); const outputPath = path.join(tmp, "github-output"); const summaryPath = path.join(tmp, "github-summary"); try { const result = spawnSync("bash", ["-c", generateStep!.run as string], {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts` around lines 34 - 40, generateStep?.run is being cast to string and passed to spawnSync, which can silently convert undefined to "undefined"; instead, explicitly guard and fail if missing: read generateStep?.run into a local (e.g., const cmd = generateStep?.run), assert it is defined (throw or expect(cmd).toBeDefined() / if (!cmd) throw new Error('generateStep.run is undefined')), then pass that confirmed string to spawnSync (spawnSync("bash", ["-c", cmd], ...)). Update references to generateStep?.run in this block (the spawnSync call) to use the validated local variable so the test fails clearly when generateStep is missing.
Vitest E2E Scenario Results — ✅ All jobs passedRun: 27390570782
|
Summary
Migrate
test/e2e/test-inference-routing.shwith the simplest equivalent live Vitest coverage.Related Issues
Refs #5098
Contract mapping
test/e2e-scenario/live/inference-routing.test.ts/TC-INF-06 invalid API key fails with credential classification and cleanupnode bin/nemoclaw.js onboardon a Docker/OpenShell-capable runner.test/e2e-scenario/live/inference-routing.test.ts/TC-INF-07 unreachable endpoint fails with transport classification and cleanupNVIDIA_API_KEYis absent from sandbox env, process list, and sampled filesystem, and sandbox placeholder is not the real key.test/e2e-scenario/live/inference-routing.test.ts/TC-INF-05 real NVIDIA key is isolated from sandbox env, process list, and filesystemopenshell sandbox execprobes.https://inference.localfor OpenAI, Anthropic, and compatible endpoints when their secrets/opt-in env are present.curltoinference.local.Simplicity check
inference-routing-vitestfree-standing live job; legacy shell script and nightly shell lane remain for Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 Phase 11 retirement.Verification
npm run build:clinpx @biomejs/biome lint test/e2e-scenario/live/inference-routing.test.tsnpx tsc --noEmit --target ES2022 --module NodeNext --moduleResolution NodeNext --types node,vitest/globals --skipLibCheck --allowImportingTsExtensions test/e2e-scenario/live/inference-routing.test.tsNEMOCLAW_RUN_E2E_SCENARIOS=1 npx vitest run --project e2e-scenarios-live test/e2e-scenario/live/inference-routing.test.ts --silent=false --reporter=defaultnpx vitest run test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts --project e2e-vitest-supportinference-routing-vitestpassed)Summary by CodeRabbit
New Features
Tests
Chores