Skip to content

test(e2e): flag unwired live Vitest tests [SUPPORT]#5234

Merged
jyaunches merged 3 commits into
mainfrom
fix/vitest-scenario-advisor-unwired-live
Jun 11, 2026
Merged

test(e2e): flag unwired live Vitest tests [SUPPORT]#5234
jyaunches merged 3 commits into
mainfrom
fix/vitest-scenario-advisor-unwired-live

Conversation

@jyaunches

@jyaunches jyaunches commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

Update the Vitest E2E scenario advisor so it no longer recommends the fan-out workflow as proof for a newly added free-standing live Vitest file that is not wired into e2e-vitest-scenarios.yaml.

Context

Recent #5098 migration PRs can add test/e2e-scenario/live/*.test.ts without adding a workflow job. The advisor previously fell back to e2e-scenarios-all, but that dispatch cannot execute a free-standing test file unless the workflow references it.

Changes

  • Detect changed free-standing live Vitest tests that are not referenced by .github/workflows/e2e-vitest-scenarios.yaml.
  • Suppress misleading fan-out/targeted recommendations for PRs where the only scenario-relevant change is an unwired free-standing live test.
  • Return a no-scenario reason that tells authors to add a discrete workflow job or register the test as a typed live scenario.
  • Add regression coverage for unwired vs wired free-standing live tests.

Verification

  • npm ci --ignore-scripts
  • npx vitest run test/e2e-scenario-advisor.test.ts
  • git diff --check

Summary by CodeRabbit

  • Tests

    • Expanded tests for E2E scenario advisor normalization, covering selector-type handling and free-standing Vitest job extraction and normalization.
  • Chores

    • Recommendation schema now requires a selectorType discriminator ("all", "scenario", "job").
    • Normalization now suppresses scenario fan-out when changed free-standing live tests aren’t wired into the trusted workflow and surfaces a noScenarioE2eReason message.

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1262f39d-66db-49eb-b5cf-8771b5a265aa

📥 Commits

Reviewing files that changed from the base of the PR and between 68c40aa and a3dc52a.

📒 Files selected for processing (3)
  • test/e2e-scenario-advisor.test.ts
  • tools/e2e-advisor/scenarios-schema.json
  • tools/e2e-advisor/scenarios.mts

📝 Walkthrough

Walkthrough

Adds workflow-aware normalization for the E2E scenario advisor: it parses trusted e2e-vitest-scenarios.yaml, introduces a required selectorType, detects unwired free-standing Vitest live tests to suppress fan-out (setting noScenarioE2eReason), adjusts dispatch command generation, and updates tests/fixtures to cover the new behaviors.

Changes

Free-standing test unwiring detection

Layer / File(s) Summary
Types and workflow constants
tools/e2e-advisor/scenarios.mts
Introduce ScenarioSelectorType usage, trusted workflow path/regex constants, and extend canonicalDispatchCommand to accept selectorType.
Normalization, call-site, and decision policy
tools/e2e-advisor/scenarios.mts
Pass vitestWorkflowText into normalizeScenarioAdvisorResult; implement unwired free-standing-test detection, fan-out suppression rules, deterministic job recommendation generation, and update the system prompt decision policy.
Workflow parsing and helper utilities
tools/e2e-advisor/scenarios.mts
Add helpers to read/parse workflow YAML, extract free-standing Vitest jobs and their referenced live test lists, normalize selector types, detect unwired tests by comparing changed files, and deduplicate lists.
Schema: selectorType required
tools/e2e-advisor/scenarios-schema.json
Add selectorType property (enum: all, scenario, job) and require it in scenarioRecommendation.
Normalization contract tests and fixtures
test/e2e-scenario-advisor.test.ts
Update fixtures to include selectorType; add tests for suppression of e2e-scenarios-all on unwired free-standing tests, extraction of job selectors from workflow, preference of wired focused jobs over fan-out, acceptance of model-provided wired job recommendations, and update sampleResult for summary rendering.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#5105: Prior changes to tools/e2e-advisor/scenarios.mts and related tests for dispatch normalization; closely related code paths.
  • NVIDIA/NemoClaw#5107: Adds free-standing Vitest job wiring that this normalization must recognize.
  • NVIDIA/NemoClaw#5236: Adds another free-standing Vitest job wiring relevant to selector/job parsing.

Suggested labels

area: e2e

Suggested reviewers

  • cv
  • prekshivyas

Poem

🐰 I sniff the workflow's hidden ways,

I see which tests the actions praise,
If live tests hop but fail to bind,
I'll hush the fan-out for peace of mind. 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: flagging unwired live Vitest tests in the E2E scenario advisor. It directly reflects the core objective of suppressing misleading recommendations for free-standing live tests not wired into the workflow.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/vitest-scenario-advisor-unwired-live

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

Copy link
Copy Markdown
Contributor

This repository limits contributors to 10 open pull requests. Please close or merge existing PRs before opening new ones.

@github-actions github-actions Bot closed this Jun 11, 2026
@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: None
Optional E2E: None

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • None. No runtime/user-flow E2E is needed. This PR only changes CI advisor tooling/schema and its tests, which cannot directly affect NemoClaw installer/onboarding, sandbox lifecycle, credentials, security boundaries, network policy, inference routing, deployment, or real assistant behavior.

Optional E2E

  • None.

New E2E recommendations

  • None.

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: None
Optional Vitest E2E scenarios: None

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

  • None. Changes are limited to the Vitest E2E scenario advisor tooling, its schema, and advisor tests outside test/e2e-scenario/; they do not affect the live Vitest scenario workflow, registry, runtime support, fixtures, manifests, or scenario execution behavior.

Optional Vitest E2E scenarios

  • None.

Relevant changed files

  • None.

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 2 needs attention, 5 worth checking, 0 nice ideas
Since last review: 0 prior items resolved, 4 still apply, 1 new item found

Review findings

🛠️ Needs attention

  • Focused job dispatches are not backed by the current workflow contract (tools/e2e-advisor/scenarios.mts:74): The normalizer can emit `gh workflow run e2e-vitest-scenarios.yaml --field jobs=<id>` for free-standing live tests, but the current `.github/workflows/e2e-vitest-scenarios.yaml` only declares `scenarios` and `pr_number` workflow_dispatch inputs and has no `inputs.jobs` selectors for its free-standing jobs. That makes the new recommendation shape non-dispatchable against the trusted workflow source of truth.
    • Recommendation: Either add and test an actual `jobs` workflow_dispatch input plus per-job selectors in `.github/workflows/e2e-vitest-scenarios.yaml`, or keep recommendations limited to dispatch shapes the current workflow supports. Add a regression test using the real workflow shape so `--field jobs=...` is only produced when the workflow can honor it.
    • Evidence: `canonicalDispatchCommand()` returns `--field jobs=${id}` for `selectorType === "job"` at `tools/e2e-advisor/scenarios.mts:72-74`; `extractFreeStandingVitestJobs()` looks for `inputs.jobs` at line 422. The current workflow inputs at `.github/workflows/e2e-vitest-scenarios.yaml:8-18` are `scenarios` and `pr_number`, with no `jobs` input.
  • Workflow wiring detection still trusts non-executing text mentions (tools/e2e-advisor/scenarios.mts:442): The guard still treats a free-standing live test as wired whenever the workflow text contains the file path. A YAML comment, artifact path, environment variable, summary text, or unrelated shell snippet can therefore keep fan-out or targeted recommendations even though the workflow cannot execute the test. This leaves the core acceptance behavior only partially satisfied and weakens the workflow trusted-code boundary.
    • Recommendation: Validate wiring against an executable workflow shape, such as a parsed job with a `run:` step invoking `npx vitest run --project e2e-scenarios-live <file>`, or move dispatchability to an explicit registry/source of truth. Add negative tests for comments, artifact paths, environment variables, and summaries that mention the path without executing it.
    • Evidence: `findUnwiredFreeStandingLiveTests()` filters with `!(vitestWorkflowText ?? "").includes(file)` at `tools/e2e-advisor/scenarios.mts:434-442`. The added tests cover no path and synthetic executable text, but not non-executing references.

🔎 Worth checking

  • Source-of-truth review needed: Free-standing live Vitest workflow wiring detection: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `findUnwiredFreeStandingLiveTests()` uses raw `includes(file)` and `extractFreeStandingVitestJobs()` scans job bodies for `inputs.jobs`, `,<id>,`, and test paths.
  • Unwired-test suppression can drop recommendations for unrelated product changes (tools/e2e-advisor/scenarios.mts:451): The suppression decision filters the changed file set down to scenario-relevant paths before checking whether all remaining paths are unwired live tests or the workflow. If a PR changes product/source code plus an unwired free-standing live test, the product files are ignored and otherwise valid model recommendations can be removed.
    • Recommendation: Only suppress recommendations when the full changed-file set is limited to the unwired free-standing live test and optionally the workflow, or explicitly prove and test that non-scenario files cannot independently justify scenario recommendations.
    • Evidence: `shouldSuppressFanoutForUnwiredLiveTests()` computes `const relevantFiles = changedFiles.filter(isVitestScenarioRelevantFile)` and then checks only `relevantFiles.every(...)` at `tools/e2e-advisor/scenarios.mts:446-456`.
  • Free-standing job extraction is a raw-text heuristic rather than an executable workflow source of truth (tools/e2e-advisor/scenarios.mts:422): The new extractor accepts a job when its body contains `inputs.jobs`, `,<job-id>,`, and a live test file path. It does not verify that the workflow declares the input, that the `if:` expression gates the job on that input, or that the matching path appears in an executable Vitest run step. This can create deterministic focused recommendations from text that is not actually dispatchable.
    • Recommendation: Parse or otherwise validate the executable workflow structure, including workflow_dispatch input declaration, per-job selector predicate, and a `run:` step that executes the matching live test with the expected Vitest project. Keep the parser small and covered by negative tests.
    • Evidence: `extractFreeStandingVitestJobs()` checks `body.includes("inputs.jobs")`, `body.includes(`,${id},`)`, and a regex path match at `tools/e2e-advisor/scenarios.mts:405-431`.
  • Workflow text read fallback silently masks source-of-truth failures (tools/e2e-advisor/scenarios.mts:387): If the trusted workflow file cannot be read, the normalizer silently continues with `undefined` workflow text. That hides whether recommendations are based on the canonical workflow source, and there is no focused regression test for unavailable workflow text.
    • Recommendation: Handle unavailable workflow text explicitly: either fail normalization, surface a deterministic no-scenario reason, or narrowly handle expected filesystem errors with tests. Document why fallback is needed and when it can be removed.
    • Evidence: `readVitestWorkflowText()` catches all errors and returns `undefined` at `tools/e2e-advisor/scenarios.mts:384-389`.
  • Regression tests use synthetic wiring but miss current workflow and negative trusted-boundary cases (test/e2e-scenario-advisor.test.ts:355): The added tests cover a basic unwired case and synthetic `inputs.jobs` examples, but they do not pin the real current workflow shape or the non-executing-reference cases that previously caused the trusted workflow-boundary concern.
    • Recommendation: Add behavior tests that use the real workflow text or representative fixtures: no `jobs` input means no `--field jobs=...`, comments/artifact paths/env vars do not count as wiring, and mixed product changes preserve independent recommendations.
    • Evidence: New tests around `test/e2e-scenario-advisor.test.ts:355-452` exercise synthetic workflow text with `inputs.jobs`, while the actual `.github/workflows/e2e-vitest-scenarios.yaml` has no `jobs` input.

🌱 Nice ideas

  • None.
Consider writing more tests for
  • **Runtime validation** — Does not emit `--field jobs=<id>` when `e2e-vitest-scenarios.yaml` does not declare a `jobs` workflow_dispatch input.. The changed code affects CI/advisor infrastructure and workflow dispatch recommendations. Unit coverage exists for basic normalization, but stronger behavioral validation is needed around the trusted workflow boundary and current workflow dispatch contract.
  • **Runtime validation** — Treats a YAML comment mentioning `test/e2e-scenario/live/<name>.test.ts` as unwired.. The changed code affects CI/advisor infrastructure and workflow dispatch recommendations. Unit coverage exists for basic normalization, but stronger behavioral validation is needed around the trusted workflow boundary and current workflow dispatch contract.
  • **Runtime validation** — Treats artifact upload paths, environment variables, or step summaries mentioning a live test path as unwired unless a Vitest run step executes it.. The changed code affects CI/advisor infrastructure and workflow dispatch recommendations. Unit coverage exists for basic normalization, but stronger behavioral validation is needed around the trusted workflow boundary and current workflow dispatch contract.
  • **Runtime validation** — Requires an executable `npx vitest run --project e2e-scenarios-live <file>` step before considering a free-standing test wired.. The changed code affects CI/advisor infrastructure and workflow dispatch recommendations. Unit coverage exists for basic normalization, but stronger behavioral validation is needed around the trusted workflow boundary and current workflow dispatch contract.
  • **Runtime validation** — Keeps required scenario or fan-out recommendations when a product/source file and an unwired free-standing live test are both changed.. The changed code affects CI/advisor infrastructure and workflow dispatch recommendations. Unit coverage exists for basic normalization, but stronger behavioral validation is needed around the trusted workflow boundary and current workflow dispatch contract.
  • **Regression tests use synthetic wiring but miss current workflow and negative trusted-boundary cases** — Add behavior tests that use the real workflow text or representative fixtures: no `jobs` input means no `--field jobs=...`, comments/artifact paths/env vars do not count as wiring, and mixed product changes preserve independent recommendations.
  • **Acceptance clause:** Update the Vitest E2E scenario advisor so it no longer recommends the fan-out workflow as proof for a newly added free-standing live Vitest file that is not wired into `e2e-vitest-scenarios.yaml`. — add test evidence or identify existing coverage. `normalizeScenarioAdvisorResult()` suppresses recommendations for detected unwired live tests and the new unwired test covers workflow text with no path. However, detection still uses raw substring matching, so non-executing path references can bypass the guard.
  • **Acceptance clause:** Detect changed free-standing live Vitest tests that are not referenced by `.github/workflows/e2e-vitest-scenarios.yaml`. — add test evidence or identify existing coverage. `FREE_STANDING_LIVE_TEST_PATTERN` and `findUnwiredFreeStandingLiveTests()` cover changed `test/e2e-scenario/live/*.test.ts` files excluding `registry-scenarios.test.ts`, but `referenced` is implemented as `workflowText.includes(file)` rather than executable dispatchability.
Since last review details

Current findings:

  • Source-of-truth review needed: Free-standing live Vitest workflow wiring detection: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `findUnwiredFreeStandingLiveTests()` uses raw `includes(file)` and `extractFreeStandingVitestJobs()` scans job bodies for `inputs.jobs`, `,<id>,`, and test paths.
  • Focused job dispatches are not backed by the current workflow contract (tools/e2e-advisor/scenarios.mts:74): The normalizer can emit `gh workflow run e2e-vitest-scenarios.yaml --field jobs=<id>` for free-standing live tests, but the current `.github/workflows/e2e-vitest-scenarios.yaml` only declares `scenarios` and `pr_number` workflow_dispatch inputs and has no `inputs.jobs` selectors for its free-standing jobs. That makes the new recommendation shape non-dispatchable against the trusted workflow source of truth.
    • Recommendation: Either add and test an actual `jobs` workflow_dispatch input plus per-job selectors in `.github/workflows/e2e-vitest-scenarios.yaml`, or keep recommendations limited to dispatch shapes the current workflow supports. Add a regression test using the real workflow shape so `--field jobs=...` is only produced when the workflow can honor it.
    • Evidence: `canonicalDispatchCommand()` returns `--field jobs=${id}` for `selectorType === "job"` at `tools/e2e-advisor/scenarios.mts:72-74`; `extractFreeStandingVitestJobs()` looks for `inputs.jobs` at line 422. The current workflow inputs at `.github/workflows/e2e-vitest-scenarios.yaml:8-18` are `scenarios` and `pr_number`, with no `jobs` input.
  • Workflow wiring detection still trusts non-executing text mentions (tools/e2e-advisor/scenarios.mts:442): The guard still treats a free-standing live test as wired whenever the workflow text contains the file path. A YAML comment, artifact path, environment variable, summary text, or unrelated shell snippet can therefore keep fan-out or targeted recommendations even though the workflow cannot execute the test. This leaves the core acceptance behavior only partially satisfied and weakens the workflow trusted-code boundary.
    • Recommendation: Validate wiring against an executable workflow shape, such as a parsed job with a `run:` step invoking `npx vitest run --project e2e-scenarios-live <file>`, or move dispatchability to an explicit registry/source of truth. Add negative tests for comments, artifact paths, environment variables, and summaries that mention the path without executing it.
    • Evidence: `findUnwiredFreeStandingLiveTests()` filters with `!(vitestWorkflowText ?? "").includes(file)` at `tools/e2e-advisor/scenarios.mts:434-442`. The added tests cover no path and synthetic executable text, but not non-executing references.
  • Unwired-test suppression can drop recommendations for unrelated product changes (tools/e2e-advisor/scenarios.mts:451): The suppression decision filters the changed file set down to scenario-relevant paths before checking whether all remaining paths are unwired live tests or the workflow. If a PR changes product/source code plus an unwired free-standing live test, the product files are ignored and otherwise valid model recommendations can be removed.
    • Recommendation: Only suppress recommendations when the full changed-file set is limited to the unwired free-standing live test and optionally the workflow, or explicitly prove and test that non-scenario files cannot independently justify scenario recommendations.
    • Evidence: `shouldSuppressFanoutForUnwiredLiveTests()` computes `const relevantFiles = changedFiles.filter(isVitestScenarioRelevantFile)` and then checks only `relevantFiles.every(...)` at `tools/e2e-advisor/scenarios.mts:446-456`.
  • Free-standing job extraction is a raw-text heuristic rather than an executable workflow source of truth (tools/e2e-advisor/scenarios.mts:422): The new extractor accepts a job when its body contains `inputs.jobs`, `,<job-id>,`, and a live test file path. It does not verify that the workflow declares the input, that the `if:` expression gates the job on that input, or that the matching path appears in an executable Vitest run step. This can create deterministic focused recommendations from text that is not actually dispatchable.
    • Recommendation: Parse or otherwise validate the executable workflow structure, including workflow_dispatch input declaration, per-job selector predicate, and a `run:` step that executes the matching live test with the expected Vitest project. Keep the parser small and covered by negative tests.
    • Evidence: `extractFreeStandingVitestJobs()` checks `body.includes("inputs.jobs")`, `body.includes(`,${id},`)`, and a regex path match at `tools/e2e-advisor/scenarios.mts:405-431`.
  • Workflow text read fallback silently masks source-of-truth failures (tools/e2e-advisor/scenarios.mts:387): If the trusted workflow file cannot be read, the normalizer silently continues with `undefined` workflow text. That hides whether recommendations are based on the canonical workflow source, and there is no focused regression test for unavailable workflow text.
    • Recommendation: Handle unavailable workflow text explicitly: either fail normalization, surface a deterministic no-scenario reason, or narrowly handle expected filesystem errors with tests. Document why fallback is needed and when it can be removed.
    • Evidence: `readVitestWorkflowText()` catches all errors and returns `undefined` at `tools/e2e-advisor/scenarios.mts:384-389`.
  • Regression tests use synthetic wiring but miss current workflow and negative trusted-boundary cases (test/e2e-scenario-advisor.test.ts:355): The added tests cover a basic unwired case and synthetic `inputs.jobs` examples, but they do not pin the real current workflow shape or the non-executing-reference cases that previously caused the trusted workflow-boundary concern.
    • Recommendation: Add behavior tests that use the real workflow text or representative fixtures: no `jobs` input means no `--field jobs=...`, comments/artifact paths/env vars do not count as wiring, and mixed product changes preserve independent recommendations.
    • Evidence: New tests around `test/e2e-scenario-advisor.test.ts:355-452` exercise synthetic workflow text with `inputs.jobs`, while the actual `.github/workflows/e2e-vitest-scenarios.yaml` has no `jobs` input.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@jyaunches jyaunches reopened this Jun 11, 2026
@jyaunches jyaunches changed the title test(e2e): flag unwired live Vitest tests test(e2e): #5098 support flag unwired live vitest tests Jun 11, 2026
@jyaunches jyaunches changed the title test(e2e): #5098 support flag unwired live vitest tests test(e2e): flag unwired live Vitest tests Jun 11, 2026
@jyaunches jyaunches changed the title test(e2e): flag unwired live Vitest tests test(e2e): flag unwired live Vitest tests [SUPPORT] Jun 11, 2026
…dvisor-unwired-live

Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>
@cv cv added the v0.0.64 Release target label Jun 11, 2026
@jyaunches jyaunches merged commit 76ed5cc into main Jun 11, 2026
42 checks passed
@jyaunches jyaunches deleted the fix/vitest-scenario-advisor-unwired-live branch June 11, 2026 21:40
@wscurran wscurran added area: e2e End-to-end tests, nightly failures, or validation infrastructure bug-fix PR fixes a bug or regression labels Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: e2e End-to-end tests, nightly failures, or validation infrastructure bug-fix PR fixes a bug or regression v0.0.64 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants