Skip to content

test(e2e): migrate strict tool-call probe to Vitest#5145

Closed
cv wants to merge 18 commits into
mainfrom
codex/e2e-migrate-strict-tool-call-probe
Closed

test(e2e): migrate strict tool-call probe to Vitest#5145
cv wants to merge 18 commits into
mainfrom
codex/e2e-migrate-strict-tool-call-probe

Conversation

@cv

@cv cv commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Summary

Migrates the strict Chat Completions tool-call probe from the legacy shell script into a Vitest live scenario. The regression workflow now installs root dependencies, runs the targeted e2e-scenarios-live test directly, and uploads fixture artifacts.

Related Issue

Refs #4941
Refs #4537

Changes

  • Add test/e2e-scenario/live/strict-tool-call-probe.test.ts with strict payload, onboarding caller, transient retry, and fail-closed mock coverage.
  • Replace test/e2e/test-strict-tool-call-probe.sh in regression-e2e.yaml with direct Vitest execution and artifact upload.
  • Extend test/regression-e2e-workflow.test.ts to lock the new direct Vitest workflow contract.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Carlos Villela cvillela@nvidia.com

@cv cv self-assigned this Jun 10, 2026
@copy-pr-bot

copy-pr-bot Bot commented Jun 10, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f099514c-0a08-46d9-8efc-189875157ab2

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/e2e-migrate-strict-tool-call-probe

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: strict-tool-call-probe-e2e
Optional E2E: onboard-inference-smoke-e2e

Dispatch hint: strict-tool-call-probe-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/codex/e2e-simplify-migration-tracking
Head: HEAD
Confidence: high

Required E2E

  • strict-tool-call-probe-e2e (low): This PR directly rewires and migrates the strict tool-call probe E2E lane; run the exact affected regression job to validate the new Vitest scenario, npm install/build path, live-scenario gate, and artifact upload configuration.

Optional E2E

  • onboard-inference-smoke-e2e (low): Optional adjacent confidence for onboarding/inference validation behavior, but not merge-blocking because this PR changes the strict-tool-call E2E harness rather than runtime onboarding or inference source.

New E2E recommendations

  • None.

Dispatch hint

  • Workflow: regression-e2e.yaml
  • jobs input: strict-tool-call-probe-e2e

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

E2E Scenario Advisor Recommendation

Required scenario E2E: e2e-scenarios-all
Optional scenario E2E: None

Dispatch required scenario E2E:

  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref>

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/codex/e2e-simplify-migration-tracking
Head: HEAD
Confidence: medium

Required scenario E2E

  • e2e-scenarios-all: The PR adds a Vitest live test under test/e2e-scenario/live, but it is not a trusted live-supported typed registry scenario ID that can be targeted through the canonical scenario workflow. Per scenario advisor rules, use the canonical all-scenarios fan-out rather than inventing a targeted ID.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref>

Optional scenario E2E

  • None.

Relevant changed files

  • test/e2e-scenario/live/strict-tool-call-probe.test.ts

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 0 worth checking, 0 nice ideas
Since last review: 1 prior item resolved, 0 still apply, 0 new items found

Consider writing more tests for
  • **Runtime validation** — Exercise the `strict-tool-call-probe-e2e` regression workflow lane and verify it invokes the `e2e-scenarios-live` Vitest scenario with `NEMOCLAW_RUN_E2E_SCENARIOS=1`.. The migrated scenario and workflow contract test cover the behavior statically and hermetically, but .github/workflows/regression-e2e.yaml is workflow/infrastructure code, so a targeted runtime exercise of the changed lane would increase confidence that the GitHub Actions wiring matches the contract.
  • **Runtime validation** — Verify the changed workflow uploads `e2e-artifacts/vitest/strict-tool-call-probe/` artifacts on both the success path and a scenario failure path.. The migrated scenario and workflow contract test cover the behavior statically and hermetically, but .github/workflows/regression-e2e.yaml is workflow/infrastructure code, so a targeted runtime exercise of the changed lane would increase confidence that the GitHub Actions wiring matches the contract.
  • **Acceptance clause:** Refs Adopt Vitest fixtures as the E2E scenario execution model #4941 — add test evidence or identify existing coverage. The PR body references Adopt Vitest fixtures as the E2E scenario execution model #4941, but deterministic context did not include linked issue bodies or comments to extract literal acceptance clauses.
  • **Acceptance clause:** Refs Add hermetic E2E coverage for strict chat-completions tool-call validation #4537 — add test evidence or identify existing coverage. The PR body references Add hermetic E2E coverage for strict chat-completions tool-call validation #4537, but deterministic context did not include linked issue bodies or comments. The changed test comments and assertions do cover the stated strict Chat Completions tool-call probe behavior.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@wscurran wscurran added area: ci CI workflows, checks, release automation, or GitHub Actions area: e2e End-to-end tests, nightly failures, or validation infrastructure refactor PR restructures code without intended behavior change labels Jun 10, 2026
@wscurran

Copy link
Copy Markdown
Contributor

@copy-pr-bot

copy-pr-bot Bot commented Jun 10, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Base automatically changed from codex/e2e-simplify-migration-tracking to main June 10, 2026 20:53
@jyaunches

Copy link
Copy Markdown
Contributor

Closing as superseded by merged PR #5153, which retired test-strict-tool-call-probe.sh under the simplified #5098 migration rules.

@jyaunches jyaunches closed this Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: ci CI workflows, checks, release automation, or GitHub Actions area: e2e End-to-end tests, nightly failures, or validation infrastructure refactor PR restructures code without intended behavior change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants