Skip to content

test(e2e): migrate smoke and onboarding scenarios#5054

Closed
cv wants to merge 10 commits into
codex/e2e-fanout-02-onboarding-variant-fixturesfrom
codex/e2e-fanout-03-smoke-onboarding-scenarios
Closed

test(e2e): migrate smoke and onboarding scenarios#5054
cv wants to merge 10 commits into
codex/e2e-fanout-02-onboarding-variant-fixturesfrom
codex/e2e-fanout-03-smoke-onboarding-scenarios

Conversation

@cv

@cv cv commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Summary

Migrate the smoke/onboarding family from placeholder status into registry-backed live Vitest coverage.

This PR expands the live Vitest scenario runner so onboarding fixtures can execute the supported smoke/onboarding variants directly from typed scenario metadata. It also keeps the legacy typed shell runner aligned for the migrated negative onboarding paths until that runner is retired.

Related Issue

Refs #4941
Refs #4990
Refs #4348
Depends on #5046, #5052, and #5053.
Stacked on branch codex/e2e-fanout-02-onboarding-variant-fixtures.

Changes

  • Mark the smoke/onboarding family as live-supported where fixtures now own the setup:
    • cloud OpenClaw smoke
    • no-Docker preflight negative
    • resume after interrupt
    • repair existing config
    • double same-provider onboarding
    • custom policies
    • invalid NVIDIA key negative
    • gateway port conflict negative
  • Keep provider-switch onboarding unsupported until inference/provider-switch fixtures own that behavior.
  • Stop requiring a real NVIDIA_API_KEY for the invalid-key negative scenario; both the Vitest fixture path and shell dispatcher path inject not-a-nvidia-key.
  • Assert and record expected-failure contracts in scenario-result.json for negative live scenarios.
  • Hold the configured gateway port in the legacy shell dispatcher for cloud-openclaw-gateway-port-conflict, with validation and cleanup coverage.
  • Exclude test/e2e-scenario/** from the regular cli Vitest project so live scenario tests remain opt-in through e2e-scenarios-live.
  • Update matrix, support, compiler, dispatcher, and project-gating tests for the migrated scenario family.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx biome check --write test/e2e-scenario/framework-tests/e2e-onboard-dispatch.test.ts test/e2e-scenario/nemoclaw_scenarios/onboard/dispatch.sh
  • npx vitest run --project e2e-scenario-framework test/e2e-scenario/framework-tests/e2e-onboard-dispatch.test.ts --silent=false --reporter=default -> 1 file, 3 tests passed
  • npx prek run --files test/e2e-scenario/nemoclaw_scenarios/onboard/dispatch.sh test/e2e-scenario/framework-tests/e2e-onboard-dispatch.test.ts --skip test-cli
  • npx vitest run --project e2e-scenario-framework --silent=false --reporter=default -> 27 files, 340 tests passed
  • npm run typecheck:cli
  • git diff --check
  • PR checks on head 2c149cab786a809d3f41fc21b2bbfe88277e5217 are green, including static-checks, build-typecheck, cli-tests, plugin-tests, macos-e2e, wsl-e2e, CodeQL, E2E recommendation, and PR review advisor.
  • Live Vitest scenario tiles passed for the migrated family:
    • ubuntu-repo-cloud-openclaw — run 27235877649
    • ubuntu-no-docker-preflight-negative — run 27235876285
    • ubuntu-repo-cloud-openclaw-resume — run 27235883637
    • ubuntu-repo-cloud-openclaw-repair — run 27235882301
    • ubuntu-repo-cloud-openclaw-double-same-provider — run 27235880807
    • ubuntu-repo-cloud-openclaw-custom-policies — run 27235879233
    • ubuntu-invalid-nvidia-key-negative — run 27235874787
    • ubuntu-gateway-port-conflict-negative — run 27235873305
  • Typed shell scenario runner evidence:
    • ubuntu-invalid-nvidia-key-negative — run 27238468247 on head 2c149cab786a809d3f41fc21b2bbfe88277e5217; log shows expected failure { "phase": "onboarding", "errorClass": "invalid-nvidia-api-key" }, gateway-absent and sandbox-absent state-validation actions, and negative-contract: passed.
    • ubuntu-no-docker-preflight-negative — run 27235876305
    • ubuntu-gateway-port-conflict-negative — run 27237583030 after shell bridge hardening
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Note: the full touched-file commit hook remains blocked in this local environment by unrelated SSH signing failures inside test/release-latest-tag.test.ts temporary git repositories. The relevant touched-file hooks were run with --skip test-cli, and the full framework/typecheck/CI checks above passed.


Signed-off-by: Carlos Villela cvillela@nvidia.com

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@cv cv self-assigned this Jun 9, 2026
@copy-pr-bot

copy-pr-bot Bot commented Jun 9, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@cv cv added area: e2e End-to-end tests, nightly failures, or validation infrastructure area: onboarding Onboarding FSM, provider setup, sandbox launch, or first-run flow chore Build, CI, dependency, or tooling maintenance labels Jun 9, 2026
@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 682b603a-4c16-425f-9322-8f16e38c3146

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/e2e-fanout-03-smoke-onboarding-scenarios

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: .github/workflows/e2e-vitest-scenarios.yaml#live-scenarios, .github/workflows/e2e-scenarios.yaml#run-scenario
Optional E2E: .github/workflows/e2e-vitest-scenarios.yaml#live-scenarios, .github/workflows/e2e-scenarios-all.yaml#run-scenario

Dispatch hint: scenarios=ubuntu-repo-cloud-openclaw,ubuntu-no-docker-preflight-negative,ubuntu-repo-cloud-openclaw-resume,ubuntu-repo-cloud-openclaw-repair,ubuntu-repo-cloud-openclaw-double-same-provider,ubuntu-repo-cloud-openclaw-custom-policies,ubuntu-invalid-nvidia-key-negative,ubuntu-gateway-port-conflict-negative

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/codex/e2e-fanout-02-onboarding-variant-fixtures
Head: HEAD
Confidence: high

Required E2E

  • .github/workflows/e2e-vitest-scenarios.yaml#live-scenarios (high; one ubuntu-latest live E2E matrix job per selected scenario, installs/builds CLI and mutates real NemoClaw/OpenShell state): Run the live Vitest-supported onboarding matrix because this PR changes the live support whitelist, matrix generation, fixture dispatcher, expected-failure contract, and onboarding fixture behavior. Include the baseline plus newly supported onboarding variants: ubuntu-repo-cloud-openclaw, ubuntu-no-docker-preflight-negative, ubuntu-repo-cloud-openclaw-resume, ubuntu-repo-cloud-openclaw-repair, ubuntu-repo-cloud-openclaw-double-same-provider, ubuntu-repo-cloud-openclaw-custom-policies, ubuntu-invalid-nvidia-key-negative, and ubuntu-gateway-port-conflict-negative.
  • .github/workflows/e2e-scenarios.yaml#run-scenario (high; one ubuntu-latest scenario-runner job executing real install/onboard negative flows): Run the typed shell scenario runner for the negative onboarding paths because compiler.ts and nemoclaw_scenarios/onboard/dispatch.sh changed the real shell dispatch path, secretEnv propagation, invalid-key injection, gateway-port holder, and docker-missing executable profile routing.

Optional E2E

  • .github/workflows/e2e-vitest-scenarios.yaml#live-scenarios (high): Run ubuntu-repo-docker-post-reboot-recovery as an optional confidence check because live registry and matrix code were touched and this scenario remains part of the supported live set, but the PR does not directly change lifecycle fixture dispatch.
  • .github/workflows/e2e-scenarios-all.yaml#run-scenario (very high; all typed scenario runner jobs across supported runners): Optional full fan-out after merge or as a maintainer-triggered soak if the team wants broad confidence that typed registry matrix changes did not disturb unrelated platform routes.

New E2E recommendations

  • Workflow route and typed registry parity (medium): The scenario runner workflow keeps a static ROUTES table while the typed registry and live matrix are generated dynamically. This PR changes supported scenario sets, so a future gap could leave a valid scenario unrunnable by workflow_dispatch.
    • Suggested test: Add a workflow-contract test that validates .github/workflows/e2e-scenarios.yaml route entries against buildScenarioMatrix()/registered scenario ids.
  • Unsupported onboarding variants (medium): Provider-switch and token-rotation OpenClaw onboarding scenarios remain skipped by live Vitest support even though they are security- and credential-sensitive user flows adjacent to the changed onboarding profile resolver.
    • Suggested test: Add live fixture coverage for ubuntu-repo-cloud-openclaw-double-provider-switch and ubuntu-repo-cloud-openclaw-token-rotation once their inference/credential fixtures are wired.

Dispatch hint

  • Workflow: .github/workflows/e2e-vitest-scenarios.yaml
  • jobs input: scenarios=ubuntu-repo-cloud-openclaw,ubuntu-no-docker-preflight-negative,ubuntu-repo-cloud-openclaw-resume,ubuntu-repo-cloud-openclaw-repair,ubuntu-repo-cloud-openclaw-double-same-provider,ubuntu-repo-cloud-openclaw-custom-policies,ubuntu-invalid-nvidia-key-negative,ubuntu-gateway-port-conflict-negative

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

E2E Scenario Advisor Recommendation

Required scenario E2E: e2e-scenarios-all
Optional scenario E2E: None

Dispatch required scenario E2E:

  • gh workflow run e2e-scenarios-all.yaml --ref <pr-head-ref>

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/codex/e2e-fanout-02-onboarding-variant-fixtures
Head: HEAD
Confidence: high

Required scenario E2E

  • e2e-scenarios-all: Scenario framework/runner logic and typed scenario catalog/runtime support changed, including onboarding phase execution, onboarding dispatch helpers, compiler routing, live scenario support, and baseline scenario metadata. Policy requires the all-scenarios fan-out for scenario runtime/runner or catalog-affecting changes.
    • Dispatch: gh workflow run e2e-scenarios-all.yaml --ref <pr-head-ref>

Optional scenario E2E

  • None.

Relevant changed files

  • test/e2e-scenario/framework-tests/e2e-live-project-config.test.ts
  • test/e2e-scenario/framework-tests/e2e-live-registry-discovery.test.ts
  • test/e2e-scenario/framework-tests/e2e-onboard-dispatch.test.ts
  • test/e2e-scenario/framework-tests/e2e-phase-onboarding.test.ts
  • test/e2e-scenario/framework-tests/e2e-phase-orchestrators.test.ts
  • test/e2e-scenario/framework-tests/e2e-scenario-matrix.test.ts
  • test/e2e-scenario/framework/phases/onboarding.ts
  • test/e2e-scenario/live/registry-scenarios.test.ts
  • test/e2e-scenario/nemoclaw_scenarios/onboard/dispatch.sh
  • test/e2e-scenario/scenarios/compiler.ts
  • test/e2e-scenario/scenarios/onboarding-profiles.ts
  • test/e2e-scenario/scenarios/runtime-support.ts
  • test/e2e-scenario/scenarios/scenarios/baseline.ts
  • vitest.config.ts

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 0 worth checking, 0 nice ideas
Since last review: 1 prior item resolved, 0 still apply, 0 new items found

Consider writing more tests for
  • **Runtime validation** — Validate ubuntu-invalid-nvidia-key-negative runs without parent NVIDIA_API_KEY, injects not-a-nvidia-key, writes scenario-result.json expectedFailure { phase: "onboarding", errorClass: "invalid-nvidia-api-key" }, and records gateway-absent and sandbox-absent probes.. Framework and shell tests cover routing, secret-env behavior, expected-failure classification, and port-holder mechanics. Because this PR expands opt-in live scenarios that mutate real NemoClaw/OpenShell gateway, sandbox, and Docker state, targeted runtime validation would improve confidence in artifact contents and forbidden side effects.
  • **Runtime validation** — Validate ubuntu-no-docker-preflight-negative writes scenario-result.json expectedFailure { phase: "preflight", errorClass: "docker-missing" } and records gateway-absent and sandbox-absent probes.. Framework and shell tests cover routing, secret-env behavior, expected-failure classification, and port-holder mechanics. Because this PR expands opt-in live scenarios that mutate real NemoClaw/OpenShell gateway, sandbox, and Docker state, targeted runtime validation would improve confidence in artifact contents and forbidden side effects.
  • **Runtime validation** — Validate ubuntu-gateway-port-conflict-negative holds the selected gateway port during onboarding, releases it afterward, writes scenario-result.json expectedFailure { phase: "onboarding", errorClass: "gateway-port-conflict" }, and proves gateway and sandbox are absent.. Framework and shell tests cover routing, secret-env behavior, expected-failure classification, and port-holder mechanics. Because this PR expands opt-in live scenarios that mutate real NemoClaw/OpenShell gateway, sandbox, and Docker state, targeted runtime validation would improve confidence in artifact contents and forbidden side effects.
  • **Runtime validation** — Add a framework contract test that every default buildLiveScenarioMatrix() entry resolves through resolveExecutableOnboardingProfile() to a profile accepted by both OnboardingPhaseFixture.from() and the shell dispatcher profile set.. Framework and shell tests cover routing, secret-env behavior, expected-failure classification, and port-holder mechanics. Because this PR expands opt-in live scenarios that mutate real NemoClaw/OpenShell gateway, sandbox, and Docker state, targeted runtime validation would improve confidence in artifact contents and forbidden side effects.
  • **Runtime validation** — Validate ubuntu-repo-cloud-openclaw-resume, ubuntu-repo-cloud-openclaw-repair, ubuntu-repo-cloud-openclaw-double-same-provider, and ubuntu-repo-cloud-openclaw-custom-policies write the expected expectedStateId and probe IDs to scenario-result.json.. Framework and shell tests cover routing, secret-env behavior, expected-failure classification, and port-holder mechanics. Because this PR expands opt-in live scenarios that mutate real NemoClaw/OpenShell gateway, sandbox, and Docker state, targeted runtime validation would improve confidence in artifact contents and forbidden side effects.
  • **Acceptance clause:** Refs Adopt Vitest fixtures as the E2E scenario execution model #4941 — add test evidence or identify existing coverage. The PR body references this issue, but deterministic linkedIssues was empty and issue body/comment clauses were not available for literal extraction.
  • **Acceptance clause:** Refs Adopt phase fixtures + registry-driven test discovery for Vitest E2E scenarios #4990 — add test evidence or identify existing coverage. The PR body references this issue, but deterministic linkedIssues was empty and issue body/comment clauses were not available for literal extraction.
  • **Acceptance clause:** Refs Phase 2: Onboarding and Installer Audit Coverage (E2E audit-coverage) #4348 — add test evidence or identify existing coverage. The PR body references this issue, but deterministic linkedIssues was empty and issue body/comment clauses were not available for literal extraction.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@copy-pr-bot

copy-pr-bot Bot commented Jun 9, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@cv

cv commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

Closing as superseded by #5106 and the post-#5098 one-E2E migration plan.

This branch belongs to the pre-cutover fanout stack. Any useful helper/scenario work should come back as a fresh, focused draft PR from current main: Vitest as the only E2E harness, GitHub Actions as the matrix, no revived runner path, no long-lived legacy-inventory.json roadmap expansion, and replacement/deletion evidence carried in the PR body plus linked issue.

@cv cv closed this Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: e2e End-to-end tests, nightly failures, or validation infrastructure area: onboarding Onboarding FSM, provider setup, sandbox launch, or first-run flow chore Build, CI, dependency, or tooling maintenance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants