Skip to content

test(e2e): wire onboarding variant fixtures#5053

Closed
cv wants to merge 17 commits into
mainfrom
codex/e2e-fanout-02-onboarding-variant-fixtures
Closed

test(e2e): wire onboarding variant fixtures#5053
cv wants to merge 17 commits into
mainfrom
codex/e2e-fanout-02-onboarding-variant-fixtures

Conversation

@cv

@cv cv commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Summary

Implements the onboarding-variant fixture layer for the E2E Vitest fan-out stack. This makes the onboarding family executable through typed fixture methods before the next PR flips individual scenarios into the live Vitest registry.

Related Issue

Refs #4941
Refs #4990
Refs #4348
Depends on #5046 and #5052.
Stacked on branch codex/e2e-fanout-01-inventory-internals.

Changes

  • Adds typed onboarding fixture branches for:
    • cloud-openclaw-custom-policies
    • cloud-openclaw-invalid-nvidia-key
    • cloud-openclaw-gateway-port-conflict
    • cloud-nvidia-openclaw-resume-after-interrupt
    • cloud-nvidia-openclaw-repair-existing-config
    • cloud-nvidia-openclaw-double-same-provider
  • Adds reusable negative-failure signature checks, stack-trace rejection, gateway port-holder support, and resume-onboard command handling.
  • Keeps no-Docker negative onboarding on the typed fixture path and adds the same no-stack-trace guard used by the new negative variants.
  • Validates repair cleanup before resume, tolerating only documented already-missing sandbox/forward state and failing on unrelated cleanup errors.
  • Keeps legacy compiled shell-runner secret metadata unchanged; test(e2e): migrate smoke and onboarding scenarios #5054 should flip supported live scenarios onto these typed fixtures before changing shell-dispatch contracts.
  • Adds framework tests for custom-policy env, bad-key injection, gateway-port conflict, resume, repair, double-onboard, Docker gating, cleanup failure handling, and typed fixture context behavior.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx vitest run --project e2e-scenario-framework test/e2e-scenario/framework-tests/e2e-phase-onboarding.test.ts test/e2e-scenario/framework-tests/e2e-phase-orchestrators.test.ts --silent=false --reporter=default
  • npx vitest run --project e2e-scenario-framework test/e2e-scenario/framework-tests/e2e-scenario-matrix.test.ts test/e2e-scenario/framework-tests/e2e-scenario-registry.test.ts test/e2e-scenario/framework-tests/e2e-plan-compiler.test.ts test/e2e-scenario/framework-tests/e2e-negative-matcher.test.ts --silent=false --reporter=default
  • npx vitest run --project e2e-scenario-framework --silent=false --reporter=default
  • npm run validate:configs && npx prek run --all-files --stage pre-push --skip tsc-plugin --skip tsc-js --skip tsc-cli --skip version-tag-sync --skip test-cli --skip test-plugin --skip source-shape-test-budget --skip test-file-size-budget --skip test-skills-yaml && npm run source-shape:check && npm run test-size:check && npx vitest run test/skills-frontmatter.test.ts && python3 scripts/generate-platform-docs.py --check
  • npm run typecheck:cli
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Carlos Villela cvillela@nvidia.com

jyaunches and others added 7 commits June 9, 2026 12:24
`liveScenarioSupport` previously rejected any scenario that declared an
`environment.lifecycle`, so post-onboard host mutations (reboot, rebuild,
upgrade, drift) could not surface in the live Vitest matrix at all.

Replace the unconditional reject with a `SUPPORTED_LIFECYCLES` whitelist
that starts with the single profile the upcoming post-reboot-recovery
fixture dispatches: `post-reboot-recovery`. Future profiles must land the
dispatcher branch and an expected-state in the same change set, so the
whitelist stays in lockstep with what the runner can actually execute.

Prepares the runner for #4423's failing-test-first guard, which needs a
post-reboot lifecycle scenario to demonstrate registry preservation +
Docker-backed sandbox recovery on Linux/Spark Docker-driver hosts.

Refs #4423
Adds two host-side state-validation probes the live runner needs to
express the regression target tracked by #4423:

  * `local-registry-entry-present` reads `~/.nemoclaw/sandboxes.json`
    and asserts the scenario's sandbox name is still recorded. This is
    deliberately orthogonal to `sandbox.expected`: post-reboot bugs
    can wipe the local registry while the live OpenShell gateway is
    healthy, and only a host-side probe catches the data-loss
    regression.

  * `docker-sandbox-container-present` runs
    `docker ps -a --filter label=openshell.ai/sandbox-name=<name>` and
    accepts running, stopped, or `*-nemoclaw-gpu-backup-*` sibling
    containers. The label filter mirrors `OPENSHELL_SANDBOX_NAME_LABEL`
    used by `findOpenShellDockerSandboxContainerIds` in
    `src/lib/onboard/docker-gpu-patch.ts`, so the probe stays in lock-
    step with how OpenShell labels containers today.

Probe wiring:

  * `StateProbeId` extended with the two new probe ids.
  * `ExpectedState` gains `localRegistry` and `dockerSandboxContainer`
    optional dimensions; `probesForState` emits the new probes only
    for `expected: "present"`. Negative-direction probes are
    intentionally omitted today and pinned by a probesForState test.
  * `StateValidationPhaseFixture.from()` now accepts either an
    expected-state ID or an inline `ExpectedState`, so unit tests can
    drive new probes without registering synthetic states in the
    typed registry. The live runner still calls `from(id, instance)`.
  * Fixture takes an optional `ProbeIO` injection so tests can stub
    the registry reader without touching `~/.nemoclaw`.

No callers of the existing typed registry are affected: every shipped
expected-state leaves `localRegistry` and `dockerSandboxContainer`
unset, so `probesForState` returns the same probe lists as before.

Refs #4423
Adds a Vitest phase fixture that mutates host state between onboarding
and state-validation, so live scenarios can express post-onboard
invariants the legacy bash runner has no equivalent for.

`LifecyclePhaseFixture.simulate("post-reboot-recovery", instance, opts)`
reproduces the host-side conditions of a DGX Spark / Linux Docker-driver
reboot in two modes:

  * `stop-original` (default)   — `openshell gateway stop` + `docker
                                   stop` of the labeled sandbox
                                   container. Models the common reboot
                                   outcome where OpenShell forgets the
                                   sandbox while Docker keeps the
                                   container exited but labeled.

  * `rename-to-gpu-backup`      — additionally `docker rename`s the
                                   container to a `*-nemoclaw-gpu-
                                   backup-<ts>` sibling, mirroring the
                                   GPU-patch reboot path in
                                   `src/lib/onboard/docker-gpu-patch.ts`.

Both modes register cleanups (in reverse order) to restore the
container so test teardown leaves Docker in a usable state.

Wiring:

  * `framework/phases/index.ts` re-exports the fixture and types.
  * `framework/e2e-test.ts` registers a `lifecycle` Vitest fixture on
    `E2EScenarioFixtures`, wired with the shared `host`, `sandbox`,
    and `cleanup` registries.
  * `live/registry-scenarios.test.ts` invokes
    `lifecycle.simulate(profile, instance)` between `onboard.from(...)`
    and `stateValidation.from(...)` whenever the scenario declares a
    whitelisted `environment.lifecycle`. Scenarios that omit lifecycle
    are unaffected. A scenario whose lifecycle is whitelisted by
    `runtime-support.ts` but NOT dispatched by the fixture fails fast
    with a clear error so the whitelist and dispatcher stay in lock-
    step.

Coverage in `e2e-phase-lifecycle.test.ts` exercises both modes,
gateway-stop tolerance, the no-labeled-container failure case, the
docker-discover failure case, the unsupported-profile rejection,
the cleanup queue order, and `buildBackupContainerName` truncation.

The fixture is intentionally narrow on profiles: only
`post-reboot-recovery` is dispatched today. Adding rebuild, upgrade,
or drift profiles is a separate, equally narrow change set that must
land the dispatcher branch and `SUPPORTED_LIFECYCLES` whitelist
together.

Refs #4423
Registers the failing-test-first guard for #4423 in the typed scenario
registry so the live Vitest matrix from #5006 fans it out as a
dedicated CI job. Builds on the framework primitives added earlier in
this PR (lifecycle phase fixture, host-side probes, lifecycle whitelist).

Additions:

  * `post-reboot-recovery-ready` expected-state in
    `scenarios/expected-states.ts` declaring the user-visible
    invariants that must hold after a `nemoclaw <name> status` call
    on a freshly-rebooted DGX Spark / Linux Docker-driver host:
      - cli installed,
      - gateway healthy (the user-systemd unit from #4580 brings it
        back up before status runs),
      - sandbox running (recovery completed in time),
      - localRegistry entry preserved (the user-visible regression
        target — destroyed on unfixed `main`),
      - dockerSandboxContainer present (recovery didn't delete the
        labeled container or its `*-nemoclaw-gpu-backup-*` sibling).

  * `ubuntu-repo-docker-post-reboot-recovery` scenario in
    `scenarios/scenarios/baseline.ts` wiring
    `ubuntuRepoDockerLifecycle("cloud-openclaw", "post-reboot-recovery")`
    against the new expected-state and a smoke suite. Carries a
    description that explains the RED/GREEN contract and points to the
    PR-A fix landing in `src/lib/`.

  * `manifests/openclaw-nvidia-post-reboot-recovery.yaml` declares
    `lifecycle: post-reboot-recovery` and the same NVIDIA_API_KEY
    credential ref the cloud-openclaw scenarios use.

  * `.github/workflows/e2e-scenarios.yaml` ROUTES table gains the new
    scenario so the workflow-boundary test
    (`e2e-scenarios-workflow.test.ts`) routes every typed id.

Test pinning:

  * `e2e-scenario-matrix.test.ts` updated from a 1-entry to a 2-entry
    live matrix expectation. The new entry asserts on
    `expectedStateId: "post-reboot-recovery-ready"` so a future
    accidental dropped-lifecycle change to the scenario regresses
    loudly.

  * `e2e-live-registry-discovery.test.ts` swaps the synthetic
    whitelist-coverage test for an assertion against the real
    `ubuntu-repo-docker-post-reboot-recovery` registry entry.

Behavior:

  * On unfixed `main`, the live runner's lifecycle phase stops the
    OpenShell gateway runtime and `docker stop`s the labeled sandbox
    container. State-validation then runs `nemoclaw <name> status`
    (which restarts the gateway via systemd) and the destructive
    `missing` branch in `src/lib/actions/sandbox/status.ts` wipes the
    local registry entry. The `local-registry-entry-present` probe
    fails. Scenario goes RED.

  * On the PR-A fix branch, the new Docker-driver sandbox recovery
    helper restarts the labeled container before stale-removal can
    fire, registry survives, all five probes pass. Scenario flips
    GREEN.

The bash-side legacy compiler emits a
`lifecycle.profile.post-reboot-recovery` PhaseAction pointing at
`nemoclaw_scenarios/lifecycle/dispatch.sh`, but the legacy bash worker
is intentionally not provided: this scenario is Vitest-only. The
typed runner's `LifecyclePhaseFixture` handles dispatch directly. If
the legacy runner is invoked against this scenario it errors out at
the dispatcher; that's the right failure mode while the bash side
stays on its own retirement clock.

Refs #4423
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@cv cv self-assigned this Jun 9, 2026
@copy-pr-bot

copy-pr-bot Bot commented Jun 9, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@cv cv added area: e2e End-to-end tests, nightly failures, or validation infrastructure area: onboarding Onboarding FSM, provider setup, sandbox launch, or first-run flow chore Build, CI, dependency, or tooling maintenance labels Jun 9, 2026
@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 8c28daa8-8c5d-4855-8971-a5fb80a0ac86

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/e2e-fanout-02-onboarding-variant-fixtures

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: ubuntu-repo-cloud-openclaw, ubuntu-repo-cloud-openclaw-custom-policies, ubuntu-invalid-nvidia-key-negative, ubuntu-gateway-port-conflict-negative, ubuntu-no-docker-preflight-negative, ubuntu-repo-cloud-openclaw-resume, ubuntu-repo-cloud-openclaw-repair, ubuntu-repo-cloud-openclaw-double-same-provider
Optional E2E: ubuntu-repo-cloud-openclaw-double-provider-switch, ubuntu-repo-cloud-openclaw-token-rotation

Dispatch hint: ubuntu-repo-cloud-openclaw,ubuntu-repo-cloud-openclaw-custom-policies,ubuntu-invalid-nvidia-key-negative,ubuntu-gateway-port-conflict-negative,ubuntu-no-docker-preflight-negative,ubuntu-repo-cloud-openclaw-resume,ubuntu-repo-cloud-openclaw-repair,ubuntu-repo-cloud-openclaw-double-same-provider

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/codex/e2e-fanout-01-inventory-internals
Head: HEAD
Confidence: high

Required E2E

  • ubuntu-repo-cloud-openclaw (high): Baseline cloud OpenClaw onboarding should run because the shared onboarding fixture path and cleanup behavior changed.
  • ubuntu-repo-cloud-openclaw-custom-policies (high): The PR adds a custom policy/model onboarding fixture path with policy preset environment variables and expected ready-state behavior.
  • ubuntu-invalid-nvidia-key-negative (medium): The PR adds invalid NVIDIA API key negative onboarding handling, redaction values, expected failure classification, and stack-trace rejection.
  • ubuntu-gateway-port-conflict-negative (medium): The PR adds gateway port conflict onboarding behavior using a local port holder and validates that failure is classified without unwanted sandbox/gateway side effects.
  • ubuntu-no-docker-preflight-negative (medium): The PR modifies no-Docker negative preflight handling by adding stack-trace rejection and shared expected-failure helpers; this scenario validates the preflight boundary.
  • ubuntu-repo-cloud-openclaw-resume (high): The PR adds the resume-after-interrupt onboarding fixture path, including failure injection, nemoclaw onboard --resume, and resume environment behavior.
  • ubuntu-repo-cloud-openclaw-repair (high): The PR adds repair-existing-config behavior that deletes sandboxes, stops forwards, tolerates missing artifacts, and then resumes onboarding.
  • ubuntu-repo-cloud-openclaw-double-same-provider (high): The PR adds the double-onboard same-provider fixture path using sandbox recreation, which directly affects sandbox lifecycle and onboarding idempotency coverage.

Optional E2E

  • ubuntu-repo-cloud-openclaw-double-provider-switch (high): Adjacent confidence for repeated onboarding/provider-switch lifecycle behavior, but this PR specifically changes same-provider recreation rather than provider-switch logic.
  • ubuntu-repo-cloud-openclaw-token-rotation (high): Optional adjacent coverage for credential lifecycle after onboarding fixture changes; not directly touched by the diff.

New E2E recommendations

  • None.

Dispatch hint

  • Workflow: .github/workflows/e2e-scenarios.yaml
  • jobs input: ubuntu-repo-cloud-openclaw,ubuntu-repo-cloud-openclaw-custom-policies,ubuntu-invalid-nvidia-key-negative,ubuntu-gateway-port-conflict-negative,ubuntu-no-docker-preflight-negative,ubuntu-repo-cloud-openclaw-resume,ubuntu-repo-cloud-openclaw-repair,ubuntu-repo-cloud-openclaw-double-same-provider

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

E2E Scenario Advisor Recommendation

Required scenario E2E: ubuntu-repo-cloud-openclaw, ubuntu-repo-cloud-openclaw-custom-policies, ubuntu-invalid-nvidia-key-negative, ubuntu-gateway-port-conflict-negative, ubuntu-no-docker-preflight-negative, ubuntu-repo-cloud-openclaw-resume, ubuntu-repo-cloud-openclaw-repair, ubuntu-repo-cloud-openclaw-double-same-provider
Optional scenario E2E: macos-repo-cloud-openclaw, wsl-repo-cloud-openclaw, brev-launchable-cloud-openclaw

Dispatch required scenario E2E:

  • gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw
  • gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-custom-policies
  • gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-invalid-nvidia-key-negative
  • gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-gateway-port-conflict-negative
  • gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-no-docker-preflight-negative
  • gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-resume
  • gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-repair
  • gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-double-same-provider

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/codex/e2e-fanout-01-inventory-internals
Head: HEAD
Confidence: high

Required scenario E2E

  • ubuntu-repo-cloud-openclaw: Covers the baseline cloud OpenClaw onboarding path refactored through shared helpers in test/e2e-scenario/framework/phases/onboarding.ts.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw
  • ubuntu-repo-cloud-openclaw-custom-policies: Directly exercises the newly changed custom model/policy preset onboarding path.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-custom-policies
  • ubuntu-invalid-nvidia-key-negative: Directly exercises the invalid NVIDIA API key negative onboarding behavior added in the onboarding phase fixture.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-invalid-nvidia-key-negative
  • ubuntu-gateway-port-conflict-negative: Directly exercises the gateway port conflict negative onboarding behavior and local port holder logic.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-gateway-port-conflict-negative
  • ubuntu-no-docker-preflight-negative: Covers the modified no-Docker preflight negative path, including stack-trace rejection and failure-signature handling.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-no-docker-preflight-negative
  • ubuntu-repo-cloud-openclaw-resume: Directly exercises resume-after-interrupt onboarding flow added to the onboarding phase fixture.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-resume
  • ubuntu-repo-cloud-openclaw-repair: Directly exercises repair-existing-config onboarding flow and repair cleanup handling added to the onboarding phase fixture.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-repair
  • ubuntu-repo-cloud-openclaw-double-same-provider: Directly exercises the double same-provider onboarding/recreate path added to the onboarding phase fixture.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-double-same-provider

Optional scenario E2E

  • macos-repo-cloud-openclaw: Adjacent platform coverage for cloud OpenClaw onboarding; optional because it uses a special macOS runner and primarily validates platform-specific CLI readiness.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=macos-repo-cloud-openclaw
  • wsl-repo-cloud-openclaw: Adjacent platform coverage for cloud OpenClaw onboarding on WSL; optional because it requires a Windows/WSL runner.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=wsl-repo-cloud-openclaw
  • brev-launchable-cloud-openclaw: Adjacent launchable-image coverage for baseline cloud OpenClaw onboarding; optional because it uses the Brev launchable path rather than the primary repo-current onboarding target.
    • Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=brev-launchable-cloud-openclaw

Relevant changed files

  • test/e2e-scenario/framework-tests/e2e-phase-onboarding.test.ts
  • test/e2e-scenario/framework-tests/e2e-phase-orchestrators.test.ts
  • test/e2e-scenario/framework/phases/onboarding.ts

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 1 worth checking, 0 nice ideas
Since last review: 2 prior items resolved, 1 still applies, 0 new items found

Review findings

🛠️ Needs attention

  • None.

🔎 Worth checking

  • Typed invalid-key and port-conflict contracts still diverge from compiled shell dispatch (test/e2e-scenario/scenarios/compiler.ts:132): The new typed fixture safely injects `NVIDIA_API_KEY=not-a-nvidia-key` for `cloud-openclaw-invalid-nvidia-key` and binds `127.0.0.1:18080` plus `NEMOCLAW_GATEWAY_PORT=18080` for `cloud-openclaw-gateway-port-conflict`. However, the active compiled shell-runner path still declares live `NVIDIA_API_KEY` passthrough for both profiles and `dispatch.sh` routes both to the normal `e2e_onboard_cloud_openclaw` worker. If these profiles are exercised through compiled shell plans before the live registry flip, the invalid-key scenario can use a real secret and the port-conflict scenario may run normal onboarding instead of the intended negative path.
    • Recommendation: Either make the compiled shell path explicitly reject these typed-fixture-only profiles until test(e2e): migrate smoke and onboarding scenarios #5054 flips them, or update the shell dispatcher/workers and compiler metadata to match the typed fixture semantics. Add a plan/dispatcher test proving the temporary rejection or the final fake-key/port-conflict behavior.
    • Evidence: `OnboardingPhaseFixture.cloudOpenClawInvalidNvidiaKey()` injects `INVALID_NVIDIA_API_KEY`; `cloudOpenClawGatewayPortConflict()` sets `NEMOCLAW_GATEWAY_PORT` and uses `withPortHolder()`. Nearby active code still has `ONBOARD_PROFILE_SECRET_ENV["cloud-openclaw-invalid-nvidia-key"] = ["NVIDIA_API_KEY"]`, `ONBOARD_PROFILE_SECRET_ENV["cloud-openclaw-gateway-port-conflict"] = ["NVIDIA_API_KEY"]`, and `dispatch.sh` routes `cloud-openclaw-invalid-nvidia-key | cloud-openclaw-gateway-port-conflict)` to `e2e_onboard_cloud_openclaw`.

🌱 Nice ideas

  • None.
Consider writing more tests for
  • **Runtime validation** — compiled shell plan for cloud-openclaw-invalid-nvidia-key is rejected or explicitly marked typed-fixture-only until live registry flip. The new typed fixture branches have good unit-level FakeRunner coverage, including negative signatures and cleanup failure handling. Runtime/compiled-path validation is still advisable because the same profile names remain routable through the legacy compiled shell dispatcher with different secret and failure-condition setup.
  • **Runtime validation** — compiled shell plan for cloud-openclaw-gateway-port-conflict is rejected or explicitly marked typed-fixture-only until live registry flip. The new typed fixture branches have good unit-level FakeRunner coverage, including negative signatures and cleanup failure handling. Runtime/compiled-path validation is still advisable because the same profile names remain routable through the legacy compiled shell dispatcher with different secret and failure-condition setup.
  • **Runtime validation** — cloud-openclaw-gateway-port-conflict binds 127.0.0.1:18080 while the onboarding command is running and releases it afterward. The new typed fixture branches have good unit-level FakeRunner coverage, including negative signatures and cleanup failure handling. Runtime/compiled-path validation is still advisable because the same profile names remain routable through the legacy compiled shell dispatcher with different secret and failure-condition setup.
  • **Runtime validation** — resume and repair variants reject unavailable Docker before reading NVIDIA_API_KEY when no secret is configured. The new typed fixture branches have good unit-level FakeRunner coverage, including negative signatures and cleanup failure handling. Runtime/compiled-path validation is still advisable because the same profile names remain routable through the legacy compiled shell dispatcher with different secret and failure-condition setup.
  • **Acceptance clause:** Related issue clauses from Adopt Vitest fixtures as the E2E scenario execution model #4941, Adopt phase fixtures + registry-driven test discovery for Vitest E2E scenarios #4990, and Phase 2: Onboarding and Installer Audit Coverage (E2E audit-coverage) #4348 — add test evidence or identify existing coverage. The deterministic context did not include linked issue bodies or comments, so literal issue acceptance clauses could not be extracted beyond the PR body's references.
  • **Acceptance clause:** Adds typed onboarding fixture branches for: `cloud-openclaw-invalid-nvidia-key` — add test evidence or identify existing coverage. The typed fixture branch exists, injects `not-a-nvidia-key`, avoids `secrets.required()`, and has unit coverage for expected failure and stack-trace rejection. The same profile remains divergent in the compiled shell dispatcher path, which still passes live `NVIDIA_API_KEY` and invokes the normal cloud OpenClaw worker.
  • **Acceptance clause:** Adds typed onboarding fixture branches for: `cloud-openclaw-gateway-port-conflict` — add test evidence or identify existing coverage. The typed fixture branch exists, sets `NEMOCLAW_GATEWAY_PORT=18080`, wraps onboarding in a local port holder, and has unit coverage for the env and expected failure. The same profile remains divergent in the compiled shell dispatcher path, which still invokes the normal cloud OpenClaw worker without the port conflict setup.
Since last review details

Current findings:

  • Typed invalid-key and port-conflict contracts still diverge from compiled shell dispatch (test/e2e-scenario/scenarios/compiler.ts:132): The new typed fixture safely injects `NVIDIA_API_KEY=not-a-nvidia-key` for `cloud-openclaw-invalid-nvidia-key` and binds `127.0.0.1:18080` plus `NEMOCLAW_GATEWAY_PORT=18080` for `cloud-openclaw-gateway-port-conflict`. However, the active compiled shell-runner path still declares live `NVIDIA_API_KEY` passthrough for both profiles and `dispatch.sh` routes both to the normal `e2e_onboard_cloud_openclaw` worker. If these profiles are exercised through compiled shell plans before the live registry flip, the invalid-key scenario can use a real secret and the port-conflict scenario may run normal onboarding instead of the intended negative path.
    • Recommendation: Either make the compiled shell path explicitly reject these typed-fixture-only profiles until test(e2e): migrate smoke and onboarding scenarios #5054 flips them, or update the shell dispatcher/workers and compiler metadata to match the typed fixture semantics. Add a plan/dispatcher test proving the temporary rejection or the final fake-key/port-conflict behavior.
    • Evidence: `OnboardingPhaseFixture.cloudOpenClawInvalidNvidiaKey()` injects `INVALID_NVIDIA_API_KEY`; `cloudOpenClawGatewayPortConflict()` sets `NEMOCLAW_GATEWAY_PORT` and uses `withPortHolder()`. Nearby active code still has `ONBOARD_PROFILE_SECRET_ENV["cloud-openclaw-invalid-nvidia-key"] = ["NVIDIA_API_KEY"]`, `ONBOARD_PROFILE_SECRET_ENV["cloud-openclaw-gateway-port-conflict"] = ["NVIDIA_API_KEY"]`, and `dispatch.sh` routes `cloud-openclaw-invalid-nvidia-key | cloud-openclaw-gateway-port-conflict)` to `e2e_onboard_cloud_openclaw`.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

Base automatically changed from codex/e2e-fanout-01-inventory-internals to main June 10, 2026 02:41
@cv

cv commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

Closing as superseded by #5106 and the post-#5098 one-E2E migration plan.

This branch belongs to the pre-cutover fanout stack. Any useful helper/scenario work should come back as a fresh, focused draft PR from current main: Vitest as the only E2E harness, GitHub Actions as the matrix, no revived runner path, no long-lived legacy-inventory.json roadmap expansion, and replacement/deletion evidence carried in the PR body plus linked issue.

@cv cv closed this Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: e2e End-to-end tests, nightly failures, or validation infrastructure area: onboarding Onboarding FSM, provider setup, sandbox launch, or first-run flow chore Build, CI, dependency, or tooling maintenance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants