Skip to content

refactor(onboard): run initial phases through FSM slice#4499

Merged
cv merged 84 commits into
mainfrom
stack/onboard-fsm-use-initial-slice
Jun 9, 2026
Merged

refactor(onboard): run initial phases through FSM slice#4499
cv merged 84 commits into
mainfrom
stack/onboard-fsm-use-initial-slice

Conversation

@cv

@cv cv commented May 29, 2026

Copy link
Copy Markdown
Collaborator

Summary

Move the fresh-run preflight/gateway live call site onto the initial FSM flow slice. Resume remains on the compatibility path for now so preflight and gateway backstops still run even when saved machine state is already ahead.

Changes

  • Build an initial OnboardFlowContext in src/lib/onboard.ts for preflight/gateway state.
  • Wrap the existing preflight and gateway handler calls as sequence phases.
  • Use runInitialOnboardFlowSequence(...) for fresh runs that start at preflight.
  • Preserve the compatibility path for resume/ahead-state sessions.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

  • New Features

    • Enhanced GPU passthrough guidance during onboarding
    • Improved support for resuming interrupted onboarding sessions
    • Better GPU configuration detection and handling
  • Refactor

    • Streamlined onboarding flow orchestration
  • Tests

    • Added comprehensive coverage for resume scenarios and GPU detection

cv added 30 commits May 27, 2026 15:18
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
cv added 15 commits June 9, 2026 11:33
…inalization-phases' into stack/onboard-fsm-flow-sequence
… into stack/onboard-fsm-initial-sequence-slice
…ce-slice' into stack/onboard-fsm-core-sequence-slice
…slice' into stack/onboard-fsm-final-sequence-slice
…-slice' into stack/onboard-fsm-use-initial-slice

# Conflicts:
#	src/lib/onboard.ts
…e-sequence-slice

# Conflicts:
#	src/lib/onboard/machine/flow-slices.test.ts
#	src/lib/onboard/machine/flow-slices.ts
…slice' into stack/onboard-fsm-final-sequence-slice
…-slice' into stack/onboard-fsm-use-initial-slice
@copy-pr-bot

copy-pr-bot Bot commented Jun 9, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

cv added 3 commits June 9, 2026 13:39
…al-sequence-slice

# Conflicts:
#	src/lib/onboard/machine/flow-slices.test.ts
#	src/lib/onboard/machine/flow-slices.ts
…-slice' into stack/onboard-fsm-use-initial-slice
Base automatically changed from stack/onboard-fsm-final-sequence-slice to main June 9, 2026 20:50
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@cv cv marked this pull request as ready for review June 9, 2026 23:02

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 6517-6523: The code assigns session from initialContext.session
which can discard runner-updated session state; instead use the session returned
on the runner result (initialFlowResult.session) after
runInitialOnboardFlowSlice() so downstream resume sees the latest machine/step
updates—replace uses of initialContext.session for the session handoff with
initialFlowResult.session and keep reading sandboxGpuConfig/gpu from
initialContext as before.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e3a34e8f-192c-4fbc-9d6f-db03b49bf9e3

📥 Commits

Reviewing files that changed from the base of the PR and between 2f1c2da and 386b1dc.

📒 Files selected for processing (5)
  • src/lib/onboard.ts
  • src/lib/onboard/machine/initial-flow-phases.test.ts
  • src/lib/onboard/machine/initial-flow-phases.ts
  • src/lib/onboard/sandbox-gpu-notes.ts
  • src/lib/onboard/sandbox-gpu-preflight.ts

Comment thread src/lib/onboard.ts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/lib/onboard/machine/initial-flow-phases.test.ts (1)

342-362: 💤 Low value

Consider using exact equality to enforce call order.

The test uses expect.arrayContaining([...]) which verifies all listed calls are present but permits additional calls and doesn't enforce exact ordering. Since the preflight→gateway flow executes operations in a deterministic sequence (context snippet 1), using exact equality would catch ordering regressions and unexpected calls:

-    expect(calls).toEqual(
-      expect.arrayContaining([
+    expect(calls).toEqual([
         "get-sandbox",
         "resume-gpu-overrides",
         ...
         "record-gateway-complete",
-      ]),
-    );
+    ]);

The current form still validates presence of the expected operations and throwing mocks guard against unwanted ones, so this is an optional strengthening rather than a fix.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard/machine/initial-flow-phases.test.ts` around lines 342 - 362,
The test currently uses expect.arrayContaining([...]) which only asserts
presence and not order or extra calls; change the assertion on the calls array
in initial-flow-phases.test (the expect(calls) assertion) to assert exact
equality and ordering (e.g., replace expect.arrayContaining with a direct
equality assertion such as expect(calls).toEqual([...]) or
expect(calls).toStrictEqual([...])) so the test enforces the precise sequence
and disallows extra calls, referencing the same literal list of call names shown
in the diff.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/lib/onboard/machine/initial-flow-phases.test.ts`:
- Around line 342-362: The test currently uses expect.arrayContaining([...])
which only asserts presence and not order or extra calls; change the assertion
on the calls array in initial-flow-phases.test (the expect(calls) assertion) to
assert exact equality and ordering (e.g., replace expect.arrayContaining with a
direct equality assertion such as expect(calls).toEqual([...]) or
expect(calls).toStrictEqual([...])) so the test enforces the precise sequence
and disallows extra calls, referencing the same literal list of call names shown
in the diff.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d60de121-3c35-4b90-a51f-46dd92d27c10

📥 Commits

Reviewing files that changed from the base of the PR and between 386b1dc and d306d71.

📒 Files selected for processing (2)
  • src/lib/onboard/machine/initial-flow-phases.test.ts
  • src/lib/onboard/machine/initial-flow-phases.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/lib/onboard/machine/initial-flow-phases.ts

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 27242617348
Target ref: 1a6e0067e737d51eb073c204f5c9cfd79982ce60
Workflow ref: main
Requested jobs: cloud-onboard-e2e,onboard-resume-e2e,onboard-negative-paths-e2e,issue-3600-gpu-proof-optional-e2e
Summary: 4 passed, 0 failed, 0 skipped

Job Result
cloud-onboard-e2e ✅ success
issue-3600-gpu-proof-optional-e2e ✅ success
onboard-negative-paths-e2e ✅ success
onboard-resume-e2e ✅ success

@cv cv merged commit 4515eda into main Jun 9, 2026
34 checks passed
@cv cv deleted the stack/onboard-fsm-use-initial-slice branch June 9, 2026 23:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: onboarding Onboarding FSM, provider setup, sandbox launch, or first-run flow refactor PR restructures code without intended behavior change v0.0.62 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants