Skip to content

test(onboard): correct caller-level coverage citation in gateway-start-failure-integration tests#5118

Closed
jyaunches wants to merge 1 commit into
mainfrom
test/correct-gateway-start-failure-coverage-citation
Closed

test(onboard): correct caller-level coverage citation in gateway-start-failure-integration tests#5118
jyaunches wants to merge 1 commit into
mainfrom
test/correct-gateway-start-failure-coverage-citation

Conversation

@jyaunches

@jyaunches jyaunches commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

Small accuracy fix for documentation that landed in PR #5109. The merged PR's it.todo block in src/lib/onboard/gateway-start-failure-integration.test.ts and the inventory entry's notes field both overstate a coverage gap: they describe three caller-level contracts as "not directly proven", but two of those contracts are already proven by an existing process-level test in test/onboard.test.ts.

This PR was authored as f532050b6 on the #5109 branch but lost a race against the merge — cv approved at 06:36:51Z, the squash-merge happened at 06:38:23Z, the accuracy fix was pushed shortly after and never landed.

What's actually proven (and where)

The post-merge state in main claims the following are unproven caller-level gaps:

  1. ⚠️ startGateway aborts via pRetry.AbortError without entering health-poll loop
  2. ⚠️ startGateway never invokes openshell status or gateway info after docker-unreachable output
  3. ⚠️ startGateway forwards dockerUnreachable=true to handleFinalGatewayStartFailure

test/onboard.test.ts:357 (it("fast-fails gateway start before health polling when Docker is unreachable (#2347)", ...)) already proves items 1 and 2 by spawning a child Node process with a PATH-shimmed openshell, mocking p-retry, loading the compiled startGateway() from dist/lib/onboard.js, and asserting:

  • exit code 1
  • "Docker daemon is not running" in stderr
  • "colima start" macOS hint
  • NO "Waiting for gateway health" (proves item 1)
  • NO HEALTH POLL REACHED (proves item 2)

What that test does not assert is the absence of the "Cleaning up failed gateway state..." line that the non-dockerUnreachable branch prints. That absence is the only signal that distinguishes dockerUnreachable=true forwarding (item 3) from other exit-1 paths in startGatewayWithOptions. Item 3 alone is the genuine remaining gap.

What this PR does

Two files, comment-only changes:

File Change
src/lib/onboard/gateway-start-failure-integration.test.ts File header now cites test/onboard.test.ts as the primary caller-level coverage, with the assertion list copied for reviewers. The it.todo describe block is renamed (narrow remaining gap) and shrinks from 3 entries to 1, focused on the cleanup-messaging-absence signal.
test/e2e-scenario/migration/legacy-inventory.json Entry's notes field updated to cite test/onboard.test.ts and narrow the residual gap.

No production code changes; no test logic changes; only documentation and one it.todo consolidation.

Knock-on effect: re-scope #5113

Issue #5113 (Extract attemptGatewayStart helper from startGatewayWithOptions for caller-level testability) was filed when I thought the gap was 3 contracts. With items 1–2 already covered, the issue's scope can shrink to either:

  • A 5-line addition to the existing test/onboard.test.ts:357 test asserting !result.stderr.includes("Cleaning up failed gateway state"), or
  • The seam refactor as originally described, only if a future PR has a stronger reason to extract the helper.

I'll add a comment on #5113 with this update.

Refs

Summary by CodeRabbit

  • Tests

    • Updated test documentation to clarify Docker-unreachable contract coverage and narrow remaining test gaps.
  • Chores

    • Updated internal migration inventory notes regarding test coverage status.

Note: This release contains no user-facing changes; updates are internal to testing and documentation.

…narrow follow-up scope

Address PR Review Advisor's worth-checking finding 'Coverage-gap notes
understate existing caller-level regression coverage'. The advisor
correctly pointed out that test/onboard.test.ts already contains a
process-level regression test for #2347:

  it("fast-fails gateway start before health polling when Docker is
     unreachable (#2347)", ...)

That test spawns a child Node process with a PATH-shimmed openshell,
mocks p-retry, loads the compiled startGateway from dist/lib/onboard.js,
and asserts exit 1, "Docker daemon is not running", "colima start",
absence of "Waiting for gateway health", and absence of HEALTH POLL
REACHED (i.e. no status/gateway-info probes). It supersedes two of the
three it.todo placeholders this PR previously documented as caller-level
gaps.

Updates:

  - File header in gateway-start-failure-integration.test.ts now cites
    test/onboard.test.ts as the primary caller-level coverage source
    and explains exactly which contract remains un-asserted.

  - The 'call-site contracts (caller-level coverage gap)' describe
    block is renamed to 'call-site contracts (narrow remaining gap)'
    and shrinks from three it.todo entries to one. Items 1 and 2 (no
    health-poll loop entry, no status/gateway-info probes) are removed
    because they are already proven by test/onboard.test.ts. Item 3 is
    rewritten to focus on the actual remaining gap: distinguishing
    dockerUnreachable=true from other exit-1 paths via the absence of
    the 'Cleaning up failed gateway state...' line that the alternative
    branch prints.

  - Inventory entry's notes updated similarly: cites test/onboard.test.ts
    as existing caller-level coverage and narrows the residual gap.

Issue #5113 should be re-scoped down (or potentially closed) given
the much smaller remaining gap; that update is a separate action.

Verified locally: 18 passing tests + 1 todo (was 18 + 3); biome clean.

Refs: #2347, #4355, #5113
@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ac8f15d7-a480-4738-8ebd-c7e2e77711c4

📥 Commits

Reviewing files that changed from the base of the PR and between 82499f8 and d63be5d.

📒 Files selected for processing (2)
  • src/lib/onboard/gateway-start-failure-integration.test.ts
  • test/e2e-scenario/migration/legacy-inventory.json

📝 Walkthrough

Walkthrough

Test documentation in gateway-start-failure-integration.test.ts is expanded to clarify that process-level docker-unreachable fast-fail behavior is already covered by test/onboard.test.ts, narrowing the remaining call-site assertion gap to a single focused it.todo for dockerUnreachable=true forwarding. Legacy inventory notes are updated to reflect this clarified state.

Changes

Test coverage documentation clarification

Layer / File(s) Summary
Document coverage gap narrowing
src/lib/onboard/gateway-start-failure-integration.test.ts, test/e2e-scenario/migration/legacy-inventory.json
File-level documentation (lines 22–43) clarifies that test/onboard.test.ts proves process-level docker-unreachable fast-fail behavior. The call-site contracts section (lines 324–341) replaces multiple docker-unreachable it.todo placeholders with a single focused todo for dockerUnreachable=true forwarding distinguished by cleanup log absence. Inventory notes are updated to match the narrowed gap and defer retirement pending follow-up work (#5113).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

  • NVIDIA/NemoClaw#5113: Targets the same dockerUnreachable flag propagation to handleFinalGatewayStartFailure as the remaining coverage gap documented by this PR.

Possibly related PRs

  • NVIDIA/NemoClaw#5109: The primary PR that added process-level Vitest helper coverage for docker-unreachable behavior; this PR refines the remaining call-site assertion gap and updates corresponding inventory notes.

Suggested labels

chore, area: e2e

Suggested reviewers

  • cv

Poem

🐰 Notes grow clear as morning dew,
Coverage gaps now narrowed true,
Docker-unreachable finds its place,
One todo left to chase!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: correcting coverage citations in gateway-start-failure-integration tests by reflecting accurate caller-level coverage from test/onboard.test.ts.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch test/correct-gateway-start-failure-coverage-citation

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: None
Optional E2E: None

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • None. No E2E is recommended: this PR changes only test comments/todo documentation and E2E migration inventory metadata. It does not modify installer/onboarding runtime code, sandbox lifecycle behavior, credentials, network policy, inference routing, deployment, workflows, or real assistant user flows.

Optional E2E

  • None.

New E2E recommendations

  • None.

@github-actions

Copy link
Copy Markdown
Contributor

E2E Scenario Advisor Recommendation

Required scenario E2E: None
Optional scenario E2E: None

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

  • None. No Vitest scenario E2E run is needed. The only test/e2e-scenario change updates migration inventory notes/deletion-gate documentation for a legacy script and does not modify live scenario registry metadata, runtime support, manifests, expected-state metadata, fixtures, live tests, or the e2e-vitest-scenarios workflow. The other changed file is a non-scenario integration test outside test/e2e-scenario.

Optional scenario E2E

  • None.

Relevant changed files

  • test/e2e-scenario/migration/legacy-inventory.json

@github-actions

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 1 needs attention, 1 worth checking, 0 nice ideas
Top item: Keep or prove the pRetry AbortError/no-retry coverage gap

Review findings

🛠️ Needs attention

  • The removed todo still has an unproven pRetry AbortError/no-retry sub-contract (src/lib/onboard/gateway-start-failure-integration.test.ts:35): The updated comments and inventory note say the caller-level contracts are fully covered except for forwarding dockerUnreachable=true to handleFinalGatewayStartFailure. That is accurate for the no-health-polling and no status/gateway-info probe assertions, but not for the deleted todo's 'aborts via pRetry.AbortError' part. The existing process-level test uses a local p-retry mock that executes only one attempt and rethrows; it does not prove that the production path throws pRetry.AbortError specifically or suppresses retries versus throwing a normal error after one mocked attempt.
    • Recommendation: Either keep a separate todo/follow-up note for the pRetry AbortError/no-retry contract, or extend the process-level test/mocked p-retry seam so ordinary errors retry or call onFailedAttempt while AbortError does not, then assert the docker-unreachable path performs exactly one gateway-start attempt and does not invoke onFailedAttempt.
    • Evidence: test/onboard.test.ts:357 asserts exit 1, Docker guidance, no 'Waiting for gateway health', and no 'HEALTH POLL REACHED'. Its p-retry mock catches non-AbortError, may call onFailedAttempt, and rethrows with retriesLeft 0; it never retries ordinary errors, so it cannot distinguish AbortError retry suppression. The PR removes the todo 'startGateway aborts via pRetry.AbortError without entering health-poll loop' and replaces it with only the dockerUnreachable=true forwarding todo.

🔎 Worth checking

  • Source-of-truth review needed: Docker-unreachable gateway-start recovery coverage documentation and migration-inventory retirement notes: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: src/lib/onboard.ts sets dockerUnreachable=true and throws new pRetry.AbortError before 'Waiting for gateway health...'; the existing process test does not distinguish AbortError from a normal one-attempt failure under its p-retry mock.

🌱 Nice ideas

  • None.
Consider writing more tests for
  • **Mocked behavioral coverage** — Extend the p-retry mock so ordinary errors retry or call onFailedAttempt while AbortError does not, then assert Docker-unreachable gateway start invokes the fake openshell gateway start exactly once.. The changed files are documentation/todo metadata, but the PR's core correctness claim depends on what an existing p-retry mock proves. That mock is sufficient for no health polling and no status/gateway-info probes, but not for proving AbortError retry suppression.
  • **Mocked behavioral coverage** — Assert the Docker-unreachable process-level test does not call p-retry onFailedAttempt, proving the path is treated as AbortError rather than a normal retryable failure.. The changed files are documentation/todo metadata, but the PR's core correctness claim depends on what an existing p-retry mock proves. That mock is sufficient for no health polling and no status/gateway-info probes, but not for proving AbortError retry suppression.
  • **Mocked behavioral coverage** — Add the documented process-level assertion that stderr does not include 'Cleaning up failed gateway state...' for Docker-unreachable gateway-start output.. The changed files are documentation/todo metadata, but the PR's core correctness claim depends on what an existing p-retry mock proves. That mock is sufficient for no health polling and no status/gateway-info probes, but not for proving AbortError retry suppression.
  • **Acceptance clause:** 1. ⚠️ `startGateway` aborts via `pRetry.AbortError` without entering health-poll loop — add test evidence or identify existing coverage. The existing process-level test proves the path does not enter the health-poll loop, but its p-retry mock does not prove the thrown error is pRetry.AbortError or that retry behavior is aborted rather than merely single-attempt under the mock.
  • **Acceptance clause:** 3. ⚠️ `startGateway` forwards `dockerUnreachable=true` to `handleFinalGatewayStartFailure` — add test evidence or identify existing coverage. The PR keeps this as the one remaining todo and documents the distinguishing signal: absence of 'Cleaning up failed gateway state...' from the non-docker-unreachable branch. That direct assertion is not yet present.
  • **Acceptance clause:** File header now cites `test/onboard.test.ts` as the primary caller-level coverage, with the assertion list copied for reviewers. The `it.todo` describe block is renamed `(narrow remaining gap)` and shrinks from 3 entries to 1, focused on the cleanup-messaging-absence signal. — add test evidence or identify existing coverage. The header and todo block were updated as described, but shrinking from 3 todos to 1 loses the still-unproven pRetry AbortError/no-retry sub-contract.
  • **Acceptance clause:** Entry's `notes` field updated to cite `test/onboard.test.ts` and narrow the residual gap. — add test evidence or identify existing coverage. The inventory row now cites test/onboard.test.ts and the cleanup-message gap, but it also narrows the residual gap too far by not retaining the pRetry AbortError/no-retry proof gap.
  • **Docker-unreachable gateway-start recovery coverage documentation and migration-inventory retirement notes** — Existing helper/composition tests cover classifier/final-handler behavior, and test/onboard.test.ts covers exit guidance, no health polling, and no status/gateway-info probes. Missing direct evidence remains for pRetry AbortError/no-retry semantics and cleanup-message absence.. src/lib/onboard.ts sets dockerUnreachable=true and throws new pRetry.AbortError before 'Waiting for gateway health...'; the existing process test does not distinguish AbortError from a normal one-attempt failure under its p-retry mock.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@jyaunches

Copy link
Copy Markdown
Contributor Author

Superseded by #5119 (merged 2026-06-10T08:07:23Z), which fully retired test-docker-unreachable-gateway-start.sh and made the documentation inaccuracies this PR was correcting moot.

In particular, #5119:

  • Deleted the legacy bash script (test/e2e/test-docker-unreachable-gateway-start.sh)
  • Removed the inventory entry from legacy-inventory.json entirely
  • Removed the docker-unreachable-gateway-start-e2e job from regression-e2e.yaml
  • Stripped the it.todo describe block from gateway-start-failure-integration.test.ts (the section this PR was rewording)
  • Extracted the existing test/onboard.test.ts regression test into a new dedicated test/onboard-gateway-docker-unreachable.test.ts
  • Closed Extract attemptGatewayStart helper from startGatewayWithOptions for caller-level testability #5113

There's no remaining content in this PR worth landing.

Closing.

@jyaunches jyaunches closed this Jun 10, 2026
@jyaunches jyaunches deleted the test/correct-gateway-start-failure-coverage-citation branch June 10, 2026 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants