test(e2e): migrate test-openshell-version-pin.sh to free-standing Vitest live test#5107
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
Warning Review limit reached
More reviews will be available in 42 minutes and 1 second. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughAdds a hermetic Vitest e2e test that verifies OpenShell version pinning, a CI job to run that test and collect artifacts, and updates the migration inventory to mark the legacy test as covered with convergence evidence. ChangesOpenShell version pinning test migration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
E2E Advisor RecommendationRequired E2E: Dispatch hint: Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
Dispatch hint
|
E2E Scenario Advisor RecommendationRequired scenario E2E: Dispatch required scenario E2E:
Full scenario advisor summaryE2E Scenario AdvisorBase: Required scenario E2E
Optional scenario E2E
Relevant changed files
|
PR Review AdvisorFindings: 0 needs attention, 0 worth checking, 0 nice ideas This is an automated advisory review. A human maintainer must make the final merge decision. |
Migrates the four [PASS] assertions of test/e2e/test-openshell-version-pin.sh into an installer-integration Vitest test. End-to-end harness stubs uname, openshell, openshell-gateway/sandbox helpers, gh, curl, tar, and strings so install-openshell.sh runs through download, checksum verify, extraction, and install-bins onto the writable active-bin dir. Assertions translated: - installer-exits-zero: result.status === 0 - download-log-contains-v0.0.44: pinned tag was requested - download-log-excludes-v0.0.45: too-new sticky version never re-fetched - replaced-openshell-reports-0.0.44: binary on disk overwritten with pinned build (and not still 0.0.45) Refs #4355, #3474
Per nemoclaw-e2e-legacy-migrate Phase 5.8: first dispatch passed (installer-integration job 80478221325 succeeded). Skill requires a second consecutive PASS to confirm convergence and rule out false-pass. Refs PR #5107
…ash guard" This reverts a95edc0. Pivoting per skill Phase 5 architectural-fix diagnostics: the legacy-inventory.json schema lock test (e2e-migration-inventory.test.ts) asserts targetVitestScenarios paths match ^test/e2e-scenario/live/.+\.test\.ts$. Adding the migrated test to test/install-openshell-version-check.test.ts violated that contract. Re-homing the test as a free-standing live test in a follow-up commit, with a discrete dispatch job in e2e-vitest-scenarios.yaml. Refs #4355, #3474, PR #5107
…-pin Migrates test/e2e/test-openshell-version-pin.sh (regression guard for #3474) to a free-standing Vitest live test under test/e2e-scenario/live/. The legacy script is a hermetic installer-script behavioral test: it runs scripts/install-openshell.sh under a stubbed PATH where the already-installed openshell reports a too-new version (0.0.45) and the downloaded archives produce a binary that reports the pinned 0.0.44. It does not exercise the registry-driven steady-state probe model, so it lives outside baseline.ts as a free-standing live test (per #5049). Four [PASS] assertions translated faithfully: - installer-exits-zero (result.status === 0) - download-log-contains-v0.0.44 - download-log-excludes-v0.0.45 - replaced-openshell-reports-0.0.44 CI: discrete openshell-version-pin-vitest job in e2e-vitest-scenarios.yaml runs on workflow_dispatch alongside the registry-driven matrix. Bash guard retained in regression-e2e.yaml; deletion is a follow-up PR with #4357 approval per the migration freeze policy. Refs #4355, #3474
targetVitestScenarios -> test/e2e-scenario/live/openshell-version-pin.test.ts deletionReady -> false (deletion of bash guard requires #4357 approval) frozenAtSha and convergenceEvidence will be set in a follow-up commit once the new openshell-version-pin-vitest CI job has two consecutive GREEN runs (skill Phase 5.8 false-pass guard). Refs #4355, #3474, PR #5107
Two consecutive GREEN dispatches of openshell-version-pin-vitest in the correct post-pivot home (test/e2e-scenario/live/openshell-version-pin.test.ts): - run 27252871283 / job 80480862915 - run 27253022534 / job 80481579156 Sets frozenAtSha and convergenceEvidence per skill Phase 6. Refs #4355, #3474, PR #5107
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/e2e-scenario/live/openshell-version-pin.test.ts (1)
29-284: ⚡ Quick winConsider extracting stub-creation logic into helper functions.
The test body is ~255 lines, violating the "keep function complexity low" guideline. While the sequential setup is straightforward, extracting stub-creation logic (e.g.,
createFakeUname(),createFakeGh(),createFakeCurl(), etc.) would improve readability and maintainability.♻️ Example refactor for `uname` stub
+function createFakeUname(binDir: string): void { + writeExecutable( + path.join(binDir, "uname"), + `#!/usr/bin/env bash +if [ "\${1:-}" = "-m" ]; then echo "x86_64"; else echo "Linux"; fi`, + ); +} + test( "openshell-version-pin: replaces sticky too-new openshell with pinned 0.0.44 end-to-end", async ({ artifacts }) => { // ... - writeExecutable( - path.join(fakeBin, "uname"), - `#!/usr/bin/env bash -if [ "\${1:-}" = "-m" ]; then echo "x86_64"; else echo "Linux"; fi`, - ); + createFakeUname(fakeBin);Apply the same pattern for
gh,curl,tar,strings, and the OpenShell binaries.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e-scenario/live/openshell-version-pin.test.ts` around lines 29 - 284, The test function is too large and repeats stub-creation code; extract each stub setup into small helpers (e.g., createFakeUname, createFakeGh, createFakeCurl, createFakeTar, createFakeStrings, createFakeOpenshell, createFakeHelperBinaries) that accept the tmp/fakeBin and downloadLog paths and call the existing writeExecutable helper, then replace the inline blocks in the test with calls to those helpers and keep the spawnSync/assertion logic unchanged; reference the test name string "openshell-version-pin: replaces sticky too-new openshell with pinned 0.0.44 end-to-end", the INSTALL_SCRIPT invocation and writeExecutable usages to locate where to extract the code.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/e2e-scenario/live/openshell-version-pin.test.ts`:
- Around line 29-284: The test function is too large and repeats stub-creation
code; extract each stub setup into small helpers (e.g., createFakeUname,
createFakeGh, createFakeCurl, createFakeTar, createFakeStrings,
createFakeOpenshell, createFakeHelperBinaries) that accept the tmp/fakeBin and
downloadLog paths and call the existing writeExecutable helper, then replace the
inline blocks in the test with calls to those helpers and keep the
spawnSync/assertion logic unchanged; reference the test name string
"openshell-version-pin: replaces sticky too-new openshell with pinned 0.0.44
end-to-end", the INSTALL_SCRIPT invocation and writeExecutable usages to locate
where to extract the code.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 8a3231c7-80ce-4fdb-8a84-0997c72c46d0
📒 Files selected for processing (3)
.github/workflows/e2e-vitest-scenarios.yamltest/e2e-scenario/live/openshell-version-pin.test.tstest/e2e-scenario/migration/legacy-inventory.json
CI's static-checks job auto-fixed test() arg layout (single arrow function with destructured first arg, no trailing arg). Mirrors the project's house style. No behavior change. Refs PR #5107
Addresses CodeRabbit's CHILL nitpick on the migrated live test: the test() body was ~280 lines of inline shell-script string literals. Extracted each stub into a small named helper at module scope: - createFakeUname(binDir) - createFakeStickyOpenshell(binDir, version) - createFakeHelperBinaries(binDir) - createFakeGh(binDir, downloadLog) - createFakeCurl(binDir, downloadLog) - createFakeTar(binDir, replacementVersion) - createFakeStrings(binDir) Plus a SHARED_DOWNLOAD_BASH_HELPERS constant deduplicating the write_asset / sha256_digest / write_checksum bash functions shared between the gh and curl stubs. The test() body shrinks from ~250 lines of stub setup to ~10 lines of helper calls. Behavior is byte-equivalent: same stubs, same env, same assertions. Local Vitest runs in both 'e2e-scenarios-live' and 'cli' projects pass. Refs PR #5107, #4355, #3474
Migration framing noteThis PR is a good place to make the updated framing concrete: the goal is not to create a second class of E2E, and it is not to preserve a parallel bash harness indefinitely. For this migration, the durable target is a Vitest E2E test. If the behavior being tested is installer or shell behavior, the Vitest test should invoke the real shell boundary ( So I would read the legacy bash script’s retention here as convergence/deletion sequencing only:
|
|
Cross-linking the proposed post-#5106 migration path from #5098: #5098 (comment) For this PR, the desired shape is exactly the one-E2E-system model: the durable replacement is a Vitest E2E, and any shell/installer behavior should be invoked as the real boundary from that test. The legacy bash script should be retained only as convergence sequencing until the PR proves equivalent coverage and deletes or retires the old harness. Also worth noting for review: |
cv
left a comment
There was a problem hiding this comment.
Conflicts resolved and reviewer/advisor feedback addressed. I verified the focused local tests, the automatic PR checks, and the required e2e-vitest-scenarios fan-out on run 27262317232.
Discrete free-standing job for the typed migration of test/e2e/test-onboard-resume.sh, mirroring the openshell-version-pin-vitest precedent from #5107. Uses bash install.sh to bootstrap openshell CLI and node, then runs the live Vitest scenario with NVIDIA_API_KEY exposed. Bash script remains scheduled under nightly-e2e `onboard-resume-e2e`; this job runs only on workflow_dispatch until typed coverage soaks per epic #5098 policy. Refs: #4348, #5098
…+ workflow job Ports the live block of test/openclaw-tui-chat-correlation.test.ts into the typed scenario framework as a free-standing live test (#5049 pattern), plus a discrete openclaw-tui-chat-correlation-vitest job in e2e-vitest-scenarios.yaml modeled on #5107's openshell-version-pin-vitest. Coverage: - Onboards a fresh cloud OpenClaw sandbox via OnboardingPhaseFixture.from with the cloud-openclaw profile (already in SUPPORTED_ONBOARDING). - Asserts the pinned 2026.5.27 OpenClaw version (host-side probe via openshell sandbox exec ... openclaw --version). Override via E2E_OPENCLAW_TUI_CORRELATION_PINNED_VERSION when bumping. - Idempotent in-sandbox gateway start on port 18789 via SandboxClient.execShell. - Uploads the websocket repro driver via SandboxClient.upload, runs it via execShell with the gateway auth token read from /sandbox/.openclaw/openclaw.json. - Asserts the #2603 + #3145 contract: no empty finals for submitted runs, no uncorrelated replies, no missing/duplicate replies, no missing/duplicate user turns. Preserves the looksLikeEventCaptureFailure retry-once guard (OpenClaw-side observability race; remove when the runtime exposes a deterministic chat readiness ack). Inventory: bridge-probe entry for test/e2e/test-openclaw-tui-chat-correlation.sh now references this file as its bridge probe (satisfies the inventory invariant that bridge-probe entries have non-empty bridgeProbes pointing at real paths). Bash workflow (nightly-e2e.yaml openclaw-tui-chat-correlation-e2e job) remains untouched per the migration freeze policy; deletion is a follow-up PR with #4357 approval after typed scenario soaks. Refs: #4347 #2603 #3145 #5098 #5049 #5107
Migration:
test-openshell-version-pin.sh→ free-standing Vitest live testRefs: #4355 (owner issue), #3474 (regression target), #4990 (E2E architecture contract)
What landed in this PR
A free-standing Vitest live test at
test/e2e-scenario/live/openshell-version-pin.test.tsplus a discreteopenshell-version-pin-vitestjob in.github/workflows/e2e-vitest-scenarios.yaml, modeled on #5049's free-standing pattern. The legacy bash guardtest/e2e/test-openshell-version-pin.shremains in place; deletion is a follow-up PR with #4357 approval per the migration freeze policy.Architecture notes (course-correction during the loop)
The first pass of this migration tried to land the test in
test/install-openshell-version-check.test.tsunder the existinginstaller-integrationVitest project. It even passed CI twice there (commita95edc02e, runs 27251976992 and 27252048411). The argument was that this is a hermetic installer-script unit test, not a scenario, and sister tests already live in that file.That argument lost to an explicit project contract:
test/e2e-scenario/framework-tests/e2e-migration-inventory.test.tsenforces that every entry intargetVitestScenariosfor acoveredlegacy script must match^test/e2e-scenario/live/.+\.test\.ts$. The schema is the binding architectural decision; the project has already chosentest/e2e-scenario/live/as the canonical home for migrated tests, regardless of whether they exercise the scenario framework's primitives. The pivot adopts that decision.Commits in this PR walk that history:
873aaa2ee— mark inventorybridge-probe(initial in-flight state)a95edc02e— first sketch under installer-integration (rejected by schema)4670cb556— empty commit (verification dispatch trigger from the rejected approach)9215a4f1f— revert (2)920876546— final sketch: free-standing live test + discrete CI job3d9c01931— inventory markcoveredpointing at the new live testShape
ubuntu-latestfrom(scenario.environment, env) → from(scenario.expectedStateId, instance)openshell-version-pin-vitestine2e-vitest-scenarios.yamlAssertion chain (mapped from the bash script's four
[PASS]checks)installer-exits-zero—result.status === 0(happy path completes; no "above the maximum" hard-fail).download-log-contains-v0.0.44— pinned release tag was requested.download-log-excludes-v0.0.45— too-new sticky version is never re-fetched.replaced-openshell-reports-0.0.44— binary on disk in active install dir overwritten with pinned build.Failing-test-first contract
scripts/install-openshell.sh:272-275implements the v0.0.44 reinstall path; sister tests intest/install-openshell-version-check.test.tsalready cover the entry-point invariants). This migration's residual coverage is the happy-path completion, exercised via stubbed PATH binaries.openshell-version-pin-vitestjob before convergence is recorded in the inventory entry'sfrozenAtSha/convergenceEvidencefields.vitest run --project e2e-scenarios-live test/e2e-scenario/live/openshell-version-pin.test.tspasses; inventory schema-lock test passes.Suite separation honored
regression-e2e.yaml(bash) untouched. Legacy script remains in place; deletion is a follow-up PR with Phase 11: Final Audit Reconciliation (E2E audit-coverage) #4357 approval.e2e-vitest-scenarios.yamlgains a discrete free-standing job; matrix path unchanged.vitest.config.ts:e2e-scenarios-liveproject picks up the new test file via thetest/e2e-scenario/live/**/*.test.tsglob with no project-list edit.Inventory delta
Next steps after this PR merges
openshell-version-pin-vitest(one is happening on this PR; the second is enforced post-merge by re-dispatch onmain).#4357approval to deletetest/e2e/test-openshell-version-pin.shfromregression-e2e.yamland the bash file itself, transitioning the inventory entry fromcovered→retired.Convergence
3d9c01931.openshell-version-pin-e2esucceeded (runs 26720769662, 26718460056). Bash guard untouched.3d9c01931f598cf2450bd88028cb3b44bcf367b0. Inventory entry now recordsfrozenAtShaandconvergenceEvidence.Assertions resolved (in order)
installer-exits-zero—result.status === 0(commit920876546)download-log-contains-v0.0.44—expect(downloads).toContain("v0.0.44")(commit920876546)download-log-excludes-v0.0.45—expect(downloads).not.toContain("v0.0.45")(commit920876546)replaced-openshell-reports-0.0.44— re-run$fakeBin/openshell --version; assert0.0.44and not0.0.45(commit920876546)All four legacy
[PASS]assertions translated faithfully. No deferred assertions.Note for reviewers
First-dispatch passed (in both pre- and post-pivot homes). This is the duplicative case the migration skill anticipates:
install-openshell.sh:272-275already lands the v0.0.44 reinstall path, and sister tests already covered the entry point. The legacy bash guard's value-add was the happy-path completion through download + extraction + binary replacement, which is now expressed as a Vitest live test. Please sanity-check the assertion list above against the four[PASS]lines intest/e2e/test-openshell-version-pin.sh— if equivalence is confirmed, the migration is complete and the bash guard can be retired in a follow-up PR with#4357approval.Summary by CodeRabbit