fix(scripts,hooks): install Lefthook in agent VMs + cache-bust pre-push coverage#522
fix(scripts,hooks): install Lefthook in agent VMs + cache-bust pre-push coverage#522isuttell wants to merge 2 commits into
Conversation
…sh coverage
The most-repeated AP-98 orchestrator-friction entry ("Nth consecutive Cursor
handoff with red Validate") root-causes to a real bug, not stale cache:
`install-hooks.mjs` skipped install on generic `CI=true`. Unattended agent VMs
(Cursor/Codex) run non-interactively and `setup-worktree.mjs commandEnv()`
injects `CI=true`, so Lefthook never installed there — remote workers had no
`pre-push` gate and pushed red PRs that only CI caught.
- install-hooks.mjs: skip only on `GITHUB_ACTIONS` (the repo's real-CI marker,
matching deploy-pr-preview.mjs) or explicit `SKIP_LEFTHOOK=1`. Refactor to
pure `shouldSkipInstall(env)` + `installHooks({env,spawn,log})`; add
install-hooks.test.mjs (runs under `pnpm test:scripts`/verify).
- Pre-push coverage now runs cache-busted `test:coverage:strict`
(`turbo run test:coverage --force`) so a stale FULL-TURBO cache can't serve a
coverage false-green (AP-136). Fast cached `test:coverage` stays for iteration.
- docs/development.md Git Hooks section corrected (it claimed pre-push runs
knip+test:coverage; it runs the full verify + coverage) and documents the
GITHUB_ACTIONS/SKIP_LEFTHOOK install rule.
- CLAUDE.md: never edit the synced `ziw-*` skills here; route skill changes to
the AP-98 friction log for the skills agent.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Warning Review limit reached
More reviews will be available in 24 minutes and 33 seconds. Learn how PR review limits work. To continue reviewing without waiting, enable usage-based billing in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Plus Run ID: 📒 Files selected for processing (6)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Local review caught a latent silent-no-install footgun: the hand-rolled
`import.meta.url === \`file://${process.argv[1]}\`` guard diverges when the
checkout path contains a space or URL-reserved char (import.meta.url
percent-encodes it; the interpolated string does not), so install-hooks would
import with zero output and never install — exactly the failure this PR fixes.
Switch to the repo's established idiom (`pathToFileURL(process.argv[1]).href`,
as in deploy.mjs and 6 sibling scripts). Add a test for the already-forced
first-attempt failure branch.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
First-pass review
Risk: simple
Decision: approve
Ticket triage
- Intended change: Stop skipping Lefthook install in unattended agent VMs (
CI=truewithoutGITHUB_ACTIONS), and cache-bust the pre-push coverage gate so stale turbo output cannot false-green a push. - Scope match: Yes. Root cause (
setup-worktree.mjs→commandEnv()setsCI=true; oldinstall-hooks.mjsskipped on genericCI) is addressed.test:coverage:strictcompletes the AP-136 follow-through on pre-push coverage.
Review findings
Blocking: none
Non-blocking:
import.meta.url === \file://${process.argv[1]}`is less robust than thepathToFileURLmain-module guard used inscripts/deploy.mjs`; fine for Linux agent VMs today.docs/adr/0025-biome-lefthook-vitest-for-code-quality.mdstill describes the old pre-push commands (orthogonal doc drift).- No open Linear implement ticket — this is friction-log-driven (AP-98 / AP-136); acceptable for this scope.
Merge checklist
| Check | Status |
|---|---|
| Ticket linked & understood | ✓ (AP-98 friction / AP-136 follow-up) |
| PR scope matches | ✓ |
| Checks green | ✓ Validate, Postgres smoke, CodeQL, secret scan |
| Tests/docs appropriate | ✓ install-hooks.test.mjs + docs/development.md |
| No blocking findings | ✓ |
| No high-risk areas | ✓ dev hooks / scripts only |
| Merge-safe | ✓ |
Summary
Focused, well-tested fix for a real agent-handoff failure mode. shouldSkipInstall keyed on GITHUB_ACTIONS matches existing repo CI detection; unit tests cover the agent-VM shape. test:coverage:strict correctly closes the stale-cache gap on pre-push without slowing normal test:coverage iteration.
Sent by Cursor Automation: First Pass PR Reviewer


Why
Mining the AP-98 orchestrator friction log (87 entries) for AP-99, the single most-repeated, most-expensive recurring failure is: remote Cursor workers hand off PRs with red CI despite the docs telling them to run
pnpm verifyfirst ("4th consecutive Cursor handoff with red Validate", PR #500; also #310, #422, AP-204/313/315).It root-causes to a real bug, not stale cache:
.cursor/environment.jsoninstall =pnpm setup:codex→setup-worktree.mjs, which runspnpm hooks:install.setup-worktree.mjs commandEnv()injectsCI=truefor non-TTY processes (every Cursor/Codex VM).install-hooks.mjsskipped install on genericCI === "true", so Lefthook never installed in agent VMs → remote workers had nopre-pushgate (pnpm verify+ coverage) → they pushed red, and CI Validate caught it only after the fact, costing ~1 round-trip + 1 CI run per occurrence, repeatedly.A second, related entry family (AP-128/130/136, "third consecutive slice bitten by turbo stale-cache"): the pre-push coverage gate ran plain
pnpm test:coverage, which turbo serves from cache (FULL TURBO), so a coverage-tipping diff could push a false green.What
scripts/install-hooks.mjs: skip install only inside GitHub Actions (GITHUB_ACTIONS, the repo's real-CI marker, matchingdeploy-pr-preview.mjs) or on explicitSKIP_LEFTHOOK=1. Refactored to pureshouldSkipInstall(env)+installHooks({env,spawn,log})so the decision is unit-testable. GenericCI=trueno longer suppresses install → agent VMs get the gate.scripts/install-hooks.test.mjs(new, runs underpnpm test:scripts/verify): asserts install whenCI=truebutGITHUB_ACTIONSunset, skip underGITHUB_ACTIONS/SKIP_LEFTHOOK, plus thelefthook install/--force/failure paths.test:coverage:strict(turbo run test:coverage --force) wired into thepre-pushcoverage hook so a stale cache can't serve a coverage false-green (AP-136). Fast cachedtest:coveragestays for iteration.docs/development.mdGit Hooks section corrected (it claimed pre-push runsknip+test:coverage; it runs the fullverify+ coverage) and now documents theGITHUB_ACTIONS/SKIP_LEFTHOOKinstall rule.CLAUDE.md: never edit the syncedziw-*skills in this repo (they're pulled fromzaks-io/skills, AP-298); route needed skill changes to the AP-98 friction log for the skills agent.Deferred (not in this PR)
Per owner direction, skill/process changes are not made here — they're recorded on AP-98 as
config-gapnotes for the skills agent (e.g. make "run fullpnpm verifybefore handoff" unmissable inremote-cursor-agent.md; serialize a refactor + coverage-test ticket sharing a function seam per the AP-231 orphaned-PR lesson; scope changes go in the Cursor session thread, never as top-level Linear comments; optional stuck-worker liveness timeout).Verification
pnpm verify✅ andpnpm test:coverage:strict✅ (exit 0, global threshold enforced on freshly recomputed artifacts) run locally — we now eat our own dog food.pnpm test:scripts✅ 231 tests incl. the new 8 install-hooks cases.CI=trueandGITHUB_ACTIONSunset,installHooks()returns{installed:true}and lefthook syncspre-commit, pre-push; withGITHUB_ACTIONS=trueit's a silent no-op. Thepreparelifecycle (pnpm install) still installs hooks.--forceproof: plainturbo run test:coverage→38 cached, 38 total >>> FULL TURBO;--force→cache bypass, force executing/0 cached.Limit: a real Cursor VM can't be fully E2E'd locally — the env-shape repro is the proof; the next remote handoff is the live confirmation.
🤖 Generated with Claude Code