Skip to content

Re-poll CI state at delivery to skip superseded failures#195

Merged
subsetpark merged 4 commits intomainfrom
zax--ci-freshness-gate
Apr 17, 2026
Merged

Re-poll CI state at delivery to skip superseded failures#195
subsetpark merged 4 commits intomainfrom
zax--ci-freshness-gate

Conversation

@subsetpark
Copy link
Copy Markdown
Collaborator

@subsetpark subsetpark commented Apr 17, 2026

Summary

  • When a Ci Respond action finally dispatches, the failure that triggered it may have been superseded by a newer run. The orchestrator snapshot is only refreshed once per poll cycle (30s), and the semaphore wait can push actual delivery minutes later — so we wake agents for failures that have already cleared.
  • The runner now re-polls GitHub just before the delivery slot (same placement as the existing Review_comments prefetch). If the fetch fails or no current failures exist, we log and short-circuit to Skip_empty; otherwise we write the fresh list into orchestrator state via set_ci_checks so respond_delivery reads a current agent.ci_checks.
  • The Ci-specific is_empty path moves out of respond_delivery entirely — freshness is the caller's concern (it requires I/O), the pure decision function just builds the payload from current state.

Test plan

  • dune build clean
  • dune runtest — existing RD-1/2a/2c/3/4/5/6/7 still pass; RD-2b/RD-8/RD-9 removed (tested behavior that moved out of the pure function or a reverted parameter)
  • dune fmt clean
  • Observe on a live CI-failing PR: confirm "Fetching fresh CI state from GitHub" log, then either delivery or "Fresh CI state shows no failures — skipping CI delivery"

🤖 Generated with Claude Code


Summary by cubic

Re-polls GitHub for CI right before delivery and skips superseded failures with precise skip reasons to avoid stale alerts and wasted Claude slots. Fresh CI checks are written under the busy guard so respond_delivery builds from current state.

  • Bug Fixes

    • Re-poll PR state before CI delivery; if no PR number, fetch fails, or no failures remain, log the reason and short-circuit to Skip_empty before taking a Claude slot.
    • Persist fresh ci_checks via Orchestrator.set_ci_checks inside the busy guard (before the agent re-read) to avoid races with the poller.
  • Refactors

    • respond_delivery reads CI checks from agent and keeps a belt-and-suspenders guard using Ci_check.is_failure; the caller owns freshness and the set_ci_checks write. Simplified the CI skip fast path by removing a redundant is_ci guard.
    • Unified GitHub prefetch for Review comments and CI; improved skip logs to distinguish fetch failure vs no current failures vs no PR number.

Written for commit db3a9de. Summary will update on new commits.

By the time a Ci Respond action is dispatched, the failure that triggered
it may already have been superseded by a newer run — the agent's
orchestrator snapshot is only refreshed once per poll cycle (30s), and
the semaphore wait can push actual delivery minutes later. Delivering a
stale failure wakes the agent for a problem that's already fixed.

The runner now re-polls GitHub just before the delivery slot: if the
fetch fails or the fresh state shows no current failures, it logs and
short-circuits to Skip_empty; otherwise it writes the fresh list into
orchestrator state so respond_delivery reads current agent.ci_checks.
The Ci-specific is_empty branch moves out of respond_delivery entirely.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 17, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
onton Ready Ready Preview, Comment Apr 17, 2026 8:33pm

Request Review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 17, 2026

Warning

Rate limit exceeded

@subsetpark has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 7 minutes and 4 seconds before requesting another review.

Your PR has hit rate limit. Contact your admin to purchase credits for running PR reviews or try again in 7 minutes and 4 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 019c530c-b9b6-4c9b-a7cc-fdb3d089a928

📥 Commits

Reviewing files that changed from the base of the PR and between b75df6e and db3a9de.

📒 Files selected for processing (4)
  • bin/main.ml
  • lib/patch_decision.ml
  • lib/patch_decision.mli
  • test/test_patch_decision.ml
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch zax--ci-freshness-gate

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 4 files

Copy link
Copy Markdown

@flowglad-review-service flowglad-review-service bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: 4 comments

Comment thread bin/main.ml
Comment thread bin/main.ml Outdated
Comment thread lib/patch_decision.ml
Comment thread test/test_patch_decision.ml
@flowglad-review-service
Copy link
Copy Markdown

Review Summary

The PR correctly identifies and addresses the staleness problem for Ci deliveries by adding a GitHub re-poll before the claude slot. The design split (freshness gate in the runner, payload building in the pure function) is sound. However there are two correctness issues and one quality concern:

  1. set_ci_checks fires before with_busy_guard (must-fix, bin/main.ml): The orchestrator write happens outside the busy-guard scope, so in a stale/cancelled delivery the write is spurious and races with the poller's concurrent set_ci_checks. Moving it inside with_busy_guard (after the agent re-read) fixes this.

  2. respond_delivery can emit Ci_payload { failed_checks = [] } (must-fix, lib/patch_decision.ml): Removing the is_empty guard for Ci makes the pure function unsafe in isolation. A belt-and-suspenders not (List.exists agent.ci_checks ~f:Ci_check.is_failure) check should be re-added in the is_empty arm, making the function correct regardless of whether the caller remembered to populate ci_checks. This also justifies reinstating test RD-2b.

  3. Fetch-failure skip reason is lost in the terminal log (should-fix, bin/main.ml): The Skip_empty log inside with_claude_slot says "nothing to deliver" even when the real cause is a transient GitHub API error, making debugging harder.
    | Severity | Count |
    |----------|-------|
    | Must-fix | 2 |
    | Should-fix | 1 |
    | Note | 1 |


4 comments posted · Model: claude-sonnet-4-6 (sonnet) · Tokens: 338840 in / 7590 out · Cache: 0 created / 60644 read · Context: 5 items · Review mode: agentic · Turns: 15 · Tool calls: 19 · Forced submit: yes · Tool mix: read_file=14, search_codebase=5

… guard

- Move the fresh `set_ci_checks` write from the `ci_skip_reason`
  binding to inside `with_busy_guard` (within `with_claude_slot`,
  before the agent re-read). The write now only lands while holding
  the busy slot, so it can't race with the poller's writes or mutate
  state on a delivery that `with_busy_guard` then observes as
  `Respond_stale`.
- Distinguish the CI skip reason in the `Skip_empty` log arm so a
  fetch-failure or "no current failures" short-circuit no longer
  emits the misleading "Skipped ci — nothing to deliver" line.
- Restore the belt-and-suspenders `is_empty` guard in
  `respond_delivery` for `Ci`: if `agent.ci_checks` has no failure
  conclusions, return `Skip_empty` instead of an empty `Ci_payload`.
  Uses `Ci_check.is_failure` (single source of truth).
- Re-add RD-2b to cover the pure-function guard.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@flowglad-review-service flowglad-review-service bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: 3 comments

Comment thread bin/main.ml Outdated
Comment thread bin/main.ml Outdated
Comment thread bin/main.ml Outdated
@flowglad-review-service
Copy link
Copy Markdown

Review Summary

The four must-fix issues from Turn 1 are now correctly addressed: set_ci_checks is placed inside with_busy_guard, the CI skip log distinguishes the reason, the belt-and-suspenders is_empty guard is restored in respond_delivery with documentation, and RD-2b is back in the test suite. The core correctness concern is resolved.

Two new should-fix observations on the newly introduced code:

  1. Silent pr_number = None branch misattributed as "fetch failed" (line 2687): when a CI Respond fires before the PR exists, fresh_pr_state = None without any log, then ci_skip_reason = Some "fetch failed" even though no HTTP fetch was attempted. The comment "Log already emitted by the fetch branch" is only true for the HTTP-error sub-case.

  2. Semaphore slot consumed for a predetermined skip (line 2723): when ci_skip_reason = Some _, with_claude_slot is still entered just to return Skip_empty, needlessly holding a Claude concurrency slot. The short-circuit should happen before the semaphore acquire.

One note about Respond_stale being unreachable on the ci-skip path — observable only in logs, and state cleanup is correct.

Severity Count
Must-fix 0
Should-fix 2
Note 1

3 comments posted · Model: claude-sonnet-4-6 (sonnet) · Tokens: 261669 in / 6796 out · Cache: 0 created / 60644 read · Context: 5 items · Review mode: agentic · Turns: 15 · Tool calls: 19 · Tool mix: read_file=14, search_codebase=5

- Extract [pr_number] to a let binding so the [fresh_pr_state] fetch
  and [ci_skip_reason] can both reference it.
- Emit "No PR number yet — skipping CI state fetch" in the no-PR arm
  of the fetch (previously silent), and encode a distinct reason
  ("no PR number" vs "fetch failed") in [ci_skip_reason] via a
  [(pr_number, fresh_pr_state)] match. The terminal skip log is now
  accurate for the no-PR case.
- Short-circuit the CI skip decision before [with_claude_slot]: when
  [ci_skip_reason = Some _] and [is_ci], the [with_busy_guard] lambda
  logs and yields [`Skip_empty] without entering the Claude slot,
  freeing that slot for another agent doing real work. Normal path
  (no skip, or non-CI) keeps all the work under [with_claude_slot].
- Collapses the stale-during-semaphore-wait window on the skip path:
  no wait happens at all, so the skip log can't be misattributed.
  Removes the now-redundant [ci_skip_reason] check in the [Skip_empty]
  arm; inside the slot, [Skip_empty] only means a genuine empty
  delivery, so the generic log is accurate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@flowglad-review-service flowglad-review-service bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: 1 comment

Comment thread bin/main.ml Outdated
@flowglad-review-service
Copy link
Copy Markdown

Review Summary

All seven issues raised across the two previous review turns have been fully addressed in this diff:

  1. set_ci_checks write now inside with_busy_guard (Turn 1 must-fix) — The write to agent.ci_checks via Runtime.update_orchestrator now occurs inside with_busy_guard, inside with_claude_slot, and before the agent re-read, eliminating both the poller race and the stale-delivery corruption risk.

  2. Distinct skip-reason log messages (Turn 1 should-fix + Turn 2 should-fix) — "no PR number", "fetch failed", and "no current failures" are all distinguishable in logs. The no-PR-number path now emits its own diagnostic ("No PR number yet — skipping CI state fetch") before computing ci_skip_reason.

  3. Belt-and-suspenders guard restored in respond_delivery (Turn 1 must-fix) — is_empty for Ci now uses not (List.exists agent.ci_checks ~f:Ci_check.is_failure), preventing an empty Ci_payload from leaking through if a future caller skips the freshness hop.

  4. RD-2b test re-added (Turn 1 note) — The belt-and-suspenders path is covered by a unit test again.

  5. with_claude_slot no longer consumed on skip (Turn 2 should-fix) — The fast path short-circuits before with_claude_slot, preserving semaphore capacity for agents doing real work.

  6. Stale-during-skip window eliminated (Turn 2 note) — Because the fast path exits before any semaphore wait, the window where a stale agent could produce a misleading log is gone.

One cosmetic observation remains (new in this diff): the when is_ci guard on line 2728 is a dead condition since ci_skip_reason is structurally None when not is_ci.

Severity Count
Must-fix 0
Should-fix 0
Note 1

1 comment posted · Model: claude-sonnet-4-6 (sonnet) · Tokens: 362674 in / 4631 out · Cache: 0 created / 60644 read · Context: 5 items · Review mode: agentic · Turns: 15 · Tool calls: 16 · Tool mix: read_file=11, search_codebase=5

[ci_skip_reason] is [Some _] iff the CI freshness gate fired (it's
bound to [None] in the [else] branch when [not is_ci]), so the
[when is_ci] guard on [Some reason when is_ci] is always true when
that arm matches. Remove the guard and use [None ->] for the normal
path; the match is self-documenting on [ci_skip_reason] alone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@flowglad-review-service
Copy link
Copy Markdown

Review Summary

All seven outstanding issues from the three previous review turns have been fully addressed in this diff:

  1. set_ci_checks write ordering (Turn 1, must-fix): Now called inside with_busy_guard / with_claude_slot, before the agent re-read, eliminating both the poller race and the stale-delivery state corruption path.

  2. Misleading skip log (Turn 1, should-fix): The skip path emits "Skipped ci delivery — %s" reason with a concrete, accurate reason string rather than the generic "nothing to deliver".

  3. Belt-and-suspenders is_empty guard (Turn 1, must-fix): Restored in respond_delivery using Ci_check.is_failure, ensuring the pure function never emits an empty Ci_payload even if a future caller omits the pre-fetch.

  4. RD-2b test (Turn 1, note): Re-added with an updated comment explaining the belt-and-suspenders role.

  5. No log for pr_number = None (Turn 2, should-fix): Explicit log "No PR number yet — skipping CI state fetch" is now emitted, and the reason string is "no PR number" (not the misleading "fetch failed").

  6. with_claude_slot consumed on skip (Turn 2, should-fix): ci_skip_reason = Some _ short-circuits before entering with_claude_slot, preserving concurrency capacity for real work.

  7. Dead when is_ci guard (Turn 3, note): The ci_skip_reason match has no spurious when is_ci guard; the condition is expressed structurally.

No new issues were found in this diff. The implementation is correct.

Severity Count
Must-fix 0
Should-fix 0
Note 0

0 comments posted · Model: claude-sonnet-4-6 (sonnet) · Tokens: 290692 in / 3381 out · Cache: 0 created / 48446 read · Context: 5 items · Review mode: agentic · Turns: 12 · Tool calls: 13 · Tool mix: read_file=9, search_codebase=4

@subsetpark subsetpark merged commit d7ad8f5 into main Apr 17, 2026
9 checks passed
@subsetpark subsetpark deleted the zax--ci-freshness-gate branch April 17, 2026 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant