fix(keepalive): ride forced-continuation chains instead of yielding on stop_hook_active#26
Conversation
…n stop_hook_active Claude Code sets stop_hook_active=true on EVERY Stop event that follows a hook-forced continuation — it means "this turn is part of a chain the hook started," not "the host is about to override you." The hook was treating it as the consecutive-block cap and yielding (host_cap) on the very next boundary after each force-continue, so interactive sessions died after exactly one keepalive turn (issues #19, #25: the force_continue → host_cap alternation in keepalive.log). - Stop hook now force-continues through forced chains; the only stop conditions are the human disk markers (CONCLUDE, STOP) and the structural preconditions (no_run, inactive). host_cap is retired. - RESPAWN_REQUESTED is written preemptively while blocking mid-chain (the host's 8-blocks-without-progress override fires silently, so the resume breadcrumb must already be on disk) and cleared on fresh user-driven boundaries. New keepalive.blocks counter records chain depth for post-mortems. - The Claude host installer now pins CLAUDE_CODE_STOP_HOOK_BLOCK_CAP=5000 via .claude/settings.local.json env so the host cap is effectively lifted for overnight runs; uninstall removes it only if unmodified. - Rules block, skill.md, README, and CLI copy updated to the corrected semantics; RFC 010 (respawn supervisor) drafted as the belt-and-braces follow-up issue #25 requests. - Bump 0.0.10. Fixes #19. Root-causes #25 (RFC 010 covers its supervisor ask). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (21)
📝 WalkthroughWalkthroughv0.0.10 delivers a critical fix to the keepalive Stop-hook semantics for Claude Code's forced-continuation chains, implements preemptive Changesv0.0.10 Release: Keepalive Hook Fix & RFC 010 Proposal
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Root cause (issues #19, #25)
Every keepalive log in the bug reports shows the same alternation:
force_continue→host_cap, one pair per session. The Stop hook treatedstop_hook_active: trueas "Claude Code's consecutive-block cap is about to override us" and yielded immediately.That interpretation is wrong, verified against the Claude Code hooks guide:
stop_hook_active: trueis set on every Stop event that follows a hook-forced continuation. It only means "this turn is part of a chain the hook started."CLAUDE_CODE_STOP_HOOK_BLOCK_CAPenv var.So Nightly surrendered on the second turn boundary of every session — exactly the ~47-minute early terminations the operator reported.
The fix
compute_stop_hook_decisionno longer yields onstop_hook_active. Decision order:no_run→inactive→CONCLUDE→STOP→ force-continue. Human disk markers remain the only voluntary off-ramps; thehost_capreason code is retired.RESPAWN_REQUESTEDbefore returning the block — if the host's without-progress override (or a crash) silently kills the session, the resume marker is already on disk fornightly status/ the skill respawn path. Fresh user-driven boundaries clear the stale marker. Newkeepalive.blockscounter records chain depth for post-mortems."env": {"CLAUDE_CODE_STOP_HOOK_BLOCK_CAP": "5000"}into.claude/settings.local.json(new idempotentmerge_settings_env/remove_settings_envhelpers; uninstall removes the key only if the operator hasn't customized it).rules.py→ regenerated AGENTS.md / CLAUDE.md),skill.md, README, and CLI echoes updated to the corrected semantics ("9-consecutive-block" story removed everywhere)..planning/rfcs/010-respawn-supervisor.md): the detached respawn supervisor the operator asked for in Nightly bug report — run 2026-05-21T02-14-07Z @ 2026-06-10T02-09-48Z #25 — now belt-and-braces for host-override/crash rather than the primary keep-alive.Testing
make checkclean: ruff, pyrefly, 1032 passed. New/reworked coverage: chain blocks force-continue + write the marker + bumpkeepalive.blocks; fresh boundaries reset both; STOP/CONCLUDE/no-run/inactive win even mid-chain; settings env merge/remove (incl. preserving operator overrides).Fixes #19. Root-causes #25; the supervisor half of #25 is scoped in RFC 010.
🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes v0.0.10
Bug Fixes
New Features
nightly statusandnightly session startcommands with respawn/resume signal visibility.Documentation