Raise default budgets to high-but-bounded; release 0.13.1#7
Conversation
The previous defaults were too conservative for the long-running workloads autosentry exists to babysit (ML training, multi-hour data pipelines): a healer that needed ~15 attempts to land a kept fix would hit `max_restarts=10` and give up, while a stuck-loop healer could still spam the API hourly under `max_attempts_per_detector_per_hour=5`. Both knobs are now wide enough that productive workloads never feel capped, narrow enough that broken healing eventually stops. | knob | was | now | |-------------------------------------------------|-------|--------| | process.restart_policy.max_restarts | 10 | 50 | | healing.budget.max_attempts_per_detector_per_hour | 5 | 60 | | healing.budget.max_wall_seconds_per_incident | 600 | 7200 | Also reframes `max_restarts`: it's not a *budget* — `state.restarts` zeros on every kept fix, so this only trips when the healer can't land a kept fix for N restarts in a row. Documented as such. `healing.escalate_to_claude_after` decouples from `max_restarts // 5` to a literal `2`. Previously, raising max_restarts pushed Claude escalation later (exactly backwards: rules monopolized more attempts when the budget grew). Now rules get two cheap shots regardless of cap, and Claude takes over. New helpers in `autosentry.state`: - `budget_exhausted(restarts, max_restarts)` — single source of truth for the kill-switch check; honors the `0 = unlimited` sentinel. - `format_budget(max_restarts)` — renders `∞` for unlimited, integer otherwise. Used by Monitor, status, TUI, doctor, and the incident report so every surface displays consistently. Existing configs keep their explicit values; only fresh `init` and `init --upgrade` pick up the new defaults (upgrade prompts per-key). 326 tests passing (3 new helper tests, 2 retargeted to literal-2). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR refactors restart and healing budget handling across autosentry by introducing centralized helper functions, increasing default budget limits, and decoupling escalation thresholds. Default ChangesRestart/Healing Budget Refactoring
🎯 3 (Moderate) | ⏱️ ~25 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@CHANGELOG.md`:
- Around line 9-10: The changelog added a new release heading "## [0.13.1] —
2026-06-04" but the reference-link footer still maps "Unreleased" to "v0.8.4"
and doesn't define links for "0.13.1" and "0.13.0", so update the CHANGELOG.md
reference-link footer: change the Unreleased reference to point to the correct
comparison (or remove if not needed), and add explicit reference-link entries
for [0.13.1] and [0.13.0] with their corresponding tags/compare URLs (matching
the repo's tag naming like v0.13.1 and v0.13.0) so the headings
"0.13.1"/"0.13.0" resolve properly.
In `@src/autosentry/monitor.py`:
- Around line 141-143: Update the outdated comments that claim an "unlimited"
restart budget to reflect the current capped default of 50; locate the comment
blocks around the monitor flow mentioning max_restarts (the one referencing
"Decoupled from ``max_restarts``" and the later block around lines ~378-380) and
change their wording to indicate the restart budget is capped at 50 and that
rules get two cheap shots before the kill-switch at 50 restarts applies,
preserving the original intent about transient retries but matching the actual
capped default semantics.
In `@src/autosentry/state.py`:
- Around line 175-176: Update the stale docstring in the budget_exhausted
function/doc (budget_exhausted in src/autosentry/state.py) to stop claiming that
"max_restarts <= 0 is the sentinel for 'unlimited' — the default"; instead state
that max_restarts <= 0 remains the sentinel for unlimited but the current
configured default restart cap is 50, so the doc should clarify that the default
behavior is a 50 restart cap unless explicitly set to <= 0 for unlimited.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 76a05c29-bac9-432f-87c4-6eb519c04d7a
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (14)
CHANGELOG.mdpyproject.tomlsrc/autosentry/cli/commands/doctor.pysrc/autosentry/cli/commands/status.pysrc/autosentry/config.pysrc/autosentry/incidents/report.pysrc/autosentry/monitor.pysrc/autosentry/state.pysrc/autosentry/templates/autosentry.yaml.tmplsrc/autosentry/tui.pytests/test_cli.pytests/test_pipeline.pytests/test_reset.pytests/test_restart_budget.py
| ## [0.13.1] — 2026-06-04 | ||
|
|
There was a problem hiding this comment.
Update changelog reference links for the new release.
The new 0.13.1 section is added, but the reference-link footer still points Unreleased to v0.8.4 and doesn’t define links for 0.13.1/0.13.0, so release headings won’t link correctly.
📝 Proposed fix
-[Unreleased]: https://github.com/ulmentflam/autosentry/compare/v0.8.4...HEAD
+[Unreleased]: https://github.com/ulmentflam/autosentry/compare/v0.13.1...HEAD
+[0.13.1]: https://github.com/ulmentflam/autosentry/releases/tag/v0.13.1
+[0.13.0]: https://github.com/ulmentflam/autosentry/releases/tag/v0.13.0🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@CHANGELOG.md` around lines 9 - 10, The changelog added a new release heading
"## [0.13.1] — 2026-06-04" but the reference-link footer still maps "Unreleased"
to "v0.8.4" and doesn't define links for "0.13.1" and "0.13.0", so update the
CHANGELOG.md reference-link footer: change the Unreleased reference to point to
the correct comparison (or remove if not needed), and add explicit
reference-link entries for [0.13.1] and [0.13.0] with their corresponding
tags/compare URLs (matching the repo's tag naming like v0.13.1 and v0.13.0) so
the headings "0.13.1"/"0.13.0" resolve properly.
| # Decoupled from ``max_restarts`` so the unlimited-budget | ||
| # default doesn't push Claude escalation off to infinity — | ||
| # rules get two cheap shots at known transients, then the |
There was a problem hiding this comment.
Fix outdated “unlimited default” comments in monitor flow.
Line 141 and Line 378 describe uncapped/unlimited restart budget as the default, but the default is now capped (50). Please update both comments to match current kill-switch semantics.
Also applies to: 378-380
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/autosentry/monitor.py` around lines 141 - 143, Update the outdated
comments that claim an "unlimited" restart budget to reflect the current capped
default of 50; locate the comment blocks around the monitor flow mentioning
max_restarts (the one referencing "Decoupled from ``max_restarts``" and the
later block around lines ~378-380) and change their wording to indicate the
restart budget is capped at 50 and that rules get two cheap shots before the
kill-switch at 50 restarts applies, preserving the original intent about
transient retries but matching the actual capped default semantics.
| ``max_restarts <= 0`` is the sentinel for "unlimited" — the default. | ||
| Centralized so every caller (Monitor, doctor, vault) agrees on the |
There was a problem hiding this comment.
Update stale default wording in budget_exhausted docstring.
Line 175 says the unlimited sentinel is “the default,” but the current default restart cap is 50. Please align this wording to avoid operator confusion.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/autosentry/state.py` around lines 175 - 176, Update the stale docstring
in the budget_exhausted function/doc (budget_exhausted in
src/autosentry/state.py) to stop claiming that "max_restarts <= 0 is the
sentinel for 'unlimited' — the default"; instead state that max_restarts <= 0
remains the sentinel for unlimited but the current configured default restart
cap is 50, so the doc should clarify that the default behavior is a 50 restart
cap unless explicitly set to <= 0 for unlimited.
Summary
The previous defaults gave up too quickly for the long-running workloads autosentry exists to babysit (ML training, multi-hour data pipelines), while still letting a stuck-loop healer spam the API. Both knobs are now wide enough that productive workloads never feel capped, narrow enough that broken healing eventually stops.
process.restart_policy.max_restarts1050healing.budget.max_attempts_per_detector_per_hour560healing.budget.max_wall_seconds_per_incident6007200max_restartsis reframed as a kill-switch, not a budget:state.restartszeros on every kept fix, so this only trips when the healer can't land a kept fix for N restarts in a row. A productive healer runs the supervisor indefinitely.healing.escalate_to_claude_afterdecouples frommax_restarts // 5to a literal2. The old formula was backwards — raisingmax_restartspushed Claude escalation later, letting rules monopolize a bigger slice of the budget. Now rules get two cheap shots regardless of cap.New helpers in
autosentry.state:budget_exhausted(restarts, max_restarts)— single source of truth, honors the0 = unlimitedsentinel.format_budget(max_restarts)— renders∞for unlimited.Existing configs keep their explicit values; only fresh
initandinit --upgradepick up the new defaults (upgrade prompts per-key, so users can decline).Test plan
test_reset.py(budget_exhausted, format_budget)test_restart_budget.py(literal-2 escalation)test_cli.py(template default → 50)test_pipeline.py(stage inherits new default)ruff check+ruff format+pyreflyclean🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Configuration Updates