Fix balancer lockup when rejecting probes on empty batteries#339
Fix balancer lockup when rejecting probes on empty batteries#339
Conversation
In _reject_probe, the just-rejected candidate was re-inserted at the front of the deprioritized section of the priority list, so the next _maybe_force_swap_saturated scan kept re-picking the same battery once the saturation flag cleared. The rejection also reset _last_rotation, suppressing any scheduled rotation for a full efficiency_rotation_interval, so no other candidate was ever surfaced. Move the rejected candidate to the end of the priority list and drop the _last_rotation update on rejection, so the next tick is eligible to pick a different untested battery immediately. Adds a regression test that reproduces kiss81's 4-consumer/800 W/single active slot scenario and asserts a full battery is active for >= 50% of the final 600 ticks (0% before the fix, ~100% with it). Fixes #230.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughModified the Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
Fixes a regression where the load balancer would cycle forever between empty batteries while healthy ones remained permanently deprioritized. The issue occurred when probe rejection logic reinserted candidates near the front of the deprioritized section and suppressed scheduled rotation, causing the same empty batteries to be repeatedly selected.
Changes
Modified
_reject_probemethod inLoadBalancer:self._last_rotation = nowassignment that was suppressing scheduled rotation after each rejectionAdded regression test (
test_balancer_empty_battery_lockup.py):_FakeClockharness to drive the balancer deterministicallyImplementation Details
The fix addresses two interrelated issues:
_maybe_force_swap_saturated_last_rotationupdate allows the normalefficiency_rotation_intervaltimer to trigger scheduled rotation, ensuring all batteries get probed eventually rather than being starved by repeated rejectionshttps://claude.ai/code/session_01AyXR3Hw2TVMu59EQStvt2Z
Summary by CodeRabbit
Bug Fixes
Tests