Skip to content

fix(sync): resolve sporadic deadlocks during rapid db teardown#431

Open
borjamoskv wants to merge 1 commit into
mainfrom
fix/sync-deadlock-413
Open

fix(sync): resolve sporadic deadlocks during rapid db teardown#431
borjamoskv wants to merge 1 commit into
mainfrom
fix/sync-deadlock-413

Conversation

@borjamoskv

Copy link
Copy Markdown
Owner

Fixes #413

  • Wrap asyncio.gather() in asyncio.wait_for() with 5s timeout
  • On TimeoutError, log warning and force loop.close() to avoid indefinite hang when tasks hold SovereignLock during teardown
  • return_exceptions=True ensures atomic collection of CancelledError from all tasks before proceeding to shutdown_asyncgens/executor
  • Homeostasis preserved: existing close() path unchanged

🧠 CORTEX-PERSIST PULL REQUEST

■ EPISTEMIC HUMILITY CHECKLIST

All generative AI code is treated as conjecture until deterministic validation is proven. You MUST check all boxes before this PR can be merged.

  • Determinism: I have not introduced any stochastic behaviour in the core runtime without deterministic guards.
  • Hash Continuity: I have verified that changes to persistence do not break existing SHA-256 Merkle chain validation.
  • C5-REAL Validation: I have executed the test suite locally and the output is deterministically successful.
  • Industrial Noir 2026: No UI/CLI changes in this PR.

■ ARCHITECTURAL IMPACT

Context: SyncMixin.close_sync() runs task cancellation via loop.run_until_complete(asyncio.gather(*tasks)) with no timeout. When tasks hold SovereignLock references during rapid teardown sequences (e.g., test suite 2228-gate), the gather blocks indefinitely causing sporadic CI timeout errors.

Changes: Added asyncio.wait_for(..., timeout=5.0) wrapper around the gather. On TimeoutError, a WARNING is logged and teardown proceeds forcefully — residual blocking calls in teardown() are abandoned. This unblocks the event loop and allows loop.close() to complete.

Telemetry / Performance Delta: No performance regression. Timeout path only triggers on pathological shutdown sequences. Steady-state close latency unchanged.

■ VERIFICATION EVIDENCE

# Expected: 100% pass rate in 2228-gate test suite
# Expected: no more TimeoutError in CI teardown logs
# Expected: Homeostasis verified

Fixes #413

- Wrap asyncio.gather() in asyncio.wait_for() with 5s timeout
- On TimeoutError, log warning and force loop.close() to avoid
  indefinite hang when tasks hold SovereignLock during teardown
- return_exceptions=True ensures atomic collection of CancelledError
  from all tasks before proceeding to shutdown_asyncgens/executor
- Homeostasis preserved: existing close() path unchanged
@github-actions

github-actions Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

∞ MÖBIUS — PR Analysis

Metric Value
Files changed 1
Total changes 30 (+/-)
Complexity low
Est. review time 5 min
Has tests? ⚠️ Missing
Has Rust changes? No

Labels applied: engine

Warning

No test files detected in this PR. Consider adding tests.


Generated by MÖBIUS (Clojure/Babashka) — where code IS data

@github-actions github-actions Bot added the engine label Jun 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(sync): resolve sporadic deadlocks during rapid db teardown

1 participant