Skip to content

Move pending timers between the virtual and real event loops#2

Open
spuun wants to merge 2 commits into
mainfrom
move-pending-timers-to-event-loop
Open

Move pending timers between the virtual and real event loops#2
spuun wants to merge 2 commits into
mainfrom
move-pending-timers-to-event-loop

Conversation

@spuun

@spuun spuun commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Makes TimeControl.control non-destructive at both boundaries: fibers that are mid-sleep (or waiting on a select timeout) when control starts are pulled into virtual time, and any that are still pending when control ends are handed back to the real event loop. Previously, leftover timers raised PendingTimersError, and fibers already sleeping before control began were invisible to the virtual clock (they woke in real wall-clock time).

Exit side — re-attach pending timers to the real event loop

Instead of raising PendingTimersError when the block exits with timers still pending, each pending timer is re-attached to the real event loop with the virtual time that remained until it would have fired (wake_at - virtual_now). The parked fibers wake up later in real time; control returns immediately.

  • Tracks pending timers with their remaining time instead of just counting them.
  • Removes PendingTimersError (the abstract TimeControl::Error base remains). This is a breaking API change.

Entry side — adopt pre-existing real-loop timers into virtual time

At control start, every execution context's event loop is scanned (Fiber::ExecutionContext.each), and its Sleep/SelectTimeout events are removed from the real Crystal::EventLoop::Polling timer heap and re-registered on the virtual clock — so they wake when virtual time is advanced past their deadline rather than in real time. The two clocks share an origin at control start, so the event's real wake_at maps directly to the virtual deadline.

  • Adopted sleeps carry an on_wake proc that marks the real (stack-allocated) sleep event timed out before enqueuing, avoiding the event loop's "manually resumed before the timer expired" guard. The proc is honored on both the advance path and the exit-side re-attach path.
  • Adopted select timeouts reuse the existing native wake/cancel paths.

Scope limits (by design)

  • Polling backends only (kqueue/epoll); no-op on LibEvent/IOCP, same as the existing IO-timeout interception.
  • Only sleeps and select timeouts migrate. Pending IO-operation timeouts stay on the real loop (tied to a live IO wait).

Files

  • src/time_control/context.cr — pending-timer tracking, on_wake, adopt_sleep/adopt_select_timeout.
  • src/time_control/core_ext/crystal/event_loop/timers.cr (new) — read-only traversal of the pairing-heap timer structure.
  • src/time_control/core_ext/crystal/event_loop/polling.cr — extract-and-re-register driver.
  • src/time_control/errors.cr — drop PendingTimersError.
  • src/time_control.cr — wire adoption into control.

Testing

  • New specs cover exit-side re-attach (sleep + select timeout) and entry-side adoption (sleep, select timeout, and a sleeper in a separate Isolated context).
  • spec/time_control_spec.cr passes 8/8 across stress runs; the MT adoption test passes 6/6 in isolation.

⚠️ Note: the multi-threaded IO-timeout specs (spec/time_control_mt_spec.cr) have a pre-existing intermittent hang in the interrupt-based IO path — reproducible on main and unrelated to this change.

🤖 Generated with Claude Code

spuun and others added 2 commits June 12, 2026 12:54
Instead of raising PendingTimersError when the control block exits with
virtual timers still pending, track each pending timer together with the
virtual time remaining until it would have fired (wake_at - virtual_now)
and re-attach it to the real event loop with that remaining duration.
Parked sleeping fibers and select timeouts then wake up later in real
time; control returns immediately.

Removes PendingTimersError (the abstract Error base remains).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fibers already sleeping or waiting on a select timeout on the real event
loop when control begins are now adopted into virtual time, the mirror
image of the exit-side re-attach. At control start every execution
context's event loop is scanned (Fiber::ExecutionContext.each); sleep and
select-timeout events are removed from the real Crystal::EventLoop::Polling
timer heap and re-registered on the virtual clock, so they wake when
virtual time is advanced past their deadline instead of in real time.

Adopted sleeps carry an on_wake proc that marks the real (stack-allocated)
sleep event timed out before enqueuing, avoiding the event loop's
"manually resumed before the timer expired" guard; this proc is honored
on both the advance path and the exit-side re-attach path.

Polling builds only (kqueue/epoll). IO-operation timeouts are left on the
real loop since they are tied to a live IO wait.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes TimeControl.control non-destructive at both boundaries by (1) adopting pre-existing real event-loop sleep/select-timeout timers into virtual time at control entry, and (2) re-attaching any still-pending virtual timers back onto the real event loop at control exit instead of raising an error.

Changes:

  • Remove PendingTimersError and track pending timers with remaining time so they can be rescheduled on exit.
  • Add Polling-only traversal/extraction of real event-loop timers to adopt existing sleeps/select timeouts into the virtual clock.
  • Add/adjust specs and docs to cover adoption and exit-side re-attach behavior (including an isolated execution context case).

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/time_control/errors.cr Removes PendingTimersError (breaking API change).
src/time_control/core_ext/crystal/event_loop/timers.cr Adds read-only traversal over the Polling timer heap for adoption.
src/time_control/core_ext/crystal/event_loop/polling.cr Extracts pending sleep/select-timeout timers from Polling loops and wires adoption across execution contexts.
src/time_control/context.cr Adds on_wake, pending-timer tracking, adoption helpers, and exit-side rescheduling.
src/time_control.cr Wires timer adoption at entry and pending-timer rescheduling at exit.
spec/time_control_spec.cr Replaces pending-timer error spec with re-attach specs; adds adoption specs for sleep/select timeout.
spec/time_control_mt_spec.cr Adds MT coverage for adopting a pre-existing sleep in an isolated execution context.
README.md Documents adoption behavior and the new pending-timer exit semantics.
AGENTS.md Updates architecture notes and public API list to reflect new behavior and removed error class.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +93 to +98
# Adopts a fiber that was already sleeping on the real event loop when
# control started. *wake_at* is the event's real monotonic deadline, which
# equals the virtual deadline because both clocks share their origin at
# control start. *on_wake* marks the real (stack-allocated) sleep event
# timed out before enqueuing the fiber.
def adopt_sleep(fiber : Fiber, wake_at : Time::Instant, on_wake : Proc(Nil)) : Nil
Comment on lines +106 to +110
# Adopts a fiber that was already waiting on a `select` timeout on the real
# event loop when control started. Reuses the native select-timeout wake and
# cancel paths.
def adopt_select_timeout(fiber : Fiber, wake_at : Time::Instant) : Nil
notify = @timers_mutex.synchronize do
Comment on lines +178 to +182
# Re-attaches any timers that were still pending when control stopped to the
# real event loop, each with the virtual time that remained until it would
# have fired. Must be called after control has stopped (i.e. while time is no
# longer being intercepted) so that `sleep` uses the real event loop.
def reschedule_pending_timers : Nil
Comment thread src/time_control.cr
Comment on lines 105 to +109
ensure
@@context = nil
ctx.try &.stop
isolated.try &.wait
if ctx && ctx.leaked_timer_count > 0
raise PendingTimersError.new(ctx.leaked_timer_count)
end
ctx.try &.reschedule_pending_timers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants