Feat/coroutine pool by mahimairaja · Pull Request #37 · mahimailabs/openrtc-python

mahimairaja · 2026-05-05T20:01:54Z

Summary by CodeRabbit

Release Notes

Documentation
- Added new benchmark results dated 2026-05-05 confirming Phase 1 gate continues to pass.
Tests
- Enhanced benchmark observability with scheduler latency metrics (median, p99, max).
- Added hardware fingerprinting (CPU, RAM, kernel, Python version) to results.
- Implemented diagnostic notes tracking in benchmark output.

…sity bench Phase 1 Step 9 of .agents/TODO.md asks the density benchmark to measure event-loop responsiveness (median + p99 task scheduling latency) and to record a hardware fingerprint (cpu, ram, kernel, python). The previous version captured RSS, success/failure counts, and elapsed time — these two additions close the gap so the Step 10 results doc has every input it needs. Scheduler latency: a background task requests a 10ms sleep on a loop and records the difference between the requested wakeup time and the actual return time. Under task pressure this rises and signals event-loop starvation. Result includes sample count, median, p99, and max in milliseconds. Hardware fingerprint: platform.processor / os.cpu_count / psutil.virtual_memory.total / platform.uname / platform.python_version captured per run so benchmark numbers are reproducible across boxes. Smoke run with --sessions 5 confirms the new fields render in the human-readable output and the JSON shape stays valid. Full test suite 374 passed, 2 skipped; ruff and mypy strict clean.

Three back-to-back N=50 runs against the §7 success gate, captured through the extended benchmark harness from 32bde3a (scheduler latency + hardware fingerprint). All pass: 50/50 sessions, 0 failures, peak RSS 354-367 MB (well under the 4 GB budget), elapsed 1.02-1.08 s. Median scheduler latency ~1.07 ms, p99 in the 3-6 ms band. Tail max spikes on two of three runs (50, 64 ms) were environmental single-sample transients; the clean Run C lands at p99 3.19 ms, max 3.36 ms. No regression from the harness instrumentation additions. Gate verdict unchanged: Phase 1 §7 gate met.

coderabbitai · 2026-05-05T20:02:02Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 220939c8-2324-4ca0-bd06-96c584e571cb

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/coroutine-pool

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-05-05T20:03:23Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

coderabbitai

🧹 Nitpick comments (2)

tests/benchmarks/density.py (2)

69-71: ⚡ Quick win

Replace loose metrics dictionaries with typed dataclasses.

scheduler_latency_ms: dict[str, float] and hardware: dict[str, Any] lose schema guarantees and force awkward typing (for example, "samples" is a count but stored as float). Use typed dataclasses (or typed models) for these payloads.

Suggested direction

+@dataclass(frozen=True)
+class SchedulerLatencyStats:
+    samples: int
+    median_ms: float
+    p99_ms: float
+    max_ms: float
+
+@dataclass(frozen=True)
+class HardwareFingerprint:
+    cpu_model: str
+    cpu_count: int | None
+    total_ram_gb: float | None
+    kernel: str
+    python_version: str
@@
-    scheduler_latency_ms: dict[str, float] = field(default_factory=dict)
-    hardware: dict[str, Any] = field(default_factory=dict)
+    scheduler_latency_ms: SchedulerLatencyStats | None = None
+    hardware: HardwareFingerprint | None = None

As per coding guidelines, "Prefer dataclass or clear typed classes over loose dictionaries for structured data" and "Prefer concrete types over Any in Python type hints."

Also applies to: 236-245

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/benchmarks/density.py` around lines 69 - 71, Replace the loose dicts by
defining explicit dataclasses for the scheduler latency payload and the hardware
payload, then change the fields scheduler_latency_ms and hardware on the
enclosing dataclass to use those concrete types (with default_factory for each
dataclass) instead of dict[str, float] and dict[str, Any]; ensure the scheduler
dataclass uses proper numeric types for counts (e.g., int for "samples") and
floats for times, and the hardware dataclass types each field explicitly (no
Any), update any code constructing or accessing scheduler_latency_ms and
hardware to use the new attribute names/types on the dataclass (e.g.,
MySchedulerLatency and MyHardware classes referenced where scheduler_latency_ms
and hardware were used), and apply the same replacement for the other occurrence
noted around the second block.

162-163: ⚡ Quick win

Replace suppress(CancelledError) with explicit stop-event timeout waits.

The latency sampler suppresses CancelledError, which weakens shutdown semantics and can mask unexpected cancellations. Use asyncio.wait_for(stop.wait(), timeout=...) with suppress(TimeoutError) instead, matching the pattern already used in _sample_loop_rss in the same file. This respects the stop signal immediately without swallowing exceptions.

Suggested patch

 async def _sample_loop_latency(stop: asyncio.Event, samples: list[float]) -> None:
     """Background task: measure scheduler wakeup latency in milliseconds.
 
     A small ``asyncio.sleep`` is requested every interval; the delta between
     the requested wakeup time and the actual return time is the loop's
     scheduling latency. Under heavy task pressure this rises and signals
     starvation of the event loop.
     """
     while not stop.is_set():
         target = time.monotonic() + _LATENCY_SAMPLE_INTERVAL_SECONDS
-        with contextlib.suppress(asyncio.CancelledError):
-            await asyncio.sleep(_LATENCY_SAMPLE_INTERVAL_SECONDS)
+        try:
+            await asyncio.wait_for(stop.wait(), timeout=_LATENCY_SAMPLE_INTERVAL_SECONDS)
+        except TimeoutError:
+            pass
+        if stop.is_set():
+            break
         actual = time.monotonic()
         samples.append(max(0.0, (actual - target) * 1000.0))

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/benchmarks/density.py` around lines 162 - 163, Replace the current
"with contextlib.suppress(asyncio.CancelledError): await
asyncio.sleep(_LATENCY_SAMPLE_INTERVAL_SECONDS)" in the latency sampling loop
with an explicit wait on the stop event: use "await
asyncio.wait_for(stop.wait(), timeout=_LATENCY_SAMPLE_INTERVAL_SECONDS)" and
wrap that in "contextlib.suppress(asyncio.TimeoutError)" (mirroring
_sample_loop_rss) so the loop respects immediate stop signals instead of
swallowing CancelledError; keep the existing _LATENCY_SAMPLE_INTERVAL_SECONDS
and stop symbols to locate the change.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/benchmarks/density.py`:
- Around line 69-71: Replace the loose dicts by defining explicit dataclasses
for the scheduler latency payload and the hardware payload, then change the
fields scheduler_latency_ms and hardware on the enclosing dataclass to use those
concrete types (with default_factory for each dataclass) instead of dict[str,
float] and dict[str, Any]; ensure the scheduler dataclass uses proper numeric
types for counts (e.g., int for "samples") and floats for times, and the
hardware dataclass types each field explicitly (no Any), update any code
constructing or accessing scheduler_latency_ms and hardware to use the new
attribute names/types on the dataclass (e.g., MySchedulerLatency and MyHardware
classes referenced where scheduler_latency_ms and hardware were used), and apply
the same replacement for the other occurrence noted around the second block.
- Around line 162-163: Replace the current "with
contextlib.suppress(asyncio.CancelledError): await
asyncio.sleep(_LATENCY_SAMPLE_INTERVAL_SECONDS)" in the latency sampling loop
with an explicit wait on the stop event: use "await
asyncio.wait_for(stop.wait(), timeout=_LATENCY_SAMPLE_INTERVAL_SECONDS)" and
wrap that in "contextlib.suppress(asyncio.TimeoutError)" (mirroring
_sample_loop_rss) so the loop respects immediate stop signals instead of
swallowing CancelledError; keep the existing _LATENCY_SAMPLE_INTERVAL_SECONDS
and stop symbols to locate the change.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 110e5ee4-fc2d-44f8-9b36-19aa5a23a102

📥 Commits

Reviewing files that changed from the base of the PR and between b576dbb and 1d87413.

📒 Files selected for processing (2)

docs/benchmarks/density-v0.1.md
tests/benchmarks/density.py

Codex review nitpicks on tests/benchmarks/density.py: 1. Loose dicts replaced with explicit dataclasses. SchedulerLatency carries samples (int), median/p99/max (float ms). Hardware carries cpu_model (str), cpu_count (int | None), total_ram_gb (float), kernel (str), python_version (str). DensityResult.scheduler_latency_ms and .hardware are now typed Optional[SchedulerLatency] and Optional[Hardware] instead of dict[str, float] / dict[str, Any]. _hardware_fingerprint() returns Hardware. _format_human accesses attrs by name (no key lookup). The JSON output shape stays stable (asdict serialization), with a samples-as-int fix; downstream consumers see the same keys. 2. _sample_loop_latency now waits on stop with asyncio.wait_for(...) and contextlib.suppress(TimeoutError), mirroring _sample_rss. Replaces the old contextlib.suppress(asyncio.CancelledError) + asyncio.sleep pattern that swallowed cancellation and ignored the stop event between samples. The loop now exits within one sample interval of stop.set() instead of always waiting the full interval. Verification: - uv run python tests/benchmarks/density.py --sessions 5 renders the same human-readable output with the new dataclass fields - uv run python tests/benchmarks/density.py --sessions 3 --json produces the same shape (samples is int now, not float) - uv run pytest: 374 passed, 2 skipped - uv run ruff check / ruff format: clean - uv run mypy src/: clean (the benchmark is in tests/, not part of strict typing surface, but field changes don't ripple to src/ either way)

The density benchmark imports psutil at module load time (tests/benchmarks/density.py:41 — added in 32bde3a) to capture total RAM in the hardware fingerprint. psutil currently resolves transitively through livekit-agents, but that's a fragile guarantee and the benchmark is documented as directly runnable ('uv run python tests/benchmarks/density.py'). Declare it explicitly under [dependency-groups].dev so contributors get a clean source install with the benchmark working out of the box. End-user installs are unaffected (psutil stays out of the runtime extras and the wheel). Codex review (P2 finding from the feat/coroutine-pool branch review) — undeclared import. Verification: - uv sync --group dev resolves cleanly; psutil 7.2.2 installed - uv run pytest: 374 passed, 2 skipped - uv run ruff check: clean - uv run mypy src/: clean - uv run python tests/benchmarks/density.py --sessions 3 exits 0 with the hardware fingerprint rendered

mahimairaja added 2 commits May 5, 2026 14:58

mahimairaja self-assigned this May 5, 2026

coderabbitai Bot reviewed May 5, 2026

View reviewed changes

mahimairaja added 2 commits May 5, 2026 16:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat/coroutine pool#37

Feat/coroutine pool#37
mahimairaja wants to merge 4 commits intomainfrom
feat/coroutine-pool

mahimairaja commented May 5, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 5, 2026 •

edited

Loading

Review skipped

Uh oh!

codecov Bot commented May 5, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

mahimairaja commented May 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

codecov Bot commented May 5, 2026

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mahimairaja commented May 5, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 5, 2026 •

edited

Loading