Skip to content

perf(cv): decouple frame capture from inference in YOLOPoseSource#64

Merged
rwlove merged 1 commit into
mainfrom
perf/cv-doublebuffer-vectorize
Jul 2, 2026
Merged

perf(cv): decouple frame capture from inference in YOLOPoseSource#64
rwlove merged 1 commit into
mainfrom
perf/cv-doublebuffer-vectorize

Conversation

@rwlove

@rwlove rwlove commented Jul 2, 2026

Copy link
Copy Markdown
Owner

Summary

For live sources, _stream_one_capture now runs cap.read() and model.predict() as independent asyncio tasks communicating through a size-1 slot with drop-newest semantics. Previously they were serialised.

  • Reader keeps publishing _latest_frame / _frame_id (drives PR perf(cv): cache last-encoded JPEG per frame in healthd's camera snapshot #61's JPEG cache) so the wall preview stays fresh even between predictions.
  • File sources keep the original sequential path (renamed to _stream_sequential) — file replay is predictor-bounded, so double-buffering would race through and drop most frames.
  • Shutdown is careful: signal reader stop → drain slot → cancel reader task → await → THEN release VideoCapture. No zombie threads, no cap.read() racing with cap.release().

Numbers (mocked cap + model, wall-clock accurate)

Scenario Old New Speedup
15 fps camera + 100 ms predict 5.98 fps 12.80 fps 2.14×
Fast reader + 200 ms predict 4.60 fps 23.93 fps 5.20×

Also

Reverted a speculative vectorisation of pose_sequence_to_features — benchmarking showed it was 0.6-0.7× the loop version at realistic clip lengths (T=150-500), and the function is called once per completed set anyway. Not a hot path; the loop wins.

Test plan

  • 5 new async tests: happy-path yield, drop-newest under slow predictor, clean shutdown on aiter close, EOF sentinel, fps counting
  • 42 pass / 1 skip (was 37/1)
  • Ruff clean

Tag: pump-cv-v0.6.0.

🤖 Generated with Claude Code

…ve sources)

For live sources (RTSP, HTTP, anything not is_file), _stream_one_capture
now runs cap.read() and model.predict() as independent asyncio tasks
communicating through a size-1 slot. Previously they were serialised:
capture waited for predict, predict waited for capture, so a 15 fps
camera + 100 ms/frame model ran end-to-end at ~6 fps.

Drop-newest slot semantics: if the reader outpaces the predictor, the
older queued frame is discarded and replaced with the newest one. The
rep detector only ever needs the most recent frame, so backing up a
FIFO would just add latency between what's happening in the gym and
what the pipeline sees. Reader keeps publishing _latest_frame and
_frame_id (and by extension the JPEG cache from PR #61) on every
decode so the wall preview stays fresh even between predictions.

File sources still run the original sequential path (renamed to
_stream_sequential) — file replay is bounded by the predictor rather
than a real-time camera, so double-buffering would race through the
file and drop most frames.

Shutdown carefully: the outer finally-block signals the reader to stop,
drains the slot to unblock any in-flight put(), cancels the reader
task, awaits it, and only THEN releases the VideoCapture — so we never
release cap while a cap.read() is still on the thread pool.

Benchmark (mocked cap + model, wall-clock accurate):
- 15 fps camera + 100 ms predict:  5.98 fps → 12.80 fps (2.14x)
- Fast reader + 200 ms predict:    4.60 fps → 23.93 fps (5.20x)

Also reverts a speculative vectorisation of pose_sequence_to_features
that was in my working tree: benchmarking showed the NumPy version was
0.6-0.7x the loop version at realistic T=150-500 clip lengths (per-call
overhead dominates), and that function is called once per completed set
anyway, not per frame.

5 new async tests cover happy-path yield, drop-newest under a slow
predictor, clean shutdown on aiter close (VideoCapture released, no
zombie reader), EOF sentinel handling, and fps counting still working.
42 pass / 1 skip (was 37/1). Ruff clean.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@rwlove rwlove merged commit 9641c40 into main Jul 2, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant