fix(dogstatsd): decouple socket reading from metric processing#76

Merged

duncanista merged 2 commits intomainfrom

jordan.gonzalez/dogstatsd/reader-hot-path

Feb 18, 2026

Contributor

duncanista commented Feb 12, 2026 •

edited by atlassian bot

Loading

What does this PR do?

Three optimizations to the reader task introduced in the previous PR, reducing per-packet overhead and improving burst absorption:

Buffer pool — pre-allocated Vec buffers cycle between reader and processor via an unbounded channel. In steady state, the reader never hits the allocator.
Batch drain — after each async read_into(), a tight try_read_into() loop drains all pending packets from the kernel buffer without re-entering tokio's event loop.
is_cancelled() over tokio::select! — replaces a two-future select with a single AtomicBool load per iteration. Safe because process_loop already drains remaining channel packets on cancellation.

Motivation

The batch drain loop (try_read_into → try_send → repeat) processes all pending packets in a tight syscall loop with zero allocations, emptying the kernel buffer before the next scheduling pause can cause drops.

Additional Notes

No public API changes — all changes are internal to the reader/processor loop.
The Packet struct and buffer pool are private implementation details.
consume_statsd() (#[cfg(test)]) is unchanged — unit tests bypass the reader loop entirely, so they don't exercise the pool or batch drain. This is intentional: those optimizations are transparent to the metric parsing logic.

Describe how to test/QA your changes

All 110 existing tests pass (105 unit + 4 integration + 1 doc-test)
cargo clippy -p dogstatsd — no warnings
End-to-end on Lambda with 300k metrics: combined with chain PRs and sender-side pacing, achieves 300k/300k delivery
Buffer pool correctness is implicitly tested by all existing tests — any pool bug would surface as use-after-free or data corruption in metric parsing

duncanista requested review from a team as code owners

February 12, 2026 19:37

duncanista requested review from kathiehuang and removed request for a team

February 12, 2026 19:37

duncanista changed the base branch from main to jordan.gonzalez/dogstatsd/decouple-reader-processor

February 12, 2026 19:37

duncanista requested a review from Copilot

February 12, 2026 22:10

Copilot started reviewing on behalf of duncanista

February 12, 2026 22:11

Copilot AI reviewed

View reviewed changes

Contributor

Copilot AI left a comment

Pull request overview

This PR optimizes the DogStatsD metric reader task to reduce per-packet overhead and improve burst absorption through three key optimizations: a buffer pool to eliminate allocations in steady state, a batch drain loop to empty the kernel buffer without re-entering tokio's event loop, and replacing tokio::select! with is_cancelled() checks for lower per-iteration overhead.

Changes:

Introduced a Packet struct to carry pooled buffers with metadata through the processing pipeline
Added BufferReader::try_read_into() for non-blocking packet reads to enable batch draining
Implemented an unbounded channel-based buffer pool where the processor returns buffers to the reader for reuse

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

crates/dogstatsd/src/dogstatsd.rs Show resolved Hide resolved

crates/dogstatsd/src/dogstatsd.rs Outdated Show resolved Hide resolved

crates/dogstatsd/src/dogstatsd.rs Outdated Show resolved Hide resolved

crates/dogstatsd/src/dogstatsd.rs Outdated Show resolved Hide resolved

crates/dogstatsd/src/dogstatsd.rs Outdated Show resolved Hide resolved

crates/dogstatsd/src/dogstatsd.rs Outdated Show resolved Hide resolved

duncanista force-pushed the jordan.gonzalez/dogstatsd/decouple-reader-processor branch 2 times, most recently from 97d3bca to d30cd5c Compare

February 13, 2026 14:55

duncanista force-pushed the jordan.gonzalez/dogstatsd/reader-hot-path branch from 857a8d4 to 48236d7 Compare

February 13, 2026 17:28

duncanista force-pushed the jordan.gonzalez/dogstatsd/decouple-reader-processor branch from d30cd5c to 57ef7fc Compare

February 13, 2026 17:58

kathiehuang approved these changes

View reviewed changes

Contributor

kathiehuang left a comment

I think this makes sense but this is my first time looking at dogstatsd code 😅 feel free to request other reviewers too!

If you could respond to copilot's comments that would be helpful too!

crates/dogstatsd/src/dogstatsd.rs Show resolved Hide resolved

duncanista force-pushed the jordan.gonzalez/dogstatsd/decouple-reader-processor branch from 57ef7fc to 2b46ff8 Compare

February 17, 2026 17:32

duncanista force-pushed the jordan.gonzalez/dogstatsd/reader-hot-path branch from 48236d7 to f0b864b Compare

February 17, 2026 19:15

duncanista requested a review from Copilot

February 17, 2026 19:16

Copilot started reviewing on behalf of duncanista

February 17, 2026 19:16

Copilot AI reviewed

View reviewed changes

Contributor

Copilot AI left a comment

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

crates/dogstatsd/src/dogstatsd.rs Show resolved Hide resolved

crates/dogstatsd/src/dogstatsd.rs

Comment on lines 375 to 388

+                      // Drain any packets already in the kernel buffer without going
+                      // through tokio's event loop — just raw non-blocking syscalls.
+                      loop {
+                          let mut buf = pool_rx.try_recv().unwrap_or_else(|_| vec![0u8; buf_size]);
+                          match reader.try_read_into(&mut buf) {
+                              Ok(Some((len, source))) => {
+                                  if !try_send_packet(&tx, Packet { buf, len, source }, &mut dropped, &pool_tx) {
+                                      break;
+                                  }
+                              }
+                              Ok(None) => {
+                                  let _ = pool_tx.send(buf);
+                                  break;
+                              }

Copilot AI Feb 17, 2026

The batch-drain loop runs unconditionally, but try_read_into() returns Ok(None) for non-UDP transports. That means each iteration still pulls a buffer from the pool (or allocates on cold start) and immediately returns it, adding extra channel traffic and potentially an extra allocation per packet for named pipes/mirror transport. Consider gating the drain loop to only run for UdpSocket (e.g., via a supports_try_read()/is_udp() check) so non-UDP paths don’t pay this overhead.

Copilot uses AI. Check for mistakes.

Contributor Author

duncanista Feb 17, 2026

The overhead is trivial — one try_recv() + one send() on unbounded channels per packet for named pipes. Adding a supports_try_read() method adds API surface and branching for a path that barely matters. Named pipes are Windows-only and low-volume compared to UDP.

crates/dogstatsd/src/dogstatsd.rs Outdated Show resolved Hide resolved

duncanista force-pushed the jordan.gonzalez/dogstatsd/decouple-reader-processor branch 3 times, most recently from 47b13ad to 9df7478 Compare

February 17, 2026 21:22

duncanista mentioned this pull request

feat(dogstatsd): decouple reader from processor #75

Merged

Base automatically changed from jordan.gonzalez/dogstatsd/decouple-reader-processor to main

February 17, 2026 22:17

duncanista force-pushed the jordan.gonzalez/dogstatsd/reader-hot-path branch from 55183db to e7c66c9 Compare

February 17, 2026 22:28


          perf(dogstatsd): optimize reader hot path

3ae2068

Reduce per-packet overhead in the reader loop with three optimizations:

- Buffer pool: cycle Vec<u8> buffers between reader and processor via
  an unbounded channel, eliminating heap allocation per read in steady
  state. Pool size is naturally bounded by the data channel capacity.
- Batch drain: after each async read, call try_read_into() in a tight
  loop to drain all pending kernel packets without re-entering tokio's
  event loop. Most impactful after hypervisor scheduling pauses on Lambda.
- Fast cancellation check: use is_cancelled() (AtomicBool load) at the
  top of each iteration, with tokio::select! on the async read to ensure
  the reader exits promptly even on a quiet socket.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

duncanista requested a review from Copilot

February 17, 2026 22:41

Copilot started reviewing on behalf of duncanista

February 17, 2026 22:42

Copilot AI reviewed

View reviewed changes

Contributor

Copilot AI left a comment

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

duncanista changed the base branch from main to yiming.luo/dogstatsd-debug

February 17, 2026 22:48

duncanista changed the base branch from yiming.luo/dogstatsd-debug to main

February 17, 2026 22:48

duncanista requested a review from Copilot

February 17, 2026 22:48

Copilot started reviewing on behalf of duncanista

February 17, 2026 22:49

Copilot AI reviewed

View reviewed changes

Contributor

Copilot AI left a comment

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


          add some more integration tests

659a19f

duncanista merged commit 2eb009a into main

26 checks passed

duncanista deleted the jordan.gonzalez/dogstatsd/reader-hot-path branch

February 18, 2026 21:47

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet