[improve][broker] Improve dispatch performance by summing entry bytes with a loop by void-ptr974 · Pull Request #26055 · apache/pulsar

void-ptr974 · 2026-06-18T06:51:47Z

Motivation

The shared-subscription dispatch path sums entry bytes for every read batch. The existing implementation uses a stream pipeline, which adds avoidable allocation and dispatch overhead on this hot path.

Modifications

Replace the stream-based byte summation in both multiple-consumer dispatchers with a simple indexed loop.

Performance

This improves the dispatch hot path by reducing per-read-batch CPU work and eliminating stream pipeline allocation.

JMH data gathered locally using DispatcherDispatchPathBenchmark.(streamTotalBytes|loopTotalBytes), with -p entriesCount=32,256,1000 -wi 2 -i 5 -w 1s -r 1s -f 1 -bm avgt -tu ns -prof gc:

entriesCount	streamTotalBytes	loopTotalBytes	improvement	allocation
32	112.4 ns/op	40.3 ns/op	64.1%	256 B/op -> ~0
256	407.3 ns/op	253.7 ns/op	37.7%	256 B/op -> ~0
1000	1886.0 ns/op	1565.8 ns/op	17.0%	256 B/op -> ~0

Verification

Added PersistentDispatcherTotalBytesTest coverage for empty, single-entry, and large 1024-entry batches with varying real EntryImpl payload sizes.
./gradlew :pulsar-broker:compileJava :pulsar-broker:checkstyleMain --max-workers=1
./gradlew :pulsar-broker:test --tests org.apache.pulsar.broker.service.persistent.PersistentDispatcherTotalBytesTest --max-workers=1 -PtestRetryCount=0
./gradlew :pulsar-broker:checkstyleTest --max-workers=1

merlimat · 2026-06-18T18:40:54Z

        }
    }

+    static long getTotalBytesSize(List<Entry> entries) {


Instead of duplicating the code, we should move to a shared place

Thanks, done. Moved the helper to AbstractPersistentDispatcherMultipleConsumers and both dispatchers now reuse it.

void-ptr974 force-pushed the optimize-dispatch-total-bytes-loop branch 2 times, most recently from e540c3c to 68c04be Compare June 18, 2026 07:50

void-ptr974 changed the title ~~[improve][broker] Avoid stream allocation when summing dispatch bytes~~ [improve][broker] Improve dispatch performance by summing entry bytes with a loop Jun 18, 2026

void-ptr974 marked this pull request as ready for review June 18, 2026 12:39

merlimat reviewed Jun 18, 2026

View reviewed changes

[improve][broker] Avoid stream allocation when summing dispatch bytes

cf896a3

void-ptr974 force-pushed the optimize-dispatch-total-bytes-loop branch from 68c04be to cf896a3 Compare June 19, 2026 03:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[improve][broker] Improve dispatch performance by summing entry bytes with a loop#26055

[improve][broker] Improve dispatch performance by summing entry bytes with a loop#26055
void-ptr974 wants to merge 1 commit into
apache:masterfrom
void-ptr974:optimize-dispatch-total-bytes-loop

void-ptr974 commented Jun 18, 2026 •

edited

Loading

Uh oh!

merlimat Jun 18, 2026

Uh oh!

void-ptr974 Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

void-ptr974 commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Performance

Verification

Uh oh!

merlimat Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

void-ptr974 Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

void-ptr974 commented Jun 18, 2026 •

edited

Loading