Skip to content

[improve][broker] Improve dispatch performance by summing entry bytes with a loop#26055

Open
void-ptr974 wants to merge 1 commit into
apache:masterfrom
void-ptr974:optimize-dispatch-total-bytes-loop
Open

[improve][broker] Improve dispatch performance by summing entry bytes with a loop#26055
void-ptr974 wants to merge 1 commit into
apache:masterfrom
void-ptr974:optimize-dispatch-total-bytes-loop

Conversation

@void-ptr974

@void-ptr974 void-ptr974 commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Motivation

The shared-subscription dispatch path sums entry bytes for every read batch. The existing implementation uses a stream pipeline, which adds avoidable allocation and dispatch overhead on this hot path.

Modifications

Replace the stream-based byte summation in both multiple-consumer dispatchers with a simple indexed loop.

Performance

This improves the dispatch hot path by reducing per-read-batch CPU work and eliminating stream pipeline allocation.

JMH data gathered locally using DispatcherDispatchPathBenchmark.(streamTotalBytes|loopTotalBytes), with -p entriesCount=32,256,1000 -wi 2 -i 5 -w 1s -r 1s -f 1 -bm avgt -tu ns -prof gc:

entriesCount streamTotalBytes loopTotalBytes improvement allocation
32 112.4 ns/op 40.3 ns/op 64.1% 256 B/op -> ~0
256 407.3 ns/op 253.7 ns/op 37.7% 256 B/op -> ~0
1000 1886.0 ns/op 1565.8 ns/op 17.0% 256 B/op -> ~0

Verification

  • Added PersistentDispatcherTotalBytesTest coverage for empty, single-entry, and large 1024-entry batches with varying real EntryImpl payload sizes.
  • ./gradlew :pulsar-broker:compileJava :pulsar-broker:checkstyleMain --max-workers=1
  • ./gradlew :pulsar-broker:test --tests org.apache.pulsar.broker.service.persistent.PersistentDispatcherTotalBytesTest --max-workers=1 -PtestRetryCount=0
  • ./gradlew :pulsar-broker:checkstyleTest --max-workers=1

@void-ptr974 void-ptr974 force-pushed the optimize-dispatch-total-bytes-loop branch 2 times, most recently from e540c3c to 68c04be Compare June 18, 2026 07:50
@void-ptr974 void-ptr974 changed the title [improve][broker] Avoid stream allocation when summing dispatch bytes [improve][broker] Improve dispatch performance by summing entry bytes with a loop Jun 18, 2026
@void-ptr974 void-ptr974 marked this pull request as ready for review June 18, 2026 12:39
}
}

static long getTotalBytesSize(List<Entry> entries) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of duplicating the code, we should move to a shared place

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done. Moved the helper to AbstractPersistentDispatcherMultipleConsumers and both dispatchers now reuse it.

@void-ptr974 void-ptr974 force-pushed the optimize-dispatch-total-bytes-loop branch from 68c04be to cf896a3 Compare June 19, 2026 03:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants