[ntuple] Address some RNTupleProcessor performance bottlenecks by enirolf · Pull Request #22593 · root-project/root

enirolf · 2026-06-12T11:34:15Z

This change prevents the unnecessary re-connection of inner RNTuples in a chain upon calling LoadEntry, by first checking whether the requested entry is before or after the currently loaded entry.

In addition, some unnecessary calls to Initialize and Connect in RNTupleSingleProcessor::GetNEntries have been factored out.

vepadulano · 2026-06-12T11:45:24Z

+   std::size_t currProcessorNumber = fCurrentProcessorNumber;
+   ROOT::NTupleSize_t entriesSeen = 0;
+   for (unsigned i = 0; i < currProcessorNumber; ++i) {
+      entriesSeen += fInnerProcessors[i]->GetNEntries();
+   }


Not for this PR, but in principle we could have a cache vector of number of entries per processor which is filled lazily at discovery time whenever a processor needs to connect to file(s)

Actually this is exactly what is done a few lines down and somehow didn't think to do it here, so thanks for pointing this out :D. Let me quickly add it here as well.

github-actions · 2026-06-12T16:07:07Z

Test Results

21 files 21 suites 3d 5h 44m 41s ⏱️
3 863 tests 3 863 ✅ 0 💤 0 ❌
72 810 runs 72 810 ✅ 0 💤 0 ❌

Results for commit d7d839b.

pcanal · 2026-06-12T22:20:33Z

+   // If the requested entry number is lower than the current entry number, we have to again localise the correct local
+   // entry number starting from the first processor in the chain. Otherwise, we can continue looking from the inner
+   // processor that is currently connected, which is much faster when the chain consists of many inner processors.
   if (entryNumber < fCurrentEntryNumber) {


Can't this be speed up in case the entryNumber is less than the fCurrentEntryNumber but more than the starting entry number of the current file? (and/or is the set of lengths cached and thus fast to go through again?)

enirolf requested review from hahnjo, pcanal and vepadulano June 12, 2026 11:34

enirolf self-assigned this Jun 12, 2026

enirolf requested a review from jblomer as a code owner June 12, 2026 11:34

enirolf added the in:RNTuple label Jun 12, 2026

enirolf requested a review from silverweed as a code owner June 12, 2026 11:34

vepadulano approved these changes Jun 12, 2026

View reviewed changes

enirolf added 2 commits June 12, 2026 13:47

[ntuple] Prevent redundant processor (re-)connection in chains

5e31171

[ntuple] Remove unnecessary init and connect of single processors

0c7ea38

enirolf force-pushed the ntuple-proc-chain-bottleneck branch from be1986b to 0c7ea38 Compare June 12, 2026 11:57

[ntuple] Use cached entry counts before calling GetNEntries

d7d839b

enirolf force-pushed the ntuple-proc-chain-bottleneck branch from 724d342 to d7d839b Compare June 12, 2026 13:00

pcanal reviewed Jun 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ntuple] Address some RNTupleProcessor performance bottlenecks#22593

[ntuple] Address some RNTupleProcessor performance bottlenecks#22593
enirolf wants to merge 3 commits into
root-project:masterfrom
enirolf:ntuple-proc-chain-bottleneck

enirolf commented Jun 12, 2026

Uh oh!

Uh oh!

vepadulano Jun 12, 2026

Uh oh!

enirolf Jun 12, 2026

Uh oh!

github-actions Bot commented Jun 12, 2026

Uh oh!

pcanal Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

enirolf commented Jun 12, 2026

Uh oh!

Uh oh!

vepadulano Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

enirolf Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 12, 2026

Test Results

Uh oh!

pcanal Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants