Skip to content

[ntuple] Address some RNTupleProcessor performance bottlenecks#22593

Open
enirolf wants to merge 3 commits into
root-project:masterfrom
enirolf:ntuple-proc-chain-bottleneck
Open

[ntuple] Address some RNTupleProcessor performance bottlenecks#22593
enirolf wants to merge 3 commits into
root-project:masterfrom
enirolf:ntuple-proc-chain-bottleneck

Conversation

@enirolf

@enirolf enirolf commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

This change prevents the unnecessary re-connection of inner RNTuples in a chain upon calling LoadEntry, by first checking whether the requested entry is before or after the currently loaded entry.

In addition, some unnecessary calls to Initialize and Connect in RNTupleSingleProcessor::GetNEntries have been factored out.

@enirolf enirolf requested review from hahnjo, pcanal and vepadulano June 12, 2026 11:34
@enirolf enirolf self-assigned this Jun 12, 2026
@enirolf enirolf requested a review from jblomer as a code owner June 12, 2026 11:34
@enirolf enirolf requested a review from silverweed as a code owner June 12, 2026 11:34
Comment thread tree/ntuple/src/RNTupleProcessor.cxx Outdated
Comment on lines +358 to +362
std::size_t currProcessorNumber = fCurrentProcessorNumber;
ROOT::NTupleSize_t entriesSeen = 0;
for (unsigned i = 0; i < currProcessorNumber; ++i) {
entriesSeen += fInnerProcessors[i]->GetNEntries();
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR, but in principle we could have a cache vector of number of entries per processor which is filled lazily at discovery time whenever a processor needs to connect to file(s)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is exactly what is done a few lines down and somehow didn't think to do it here, so thanks for pointing this out :D. Let me quickly add it here as well.

@enirolf enirolf force-pushed the ntuple-proc-chain-bottleneck branch from be1986b to 0c7ea38 Compare June 12, 2026 11:57
@enirolf enirolf force-pushed the ntuple-proc-chain-bottleneck branch from 724d342 to d7d839b Compare June 12, 2026 13:00
@github-actions

Copy link
Copy Markdown

Test Results

    21 files      21 suites   3d 5h 44m 41s ⏱️
 3 863 tests  3 863 ✅ 0 💤 0 ❌
72 810 runs  72 810 ✅ 0 💤 0 ❌

Results for commit d7d839b.

Comment on lines +350 to 353
// If the requested entry number is lower than the current entry number, we have to again localise the correct local
// entry number starting from the first processor in the chain. Otherwise, we can continue looking from the inner
// processor that is currently connected, which is much faster when the chain consists of many inner processors.
if (entryNumber < fCurrentEntryNumber) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this be speed up in case the entryNumber is less than the fCurrentEntryNumber but more than the starting entry number of the current file? (and/or is the set of lengths cached and thus fast to go through again?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants