Improve memory usage of unaligned dataset joins in an RDataFrame analysis

### Explain what you would like to see improved and how.

I had a short discussion with @vepadulano regarding one issue we see in RDataFrame.

We run on TTrees that we need to "join" with BuildIndex(). The problem is in our case the number of events we need to match is around 500M, this causes issues because the memory needed to keep the hash map for the matching is huge. This would be manageable on its own, but the problem is that each thread keeps the copy of the map. This results in our case of 40 threads using more than 120 GBs of memory (would need probably much more but this is the limitation of the hardware). As you can see this is pretty restrictive as the solution here is to either: 
 * Do not use that many threads
 * Somehow split the files so you dont need to have a map of 500M entries
 * "Just get more RAM" 

These are not very compelling options.

We understand that this is probably beyond the scope of TTree and RDF support but this is something that could maybe improve for the RNtuple and RDF? As the current situation with TTrees is not sustainable

### ROOT version

Any

### Installation method

Any

### Operating system

Any

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve memory usage of unaligned dataset joins in an RDataFrame analysis #21859

Explain what you would like to see improved and how.

ROOT version

Installation method

Operating system

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve memory usage of unaligned dataset joins in an RDataFrame analysis #21859

Description

Explain what you would like to see improved and how.

ROOT version

Installation method

Operating system

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions