Skip to content

Avoid concatenating record batches in joins to alleviate memory pressure #23031

@maxburke

Description

@maxburke

Is your feature request related to a problem or challenge?

Many of the joins (such as HashJoin) concatenate input record batches into a single record batch for ease of index accounting, but this causes huge memory pressure issues when the datasets are large.

Describe the solution you'd like

If the joins were able to preserve the original record batches without concatenation, this would help alleviate OOMs on large queries.

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions