[fix](exec) publish copied local exchange rows#63547
Conversation
|
/review |
There was a problem hiding this comment.
I found one blocking test-coverage gap. The implementation is focused and the added to_block() handoff appears correct for the reviewed COW/mem-reuse failure, but the PR changes three exchanger implementations and only proves one of them.
Critical checkpoint conclusions:
- Goal and proof: the goal is to return copied rows after
MutableBlockmutates/reuses an output block with shared columns. The code addresses this in adaptive, shuffle, and broadcast paths, but tests currently prove only the adaptive path. - Scope: the implementation is small and localized.
- Concurrency/lifecycle: no new concurrency, locks, dependencies, atomics, or lifecycle ownership changes were introduced. Existing queue/dependency behavior is unchanged.
- Config/compatibility/protocol: no new configs, storage/protocol formats, or compatibility concerns.
- Parallel paths: this is the main issue;
ShuffleExchangerandBroadcastExchangerare parallel changed paths and need equivalent coverage. - Error handling/memory safety: existing
RETURN_IF_ERRORbehavior is preserved; no new untracked long-lived memory ownership was introduced. - Transaction/data visibility: not applicable.
- Observability/performance: no new observability need identified; extra
to_block()is only after copied rows are materialized and is consistent with the fix.
User focus: no additional user-provided review focus was supplied.
| block, partitioned_block.first->_data_block); | ||
| RETURN_IF_ERROR(get_data()); | ||
| if (mutable_block.rows() > 0) { | ||
| *block = mutable_block.to_block(); |
There was a problem hiding this comment.
This same MutableBlock handoff fix is applied to ShuffleExchanger here and to BroadcastExchanger below, but the new regression test only exercises AdaptivePassthroughExchanger. The failure mode depends on build_mutable_mem_reuse_block() mutating an output block whose column is shared, so existing tests that call these exchangers with a fresh Block would not catch a regression in the shuffle or broadcast paths. Please add equivalent shared-output-column coverage for the changed shuffle and broadcast get_block() paths, or otherwise prove these two paths are covered by an existing test.
|
run buildall |
TPC-H: Total hot run time: 30953 ms |
TPC-DS: Total hot run time: 170006 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: Local exchange copy-on-read paths copy row slices into a
MutableBlock, but the copied columns were not published back to the outputBlock. When output columns are shared and copy-on-write creates new mutable columns, the source side can return the old empty block and lose rows. This change publishes copied rows back to the output block and adds a BE unit test for the shared-output-column case.Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)