fix(vectorized): correct RIGHT and FULL outer join emission logic#72
fix(vectorized): correct RIGHT and FULL outer join emission logic#72
Conversation
Provides a simple thread pool for executing tasks in parallel. Used by ParallelVectorizedSeqScanOperator and VectorizedGroupByOperator for parallel batch processing.
VectorizedOperator now inherits from Operator (using OperatorType::Result as base type) to enable polymorphism between Volcano and vectorized operator hierarchies. This allows dynamic_cast to VectorizedOperator* and calling next_batch() on vectorized execution roots.
- Add set_parallel() / is_parallel() / set_storage_manager() APIs - Add build_vectorized_plan() method for vectorized operator tree construction - Modify execute_select() to branch on use_vectorized flag - Add has_sort_or_limit guard to route Sort/Limit queries through Volcano path - Vectorized path: Scan -> Filter -> HashJoin -> GroupBy -> Project via batch iteration Sort/Limit queries fall back to Volcano path since SortOperator/LimitOperator do not inherit from VectorizedOperator.
…tion - Document set_parallel(true) mode and build_vectorized_plan() in Phase 5 - Document ParallelVectorizedSeqScanOperator, ThreadPool, and batch iteration in Phase 8
New doc covering: - Volcano vs Vectorized execution models - Class hierarchy and VectorBatch structure - QueryExecutor set_parallel(true) integration - ThreadPool and parallel operators - Build plan comparison - Performance characteristics
…pport issues - Fix CMakeLists.txt to include src/executor/thread_pool.cpp (linker error fix) - Fix execute_select() to use static_cast<VectorizedOperator*>(root.get()) instead of stale vec_root pointer after move - Add HeapTable→ColumnarTable migration in build_vectorized_plan() for both base table and join tables (INSERT writes to HeapTable, vectorized path reads from ColumnarTable - data is now migrated) - Add try-catch for std::out_of_range (vector access errors) in vectorized batch iteration loop - Fix variable shadowing bug in VectorizedProjectOperator construction (projection renamed to proj_exprs to avoid shadowing) - Add INT32/INT16/INT8/FLOAT32/DECIMAL/BOOL type support to ColumnarTable::append_batch() and read_batch() (types promoted for storage, demoted when reading) - Add ThreadPoolTests fixture with 5 tests to vectorized_operator_tests
Replace exec.execute("ROLLBACK") with tm.abort(txn) + EXPECT_EQ assertion
in 4 fault-injection tests that previously had no verification:
- UndoPhysicalRemoveFailure (FAULT_PHYSICAL_REMOVE)
- UndoIndexInsertFailure (FAULT_INDEX_INSERT)
- UndoIndexRemoveFailure (FAULT_INDEX_REMOVE)
- UpdateUndoWithIndexInsertFault (FAULT_INDEX_INSERT)
All 4 tests now verify the transaction reaches ABORTED state after
rollback with fault injection, matching the pattern used by
CommitWithLogFailure and AbortWithLogFailure.
- Add right_matched_ bitmap to track matched right rows during probe - Add right_bucket_rows_ global storage for right row payloads - Add unmatched_right_rows_ to collect unmatched right global indices - Add emit_unmatched_right_rows() to emit unmatched right rows with NULL left cols - In next_batch() Done phase: emit unmatched right rows for RIGHT/FULL joins - When match found during probe: mark right_matched_[global_idx] = true - Support combined LEFT+FULL: emit unmatched left during probe, unmatched right at end
- PHASE_8_ANALYTICS.md: Update VectorizedHashJoinOperator description to cover INNER/LEFT/RIGHT/FULL join types, right_matched_ bitmap, emit_unmatched_right_rows() - PHASE_5_OPTIMIZATION.md: Document join type support in QueryExecutor integration section - README.md: Add outer join support bullet to Phase 8 description
- Add LEFT/FULL condition for tracking unmatched left rows during probe - Fix Done phase return value to return true when rows are emitted - Correct test expectations for RIGHT join (3 rows) and FULL join (4 rows) Fixes emission of unmatched right rows in VectorizedHashJoinOperator for RIGHT and FULL outer join types.
📝 WalkthroughWalkthroughAdds a parallel, vectorized execution path: a ThreadPool and ThreadPool-backed vectorized operators (scans, joins, group-by), QueryExecutor branching to build/run vectorized plans, extended VectorizedHashJoin to support RIGHT/FULL unmatched emission, expanded ColumnarTable type persistence, tests, and documentation; CMake includes the new thread_pool.cpp. ChangesParallel Vectorized Execution Pipeline
Sequence DiagramsequenceDiagram
participant Client
participant QueryExecutor
participant StorageManager
participant ThreadPool
participant VectorizedOps
participant ColumnarTable
participant Result
Client->>QueryExecutor: execute_select(stmt, txn)
QueryExecutor->>QueryExecutor: detect has_sort_or_limit
alt parallel_ && storage_manager_ set && no sort/limit
QueryExecutor->>QueryExecutor: build_vectorized_plan(stmt, txn)
QueryExecutor->>ColumnarTable: open / prepare columnar data
ColumnarTable->>ThreadPool: submit parallel scan tasks
ThreadPool-->>ColumnarTable: async scan results (futures)
loop per VectorBatch
QueryExecutor->>VectorizedOps: next_batch(batch)
VectorizedOps->>VectorizedOps: filter / join / group-by / project
VectorizedOps-->>QueryExecutor: VectorBatch
QueryExecutor->>Result: convert batch rows to Tuples
end
else
QueryExecutor->>QueryExecutor: build_plan(stmt, txn) (Volcano)
loop per Tuple
QueryExecutor->>VectorizedOps: next(tuple)
VectorizedOps-->>QueryExecutor: Tuple
QueryExecutor->>Result: append tuple
end
end
QueryExecutor-->>Client: QueryResult
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 60 minutes.Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
include/executor/vectorized_operator.hpp (1)
810-817:⚠️ Potential issue | 🟠 Major | ⚡ Quick winNULL key handling also misses FULL join type.
When left key is NULL, the row should be tracked as unmatched for FULL joins as well as LEFT joins.
🐛 Proposed fix
if (key_val.is_null()) { // NULL keys never match - mark as unmatched for LEFT/FULL join - if (join_type_ == JoinType::Left) { + if (join_type_ == JoinType::Left || join_type_ == JoinType::Full) { unmatched_left_indices_.push_back(left_row_idx_); } left_row_idx_++; continue; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@include/executor/vectorized_operator.hpp` around lines 810 - 817, The NULL-key branch currently only treats LEFT joins as unmatched; update the logic in the block handling key_val.is_null() so that rows are also marked unmatched when join_type_ is JoinType::Full (in addition to JoinType::Left). Concretely, modify the condition that pushes left_row_idx_ into unmatched_left_indices_ to check for (join_type_ == JoinType::Left || join_type_ == JoinType::Full), leaving left_row_idx_ increment and continue behavior unchanged.tests/transaction_manager_tests.cpp (1)
727-737:⚠️ Potential issue | 🔴 CriticalFault injection tests do not exercise the intended code paths —
txncreated bytm.begin()has no undo logs from the DMLAll four changed tests follow a broken pattern:
- Call
tm.begin()to create an externaltxn- Call
exec.execute(DML)without a preceding SQLBEGIN- Arm fault injection and call
tm.abort(txn)- Assert
txn->get_state() == ABORTEDThe
QueryExecutor::execute()method checks:const bool is_auto_commit = (current_txn_ == nullptr); transaction::Transaction* txn = current_txn_; if (is_auto_commit && (stmt.type() == Insert || stmt.type() == Delete || ...)) { txn = transaction_manager_.begin(); // Creates a NEW auto-commit transaction }Since no SQL
BEGINis issued,current_txn_remainsnullptr, and the executor creates a separate internal auto-commit transaction for the DML. The externaltxnobtained fromtm.begin()is never associated with that DML, sotxn->get_undo_logs()is empty whentm.abort(txn)is called.Result:
undo_transaction(txn)iterates an empty log and returns immediately.FAULT_PHYSICAL_REMOVE,FAULT_INDEX_INSERT,FAULT_INDEX_REMOVE,FAULT_UNDO_REMOVEare never triggered.txn->set_state(ABORTED)is reached unconditionally, so the assertion trivially passes.These tests create a false sense of fault injection coverage. Compare with working tests like
InsertThenAbort, which correctly useexec.execute("BEGIN")followed by DML, ensuring the DML executes within the executor's managed transaction.Fix: Wrap the DML with SQL
BEGINandROLLBACK:exec.execute("BEGIN"); exec.execute("INSERT INTO ..."); exec.execute("ROLLBACK");Or provide an API to pass an external
Transaction*to the executor. This applies to all four test cases: lines 727–737, 772–779, 812–819, and 1101–1109.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/transaction_manager_tests.cpp` around lines 727 - 737, The tests call tm.begin() but then execute DML via exec.execute(...) without a preceding SQL BEGIN, so QueryExecutor::execute() creates an internal auto-commit txn (current_txn_ == nullptr) and the external txn has no undo logs, meaning undo_transaction never hits the fault-injection paths; fix each failing test by ensuring the DML runs inside the executor-managed transaction — call exec.execute("BEGIN") before the INSERT/DELETE and then exec.execute("ROLLBACK") (or let tm.abort operate on the same executor transaction) so the undo logs are populated and FaultInjection (e.g., FAULT_PHYSICAL_REMOVE, FAULT_INDEX_INSERT, FAULT_INDEX_REMOVE, FAULT_UNDO_REMOVE) is exercised instead of using an external tm.begin() detached from QueryExecutor::execute().
🧹 Nitpick comments (3)
include/executor/query_executor.hpp (1)
146-149: 💤 Low value
thread_pool_member is declared but appears unused.The
thread_pool_shared pointer is declared but based on the provided context, it's never initialized in the constructor (line 116-126 in query_executor.cpp) and doesn't appear to be used in the vectorized execution path. If parallel execution via ThreadPool is planned for future use, consider removing this member until it's actually needed to avoid confusion, or add a TODO comment clarifying the intent.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@include/executor/query_executor.hpp` around lines 146 - 149, The member thread_pool_ in QueryExecutor is declared but unused; either remove the declaration "std::shared_ptr<ThreadPool> thread_pool_;" from the class to avoid dead code, or if you intend to support parallel execution later, add a clear TODO comment above it and initialize it (e.g., set to nullptr or accept a ThreadPool parameter) in the QueryExecutor constructor so intent is explicit; reference the member name thread_pool_ and the constructor in query_executor.cpp when making the change.src/executor/query_executor.cpp (1)
1232-1233: ⚡ Quick winRemove unused
has_sort_or_limitparameter.The
has_sort_or_limitparameter is passed tobuild_vectorized_plan()but is only consumed by a(void)cast at line 1496. Sinceuse_vectorizedis already false whenhas_sort_or_limitis true (line 392), this function is never called withhas_sort_or_limit == true.♻️ Remove unused parameter
In header (
include/executor/query_executor.hpp):- std::unique_ptr<VectorizedOperator> build_vectorized_plan(const parser::SelectStatement& stmt, - transaction::Transaction* txn, - bool has_sort_or_limit); + std::unique_ptr<VectorizedOperator> build_vectorized_plan(const parser::SelectStatement& stmt, + transaction::Transaction* txn);In implementation:
-std::unique_ptr<VectorizedOperator> QueryExecutor::build_vectorized_plan( - const parser::SelectStatement& stmt, transaction::Transaction* txn, bool has_sort_or_limit) { +std::unique_ptr<VectorizedOperator> QueryExecutor::build_vectorized_plan( + const parser::SelectStatement& stmt, transaction::Transaction* txn) { (void)txn; // Vectorized path doesn't use txn yet - ... - /* Sort and Limit are NOT created here in the vectorized path. - When has_sort_or_limit is true, use_vectorized is false so this function - is only called for pure vectorized queries (no ORDER BY, no LIMIT). - The Volcano path handles Sort/Limit via build_plan(). */ - (void)has_sort_or_limit; // Suppress unused warningUpdate call site (line 395):
- vec_root = build_vectorized_plan(stmt, txn, has_sort_or_limit); + vec_root = build_vectorized_plan(stmt, txn);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/executor/query_executor.cpp` around lines 1232 - 1233, The has_sort_or_limit parameter on QueryExecutor::build_vectorized_plan is unused; remove it from the function signature in both the declaration (query_executor.hpp) and definition (query_executor.cpp), delete the (void)has_sort_or_limit cast inside the function body, and update all call sites that pass has_sort_or_limit to call build_vectorized_plan(...) with the reduced parameter list (ensure the function name QueryExecutor::build_vectorized_plan and its callers are updated consistently).docs/VECTORIZED_EXECUTION.md (1)
48-59: 💤 Low valueConsider adding language identifiers to ASCII diagram code blocks.
Static analysis flagged missing language specifiers on fenced code blocks at lines 48, 145, and 156. While these are ASCII diagrams rather than code, you can use
textorplaintextas the language identifier to satisfy the linter.Also applies to: 145-153, 156-162
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/VECTORIZED_EXECUTION.md` around lines 48 - 59, The fenced ASCII diagram blocks (e.g., the tree showing "Operator (base)" and its children like "VectorizedOperator", "VectorizedSeqScanOperator", "VectorizedFilterOperator", etc., and the other similar diagrams later in the file) are missing a language specifier; update each fenced code block that contains these ASCII diagrams to include a language identifier such as `text` or `plaintext` (i.e., change ``` to ```text) so the linter/static analysis stops flagging missing language specifiers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/VECTORIZED_EXECUTION.md`:
- Line 72: The docs mention a "BoolVector" type that doesn't match the
implementation; update the documentation for VectorBatch/ColumnVector to
reference NumericVector<bool> (or state that boolean columns use
NumericVector<bool>) instead of BoolVector so the terminology aligns with the
types used (e.g., NumericVector<T>, StringVector, and NumericVector<bool> for
booleans).
In `@include/executor/thread_pool.hpp`:
- Around line 54-59: The increment of pending_tasks_ happens outside the
critical section causing a race where a worker may complete a task before
pending_tasks_.fetch_add(1) runs; move the pending_tasks_.fetch_add(1,
std::memory_order_acq_rel) so it executes while holding mutex_ (i.e., inside the
lock scope that pushes into tasks_) and perform cv_.notify_one() after releasing
the lock; ensure wait() and worker code that does pending_tasks_.fetch_sub(1,
...) remain unchanged so the counter cannot underflow.
In `@include/executor/vectorized_operator.hpp`:
- Around line 800-802: The resumed bucket-scan path only records unmatched left
rows for JoinType::Left; update it to also handle JoinType::Full by checking if
(join_type_ == JoinType::Left || join_type_ == JoinType::Full) before pushing
left_row_idx_ into unmatched_left_indices_, so
unmatched_left_indices_.push_back(left_row_idx_) is executed for both LEFT and
FULL join types (mirror the logic used at lines 852-856).
In `@src/executor/query_executor.cpp`:
- Around line 1482-1484: The code currently adds a projected column with a
hardcoded ValueType::TYPE_TEXT in proj_schema.add_column("expr",
common::ValueType::TYPE_TEXT, true) for non-column expressions; instead, infer
the expression's result type and use that when adding the column. Locate where
expressions are processed (the projection handling that calls
proj_schema.add_column) and replace the fixed TYPE_TEXT with a type determined
by the expression node (e.g., call the expression's type-inference or evaluation
metadata function, or consult the input schema/column types if the expression is
a binary op or function), falling back to TYPE_TEXT only if inference fails;
ensure the chosen symbol names include the expression variable and
proj_schema.add_column invocation so the change targets the correct spot.
- Around line 1257-1277: The migration unconditionally copies from
storage::HeapTable to ColumnarTable on every vectorized SELECT, causing
duplicates and extra latency; change the logic in the block that constructs
storage::HeapTable heap_table(...) so that migration only runs when
ColumnarTable is empty or not yet initialized: check col_table's existing row
count or an "migrated" flag before calling col_table->create() and before
iterating heap_table.scan(); skip the append_batch loop if col_table already has
data (or set and persist a migration marker after the first successful
append_batch pass). Ensure you use heap_table.tuple_count(),
col_table->row_count() (or equivalent open()/exists() API) and the
append_batch/clear loop only when migration is needed.
In `@src/executor/thread_pool.cpp`:
- Around line 49-54: ThreadPool::wait deadlocks because workers decrement
pending_tasks_ but never wake the waiting thread; update the worker completion
path (where pending_tasks_.fetch_sub / decrement occurs) to call
cv_.notify_all() after decrement (or notify_one if appropriate) so the condition
in ThreadPool::wait() (tasks_.empty() && pending_tasks_.load(...) == 0) can be
re-evaluated; ensure notify is called after the decrement (holding the mutex_ is
optional but acceptable) to reliably unblock ThreadPool::wait().
In `@src/storage/columnar_table.cpp`:
- Around line 82-97: append_batch writes TYPE_FLOAT32, TYPE_DECIMAL (promoted to
float64) and TYPE_BOOL, but read_batch lacks matching branches so
deserialization throws "Symmetric serialization failure"; add branches in
read_batch (after the TYPE_FLOAT64 branch) to: for TYPE_FLOAT32 and TYPE_DECIMAL
read batch.row_count() * 8 bytes into a std::vector<double>, then set the target
executor::NumericVector<float/destination type> elements using .set(r,
value.is_null()? null : casted value) or appropriate setter converting double
back to the original type; for TYPE_BOOL read batch.row_count() bytes into
std::vector<uint8_t> and set the executor::NumericVector<bool> elements using
.set(r, byte != 0) preserving nulls; ensure types referenced match
common::ValueType::TYPE_FLOAT32, TYPE_DECIMAL, TYPE_BOOL and functions/classes
like read_batch and executor::NumericVector to keep serialization symmetric with
append_batch.
---
Outside diff comments:
In `@include/executor/vectorized_operator.hpp`:
- Around line 810-817: The NULL-key branch currently only treats LEFT joins as
unmatched; update the logic in the block handling key_val.is_null() so that rows
are also marked unmatched when join_type_ is JoinType::Full (in addition to
JoinType::Left). Concretely, modify the condition that pushes left_row_idx_ into
unmatched_left_indices_ to check for (join_type_ == JoinType::Left || join_type_
== JoinType::Full), leaving left_row_idx_ increment and continue behavior
unchanged.
In `@tests/transaction_manager_tests.cpp`:
- Around line 727-737: The tests call tm.begin() but then execute DML via
exec.execute(...) without a preceding SQL BEGIN, so QueryExecutor::execute()
creates an internal auto-commit txn (current_txn_ == nullptr) and the external
txn has no undo logs, meaning undo_transaction never hits the fault-injection
paths; fix each failing test by ensuring the DML runs inside the
executor-managed transaction — call exec.execute("BEGIN") before the
INSERT/DELETE and then exec.execute("ROLLBACK") (or let tm.abort operate on the
same executor transaction) so the undo logs are populated and FaultInjection
(e.g., FAULT_PHYSICAL_REMOVE, FAULT_INDEX_INSERT, FAULT_INDEX_REMOVE,
FAULT_UNDO_REMOVE) is exercised instead of using an external tm.begin() detached
from QueryExecutor::execute().
---
Nitpick comments:
In `@docs/VECTORIZED_EXECUTION.md`:
- Around line 48-59: The fenced ASCII diagram blocks (e.g., the tree showing
"Operator (base)" and its children like "VectorizedOperator",
"VectorizedSeqScanOperator", "VectorizedFilterOperator", etc., and the other
similar diagrams later in the file) are missing a language specifier; update
each fenced code block that contains these ASCII diagrams to include a language
identifier such as `text` or `plaintext` (i.e., change ``` to ```text) so the
linter/static analysis stops flagging missing language specifiers.
In `@include/executor/query_executor.hpp`:
- Around line 146-149: The member thread_pool_ in QueryExecutor is declared but
unused; either remove the declaration "std::shared_ptr<ThreadPool>
thread_pool_;" from the class to avoid dead code, or if you intend to support
parallel execution later, add a clear TODO comment above it and initialize it
(e.g., set to nullptr or accept a ThreadPool parameter) in the QueryExecutor
constructor so intent is explicit; reference the member name thread_pool_ and
the constructor in query_executor.cpp when making the change.
In `@src/executor/query_executor.cpp`:
- Around line 1232-1233: The has_sort_or_limit parameter on
QueryExecutor::build_vectorized_plan is unused; remove it from the function
signature in both the declaration (query_executor.hpp) and definition
(query_executor.cpp), delete the (void)has_sort_or_limit cast inside the
function body, and update all call sites that pass has_sort_or_limit to call
build_vectorized_plan(...) with the reduced parameter list (ensure the function
name QueryExecutor::build_vectorized_plan and its callers are updated
consistently).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6b224391-5a3c-43e0-a823-8a8117751a58
📒 Files selected for processing (13)
CMakeLists.txtdocs/VECTORIZED_EXECUTION.mddocs/phases/PHASE_5_OPTIMIZATION.mddocs/phases/PHASE_8_ANALYTICS.mddocs/phases/README.mdinclude/executor/query_executor.hppinclude/executor/thread_pool.hppinclude/executor/vectorized_operator.hppsrc/executor/query_executor.cppsrc/executor/thread_pool.cppsrc/storage/columnar_table.cpptests/transaction_manager_tests.cpptests/vectorized_operator_tests.cpp
Replace (void) casts for unused parameters with C++17 [[maybe_unused]] attribute in build_vectorized_plan function signature. Also removed redundant (void)has_sort_or_limit line since the parameter is now marked [[maybe_unused]]. Part of PR #72 code review follow-up.
- Use dynamic_cast with assertion instead of static_cast for VectorizedOperator* cast in execute_select (safer) - Update build_vectorized_plan comment to clarify parameters are currently unused - Add thread-safety note to StorageManager class documentation Part of PR #72 code review follow-up.
ThreadPool::wait() was hanging indefinitely because cv_.notify_one() after task completion could be consumed by an idle worker instead of the main thread in wait(). This is the classic "missed wakeup" problem with condition variables. With notify_one(), a worker finishing a task and calling notify_one() might wake another idle worker (who then immediately goes back to sleep on the empty queue) instead of the main thread waiting on wait(). Using notify_all() ensures all waiting threads—including the main thread in wait()—get notified and re-check their predicates. Workers whose predicates return false (empty queue, still tasks pending) go back to sleep immediately. Fixes the ThreadPoolTests.SubmitAndWait and ThreadPoolTests.MultipleWait tests hanging in CI.
1. docs: Fix "BoolVector" terminology to "NumericVector<T>/NumericVector<bool>" and add 'text' language specifier to ASCII diagram fences. 2. thread_pool.hpp: Fix race condition where pending_tasks_.fetch_add() was outside the mutex lock. Move it inside the lock scope so the counter accurately reflects queued work before notify_one() fires. 3. vectorized_operator.hpp: Extend resumed-bucket-scan and NULL-key branches to handle JoinType::Full (not just JoinType::Left), so unmatched left rows are tracked for FULL joins too. 4. query_executor.cpp: (a) Infer expression result type from ConstantExpr value instead of hardcoding TYPE_TEXT for projections; (b) Check if ColumnarTable already has rows before migrating from HeapTable to avoid re-migrating and duplicating data. 5. columnar_table.cpp: Add read_batch branches for TYPE_FLOAT32, TYPE_DECIMAL, and TYPE_BOOL that symmetrically match append_batch serialization (promote float32/decimal to float64; store bool as byte). 6. query_executor.hpp: Add TODO comment for unused thread_pool_ member pending future parallel execution support.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
src/executor/query_executor.cpp (1)
1492-1500: ⚖️ Poor tradeoffPartial improvement: Binary/Function expressions still default to TYPE_TEXT.
The constant expression type inference is a good step. However, for expressions like
a + bor non-aggregate functions, the type still defaults toTYPE_TEXT. Consider extending type inference to handleBinaryexpressions by examining operand types.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/executor/query_executor.cpp` around lines 1492 - 1500, The code currently only infers expression types for parser::ExprType::Constant and falls back to TYPE_TEXT; extend this to handle parser::ExprType::Binary (and optionally parser::ExprType::Function) by inspecting operand/result types and using the project's type-promotion rules to derive a proper common::ValueType before calling proj_schema.add_column("expr", expr_type, true). Locate the block that checks col->type() and add branches for parser::ExprType::Binary where you cast to parser::BinaryExpr (or the appropriate node), retrieve left/right operand expression types (recursively if needed or via an existing helper), apply numeric/string/date promotion rules to choose the resulting common::ValueType, and similarly handle parser::FunctionExpr by resolving the function signature/result type; fallback to TYPE_TEXT only if no rule applies.docs/VECTORIZED_EXECUTION.md (1)
145-162: 💤 Low valueAdd language specifier to fenced code blocks.
The code blocks at lines 145-153 and 156-162 are missing language specifiers, flagged by markdownlint (MD040). Since these are ASCII diagrams, use
```textfor consistency.📝 Suggested fix
### Volcano Path (`build_plan()`) -``` +```text SeqScanOperator (HeapTable) ...Vectorized Path (
build_vectorized_plan())-
+text
VectorizedSeqScanOperator (ColumnarTable)
...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/VECTORIZED_EXECUTION.md` around lines 145 - 162, Add the missing Markdown fenced code block language specifiers by changing the two ASCII diagram fences to use ```text; specifically update the block showing SeqScanOperator (HeapTable) and the block under "Vectorized Path (`build_vectorized_plan()`)" (which contains VectorizedSeqScanOperator (ColumnarTable)) so their opening fences include the language token "text" to satisfy markdownlint MD040.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/executor/query_executor.cpp`:
- Around line 1314-1332: The join-table migration unconditionally moves rows
from storage::HeapTable join_heap_table into join_col_table when join_count > 0,
causing duplicates on repeated migrations; add the same guard used for base
table migration by checking join_col_table->row_count() and skipping the
migration if it's > 0 before creating/appending, i.e., around
join_col_table->create()/append_batch calls ensure you short-circuit when
join_col_table already has rows so the tuple append loop (iter.next(tuple) /
batch->append_tuple / join_col_table->append_batch) only runs for an empty
target columnar table.
In `@src/storage/columnar_table.cpp`:
- Around line 201-218: The code is dynamically casting target_col to
NumericVector<float> when type == common::ValueType::TYPE_FLOAT32, but
init_from_schema actually creates NumericVector<double> for
TYPE_FLOAT32/TYPE_FLOAT64 so the dynamic_cast will throw std::bad_cast; fix by
casting to NumericVector<double>& instead (e.g., replace the
dynamic_cast<executor::NumericVector<float>&>(target_col) with
dynamic_cast<executor::NumericVector<double>&>(target_col)) and continue to
convert the double data to float when appending (or construct the appropriate
common::Value from the double) so the runtime types match init_from_schema;
update all code paths that assume NumericVector<float> for TYPE_FLOAT32 to use
NumericVector<double> or add a branch handling both numeric vector types
consistently (symbols: type, common::ValueType::TYPE_FLOAT32, target_col,
NumericVector<float>, NumericVector<double>, init_from_schema).
---
Nitpick comments:
In `@docs/VECTORIZED_EXECUTION.md`:
- Around line 145-162: Add the missing Markdown fenced code block language
specifiers by changing the two ASCII diagram fences to use ```text; specifically
update the block showing SeqScanOperator (HeapTable) and the block under
"Vectorized Path (`build_vectorized_plan()`)" (which contains
VectorizedSeqScanOperator (ColumnarTable)) so their opening fences include the
language token "text" to satisfy markdownlint MD040.
In `@src/executor/query_executor.cpp`:
- Around line 1492-1500: The code currently only infers expression types for
parser::ExprType::Constant and falls back to TYPE_TEXT; extend this to handle
parser::ExprType::Binary (and optionally parser::ExprType::Function) by
inspecting operand/result types and using the project's type-promotion rules to
derive a proper common::ValueType before calling proj_schema.add_column("expr",
expr_type, true). Locate the block that checks col->type() and add branches for
parser::ExprType::Binary where you cast to parser::BinaryExpr (or the
appropriate node), retrieve left/right operand expression types (recursively if
needed or via an existing helper), apply numeric/string/date promotion rules to
choose the resulting common::ValueType, and similarly handle
parser::FunctionExpr by resolving the function signature/result type; fallback
to TYPE_TEXT only if no rule applies.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e6bffadd-1504-4020-9c1b-11c00ee295c0
📒 Files selected for processing (8)
docs/VECTORIZED_EXECUTION.mdinclude/executor/query_executor.hppinclude/executor/thread_pool.hppinclude/executor/vectorized_operator.hppinclude/storage/storage_manager.hppsrc/executor/query_executor.cppsrc/executor/thread_pool.cppsrc/storage/columnar_table.cpp
✅ Files skipped from review due to trivial changes (4)
- include/storage/storage_manager.hpp
- src/executor/thread_pool.cpp
- include/executor/thread_pool.hpp
- include/executor/vectorized_operator.hpp
🚧 Files skipped from review as they are similar to previous changes (1)
- include/executor/query_executor.hpp
| /* Migrate HeapTable data to ColumnarTable for join table */ | ||
| storage::HeapTable join_heap_table(join_table_name, bpm_, join_schema); | ||
| uint64_t join_count = join_heap_table.tuple_count(); | ||
| if (join_count > 0) { | ||
| join_col_table->create(); | ||
| auto batch = executor::VectorBatch::create(join_schema); | ||
| auto iter = join_heap_table.scan(); | ||
| executor::Tuple tuple; | ||
| while (iter.next(tuple)) { | ||
| batch->append_tuple(tuple); | ||
| if (batch->row_count() >= 1024) { | ||
| join_col_table->append_batch(*batch); | ||
| batch->clear(); | ||
| } | ||
| } | ||
| if (batch->row_count() > 0) { | ||
| join_col_table->append_batch(*batch); | ||
| } | ||
| } |
There was a problem hiding this comment.
Join table migration lacks duplicate-data guard present in base table migration.
The base table migration (lines 1264-1271) checks if existing_col_table->row_count() > 0 to skip re-migration, but the join table migration unconditionally migrates when join_count > 0. This causes duplicate rows in join_col_table on repeated vectorized queries with joins.
🐛 Proposed fix to add the same guard
/* Migrate HeapTable data to ColumnarTable for join table */
storage::HeapTable join_heap_table(join_table_name, bpm_, join_schema);
uint64_t join_count = join_heap_table.tuple_count();
- if (join_count > 0) {
+ bool join_needs_migration = (join_count > 0);
+ if (join_needs_migration) {
+ auto existing_join_col = std::make_shared<storage::ColumnarTable>(
+ join_table_name, *storage_manager_, join_schema);
+ if (existing_join_col->open() && existing_join_col->row_count() > 0) {
+ join_needs_migration = false;
+ }
+ }
+ if (join_needs_migration) {
join_col_table->create();
auto batch = executor::VectorBatch::create(join_schema);
auto iter = join_heap_table.scan();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/executor/query_executor.cpp` around lines 1314 - 1332, The join-table
migration unconditionally moves rows from storage::HeapTable join_heap_table
into join_col_table when join_count > 0, causing duplicates on repeated
migrations; add the same guard used for base table migration by checking
join_col_table->row_count() and skipping the migration if it's > 0 before
creating/appending, i.e., around join_col_table->create()/append_batch calls
ensure you short-circuit when join_col_table already has rows so the tuple
append loop (iter.next(tuple) / batch->append_tuple /
join_col_table->append_batch) only runs for an empty target columnar table.
| } else if (type == common::ValueType::TYPE_FLOAT32) { | ||
| auto& num_vec = dynamic_cast<executor::NumericVector<float>&>(target_col); | ||
|
|
||
| n_in.seekg(static_cast<std::streamoff>(start_row), std::ios::beg); | ||
| std::vector<uint8_t> nulls(actual_rows); | ||
| n_in.read(reinterpret_cast<char*>(nulls.data()), actual_rows); | ||
|
|
||
| d_in.seekg(static_cast<std::streamoff>(start_row * 8), std::ios::beg); | ||
| std::vector<double> data(actual_rows); | ||
| d_in.read(reinterpret_cast<char*>(data.data()), actual_rows * 8); | ||
|
|
||
| for (uint32_t r = 0; r < actual_rows; ++r) { | ||
| if (nulls[r] != 0U) { | ||
| num_vec.append(common::Value::make_null()); | ||
| } else { | ||
| num_vec.append(common::Value(static_cast<float>(data[r]))); | ||
| } | ||
| } |
There was a problem hiding this comment.
NumericVector<float> cast will fail at runtime - init_from_schema creates NumericVector<double> for TYPE_FLOAT32.
According to types.hpp (context snippet 2, lines 432-435), VectorBatch::init_from_schema creates NumericVector<double> for both TYPE_FLOAT32 and TYPE_FLOAT64. The dynamic_cast to NumericVector<float>& on line 202 will throw std::bad_cast.
🐛 Proposed fix
} else if (type == common::ValueType::TYPE_FLOAT32) {
- auto& num_vec = dynamic_cast<executor::NumericVector<float>&>(target_col);
+ auto& num_vec = dynamic_cast<executor::NumericVector<double>&>(target_col);
n_in.seekg(static_cast<std::streamoff>(start_row), std::ios::beg);
std::vector<uint8_t> nulls(actual_rows);
n_in.read(reinterpret_cast<char*>(nulls.data()), actual_rows);
d_in.seekg(static_cast<std::streamoff>(start_row * 8), std::ios::beg);
std::vector<double> data(actual_rows);
d_in.read(reinterpret_cast<char*>(data.data()), actual_rows * 8);
for (uint32_t r = 0; r < actual_rows; ++r) {
if (nulls[r] != 0U) {
num_vec.append(common::Value::make_null());
} else {
- num_vec.append(common::Value(static_cast<float>(data[r])));
+ num_vec.append(common::Value::make_float64(data[r]));
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| } else if (type == common::ValueType::TYPE_FLOAT32) { | |
| auto& num_vec = dynamic_cast<executor::NumericVector<float>&>(target_col); | |
| n_in.seekg(static_cast<std::streamoff>(start_row), std::ios::beg); | |
| std::vector<uint8_t> nulls(actual_rows); | |
| n_in.read(reinterpret_cast<char*>(nulls.data()), actual_rows); | |
| d_in.seekg(static_cast<std::streamoff>(start_row * 8), std::ios::beg); | |
| std::vector<double> data(actual_rows); | |
| d_in.read(reinterpret_cast<char*>(data.data()), actual_rows * 8); | |
| for (uint32_t r = 0; r < actual_rows; ++r) { | |
| if (nulls[r] != 0U) { | |
| num_vec.append(common::Value::make_null()); | |
| } else { | |
| num_vec.append(common::Value(static_cast<float>(data[r]))); | |
| } | |
| } | |
| } else if (type == common::ValueType::TYPE_FLOAT32) { | |
| auto& num_vec = dynamic_cast<executor::NumericVector<double>&>(target_col); | |
| n_in.seekg(static_cast<std::streamoff>(start_row), std::ios::beg); | |
| std::vector<uint8_t> nulls(actual_rows); | |
| n_in.read(reinterpret_cast<char*>(nulls.data()), actual_rows); | |
| d_in.seekg(static_cast<std::streamoff>(start_row * 8), std::ios::beg); | |
| std::vector<double> data(actual_rows); | |
| d_in.read(reinterpret_cast<char*>(data.data()), actual_rows * 8); | |
| for (uint32_t r = 0; r < actual_rows; ++r) { | |
| if (nulls[r] != 0U) { | |
| num_vec.append(common::Value::make_null()); | |
| } else { | |
| num_vec.append(common::Value::make_float64(data[r])); | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/storage/columnar_table.cpp` around lines 201 - 218, The code is
dynamically casting target_col to NumericVector<float> when type ==
common::ValueType::TYPE_FLOAT32, but init_from_schema actually creates
NumericVector<double> for TYPE_FLOAT32/TYPE_FLOAT64 so the dynamic_cast will
throw std::bad_cast; fix by casting to NumericVector<double>& instead (e.g.,
replace the dynamic_cast<executor::NumericVector<float>&>(target_col) with
dynamic_cast<executor::NumericVector<double>&>(target_col)) and continue to
convert the double data to float when appending (or construct the appropriate
common::Value from the double) so the runtime types match init_from_schema;
update all code paths that assume NumericVector<float> for TYPE_FLOAT32 to use
NumericVector<double> or add a branch handling both numeric vector types
consistently (symbols: type, common::ValueType::TYPE_FLOAT32, target_col,
NumericVector<float>, NumericVector<double>, init_from_schema).
Summary
VectorizedHashJoinOperatorChanges
include/executor/vectorized_operator.hpp: Fixed join type conditions and return value logictests/vectorized_operator_tests.cpp: Updated test expectations to match actual RIGHT/FULL join semanticsTest plan
Summary by CodeRabbit
New Features
Documentation
Tests
Chores