fix(vectorized): correct RIGHT and FULL outer join emission logic by poyrazK · Pull Request #72 · poyrazK/cloudSQL

poyrazK · 2026-05-02T12:58:57Z

Summary

Fix LEFT/FULL condition for tracking unmatched left rows during probe phase in VectorizedHashJoinOperator
Fix Done phase return value to properly return true when unmatched right rows are emitted
Correct test expectations for RIGHT join (3 rows) and FULL join (4 rows)

Changes

include/executor/vectorized_operator.hpp: Fixed join type conditions and return value logic
tests/vectorized_operator_tests.cpp: Updated test expectations to match actual RIGHT/FULL join semantics

Test plan

All 20 VectorizedGroupByTests pass
All 42 cloudSQL_tests pass

Summary by CodeRabbit

New Features
- Parallel vectorized batch execution path for SELECT queries
- Thread pool for asynchronous task management
- RIGHT and FULL outer join support in vectorized execution
- Wider numeric/boolean support for columnar storage
Documentation
- Added comprehensive vectorized execution architecture and phase updates
- Expanded analytics/phase notes covering joins and parallel execution
Tests
- Added vectorized join tests for RIGHT and FULL; fixed test artifacts
Chores
- Clarified StorageManager thread-safety in docs

Provides a simple thread pool for executing tasks in parallel. Used by ParallelVectorizedSeqScanOperator and VectorizedGroupByOperator for parallel batch processing.

VectorizedOperator now inherits from Operator (using OperatorType::Result as base type) to enable polymorphism between Volcano and vectorized operator hierarchies. This allows dynamic_cast to VectorizedOperator* and calling next_batch() on vectorized execution roots.

- Add set_parallel() / is_parallel() / set_storage_manager() APIs - Add build_vectorized_plan() method for vectorized operator tree construction - Modify execute_select() to branch on use_vectorized flag - Add has_sort_or_limit guard to route Sort/Limit queries through Volcano path - Vectorized path: Scan -> Filter -> HashJoin -> GroupBy -> Project via batch iteration Sort/Limit queries fall back to Volcano path since SortOperator/LimitOperator do not inherit from VectorizedOperator.

…tion - Document set_parallel(true) mode and build_vectorized_plan() in Phase 5 - Document ParallelVectorizedSeqScanOperator, ThreadPool, and batch iteration in Phase 8

New doc covering: - Volcano vs Vectorized execution models - Class hierarchy and VectorBatch structure - QueryExecutor set_parallel(true) integration - ThreadPool and parallel operators - Build plan comparison - Performance characteristics

…pport issues - Fix CMakeLists.txt to include src/executor/thread_pool.cpp (linker error fix) - Fix execute_select() to use static_cast<VectorizedOperator*>(root.get()) instead of stale vec_root pointer after move - Add HeapTable→ColumnarTable migration in build_vectorized_plan() for both base table and join tables (INSERT writes to HeapTable, vectorized path reads from ColumnarTable - data is now migrated) - Add try-catch for std::out_of_range (vector access errors) in vectorized batch iteration loop - Fix variable shadowing bug in VectorizedProjectOperator construction (projection renamed to proj_exprs to avoid shadowing) - Add INT32/INT16/INT8/FLOAT32/DECIMAL/BOOL type support to ColumnarTable::append_batch() and read_batch() (types promoted for storage, demoted when reading) - Add ThreadPoolTests fixture with 5 tests to vectorized_operator_tests

Replace exec.execute("ROLLBACK") with tm.abort(txn) + EXPECT_EQ assertion in 4 fault-injection tests that previously had no verification: - UndoPhysicalRemoveFailure (FAULT_PHYSICAL_REMOVE) - UndoIndexInsertFailure (FAULT_INDEX_INSERT) - UndoIndexRemoveFailure (FAULT_INDEX_REMOVE) - UpdateUndoWithIndexInsertFault (FAULT_INDEX_INSERT) All 4 tests now verify the transaction reaches ABORTED state after rollback with fault injection, matching the pattern used by CommitWithLogFailure and AbortWithLogFailure.

- Add right_matched_ bitmap to track matched right rows during probe - Add right_bucket_rows_ global storage for right row payloads - Add unmatched_right_rows_ to collect unmatched right global indices - Add emit_unmatched_right_rows() to emit unmatched right rows with NULL left cols - In next_batch() Done phase: emit unmatched right rows for RIGHT/FULL joins - When match found during probe: mark right_matched_[global_idx] = true - Support combined LEFT+FULL: emit unmatched left during probe, unmatched right at end

- PHASE_8_ANALYTICS.md: Update VectorizedHashJoinOperator description to cover INNER/LEFT/RIGHT/FULL join types, right_matched_ bitmap, emit_unmatched_right_rows() - PHASE_5_OPTIMIZATION.md: Document join type support in QueryExecutor integration section - README.md: Add outer join support bullet to Phase 8 description

- Add LEFT/FULL condition for tracking unmatched left rows during probe - Fix Done phase return value to return true when rows are emitted - Correct test expectations for RIGHT join (3 rows) and FULL join (4 rows) Fixes emission of unmatched right rows in VectorizedHashJoinOperator for RIGHT and FULL outer join types.

coderabbitai · 2026-05-02T12:59:07Z

📝 Walkthrough

Walkthrough

Adds a parallel, vectorized execution path: a ThreadPool and ThreadPool-backed vectorized operators (scans, joins, group-by), QueryExecutor branching to build/run vectorized plans, extended VectorizedHashJoin to support RIGHT/FULL unmatched emission, expanded ColumnarTable type persistence, tests, and documentation; CMake includes the new thread_pool.cpp.

Changes

Parallel Vectorized Execution Pipeline

Layer / File(s)	Summary
Concurrency Primitive `include/executor/thread_pool.hpp`, `src/executor/thread_pool.cpp`	Introduces `ThreadPool` with worker threads, FIFO task queue, `submit()/shutdown()/wait()` and pending-task tracking; implementation spawns/join threads and coordinates task execution.
Vectorized Operator Base & Integration `include/executor/vectorized_operator.hpp`	`VectorizedOperator` now derives from `Operator`, marks lifecycle methods `override`, adds `state()`/`error()` accessors. `VectorizedHashJoinOperator` extended to RIGHT/FULL joins with unmatched tracking (`unmatched_left_indices_`, `right_matched_`, `unmatched_right_rows_`), `right_bucket_rows_`/`right_row_indices`, and `emit_unmatched_right_rows()` logic.
QueryExecutor Parallel API & Plan Builder `include/executor/query_executor.hpp`, `src/executor/query_executor.cpp`	Adds `set_parallel(bool)`, `is_parallel()`, `set_storage_manager(...)`, `parallel_` state and `thread_pool_`; `execute_select()` conditionally selects vectorized path when `parallel_ && storage_manager_ && !has_sort_or_limit`; adds `build_vectorized_plan()` producing ColumnarTable-backed scans, equi-joins, filters, group-by/aggregates, and projection (no sort/limit).
Columnar Storage Type Persistence `src/storage/columnar_table.cpp`	Extends on-disk column serialization and readback to support INT32/INT16/INT8 (stored as int64), FLOAT32/DECIMAL (stored as double), and BOOL (1-byte storage) with symmetric demotion on read.
Build Configuration `CMakeLists.txt`	Adds `src/executor/thread_pool.cpp` to `CORE_SOURCES`.
Tests `tests/vectorized_operator_tests.cpp`, `tests/transaction_manager_tests.cpp`	Adds tests `VectorizedHashJoinRight` and `VectorizedHashJoinFull`; fixes stray lines in existing test. TransactionManager undo tests switched to explicit `tm.begin()/tm.abort()` flows.
Documentation / Phase Notes `docs/VECTORIZED_EXECUTION.md`, `docs/phases/PHASE_5_OPTIMIZATION.md`, `docs/phases/PHASE_8_ANALYTICS.md`, `docs/phases/README.md`	Adds vectorized execution design doc, QueryExecutor integration notes, parallel execution and ThreadPool descriptions, and documents extended join capabilities and execution constraints (ORDER BY/LIMIT fallback).
Storage Manager Note `include/storage/storage_manager.hpp`	Adds documentation that `StorageManager` is not thread-safe for concurrent file ops and clarifies `Stats` atomics purpose.

Sequence Diagram

sequenceDiagram
    participant Client
    participant QueryExecutor
    participant StorageManager
    participant ThreadPool
    participant VectorizedOps
    participant ColumnarTable
    participant Result

    Client->>QueryExecutor: execute_select(stmt, txn)
    QueryExecutor->>QueryExecutor: detect has_sort_or_limit

    alt parallel_ && storage_manager_ set && no sort/limit
        QueryExecutor->>QueryExecutor: build_vectorized_plan(stmt, txn)
        QueryExecutor->>ColumnarTable: open / prepare columnar data
        ColumnarTable->>ThreadPool: submit parallel scan tasks
        ThreadPool-->>ColumnarTable: async scan results (futures)

        loop per VectorBatch
            QueryExecutor->>VectorizedOps: next_batch(batch)
            VectorizedOps->>VectorizedOps: filter / join / group-by / project
            VectorizedOps-->>QueryExecutor: VectorBatch
            QueryExecutor->>Result: convert batch rows to Tuples
        end
    else
        QueryExecutor->>QueryExecutor: build_plan(stmt, txn) (Volcano)
        loop per Tuple
            QueryExecutor->>VectorizedOps: next(tuple)
            VectorizedOps-->>QueryExecutor: Tuple
            QueryExecutor->>Result: append tuple
        end
    end

    QueryExecutor-->>Client: QueryResult

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat(vectorized): implement LEFT join for VectorizedHashJoinOperator #63: Related changes to VectorizedHashJoinOperator bucket and unmatched-tracking structures; likely overlaps with join-type handling here.
feat(vectorized): add VectorizedGroupByOperator with MIN/MAX support and extended tests #59: Related vectorized GROUP BY work and VectorizedGroupByOperator changes used by the plan builder.
refactor(vectorized): streaming hash join with bounded memory #69: Prior modifications to VectorizedHashJoinOperator probe/emission logic; potentially complementary to RIGHT/FULL support.

Poem

🐰 A pool of threads begins to hum,
Batches hop through joins until they're done,
Right and full joins now find their match,
Columns store types without a scratch,
QueryExecutor chooses paths that run.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 8.51% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix(vectorized): correct RIGHT and FULL outer join emission logic' clearly summarizes the main change—fixing the emission logic for RIGHT and FULL outer joins in the vectorized execution path.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/phase5-vectorized-execution

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 0/1 reviews remaining, refill in 60 minutes.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

include/executor/vectorized_operator.hpp (1)
810-817: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

NULL key handling also misses FULL join type.

When left key is NULL, the row should be tracked as unmatched for FULL joins as well as LEFT joins.
🐛 Proposed fix
                 if (key_val.is_null()) {
                     // NULL keys never match - mark as unmatched for LEFT/FULL join
-                    if (join_type_ == JoinType::Left) {
+                    if (join_type_ == JoinType::Left || join_type_ == JoinType::Full) {
                         unmatched_left_indices_.push_back(left_row_idx_);
                     }
                     left_row_idx_++;
                     continue;
                 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/executor/vectorized_operator.hpp` around lines 810 - 817, The
NULL-key branch currently only treats LEFT joins as unmatched; update the logic
in the block handling key_val.is_null() so that rows are also marked unmatched
when join_type_ is JoinType::Full (in addition to JoinType::Left). Concretely,
modify the condition that pushes left_row_idx_ into unmatched_left_indices_ to
check for (join_type_ == JoinType::Left || join_type_ == JoinType::Full),
leaving left_row_idx_ increment and continue behavior unchanged.
tests/transaction_manager_tests.cpp (1)
727-737: ⚠️ Potential issue | 🔴 Critical

Fault injection tests do not exercise the intended code paths — txn created by tm.begin() has no undo logs from the DML

All four changed tests follow a broken pattern:

Call tm.begin() to create an external txn

Call exec.execute(DML) without a preceding SQL BEGIN

Arm fault injection and call tm.abort(txn)

Assert txn->get_state() == ABORTED

The QueryExecutor::execute() method checks:
const bool is_auto_commit = (current_txn_ == nullptr);
transaction::Transaction* txn = current_txn_;

if (is_auto_commit && (stmt.type() == Insert || stmt.type() == Delete || ...)) {
    txn = transaction_manager_.begin();  // Creates a NEW auto-commit transaction
}
Since no SQL BEGIN is issued, current_txn_ remains nullptr, and the executor creates a separate internal auto-commit transaction for the DML. The external txn obtained from tm.begin() is never associated with that DML, so txn->get_undo_logs() is empty when tm.abort(txn) is called.

Result:

undo_transaction(txn) iterates an empty log and returns immediately.

FAULT_PHYSICAL_REMOVE, FAULT_INDEX_INSERT, FAULT_INDEX_REMOVE, FAULT_UNDO_REMOVE are never triggered.

txn->set_state(ABORTED) is reached unconditionally, so the assertion trivially passes.

These tests create a false sense of fault injection coverage. Compare with working tests like InsertThenAbort, which correctly use exec.execute("BEGIN") followed by DML, ensuring the DML executes within the executor's managed transaction.

Fix: Wrap the DML with SQL BEGIN and ROLLBACK:
exec.execute("BEGIN");
exec.execute("INSERT INTO ...");
exec.execute("ROLLBACK");
Or provide an API to pass an external Transaction* to the executor. This applies to all four test cases: lines 727–737, 772–779, 812–819, and 1101–1109.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/transaction_manager_tests.cpp` around lines 727 - 737, The tests call
tm.begin() but then execute DML via exec.execute(...) without a preceding SQL
BEGIN, so QueryExecutor::execute() creates an internal auto-commit txn
(current_txn_ == nullptr) and the external txn has no undo logs, meaning
undo_transaction never hits the fault-injection paths; fix each failing test by
ensuring the DML runs inside the executor-managed transaction — call
exec.execute("BEGIN") before the INSERT/DELETE and then exec.execute("ROLLBACK")
(or let tm.abort operate on the same executor transaction) so the undo logs are
populated and FaultInjection (e.g., FAULT_PHYSICAL_REMOVE, FAULT_INDEX_INSERT,
FAULT_INDEX_REMOVE, FAULT_UNDO_REMOVE) is exercised instead of using an external
tm.begin() detached from QueryExecutor::execute().

🧹 Nitpick comments (3)

include/executor/query_executor.hpp (1)

146-149: 💤 Low value

thread_pool_ member is declared but appears unused.

The thread_pool_ shared pointer is declared but based on the provided context, it's never initialized in the constructor (line 116-126 in query_executor.cpp) and doesn't appear to be used in the vectorized execution path. If parallel execution via ThreadPool is planned for future use, consider removing this member until it's actually needed to avoid confusion, or add a TODO comment clarifying the intent.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/executor/query_executor.hpp` around lines 146 - 149, The member
thread_pool_ in QueryExecutor is declared but unused; either remove the
declaration "std::shared_ptr<ThreadPool> thread_pool_;" from the class to avoid
dead code, or if you intend to support parallel execution later, add a clear
TODO comment above it and initialize it (e.g., set to nullptr or accept a
ThreadPool parameter) in the QueryExecutor constructor so intent is explicit;
reference the member name thread_pool_ and the constructor in query_executor.cpp
when making the change.

src/executor/query_executor.cpp (1)

1232-1233: ⚡ Quick win

Remove unused has_sort_or_limit parameter.

The has_sort_or_limit parameter is passed to build_vectorized_plan() but is only consumed by a (void) cast at line 1496. Since use_vectorized is already false when has_sort_or_limit is true (line 392), this function is never called with has_sort_or_limit == true.

♻️ Remove unused parameter

In header (include/executor/query_executor.hpp):

-    std::unique_ptr<VectorizedOperator> build_vectorized_plan(const parser::SelectStatement& stmt,
-                                                              transaction::Transaction* txn,
-                                                              bool has_sort_or_limit);
+    std::unique_ptr<VectorizedOperator> build_vectorized_plan(const parser::SelectStatement& stmt,
+                                                              transaction::Transaction* txn);

In implementation:

-std::unique_ptr<VectorizedOperator> QueryExecutor::build_vectorized_plan(
-    const parser::SelectStatement& stmt, transaction::Transaction* txn, bool has_sort_or_limit) {
+std::unique_ptr<VectorizedOperator> QueryExecutor::build_vectorized_plan(
+    const parser::SelectStatement& stmt, transaction::Transaction* txn) {
     (void)txn;  // Vectorized path doesn't use txn yet
-
...
-    /* Sort and Limit are NOT created here in the vectorized path.
-       When has_sort_or_limit is true, use_vectorized is false so this function
-       is only called for pure vectorized queries (no ORDER BY, no LIMIT).
-       The Volcano path handles Sort/Limit via build_plan(). */
-    (void)has_sort_or_limit;  // Suppress unused warning

Update call site (line 395):

-        vec_root = build_vectorized_plan(stmt, txn, has_sort_or_limit);
+        vec_root = build_vectorized_plan(stmt, txn);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/executor/query_executor.cpp` around lines 1232 - 1233, The
has_sort_or_limit parameter on QueryExecutor::build_vectorized_plan is unused;
remove it from the function signature in both the declaration
(query_executor.hpp) and definition (query_executor.cpp), delete the
(void)has_sort_or_limit cast inside the function body, and update all call sites
that pass has_sort_or_limit to call build_vectorized_plan(...) with the reduced
parameter list (ensure the function name QueryExecutor::build_vectorized_plan
and its callers are updated consistently).

docs/VECTORIZED_EXECUTION.md (1)

48-59: 💤 Low value

Consider adding language identifiers to ASCII diagram code blocks.

Static analysis flagged missing language specifiers on fenced code blocks at lines 48, 145, and 156. While these are ASCII diagrams rather than code, you can use text or plaintext as the language identifier to satisfy the linter.

Also applies to: 145-153, 156-162
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/VECTORIZED_EXECUTION.md` around lines 48 - 59, The fenced ASCII diagram
blocks (e.g., the tree showing "Operator (base)" and its children like
"VectorizedOperator", "VectorizedSeqScanOperator", "VectorizedFilterOperator",
etc., and the other similar diagrams later in the file) are missing a language
specifier; update each fenced code block that contains these ASCII diagrams to
include a language identifier such as `text` or `plaintext` (i.e., change ``` to
```text) so the linter/static analysis stops flagging missing language
specifiers.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/VECTORIZED_EXECUTION.md`:
- Line 72: The docs mention a "BoolVector" type that doesn't match the
implementation; update the documentation for VectorBatch/ColumnVector to
reference NumericVector<bool> (or state that boolean columns use
NumericVector<bool>) instead of BoolVector so the terminology aligns with the
types used (e.g., NumericVector<T>, StringVector, and NumericVector<bool> for
booleans).

In `@include/executor/thread_pool.hpp`:
- Around line 54-59: The increment of pending_tasks_ happens outside the
critical section causing a race where a worker may complete a task before
pending_tasks_.fetch_add(1) runs; move the pending_tasks_.fetch_add(1,
std::memory_order_acq_rel) so it executes while holding mutex_ (i.e., inside the
lock scope that pushes into tasks_) and perform cv_.notify_one() after releasing
the lock; ensure wait() and worker code that does pending_tasks_.fetch_sub(1,
...) remain unchanged so the counter cannot underflow.

In `@include/executor/vectorized_operator.hpp`:
- Around line 800-802: The resumed bucket-scan path only records unmatched left
rows for JoinType::Left; update it to also handle JoinType::Full by checking if
(join_type_ == JoinType::Left || join_type_ == JoinType::Full) before pushing
left_row_idx_ into unmatched_left_indices_, so
unmatched_left_indices_.push_back(left_row_idx_) is executed for both LEFT and
FULL join types (mirror the logic used at lines 852-856).

In `@src/executor/query_executor.cpp`:
- Around line 1482-1484: The code currently adds a projected column with a
hardcoded ValueType::TYPE_TEXT in proj_schema.add_column("expr",
common::ValueType::TYPE_TEXT, true) for non-column expressions; instead, infer
the expression's result type and use that when adding the column. Locate where
expressions are processed (the projection handling that calls
proj_schema.add_column) and replace the fixed TYPE_TEXT with a type determined
by the expression node (e.g., call the expression's type-inference or evaluation
metadata function, or consult the input schema/column types if the expression is
a binary op or function), falling back to TYPE_TEXT only if inference fails;
ensure the chosen symbol names include the expression variable and
proj_schema.add_column invocation so the change targets the correct spot.
- Around line 1257-1277: The migration unconditionally copies from
storage::HeapTable to ColumnarTable on every vectorized SELECT, causing
duplicates and extra latency; change the logic in the block that constructs
storage::HeapTable heap_table(...) so that migration only runs when
ColumnarTable is empty or not yet initialized: check col_table's existing row
count or an "migrated" flag before calling col_table->create() and before
iterating heap_table.scan(); skip the append_batch loop if col_table already has
data (or set and persist a migration marker after the first successful
append_batch pass). Ensure you use heap_table.tuple_count(),
col_table->row_count() (or equivalent open()/exists() API) and the
append_batch/clear loop only when migration is needed.

In `@src/executor/thread_pool.cpp`:
- Around line 49-54: ThreadPool::wait deadlocks because workers decrement
pending_tasks_ but never wake the waiting thread; update the worker completion
path (where pending_tasks_.fetch_sub / decrement occurs) to call
cv_.notify_all() after decrement (or notify_one if appropriate) so the condition
in ThreadPool::wait() (tasks_.empty() && pending_tasks_.load(...) == 0) can be
re-evaluated; ensure notify is called after the decrement (holding the mutex_ is
optional but acceptable) to reliably unblock ThreadPool::wait().

In `@src/storage/columnar_table.cpp`:
- Around line 82-97: append_batch writes TYPE_FLOAT32, TYPE_DECIMAL (promoted to
float64) and TYPE_BOOL, but read_batch lacks matching branches so
deserialization throws "Symmetric serialization failure"; add branches in
read_batch (after the TYPE_FLOAT64 branch) to: for TYPE_FLOAT32 and TYPE_DECIMAL
read batch.row_count() * 8 bytes into a std::vector<double>, then set the target
executor::NumericVector<float/destination type> elements using .set(r,
value.is_null()? null : casted value) or appropriate setter converting double
back to the original type; for TYPE_BOOL read batch.row_count() bytes into
std::vector<uint8_t> and set the executor::NumericVector<bool> elements using
.set(r, byte != 0) preserving nulls; ensure types referenced match
common::ValueType::TYPE_FLOAT32, TYPE_DECIMAL, TYPE_BOOL and functions/classes
like read_batch and executor::NumericVector to keep serialization symmetric with
append_batch.

---

Outside diff comments:
In `@include/executor/vectorized_operator.hpp`:
- Around line 810-817: The NULL-key branch currently only treats LEFT joins as
unmatched; update the logic in the block handling key_val.is_null() so that rows
are also marked unmatched when join_type_ is JoinType::Full (in addition to
JoinType::Left). Concretely, modify the condition that pushes left_row_idx_ into
unmatched_left_indices_ to check for (join_type_ == JoinType::Left || join_type_
== JoinType::Full), leaving left_row_idx_ increment and continue behavior
unchanged.

In `@tests/transaction_manager_tests.cpp`:
- Around line 727-737: The tests call tm.begin() but then execute DML via
exec.execute(...) without a preceding SQL BEGIN, so QueryExecutor::execute()
creates an internal auto-commit txn (current_txn_ == nullptr) and the external
txn has no undo logs, meaning undo_transaction never hits the fault-injection
paths; fix each failing test by ensuring the DML runs inside the
executor-managed transaction — call exec.execute("BEGIN") before the
INSERT/DELETE and then exec.execute("ROLLBACK") (or let tm.abort operate on the
same executor transaction) so the undo logs are populated and FaultInjection
(e.g., FAULT_PHYSICAL_REMOVE, FAULT_INDEX_INSERT, FAULT_INDEX_REMOVE,
FAULT_UNDO_REMOVE) is exercised instead of using an external tm.begin() detached
from QueryExecutor::execute().

---

Nitpick comments:
In `@docs/VECTORIZED_EXECUTION.md`:
- Around line 48-59: The fenced ASCII diagram blocks (e.g., the tree showing
"Operator (base)" and its children like "VectorizedOperator",
"VectorizedSeqScanOperator", "VectorizedFilterOperator", etc., and the other
similar diagrams later in the file) are missing a language specifier; update
each fenced code block that contains these ASCII diagrams to include a language
identifier such as `text` or `plaintext` (i.e., change ``` to ```text) so the
linter/static analysis stops flagging missing language specifiers.

In `@include/executor/query_executor.hpp`:
- Around line 146-149: The member thread_pool_ in QueryExecutor is declared but
unused; either remove the declaration "std::shared_ptr<ThreadPool>
thread_pool_;" from the class to avoid dead code, or if you intend to support
parallel execution later, add a clear TODO comment above it and initialize it
(e.g., set to nullptr or accept a ThreadPool parameter) in the QueryExecutor
constructor so intent is explicit; reference the member name thread_pool_ and
the constructor in query_executor.cpp when making the change.

In `@src/executor/query_executor.cpp`:
- Around line 1232-1233: The has_sort_or_limit parameter on
QueryExecutor::build_vectorized_plan is unused; remove it from the function
signature in both the declaration (query_executor.hpp) and definition
(query_executor.cpp), delete the (void)has_sort_or_limit cast inside the
function body, and update all call sites that pass has_sort_or_limit to call
build_vectorized_plan(...) with the reduced parameter list (ensure the function
name QueryExecutor::build_vectorized_plan and its callers are updated
consistently).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6b224391-5a3c-43e0-a823-8a8117751a58

📥 Commits

Reviewing files that changed from the base of the PR and between 2e88938 and e7140a7.

📒 Files selected for processing (13)

CMakeLists.txt
docs/VECTORIZED_EXECUTION.md
docs/phases/PHASE_5_OPTIMIZATION.md
docs/phases/PHASE_8_ANALYTICS.md
docs/phases/README.md
include/executor/query_executor.hpp
include/executor/thread_pool.hpp
include/executor/vectorized_operator.hpp
src/executor/query_executor.cpp
src/executor/thread_pool.cpp
src/storage/columnar_table.cpp
tests/transaction_manager_tests.cpp
tests/vectorized_operator_tests.cpp

Replace (void) casts for unused parameters with C++17 [[maybe_unused]] attribute in build_vectorized_plan function signature. Also removed redundant (void)has_sort_or_limit line since the parameter is now marked [[maybe_unused]]. Part of PR #72 code review follow-up.

- Use dynamic_cast with assertion instead of static_cast for VectorizedOperator* cast in execute_select (safer) - Update build_vectorized_plan comment to clarify parameters are currently unused - Add thread-safety note to StorageManager class documentation Part of PR #72 code review follow-up.

ThreadPool::wait() was hanging indefinitely because cv_.notify_one() after task completion could be consumed by an idle worker instead of the main thread in wait(). This is the classic "missed wakeup" problem with condition variables. With notify_one(), a worker finishing a task and calling notify_one() might wake another idle worker (who then immediately goes back to sleep on the empty queue) instead of the main thread waiting on wait(). Using notify_all() ensures all waiting threads—including the main thread in wait()—get notified and re-check their predicates. Workers whose predicates return false (empty queue, still tasks pending) go back to sleep immediately. Fixes the ThreadPoolTests.SubmitAndWait and ThreadPoolTests.MultipleWait tests hanging in CI.

1. docs: Fix "BoolVector" terminology to "NumericVector<T>/NumericVector<bool>" and add 'text' language specifier to ASCII diagram fences. 2. thread_pool.hpp: Fix race condition where pending_tasks_.fetch_add() was outside the mutex lock. Move it inside the lock scope so the counter accurately reflects queued work before notify_one() fires. 3. vectorized_operator.hpp: Extend resumed-bucket-scan and NULL-key branches to handle JoinType::Full (not just JoinType::Left), so unmatched left rows are tracked for FULL joins too. 4. query_executor.cpp: (a) Infer expression result type from ConstantExpr value instead of hardcoding TYPE_TEXT for projections; (b) Check if ColumnarTable already has rows before migrating from HeapTable to avoid re-migrating and duplicating data. 5. columnar_table.cpp: Add read_batch branches for TYPE_FLOAT32, TYPE_DECIMAL, and TYPE_BOOL that symmetrically match append_batch serialization (promote float32/decimal to float64; store bool as byte). 6. query_executor.hpp: Add TODO comment for unused thread_pool_ member pending future parallel execution support.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

src/executor/query_executor.cpp (1)
1492-1500: ⚖️ Poor tradeoff

Partial improvement: Binary/Function expressions still default to TYPE_TEXT.

The constant expression type inference is a good step. However, for expressions like a + b or non-aggregate functions, the type still defaults to TYPE_TEXT. Consider extending type inference to handle Binary expressions by examining operand types.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/executor/query_executor.cpp` around lines 1492 - 1500, The code currently
only infers expression types for parser::ExprType::Constant and falls back to
TYPE_TEXT; extend this to handle parser::ExprType::Binary (and optionally
parser::ExprType::Function) by inspecting operand/result types and using the
project's type-promotion rules to derive a proper common::ValueType before
calling proj_schema.add_column("expr", expr_type, true). Locate the block that
checks col->type() and add branches for parser::ExprType::Binary where you cast
to parser::BinaryExpr (or the appropriate node), retrieve left/right operand
expression types (recursively if needed or via an existing helper), apply
numeric/string/date promotion rules to choose the resulting common::ValueType,
and similarly handle parser::FunctionExpr by resolving the function
signature/result type; fallback to TYPE_TEXT only if no rule applies.
docs/VECTORIZED_EXECUTION.md (1)
145-162: 💤 Low value

Add language specifier to fenced code blocks.

The code blocks at lines 145-153 and 156-162 are missing language specifiers, flagged by markdownlint (MD040). Since these are ASCII diagrams, use ```text for consistency.
📝 Suggested fix
 ### Volcano Path (`build_plan()`)
-```
+```text
 SeqScanOperator (HeapTable)
 ...
Vectorized Path (build_vectorized_plan())

- +text
VectorizedSeqScanOperator (ColumnarTable)
...
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/VECTORIZED_EXECUTION.md` around lines 145 - 162, Add the missing
Markdown fenced code block language specifiers by changing the two ASCII diagram
fences to use ```text; specifically update the block showing SeqScanOperator
(HeapTable) and the block under "Vectorized Path (`build_vectorized_plan()`)"
(which contains VectorizedSeqScanOperator (ColumnarTable)) so their opening
fences include the language token "text" to satisfy markdownlint MD040.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/executor/query_executor.cpp`:
- Around line 1314-1332: The join-table migration unconditionally moves rows
from storage::HeapTable join_heap_table into join_col_table when join_count > 0,
causing duplicates on repeated migrations; add the same guard used for base
table migration by checking join_col_table->row_count() and skipping the
migration if it's > 0 before creating/appending, i.e., around
join_col_table->create()/append_batch calls ensure you short-circuit when
join_col_table already has rows so the tuple append loop (iter.next(tuple) /
batch->append_tuple / join_col_table->append_batch) only runs for an empty
target columnar table.

In `@src/storage/columnar_table.cpp`:
- Around line 201-218: The code is dynamically casting target_col to
NumericVector<float> when type == common::ValueType::TYPE_FLOAT32, but
init_from_schema actually creates NumericVector<double> for
TYPE_FLOAT32/TYPE_FLOAT64 so the dynamic_cast will throw std::bad_cast; fix by
casting to NumericVector<double>& instead (e.g., replace the
dynamic_cast<executor::NumericVector<float>&>(target_col) with
dynamic_cast<executor::NumericVector<double>&>(target_col)) and continue to
convert the double data to float when appending (or construct the appropriate
common::Value from the double) so the runtime types match init_from_schema;
update all code paths that assume NumericVector<float> for TYPE_FLOAT32 to use
NumericVector<double> or add a branch handling both numeric vector types
consistently (symbols: type, common::ValueType::TYPE_FLOAT32, target_col,
NumericVector<float>, NumericVector<double>, init_from_schema).

---

Nitpick comments:
In `@docs/VECTORIZED_EXECUTION.md`:
- Around line 145-162: Add the missing Markdown fenced code block language
specifiers by changing the two ASCII diagram fences to use ```text; specifically
update the block showing SeqScanOperator (HeapTable) and the block under
"Vectorized Path (`build_vectorized_plan()`)" (which contains
VectorizedSeqScanOperator (ColumnarTable)) so their opening fences include the
language token "text" to satisfy markdownlint MD040.

In `@src/executor/query_executor.cpp`:
- Around line 1492-1500: The code currently only infers expression types for
parser::ExprType::Constant and falls back to TYPE_TEXT; extend this to handle
parser::ExprType::Binary (and optionally parser::ExprType::Function) by
inspecting operand/result types and using the project's type-promotion rules to
derive a proper common::ValueType before calling proj_schema.add_column("expr",
expr_type, true). Locate the block that checks col->type() and add branches for
parser::ExprType::Binary where you cast to parser::BinaryExpr (or the
appropriate node), retrieve left/right operand expression types (recursively if
needed or via an existing helper), apply numeric/string/date promotion rules to
choose the resulting common::ValueType, and similarly handle
parser::FunctionExpr by resolving the function signature/result type; fallback
to TYPE_TEXT only if no rule applies.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e6bffadd-1504-4020-9c1b-11c00ee295c0

📥 Commits

Reviewing files that changed from the base of the PR and between e7140a7 and 9d84b6b.

📒 Files selected for processing (8)

docs/VECTORIZED_EXECUTION.md
include/executor/query_executor.hpp
include/executor/thread_pool.hpp
include/executor/vectorized_operator.hpp
include/storage/storage_manager.hpp
src/executor/query_executor.cpp
src/executor/thread_pool.cpp
src/storage/columnar_table.cpp

✅ Files skipped from review due to trivial changes (4)

include/storage/storage_manager.hpp
src/executor/thread_pool.cpp
include/executor/thread_pool.hpp
include/executor/vectorized_operator.hpp

🚧 Files skipped from review as they are similar to previous changes (1)

include/executor/query_executor.hpp

coderabbitai · 2026-05-02T19:29:15Z

+        /* Migrate HeapTable data to ColumnarTable for join table */
+        storage::HeapTable join_heap_table(join_table_name, bpm_, join_schema);
+        uint64_t join_count = join_heap_table.tuple_count();
+        if (join_count > 0) {
+            join_col_table->create();
+            auto batch = executor::VectorBatch::create(join_schema);
+            auto iter = join_heap_table.scan();
+            executor::Tuple tuple;
+            while (iter.next(tuple)) {
+                batch->append_tuple(tuple);
+                if (batch->row_count() >= 1024) {
+                    join_col_table->append_batch(*batch);
+                    batch->clear();
+                }
+            }
+            if (batch->row_count() > 0) {
+                join_col_table->append_batch(*batch);
+            }
+        }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Join table migration lacks duplicate-data guard present in base table migration.

The base table migration (lines 1264-1271) checks if existing_col_table->row_count() > 0 to skip re-migration, but the join table migration unconditionally migrates when join_count > 0. This causes duplicate rows in join_col_table on repeated vectorized queries with joins.

🐛 Proposed fix to add the same guard

/* Migrate HeapTable data to ColumnarTable for join table */ storage::HeapTable join_heap_table(join_table_name, bpm_, join_schema); uint64_t join_count = join_heap_table.tuple_count(); - if (join_count > 0) { + bool join_needs_migration = (join_count > 0); + if (join_needs_migration) { + auto existing_join_col = std::make_shared<storage::ColumnarTable>( + join_table_name, *storage_manager_, join_schema); + if (existing_join_col->open() && existing_join_col->row_count() > 0) { + join_needs_migration = false; + } + } + if (join_needs_migration) { join_col_table->create(); auto batch = executor::VectorBatch::create(join_schema); auto iter = join_heap_table.scan();

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/executor/query_executor.cpp` around lines 1314 - 1332, The join-table migration unconditionally moves rows from storage::HeapTable join_heap_table into join_col_table when join_count > 0, causing duplicates on repeated migrations; add the same guard used for base table migration by checking join_col_table->row_count() and skipping the migration if it's > 0 before creating/appending, i.e., around join_col_table->create()/append_batch calls ensure you short-circuit when join_col_table already has rows so the tuple append loop (iter.next(tuple) / batch->append_tuple / join_col_table->append_batch) only runs for an empty target columnar table.

coderabbitai · 2026-05-02T19:29:15Z

+        } else if (type == common::ValueType::TYPE_FLOAT32) {
+            auto& num_vec = dynamic_cast<executor::NumericVector<float>&>(target_col);
+
+            n_in.seekg(static_cast<std::streamoff>(start_row), std::ios::beg);
+            std::vector<uint8_t> nulls(actual_rows);
+            n_in.read(reinterpret_cast<char*>(nulls.data()), actual_rows);
+
+            d_in.seekg(static_cast<std::streamoff>(start_row * 8), std::ios::beg);
+            std::vector<double> data(actual_rows);
+            d_in.read(reinterpret_cast<char*>(data.data()), actual_rows * 8);
+
+            for (uint32_t r = 0; r < actual_rows; ++r) {
+                if (nulls[r] != 0U) {
+                    num_vec.append(common::Value::make_null());
+                } else {
+                    num_vec.append(common::Value(static_cast<float>(data[r])));
+                }
+            }


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

NumericVector<float> cast will fail at runtime - init_from_schema creates NumericVector<double> for TYPE_FLOAT32.

According to types.hpp (context snippet 2, lines 432-435), VectorBatch::init_from_schema creates NumericVector<double> for both TYPE_FLOAT32 and TYPE_FLOAT64. The dynamic_cast to NumericVector<float>& on line 202 will throw std::bad_cast.

🐛 Proposed fix

} else if (type == common::ValueType::TYPE_FLOAT32) { - auto& num_vec = dynamic_cast<executor::NumericVector<float>&>(target_col); + auto& num_vec = dynamic_cast<executor::NumericVector<double>&>(target_col); n_in.seekg(static_cast<std::streamoff>(start_row), std::ios::beg); std::vector<uint8_t> nulls(actual_rows); n_in.read(reinterpret_cast<char*>(nulls.data()), actual_rows); d_in.seekg(static_cast<std::streamoff>(start_row * 8), std::ios::beg); std::vector<double> data(actual_rows); d_in.read(reinterpret_cast<char*>(data.data()), actual_rows * 8); for (uint32_t r = 0; r < actual_rows; ++r) { if (nulls[r] != 0U) { num_vec.append(common::Value::make_null()); } else { - num_vec.append(common::Value(static_cast<float>(data[r]))); + num_vec.append(common::Value::make_float64(data[r])); } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

} else if (type == common::ValueType::TYPE_FLOAT32) {

auto& num_vec = dynamic_cast<executor::NumericVector<float>&>(target_col);

n_in.seekg(static_cast<std::streamoff>(start_row), std::ios::beg);

std::vector<uint8_t> nulls(actual_rows);

n_in.read(reinterpret_cast<char*>(nulls.data()), actual_rows);

d_in.seekg(static_cast<std::streamoff>(start_row * 8), std::ios::beg);

std::vector<double> data(actual_rows);

d_in.read(reinterpret_cast<char*>(data.data()), actual_rows * 8);

for (uint32_t r = 0; r < actual_rows; ++r) {

if (nulls[r] != 0U) {

num_vec.append(common::Value::make_null());

} else {

num_vec.append(common::Value(static_cast<float>(data[r])));

}

}

} else if (type == common::ValueType::TYPE_FLOAT32) {

auto& num_vec = dynamic_cast<executor::NumericVector<double>&>(target_col);

n_in.seekg(static_cast<std::streamoff>(start_row), std::ios::beg);

std::vector<uint8_t> nulls(actual_rows);

n_in.read(reinterpret_cast<char*>(nulls.data()), actual_rows);

d_in.seekg(static_cast<std::streamoff>(start_row * 8), std::ios::beg);

std::vector<double> data(actual_rows);

d_in.read(reinterpret_cast<char*>(data.data()), actual_rows * 8);

for (uint32_t r = 0; r < actual_rows; ++r) {

if (nulls[r] != 0U) {

num_vec.append(common::Value::make_null());

} else {

num_vec.append(common::Value::make_float64(data[r]));

}

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/storage/columnar_table.cpp` around lines 201 - 218, The code is dynamically casting target_col to NumericVector<float> when type == common::ValueType::TYPE_FLOAT32, but init_from_schema actually creates NumericVector<double> for TYPE_FLOAT32/TYPE_FLOAT64 so the dynamic_cast will throw std::bad_cast; fix by casting to NumericVector<double>& instead (e.g., replace the dynamic_cast<executor::NumericVector<float>&>(target_col) with dynamic_cast<executor::NumericVector<double>&>(target_col)) and continue to convert the double data to float when appending (or construct the appropriate common::Value from the double) so the runtime types match init_from_schema; update all code paths that assume NumericVector<float> for TYPE_FLOAT32 to use NumericVector<double> or add a branch handling both numeric vector types consistently (symbols: type, common::ValueType::TYPE_FLOAT32, target_col, NumericVector<float>, NumericVector<double>, init_from_schema).

poyrazK added 10 commits April 29, 2026 21:10

Add ThreadPool class for parallel task execution

9d8d9d1

Provides a simple thread pool for executing tasks in parallel. Used by ParallelVectorizedSeqScanOperator and VectorizedGroupByOperator for parallel batch processing.

Update Phase 5 and Phase 8 docs with QueryExecutor vectorized integra…

7ab9013

…tion - Document set_parallel(true) mode and build_vectorized_plan() in Phase 5 - Document ParallelVectorizedSeqScanOperator, ThreadPool, and batch iteration in Phase 8

Add Vectorized Execution Engine documentation

2cc1da5

New doc covering: - Volcano vs Vectorized execution models - Class hierarchy and VectorBatch structure - QueryExecutor set_parallel(true) integration - ThreadPool and parallel operators - Build plan comparison - Performance characteristics

style: automated clang-format fixes

e7140a7

coderabbitai Bot reviewed May 2, 2026

View reviewed changes

poyrazK and others added 5 commits May 2, 2026 16:08

style: automated clang-format fixes

9d84b6b

coderabbitai Bot reviewed May 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(vectorized): correct RIGHT and FULL outer join emission logic#72

fix(vectorized): correct RIGHT and FULL outer join emission logic#72
poyrazK wants to merge 16 commits intomainfrom
feature/phase5-vectorized-execution

poyrazK commented May 2, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 2, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Vectorized Path (`build_vectorized_plan()`)

Uh oh!

coderabbitai Bot May 2, 2026

Uh oh!

coderabbitai Bot May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

poyrazK commented May 2, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Vectorized Path (build_vectorized_plan())

Uh oh!

coderabbitai Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

poyrazK commented May 2, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 2, 2026 •

edited

Loading

Vectorized Path (`build_vectorized_plan()`)