Fix lazy materialize retain#193
Merged
singaraiona merged 2 commits intomasterfrom May 5, 2026
Merged
Conversation
`(select {s: (sum a) from: t})` was returning N copies of the same
value instead of a single row. The projection-only path lowered
aggregates as ordinary column expressions, so OP_SELECT saw a scalar
atom and broadcast it to the input row count (exec.c: vec->type<0 ->
broadcast_scalar).
Route the all-aggregate / no-by case through ray_group(n_keys=0),
which already has a 1-row scalar-aggregate fast path. WHERE is
pre-executed (same pattern as the by-with-where fuse path) so the
lazy g->selection bitmap reaches the reduction.
The n_keys==0 parallel scalar path was effectively dead code before
this and its FIRST/LAST merge silently relied on worker-id order
matching row-index order — broken under work-stealing dispatch.
Force serial execution when FIRST/LAST is in play; the DA path stays
parallel and tracks per-slot first_row/last_row already.
Two existing tests asserted the buggy broadcast row count
(groupby_aggregators.rfl:64, group_coverage.rfl:417); updated to the
correct 1-row expectation.
…t), LIKE on dict SYM
Lands the four findings + bonus from RAYFORCE_BOTTLENECKS.md, taking
ClickBench hot-run total from ~1.6 M ms to ~14 K ms across 40
measurable queries (≈99% reduction).
* Fused `select { … asc/desc: c take: K }` lowers to bounded-heap
top-K when k << nrows and keys resolve to plain column refs.
Single-key uses the radix-encoded fast path; multi-key falls back
to the comparator-based heap. Q26 SearchPhrase: 5 186 → 72 ms.
* Grouped `count(distinct)` no longer routed through per-group
eval-fallback — the fused OP_COUNT_DISTINCT runs per group-slice.
Scaling moves from 94×/decade to ≈4.6×/decade between 100 K and
1 M rows (essentially linear).
* LIKE on dict-encoded SYM scans the dictionary once and lifts the
result through the codes vector instead of re-evaluating per row.
Low-card SYM (54-unique BrowserCountry): 52 → 3.65 ms (14×).
High-card SYM (1.73 M-unique URL): 498 → 220 ms (2.3×).
* Unifies the previously-divergent glob matchers (eval used `*?[abc]`,
DAG used SQL `%_`; one variant blew up exponentially on
`a*a*…a*b` against an a-only string) behind a single iterative
two-pointer implementation in src/ops/glob.{c,h}. Both call sites
delegate.
* Bonus: `(at table (iasc table.col))` no longer crashes on tables —
re-indexes each column to return a TABLE.
Tests: query_coverage / read_csv / reserved_namespace updated for the
new dispatch paths; cross_type_workout / collection/at extended.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.