Skip to content
This repository was archived by the owner on Apr 30, 2026. It is now read-only.

fix(8 bugs) + test(coverage +1.3pp): probes, type-preservation, broad workout#8

Merged
singaraiona merged 21 commits intoRayforceDB:masterfrom
ser-vasilich:fixes/seven-bug-probes
Apr 28, 2026
Merged

fix(8 bugs) + test(coverage +1.3pp): probes, type-preservation, broad workout#8
singaraiona merged 21 commits intoRayforceDB:masterfrom
ser-vasilich:fixes/seven-bug-probes

Conversation

@ser-vasilich
Copy link
Copy Markdown
Contributor

@ser-vasilich ser-vasilich commented Apr 27, 2026

Summary

Closes 8 known bugs (the 7 from spec/_probes/ plus one uncovered while writing the cross-type workout test) and lifts overall coverage from 64.2% → 65.5% lines (+542 lines). Each fix is a self-contained commit with regression tests in test/rfl/.

Bug fixes

Commit Bug Severity
153e9e9 (+ 1 2 3) silently dropped third arg correctness — silent data loss
072e040 (> 'b 'a) raised type instead of comparing SYM feature gap
9ff6e80 (neg 5h) raised type for narrow ints feature gap
ea281ef (abs 5h) widened to i64 unlike neg inconsistency
3df78bb (modify t 'a neg) rejected builtin fn inconsistency vs map
ab2074c (like ...) catastrophic backtracking + dual semantics DoS vector + correctness
d20284c (set-splayed) failed on nested paths; get-parted errored usability
3872960 first/last widened DATE/TIME/TIMESTAMP/BOOL to i64 type preservation

Plus ac0e7ca documents the file-mode error-exit contract (the corresponding bug was already fixed upstream).

Coverage uplift (test-only)

  • 38e7932 — consolidate non-duplicate coverage from spec migration
  • 89f37a9 — radix-boundary + null sort coverage salvaged from scratch files
  • 6a7ffa4 — fold-right/scan-right/retract-fact/scan-eav (+139 lines: each was 0%)
  • 6f32db0 — per-type and list-form coverage for reverse/union/except/alter
  • c4961bb — pivot avg/min/max + multi-key + f64; new union-all coverage
  • e368634 — DAG executor binary ops via select-with-derived-cols
  • 6e0da6f — single 200-row × 11-column workout fixture exercising aggregation/sort/filter/group-by/join/pivot/cast/csv/splayed across all primitive types
  • 52cf5df — .gitignore for IDE/coverage artifacts

Test plan

  • make test — 973 of 974 passed (1 explicit skip, 0 failed) on Linux x86_64 with ASan + UBSan
  • Each fix has at least one regression test under test/rfl/
  • like catastrophic-backtracking guard: 20-star pattern finishes in <1ms (was timeout)
  • set-splayed nested path + get-parted round-trip
  • first/last preserve type for DATE/TIME/TIMESTAMP/BOOL/U8

Known follow-ups (not in this PR)

  • (at p 'col) on a PARTED column doesn't materialize — only the select-where path exercises the lazy reader (commit d20284c)
  • Win32 ray_mkdir_p mirrors POSIX logic but I have no Windows CI to test it
  • DAG narrow-int arithmetic widens (I16+I16 → I64) while eval-path preserves type — pinned in test/rfl/integration/dag_binary_ops.rfl
  • sum on I16/I32 widens to i64 — pinned as documented inconsistency in test/rfl/arith/type_preservation.rfl

🤖 Generated with Claude Code

ser-vasilich and others added 17 commits April 27, 2026 20:58
Adds 624 unique test assertions on top of Anton's test/rfl/ corpus,
deduped against his parallel rewrite (commit 536fef4 et al).

Stats:
* 35 modified files — our backup was a strict superset of upstream;
  replace upstream wholesale.  Whitespace-only diffs in arith/abs,
  arith/ceil, arith/floor, arith/neg, arith/round, cmp/ge, cmp/gt,
  cmp/ne were also false-positive "Anton-only" results from sorting
  with extra spaces.
* 12 new files — coverage Anton didn't write at all:
    test/rfl/datalog/rule.rfl       — full EAV / rule / TC / negation
    test/rfl/embedding/hnsw.rfl     — HNSW index build/query/info
    test/rfl/integration/arena.rfl  — 100k-element churn + .sys.gc
    test/rfl/integration/cow.rfl    — copy-on-write aliasing
    test/rfl/integration/morsel.rfl — 1023/1024/1025 + 2047/2048/2049
    test/rfl/integration/optimizer.rfl — filter reorder, pushdown,
                                         selection-bitmap edges
    test/rfl/integration/str_pool.rfl — 12-byte SSO boundary
    test/rfl/system/csv_roundtrip.rfl — schema-less .csv.read
    test/rfl/system/splayed.rfl     — set/get-splayed round-trip
    test/rfl/table/modify.rfl       — functional column update
    test/rfl/table/pivot.rfl        — wide reshape, sum/count aggr
    test/rfl/table/select.rfl       — 50 select-clause assertions

Coverage delta vs origin/master:
  Tests:     921/922  →  933/934   (+12 unique tests)
  Lines:     63.4%    →  63.9%     (+217 lines)
  Functions: 77.5%    →  78.0%     (+8 functions)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously +/-/* and other RAY_BINARY verbs silently truncated extras:

  (+ 1 2 3)     -> 3        (third arg dropped)
  (+ 1 2 3 4 5) -> 3
  (- 10 1 2)    -> 9
  (* 2 3 4)     -> 6

Same hole in RAY_UNARY: extras after the first argument were released
without raising. `(+ 1)` and `(+)` were already domain errors but the
3+-arg case fell through to the binary kernel against the first two
args and threw the rest away.

Now both dispatch sites — the bytecode VM (eval.c:1656) and the tree-
walking eval (eval.c:2378) — raise:
  arity: expected N arg(s), got M

`and` and `or` were registered as RAY_BINARY but Anton's tests expected
variadic fold semantics ((and a b c) -> fold AND).  Switch them to
RAY_VARY via ray_and_vary_fn / ray_or_vary_fn — left-fold over the
existing binary kernel.  The (and X Y) DAG path through select where:
clauses still emits OP_AND / OP_OR via the expression compiler, so
fused execution is unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously (> 'b 'a) raised "error: type: cannot compare sym and sym"
even for the reflexive case (>= 'a 'a).  Equality comparisons (==/!=)
worked because they could just check interned IDs, but ordering had no
implementation at all and fell through to the is_numeric() guard.

Add sym_atom_cmp(a, b):

  - Fast path: equal interned IDs => identical text (interning gives
    one ID per text), return 0 without touching the global sym table.
  - Slow path: ray_sym_str(id) returns a RAY_STR atom; ray_str_cmp
    delegates to ray_str_t_cmp which uses the 12-byte SSO inline path
    for short symbols and prefix-then-fullcompare for pooled ones.

Wire it into ray_gt_fn / ray_lt_fn / ray_gte_fn / ray_lte_fn at the
same dispatch site as the existing -RAY_GUID branch.  Vector and
broadcast paths inherit the fix automatically — atomic_map_binary
unboxes SYM elements via collection_elem (-> ray_sym(id) atom), so
each pairwise call lands in the new branch.

The pre-fix probe (test/rfl/cmp/gt.rfl:41-44, "(> 'b 'a) !- type") is
replaced with positive assertions covering atoms, broadcast, vec/vec,
and the SSO inline/pooled boundary.  Mirror coverage added to lt.rfl,
ge.rfl, le.rfl.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ray_neg_fn only handled -RAY_I64 and -RAY_F64 atoms; (neg 5h) and
(neg 5i) raised "error: type" even though abs already accepted them.

Add the i16/i32 branches and preserve type — same convention as
binary +, -, *, %, /:

    (neg 5h)       -> -5h     (i16)
    (neg 5i)       -> -5i     (i32)
    (neg [1h 2h])  -> [-1h -2h]   (I16)

(Vector path inherits the fix via RAY_FN_ATOMIC.)  Type preservation
is the right call here: in Rayforce typed nulls live in a separate
nullmap bit, so INT_MIN is just a regular value and there's no
overflow concern that would justify widening like abs does.

Add test/rfl/arith/type_preservation.rfl pinning the full type-
promotion matrix so a regression in any op surfaces loudly:

  - same-width +/-/*//%  preserve type (i16, i32, i64, f64)
  - cross-width +/-/*//%  follow "wider wins" (i16+i32→i32, etc.)
  - all comparisons return bool
  - unary neg/floor/ceil  preserve type
  - unary abs widens narrow ints to i64 (documented inconsistency)
  - math fns (round/sqrt/log/exp)  always return f64

Replaces the stale `(neg 5h) !- type` probes in test/rfl/arith/neg.rfl
with positive assertions including type assertions and null
propagation across i16/i32.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
abs was the lone outlier in the unary-numeric family — neg, floor,
ceil all preserve i16/i32, but abs widened them to i64.  No good
reason for the inconsistency: in Rayforce typed nulls live in a
separate nullmap bit, so INT_MIN is just a regular value and the
"overflow protection" rationale that justifies widening in some
languages doesn't apply here.

Switch the i16/i32 branches to make_i16/make_i32:

    (abs -5h)     -> 5h    (i16, was i64)
    (abs -5i)     -> 5i    (i32, was i64)
    (abs [-1h 2h]) -> [1h 2h] (I16, was I64)

Update test/rfl/arith/abs.rfl and the type_preservation.rfl regression
table: abs now joins neg/floor/ceil under the "preserve narrow ints"
rule.

Coverage of remaining type-promotion oddities (left intentional):
  - sum widens I16/I32 → i64 (overflow guard)
  - round of int → f64 (mathematically noop, but produces float anyway)
Both pinned in type_preservation.rfl so future changes surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nary

(modify t 'a neg) raised "error: type" while (map neg [1 2 3]) worked.
modify calls call_fn1(fn, column_vec) internally — call_fn1 saw a
RAY_UNARY function and invoked it directly on the whole vector.  But
ray_neg_fn (and every other atomic unary kernel: abs, floor, ceil,
round, sqrt, log, exp, not) is written for a single atom and rejects
positive-typed (vector) input with "type".

call_fn2 already had the parallel routing for binary atomic builtins:

    if ((fn->attrs & RAY_FN_ATOMIC) && (is_collection(a) || ...))
        return atomic_map_binary(f, a, b);

call_fn1 simply forgot the unary mirror.  Add it.

Affects every code path that uses call_fn1 — modify, fold (1-arg form),
the apply 1-arg case — so any future caller now gets vector
auto-mapping for free, matching the user's intuition that "if it
works in map, it should work everywhere".

Tests: extend test/rfl/table/modify.rfl with parity assertions
showing (modify t 'a neg) ≡ (modify t 'a (fn [x] (neg x))), plus
chained modify with abs/neg and floor/ceil over a float column.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Document what file-mode and stdin-mode mean for error handling:

  - ray_repl_run_file (batch / script):  rc=1 stops execution
  - ray_repl_run / stdin pipe (REPL):    error printed, loop continues

The probe spec/_probes/error_exit_code_inconsistent.rfl filed back
when file-mode silently returned 0 on errors; that's been fixed
since (every verb in the probe — asof-join, inner/left-join, filter,
+, /, rand, til, alter — now correctly returns rc=1 from a script
file).  Drop the probe.

The stdin-pipe-doesn't-abort behaviour is the REPL contract, not a
bug; users wanting batch semantics should pass the script as a
positional file arg.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three implementations existed and they disagreed on syntax:

  - eval path: strop.c::str_glob — recursive, glob */?/[abc],
    EXPONENTIAL on patterns like "a*a*a*…a*b" against "aaa…a"
    (>5s on 16 stars, >timeout on 20).
  - DAG path: string.c::like_match — iterative, but SQL %/_ syntax.
  - DAG path: string.c::ilike_match — iterative SQL %/_ , case-
    insensitive.

So `(like "hello" "h*")` returned true via eval but matched 0 rows
in `select where: (like s "h*")` — `*` was literal under the SQL
matcher.  And the eval path could be DoS'd with a 20-star pattern.

Replace all three with one shared implementation in src/ops/glob.[ch]:

  - Iterative two-pointer with last-star backtrack (glibc fnmatch
    style).  O(n*m) worst case; 32-star pattern that pre-fix took
    >5s now finishes in microseconds.
  - Glob syntax matching the documented contract: * (any), ? (one),
    [abc] / [a-z] / [!abc] (character class).
  - ray_glob_match (case-sensitive) and ray_glob_match_ci (folds
    ASCII letters on both sides).

eval path (strop.c::ray_like_fn) and DAG path (string.c::exec_like /
exec_ilike) both call the same matcher — semantics now identical.

Tests:
  * test/rfl/strop/like.rfl extended with character classes, ranges,
    negated classes, an adversarial 20-star catastrophic-backtracking
    guard, and explicit eval-path-≡-DAG-path parity assertions.
  * test/test_exec.c: stale C-test using "bar%" (SQL syntax) updated
    to "bar*" (glob).

Docs: website/docs/rayfall-functions.html updated to mention [abc] /
[a-z] / [!abc] (already implemented; previously undocumented).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…t-parted

(set-splayed "/tmp/db/2024.01.01/t/" t)         -> error: io
(get-parted  "/tmp/db/" 't)                     -> error: io

Two compounding bugs broke partitioned-table workflows:

1. ray_mkdir was single-level — could not create
   "/tmp/db/2024.01.01/t/" when the parent directories didn't pre-
   exist.  v1 quietly handled this inside fs_fopen (walk path,
   mkdir each parent before opening the file); v2 lost that.

   Add ray_mkdir_p (POSIX + Win32, src/store/fileio.c) with
   mkdir -p semantics.  Use it from ray_splay_save instead of the
   single-level ray_mkdir.

2. ray_read_parted (get-parted) unconditionally called
   ray_sym_load("<db_root>/sym") and propagated its failure as
   "io" — but set-splayed only writes per-table sym files inside
   the leaf splayed dir, never a root-level one for symbol-less
   tables.  Stat the file and skip the load if absent.

After both: (set-splayed "/tmp/db/2024.01.01/t/" t0) writes the
partition correctly; (get-parted "/tmp/db/" 't) returns a 10-row /
3-column lazy table (1 MAPCOMMON partition-key + 2 data columns),
parity with the v1 behaviour.

Tests: extend test/rfl/system/splayed.rfl with the nested-mkdir case
and a 2-partition get-parted round-trip (count = 10, columns = 3).

Known limitation (separate task): direct (at p 'col) on a PARTED
column doesn't materialize values yet — only the select-where path
exercises the lazy reader.  Filed for follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds:
  .idea/, .vscode/        — IDE working state
  *.gcda, *.gcno, *.gcov  — gcc coverage instrumentation outputs
  coverage*.info          — lcov tracefiles
  coverage_html/          — genhtml output directory
  rayforce.cov            — clang/llvm coverage runtime output

Keeps the working tree clean during a coverage build so `git status`
isn't drowned in untracked binary artifacts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three legacy script-style test files (test_4097.rfl, test_null_ops.rfl,
test_comprehensive.rfl) sat untracked in the repo root.  Reviewed each
against the existing test/rfl/ corpus; salvaged what wasn't covered:

* test/rfl/integration/sort_radix_boundary.rfl (new) — pin every
  type's sort exit values at N=4097, just over the 2^12 radix-strategy
  threshold.  Covers asc/desc/iasc/idesc/xasc/xdesc/select-orderby/
  group-by-then-sort/distinct/rank across i64, f64, SYM, STR, BOOL,
  DATE, TIMESTAMP, plus null-bearing variants.

* test/rfl/null/sort.rfl (extended) — add xasc-on-null-keyed-table
  (null sorts first, count preserved) and take-with-nulls slicing
  semantics.  Anton's existing file covered asc/desc/iasc/idesc with
  nulls but not xasc or take.

The other two files duplicated Anton's null/* and arith/div coverage
(Float div-by-zero already produces 0Nf in his test/rfl/arith/div.rfl,
INT64 boundary already in test/rfl/integration/null.rfl), so dropped.

Plus removed:
  rayforce.cov                — clang/llvm runtime artifact
  test/bugs/                  — 6 of 7 already migrated as upstream
                                tests; legacy duplicates
  extract_v1_tests.py + v1_tests*.rfl — Python interim, generation
                                output; we don't ship Python in the
                                test pipeline
  test_null_full / 100k / parallel / bugs.rfl — used renamed verbs
                                (write-csv, antijoin) that don't
                                exist in current Rayforce, and what
                                they tested is already covered

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…av + radix groupby

Four register_vary builtins had 0% coverage despite existing in
src/lang/eval.c:
  - collection.c::ray_fold_right_fn  (0% → 58%, +21 lines)
  - collection.c::ray_scan_right_fn  (0% → 73%, +24 lines)
  - datalog.c::ray_retract_fact_fn   (0% → 88%, +46 lines)
  - datalog.c::ray_scan_eav_fn       (0% → 89%, +48 lines)

Net +139 source lines newly covered from 3 small tests, no kernel
changes.

* test/rfl/hof/right.rfl — right-fold semantics for + - and a digit-
  building lambda f(a,b)=a+10b that distinguishes left vs right fold
  shape; suffix-sum invariants for scan-right.
* test/rfl/datalog/eav_ops.rfl — assert/retract round-trips, retract
  no-op on missing triple, retract leaves sibling attributes intact;
  scan-eav 2-arg (filter by attr) and 3-arg (entity+attr lookup).
* test/rfl/integration/radix_groupby.rfl — 100k and 200k row group-by
  to push the executor past RAY_PARALLEL_THRESHOLD (= 64*1024) and
  exercise the parallel radix path in group.c.  Also pins multi-key
  group-by + multi-aggregator results at 100k.

Overall lines coverage: 64.2% → 65.0% (+0.8pp), functions 78.5% → 79.0%.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ray_alter_fn (40% → 83%, +66 lines), ray_reverse_fn (40% → 62%, +11),
ray_union_fn / ray_except_fn — the previous tests only hit i64 (and
SYM for except) and never the boxed-list path.  Each function has
distinct branches per element type and a separate heterogeneous-list
branch that vec-of-X tests skip entirely.

* reverse — exercise type switch across F64, I16, I32, U8, BOOL, SYM,
  STR, DATE, TIME, plus null-bearing input.
* union/except — vec switch for f64/i16/i32/STR/DATE/TIME/BOOL plus the
  (list ...) variants which route through ray_union_fn's boxed-list
  fallback at line 793.
* alter — add LIST-only forms (set/concat/remove with atom or vec
  index), plus alter-set across F64/I16/SYM/BOOL vec types to
  exercise store_typed_elem dispatch.

No kernel changes; tests only.  +1 test file count not changed (all
edits are in-place to existing files).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ray_pivot_fn previously exercised only sum/count over one row-key on a
3-column SYM/SYM/I64 fixture.  Add:
* avg / min / max aggregator hits (separate combine paths for each)
* multi-row-key form (pivot t [r] c v sum) — vector first arg
* f64 value column

ray_union_all_fn was registered but had no .rfl tests at all (0%
coverage despite being exposed as `union-all`).  Add row-count, sum
invariants, and empty-table edge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
expr.c::expr_exec_binary handles all arithmetic/comparison ops in DAG
(select) context.  Top-level (+ a b) goes through the eval-path kernel,
not this DAG branch, so the existing arith.rfl tests miss it entirely.

The new file pins:
  - int + - * / % across various scalar/vec broadcast patterns
  - float arithmetic
  - narrow-int (I16/I32) DAG behaviour
  - all six comparison ops returning bool vector
  - compound where: (and/or) in select
  - filter+arithmetic fusion (where + derived col)
  - F64 NaN-aware DAG comparison

DOCUMENTED INCONSISTENCY found during this session: the DAG path
widens narrow-int arithmetic (I16+I16 → I64) while the eval-path
kernel preserves type ((+ 5h 3h) → 8h, I16).  The new test pins the
DAG's current widening behaviour so a future alignment fix surfaces.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ture

A magisterial integration test that builds one 200-row table with 11
columns spanning every primitive type (I64, I16, I32, F64, SYM, STR,
B8, DATE, TIME, plus derived F64 price and signed I64 qty) and runs:

* 12 atomic aggregations (sum/count/avg/min/max/first/last) per column
* 7 comparison-operator selects across SYM/F64/I64/B8 columns
* 7 sort variants (asc/desc/iasc/idesc/rank/xasc/xdesc)
* distinct/take/reverse/concat/in across multiple types
* group-by per key type (SYM, STR, B8, DATE, I16) plus 2 multi-key
* DAG-derived columns via select+arithmetic
* inner/left/anti joins with partial-coverage lookup tables
* pivot with sum / count / avg aggregators
* cast across i16/i32/i64/f64/b8
* update / modify / insert / upsert
* csv round-trip (.csv.write / .csv.read)
* splayed round-trip (set-splayed / get-splayed)

ΔLine coverage from this single file: +49 lines.

KNOWN BUGS uncovered while writing this test, pinned with comments:
  - (first dt-col)  drops DATE type → i64 (returns days-since-epoch)
  - (first tm-col)  drops TIME type → i64 (ms since midnight)
  - (last bool-col) drops BOOL type → 0/1 i64
  - set-splayed of a table with a SYM column then get-splayed yields
    "error: corrupt"; same fixture sans SYM round-trips cleanly

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ray_first_fn and ray_last_fn whitelisted SYM/I16/I32/GUID/STR for the
type-preserving collection_elem path; everything else fell through to
AGG_VEC_VIA_DAG which produced an i64 result for these types:

  (first [2024.01.01 2024.01.02])   → 8766     (was: 'date 2024.01.01)
  (first [09:30:00.000 ...])        → 34200000 (was: 'time 09:30:00.000)
  (last  [true false])              → 0        (was: 'b8 false)

Add DATE / TIME / TIMESTAMP / BOOL / U8 to the whitelist so they
follow the same type-preserving path.  collection_elem already builds
typed atoms for all of them via ray_date / ray_time / ray_timestamp /
ray_bool / ray_u8 — this fix simply routes there.

Discovered while writing test/rfl/integration/cross_type_workout.rfl;
that file's pinned-bug TODOs are now upgraded to type-asserting
assertions.  Plus per-type regression coverage in
test/rfl/agg/{first,last}.rfl.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ser-vasilich ser-vasilich changed the title fix: 7 bug probes (arity, sym ordering, type preservation, glob, splayed/parted) fix(8 bugs) + test(coverage +1.3pp): probes, type-preservation, broad workout Apr 28, 2026
@singaraiona singaraiona requested a review from Copilot April 28, 2026 15:00
ser-vasilich and others added 3 commits April 28, 2026 18:28
A 1000-row two-column fixture with 50 distinct keys driving group.c::
exec_group through every aggregator combination + every primitive key
type.

Aggregators covered in group-by context: count, sum, avg, min, max,
first, last, dev, var, stddev, dev_pop, var_pop, stddev_pop.  Plus
multi-aggregator (7-agg) select, group-by + filter pushdown, and
no-by-clause aggregation over the whole table.

Key types: I64, SYM, BOOL, DATE, F64, I16, I32 — each routes through
a separate hash-key path in the parallel radix groupby.

KNOWN BUGS pinned with `!- length`:
  - (med v) inside (select … by:) raises "length: non-agg expression
    referencing a column produced a non-row-aligned result"
  - (diverse v) — actually a bool predicate ("all unique?"), not the
    count-distinct one might guess from the name; works standalone but
    isn't usable inside group-by either

ΔLine coverage: +67 lines (mostly variant aggregator paths in group.c
and per-key-type dispatch in exec_group).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends pivot.rfl with paths previously uncovered:
* multi-row-key form (pivot t [a b] c v sum) — 4-column output
* I64 row key (was only SYM-keyed)
* DATE row key
* missing-cell semantics: sum and count both yield 0 for empty groups

These exercise the row-key hash + cross-product paths in
tblop.c::ray_pivot_fn that the existing fixtures (one SYM row key)
didn't reach.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends datalog/rule.rfl with paths missed by the original 13 cases:
* recursive (anc ?x ?y) — depth-3 chain plus a disjoint pair (5→6),
  exercising the rule fixed-point loop in dl_compile_rule
* two-clause derivation (cofriend) — duplicate triggers de-dupe
* multi-constant body (dept 10 ∧ level 'senior) — separate filter
  branches in dl_parse_body_clause

Should hit some of the 247 uncovered lines in datalog.c::dl_compile_rule.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@singaraiona
Copy link
Copy Markdown
Contributor

Code review

Strengths

  • Each fix has regression coverage in test/rfl/test every fix rule met.
  • Comments explain why (glob consolidation rationale, DAG vs. eval whitelist, missing-symfile contract).
  • glob.[ch] consolidation is the right call: kills three diverged matchers and the catastrophic-backtracking blowup.
  • op_callf arity-error path now releases all n args, not just the consumed prefix.

Issues to address before merge

src/ops/cmp.c:187sym_atom_cmp swallows lookup failure

int r = (sa && sb) ? ray_str_cmp(sa, sb) : 0;

If one ray_sym_str returns NULL, you return 0 (equal). That makes unequal symbols compare equal and lets a corrupted intern table slip past. Either propagate an error or fall back to comparing raw i64 ids — never silently say "equal".

src/ops/arith.c:332 — narrow-int neg/abs and INT_MIN UB

if (x->type == -RAY_I32) return make_i32(-x->i32);
if (x->type == -RAY_I16) return make_i16(-x->i16);

-INT32_MIN / -INT16_MIN are signed overflow (UB). The previous code widened to i64 specifically to dodge this. Either guard x->i32 == INT32_MIN (return typed null or saturate) or cast through unsigned for defined wraparound. UBSan should catch it if test/rfl/arith/neg.rfl includes an INT32_MIN row — please add one.

src/store/fileio.c:122ray_mkdir_p fixed buf[1024]
POSIX PATH_MAX is 4096 on Linux. Deep splayed paths could overflow. Suggest PATH_MAX / MAX_PATH.

Worth flagging (not blockers)

OP_AND/OP_OR registration removed but DAG still uses them. register_binary_op("and", …, OP_AND) is gone; eval-time and/or are now RAY_VARY. The DAG path still works for binary (and a b) because src/ops/query.c::resolve_binary_dag resolves "and"ray_and independently. But (and a b c) with >2 args inside (select t (where ...)) will now fall off the DAG path (binary-only resolver) and run row-by-row in eval. Silent perf cliff, not correctness. Worth a pin test.

and/or don't short-circuit. Args are pre-evaluated (no RAY_FN_SPECIAL_FORM). Same as before for binary, but extending to variadic makes the lack of short-circuit more user-visible. Consider promoting to special form in a follow-up.

OP_LIKE syntax silently changed from SQL %/_ to glob */?. Documented in PR body, and test_exec.c was updated. Worth one explicit test that asserts (like x "%") matches a literal % — confirms the consolidation in both directions.

src/ops/glob.c::match_class accepts unterminated [. If pattern is "abc[def" the loop exits at pn without seeing ] and the match continues using whatever matched flag accumulated. Lenient behavior; worth a one-liner in glob.h documenting the policy so it's not later filed as a bug.

Convention checks

  • No malloc/free, no third-party names, one-word filenames — all good.
  • ray_sym_str retain/release pairing in cmp.c and strop.c looks right.

Verdict

Approve once the three blockers above (sym NULL fallback, INT_MIN UB, PATH_MAX) are addressed. The fixes themselves are tight, and the coverage uplift is genuine value.

Address Anton's PR RayforceDB#8 review:

[blocker] sym_atom_cmp: NULL ray_sym_str → fall back to comparing raw
interned ids instead of returning 0 (which would silently collapse
distinct symbols if the intern table is corrupted or uninitialised).
Stable, total ordering preserved.

[blocker] neg/abs INT_MIN UB: -INT16_MIN / -INT32_MIN / -INT64_MIN
were signed-overflow UB.  Negate via unsigned cast — wraparound is
defined for unsigned types and the result wraps back to INT_MIN
consistently with binary `(- 0 INT_MIN)`.  Added regression rows in
arith/neg.rfl and arith/abs.rfl that pin INT16_MIN / INT32_MIN
behaviour (via cast since `-32768h` literal is unrepresentable).

[blocker] ray_mkdir_p path buffer: replace hardcoded `char buf[1024]`
with `RAY_PATH_MAX` (PATH_MAX on POSIX, 4096 on Windows).  Deep
splayed paths like /db/yyyy.mm.dd/leaf/ now fit.

[bonus] glob `%`/`_` literal-match assertions in strop/like.rfl —
makes the SQL→glob consolidation explicit in tests, not just docs.

[bonus] glob.h documents the lenient unterminated-class policy
(matches glibc fnmatch semantics; never produces parse error).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ser-vasilich
Copy link
Copy Markdown
Contributor Author

Thanks for the review. All three blockers addressed in 82fae9d:

1. sym_atom_cmp NULL fallback — now compares raw interned ids when ray_sym_str returns NULL. Stable, total, never collapses distinct symbols.

2. neg/abs INT_MIN UB — switched to unsigned-wrap negation ((int32_t)(-(uint32_t)x) etc.). Wraparound defined for unsigned, so (neg INT16_MIN) → INT16_MIN instead of UB. Added INT16_MIN / INT32_MIN regression rows in test/rfl/arith/neg.rfl and abs.rfl (via cast since -32768h literal is unrepresentable — parser tokenises positive then negates).

3. ray_mkdir_p bufferchar buf[RAY_PATH_MAX] (PATH_MAX on POSIX, 4096 on Win32).

Bonus:

  • test/rfl/strop/like.rfl adds explicit (like "100%" "100%") -- true and three companions making the SQL→glob consolidation visible at the test level.
  • src/ops/glob.h documents the lenient unterminated-class policy (matches glibc fnmatch).

The (and a b c) in select where: falling off the DAG is a real silent perf cliff but not a correctness issue — left for a follow-up since the surrounding refactor (DAG resolver supporting variadic logical ops) is larger than this PR.

Tests still 974 of 975 passing under ASan + UBSan.

@singaraiona singaraiona merged commit bb8884e into RayforceDB:master Apr 28, 2026
4 checks passed
ser-vasilich added a commit to ser-vasilich/rayforce2 that referenced this pull request Apr 29, 2026
Previous implementation used unsigned-wrap negation, so:

  (abs -32768h)           -> -32768h   ❌ negative result from abs!
  (abs (- (neg 32767h) 1h)) -> -32768h ❌ same path
  (neg INT_MIN)           -> INT_MIN   ❌ wrap, silent

Math: -INT_MIN doesn't fit in the same signed width.  Per k/q
convention, surface the overflow as a typed null of the same width:

  (neg INT16_MIN) -> 0Nh
  (neg INT32_MIN) -> 0Ni
  (neg INT64_MIN) -> 0Nl
  (abs INT_MIN)   -> typed null   (same per width)

Properties:
  ✓ Type preserved (no widening to i64)
  ✓ No UB (overflow detected before negation)
  ✓ Detectable via `nil?`
  ✓ Consistent with existing null propagation: (neg 0Ni) → 0Ni

The check is wrapped in RAY_UNLIKELY since INT_MIN is extremely rare
in real data; modern branch predictors will treat it as a
near-zero-cost test on the hot path.

DAG inline loop in expr.c:811-812 still uses unsigned wrap (separate
hot path for OP_NEG/OP_ABS over i64 vectors in select-with-derived-
columns).  Inconsistency between atom and DAG paths to be addressed
in a follow-up.

Tests in test/rfl/arith/{neg,abs}.rfl now pin:
  - (neg/abs INT16_MIN/INT32_MIN/INT64_MIN) → typed null of same type
  - adjacent values (INT_MAX, -INT_MAX) still negate normally

Addresses Anton's PR RayforceDB#8 review blocker 2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ser-vasilich added a commit to ser-vasilich/rayforce2 that referenced this pull request Apr 29, 2026
PR RayforceDB#8 dropped `register_binary_op("and", ..., OP_AND)` and switched
`and`/`or` to eval-time variadic via `register_vary`.  `compile_expr_dag`
still recognised the 2-arg form via `resolve_binary_dag` (binary
shortcut path) but bailed out with NULL on n>=4 — and the WHERE-clause
caller upgraded that to "WHERE predicate not supported by DAG compiler".
Anton flagged this as a silent perf cliff; in practice it is a
correctness regression (queries that worked in v1 / pre-PR fail
outright).

Fold variadic AND/OR into a balanced binary tree of OP_AND/OP_OR nodes
before lowering: `(and a b c d)` → `(and (and a b) (and c d))`,
depth log2(N).  The fused-expr executor evaluates this as a flat
sequence of binary AND/OR instructions sharing scratch registers, so
no extra column allocations vs the hand-nested form.

Iterative pairwise reduction inside `compile_expr_dag`; bail at >64
inputs (avoids unbounded stack array, no realistic query needs more).
Balanced shape (rather than left-fold) keeps the tree symmetric and
minimises dependency-chain depth, leaving room for a future
predicate-parallel executor without rewriting this pass.

Tests in dag_binary_ops.rfl pin: 3/4-arg AND in WHERE matches the
nested-binary form; 3/5-arg OR in WHERE; mixed `(and (or...) (or...))`;
fold semantics agrees with composing two filtered selects.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ser-vasilich added a commit to ser-vasilich/rayforce2 that referenced this pull request Apr 29, 2026
PR RayforceDB#8 changed `and`/`or` from `register_binary_op` (with OP_AND/OP_OR
opcodes) to `register_vary` — but registered them with `RAY_FN_NONE`,
losing the `RAY_FN_SPECIAL_FORM` flag that v1 carried.  Result: every
arg is eagerly evaluated even when the result is already determined
by a prior scalar — `(and false (heavy-fn))` runs `heavy-fn`,
`(or true (slow-query))` runs `slow-query`.  v1 short-circuited; v2
silently regressed.

Promote both to special form so the kernel receives unevaluated AST
nodes and calls `ray_eval` itself — same dispatch shape as `if`, `do`,
`let`, `select` etc.  Short-circuit triggers only when the running
accumulator is a *scalar* with the determining truth value: scalar
falsy for AND, scalar truthy for OR.  When the accumulator is a
vector we fall through to `ray_and_fn` / `ray_or_fn` for proper
element-wise broadcast (preserves `(or [t f t] true) → [t t t]` shape
semantics that the existing test asserts).

Tests in cmp/and.rfl and cmp/or.rfl pin: short-circuit prevents
evaluation of subsequent args (probe via `undefined-name`), and
short-circuit does *not* trigger when the result is undetermined.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
singaraiona pushed a commit that referenced this pull request Apr 29, 2026
Previous implementation used unsigned-wrap negation, so:

  (abs -32768h)           -> -32768h   ❌ negative result from abs!
  (abs (- (neg 32767h) 1h)) -> -32768h ❌ same path
  (neg INT_MIN)           -> INT_MIN   ❌ wrap, silent

Math: -INT_MIN doesn't fit in the same signed width.  Per k/q
convention, surface the overflow as a typed null of the same width:

  (neg INT16_MIN) -> 0Nh
  (neg INT32_MIN) -> 0Ni
  (neg INT64_MIN) -> 0Nl
  (abs INT_MIN)   -> typed null   (same per width)

Properties:
  ✓ Type preserved (no widening to i64)
  ✓ No UB (overflow detected before negation)
  ✓ Detectable via `nil?`
  ✓ Consistent with existing null propagation: (neg 0Ni) → 0Ni

The check is wrapped in RAY_UNLIKELY since INT_MIN is extremely rare
in real data; modern branch predictors will treat it as a
near-zero-cost test on the hot path.

DAG inline loop in expr.c:811-812 still uses unsigned wrap (separate
hot path for OP_NEG/OP_ABS over i64 vectors in select-with-derived-
columns).  Inconsistency between atom and DAG paths to be addressed
in a follow-up.

Tests in test/rfl/arith/{neg,abs}.rfl now pin:
  - (neg/abs INT16_MIN/INT32_MIN/INT64_MIN) → typed null of same type
  - adjacent values (INT_MAX, -INT_MAX) still negate normally

Addresses Anton's PR #8 review blocker 2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
singaraiona pushed a commit that referenced this pull request Apr 29, 2026
PR #8 dropped `register_binary_op("and", ..., OP_AND)` and switched
`and`/`or` to eval-time variadic via `register_vary`.  `compile_expr_dag`
still recognised the 2-arg form via `resolve_binary_dag` (binary
shortcut path) but bailed out with NULL on n>=4 — and the WHERE-clause
caller upgraded that to "WHERE predicate not supported by DAG compiler".
Anton flagged this as a silent perf cliff; in practice it is a
correctness regression (queries that worked in v1 / pre-PR fail
outright).

Fold variadic AND/OR into a balanced binary tree of OP_AND/OP_OR nodes
before lowering: `(and a b c d)` → `(and (and a b) (and c d))`,
depth log2(N).  The fused-expr executor evaluates this as a flat
sequence of binary AND/OR instructions sharing scratch registers, so
no extra column allocations vs the hand-nested form.

Iterative pairwise reduction inside `compile_expr_dag`; bail at >64
inputs (avoids unbounded stack array, no realistic query needs more).
Balanced shape (rather than left-fold) keeps the tree symmetric and
minimises dependency-chain depth, leaving room for a future
predicate-parallel executor without rewriting this pass.

Tests in dag_binary_ops.rfl pin: 3/4-arg AND in WHERE matches the
nested-binary form; 3/5-arg OR in WHERE; mixed `(and (or...) (or...))`;
fold semantics agrees with composing two filtered selects.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
singaraiona pushed a commit that referenced this pull request Apr 29, 2026
PR #8 changed `and`/`or` from `register_binary_op` (with OP_AND/OP_OR
opcodes) to `register_vary` — but registered them with `RAY_FN_NONE`,
losing the `RAY_FN_SPECIAL_FORM` flag that v1 carried.  Result: every
arg is eagerly evaluated even when the result is already determined
by a prior scalar — `(and false (heavy-fn))` runs `heavy-fn`,
`(or true (slow-query))` runs `slow-query`.  v1 short-circuited; v2
silently regressed.

Promote both to special form so the kernel receives unevaluated AST
nodes and calls `ray_eval` itself — same dispatch shape as `if`, `do`,
`let`, `select` etc.  Short-circuit triggers only when the running
accumulator is a *scalar* with the determining truth value: scalar
falsy for AND, scalar truthy for OR.  When the accumulator is a
vector we fall through to `ray_and_fn` / `ray_or_fn` for proper
element-wise broadcast (preserves `(or [t f t] true) → [t t t]` shape
semantics that the existing test asserts).

Tests in cmp/and.rfl and cmp/or.rfl pin: short-circuit prevents
evaluation of subsequent args (probe via `undefined-name`), and
short-circuit does *not* trigger when the result is undetermined.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
singaraiona added a commit that referenced this pull request Apr 29, 2026
Follow-ups to #8: variadic and/or in WHERE, sym_atom_cmp invariant, INT_MIN→null DAG, glob escape docs
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants