perf(search): master speed — keep #2/#4 wins, gate #1, beat Everything across the matrix by githubrobbi · Pull Request #375 · skyllc-ai/UltraFastFileSearch

githubrobbi · 2026-06-09T18:43:25Z

Summary

The synthesis build from the performance regression root-cause analysis: keep the two genuine winners (#2 trigram-prefix, #4 parallel-resolve), gate the loser (#1 unlimited value-sort), and drop the un-gated #3. Verified on a fresh-daemon full-matrix Windows benchmark.

Commits

#1 skip value-sort for unlimited match-all — sort_and_localise early-returns (MFT-locality sort only) when limit >= candidates.len(), eliminating a redundant full sort of millions of tuples on * full-scans.
#2+#4 trigram prefix fast-path + size-gated parallel resolve — restores search_compact_drive_prefix (trigram-accelerated win*), wires is_prefix through both MultiDriveBackend::search and search_index, and gates indices_to_rows parallelism at PARALLEL_RESOLVE_THRESHOLD (50K) so tiny exact sets stay sequential (no rayon p95 jitter). Adds prefix-parity, limit, and parallel-resolve regression tests.
backend.rs decomposition — extracts DisplayRow into display_row.rs, drops the file-size exception.
docs(benchmarks) — public v0.5.120 cross-tool snapshot vs Everything.

Benchmark result (verified-fresh daemon, C: + D:, 7.97M records)

Acceptance gate MET: best-or-tied vs both 0.5.66 and Everything on every row; beats Everything on all 16 comparable rows (C: prefix is a 1ms tie). Sets six new bests: D:/C,D: full_scan, D: prefix, D:/C,D: substring, C,D: ext_dll. Median UFFS/ES ratio ~0.52x (~1.9x faster).

Verification

cargo clippy -D warnings clean, cargo test -p uffs-core green (829 lib tests + new parity/regression tests).
Full lint-pre-push gate green (incl. windows lint, doc-tests, smoke).
Rebased onto main @ 0.5.119; 3 signed code commits + 1 signed docs commit.

Note: published artifact will be v0.5.120 after the post-merge CI version bump.

sort_and_localise ran a full O(N log N) value-sort even when limit admits every candidate (e.g. `*` full-scan, limit=usize::MAX). The downstream backend::sort_rows re-sorts the materialised rows by the user's column anyway and truncate is a no-op, so the value-sort is wasted work over millions of tuples. Add an early return for limit >= candidates.len() that does only the cheap MFT-locality sort (keeps DirCache warm for path resolution) and skips the value-sort entirely. Recovers the full_scan C,D regression (4.2s top-5 -> <=3.5s) without touching the limited-query path.

#4) #2 Trigram prefix fast-path: prefix queries (e.g. `win*`) now narrow candidates via the first-3-char trigram lookup then filter by full prefix, instead of scanning every record. Adds is_prefix_pattern() in tree.rs, a new prefix_search.rs module, and is_prefix dispatch arms in backend.rs (both search sites) + dispatch.rs (+ pick_mode_label). Expected: prefix C 91->~72ms, C,D 95->~82ms (beats ES). #4 Size-gated parallel path resolution: indices_to_rows dispatches sequential below RESOLVE_CHUNK_SIZE (4096) and par_chunks at/above it. 4096 keeps tiny exact queries (3-37 rows) off rayon (no p95 tail jitter) while letting prefix/substring (12K-34K rows) fan out. Expected: substring C 57->~38ms, C,D 58->~47ms. Decompose: extract the indices_to_rows family into the new sibling module row_resolve.rs so query/mod.rs stays under the 800-LOC policy (809 -> 694), no file_size_exceptions entry added. Tests: is_prefix_pattern acceptance matrix (tree.rs), prefix/glob parity + limit (query_tests), and a 9000-row parallel-resolve parity test guarding the chunk-reduce ordering.

…d.rs size exception backend.rs was 1067 LOC and carried a PERMANENT file_size_exceptions entry. Per workspace policy (decompose, don't suppress), move the self-contained DisplayRow type — struct + inherent impl + Default + uffs_format::FormatRow impl — into a new sibling module display_row.rs (289 LOC). backend.rs drops to 784 LOC, under the 800 ceiling. DisplayRow is re-exported (`pub use super::display_row::DisplayRow;`) so the single-import convention downstream relies on (uffs_core::search::backend::DisplayRow) is unchanged — public API and behavior preserved. Removes the backend.rs entry from scripts/ci/file_size_exceptions.txt.

Public-facing, fact-only benchmark snapshot of the verified-fresh cross-tool run (UFFS v0.5.120 vs Everything 1.4.1.1032) on C: + D: (7.97M records, Ryzen 9 3900XT / Win11 24H2). States results only, not methodology, linking docs/benchmarks/methodology.md for the fairness doctrine. Headline: UFFS wins 17/18 targeted head-to-head cells at p50 (median ~0.52x, ~1.9x faster); the 18th (C: prefix) is a 1ms tie. Mirrors the structure of the 2026-04 v0.5.66 report. REUSE: covered by the repo-wide ** -> MPL-2.0 annotation in REUSE.toml.

The golden cpp_*.txt baseline is immutable across reruns. Hashing a multi-GB file on every invocation wastes seconds for no benefit. Add compute_streaming_stats_cached: writes a .parityhash sidecar keyed on (size_bytes, mtime_nanos); subsequent runs skip the SHA256 pass entirely if the file hasn't changed. Falls back to a full recompute if the sidecar is absent, stale, or unreadable. Also annotates the baseline hash line with ', golden cached' so the operator can confirm the fast-path engaged.

githubrobbi added 4 commits June 9, 2026 09:50

githubrobbi enabled auto-merge (squash) June 9, 2026 18:44

githubrobbi merged commit bb0bd94 into main Jun 9, 2026
27 checks passed

githubrobbi deleted the perf/master-speed-20260609 branch June 9, 2026 19:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(search): master speed — keep #2/#4 wins, gate #1, beat Everything across the matrix#375

perf(search): master speed — keep #2/#4 wins, gate #1, beat Everything across the matrix#375
githubrobbi merged 5 commits into
mainfrom
perf/master-speed-20260609

githubrobbi commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

githubrobbi commented Jun 9, 2026

Summary

Commits

Benchmark result (verified-fresh daemon, C: + D:, 7.97M records)

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant