feat(search,preflight): result_grouping parameter — overlay-first ranking when knowledge_base_url set (#150) by klappy · Pull Request #152 · klappy/oddkit

klappy · 2026-04-28T14:07:33Z

Summary

Implements Option D1+D2 hybrid from #150: adds a result_grouping parameter to search and preflight actions that controls how overlay (knowledge-base) and baseline results are ordered.

Conditional default

knowledge_base_url unset → default "merged" (pure BM25 score order, no behavior change)
knowledge_base_url set → default "overlay_first" (overlay docs ranked above baseline — the fix for Feature Request: Isolate or re-rank knowledge base content vs baseline in search corpus #150)
Caller can explicitly pass any value to override either default

Values

"merged" — existing behavior unchanged (BM25 score order)
"overlay_first" — stable partition: all source==="canon" hits precede all source==="baseline" hits; BM25 score order preserved within each partition
"grouped" — response carries overlay_hits / baseline_hits arrays (search) or start_here_overlay / start_here_baseline (preflight); merged hits / start_here still emitted for back-compat

Changes

File	Change
`workers/src/orchestrate.ts`	`ResultGrouping` type, `partitionBySource` utility (exported for testing), `runSearch` and `runPreflight` accept `resolvedGrouping`, conditional default computed in `handleUnifiedAction` dispatcher
`workers/src/index.ts`	`result_grouping` schema added to unified `oddkit` tool, `oddkit_search`, and `oddkit_preflight`; threaded through handler args
`workers/src/telemetry.ts`	`parseToolCall` extracts `result_grouping` from tool arguments; blob9 repurposed from retired `cache_tier` slot
`workers/test/result-grouping.test.mjs`	24 new tests
`workers/test/telemetry-integration.test.mjs`	Updated blob count assertion (8→9) and schema map
`CHANGELOG.md`	Entry under `[Unreleased]`

Test results

result-grouping.test.mjs:        24 passed, 0 failed
tokenize.test.mjs:                7 passed, 0 failed
governance-parser.test.mjs:     105 passed, 0 failed
telemetry-integration.test.mjs:  19 passed, 0 failed
TypeScript typecheck:            0 errors

Notes

No version bump in this PR. Orchestrator decides packaging cadence.
No changes to workers/src/bm25.ts. Re-ranking is post-BM25 in orchestrate.ts.
No cache-key change. The index is unchanged; only post-retrieval ordering differs.
Stable partition, not re-sort. partitionBySource uses a single forward pass that pushes to overlay or baseline arrays — never calls .sort() again.
Metadata enrichment iterates orderedHits (the partitioned order) so overlay_hits/baseline_hits in grouped mode contain the metadata-enriched objects.
Cursor Bugbot review awaited; orchestrator-side validator dispatch is in flight in parallel.
klappy.dev docs (docs/oddkit/tools/oddkit_search.md, oddkit_preflight.md) will be updated in a separate PR after this one merges.

Closes #150

Note

Medium Risk
Changes search and preflight result ordering and response shape (including a new grouped mode) across both the Cloudflare Worker and Node CLI, plus updates telemetry schema (blob9 repurposed), so regressions could affect ranking, state-threading, or downstream consumers expecting the old shape.

Overview
Adds a result_grouping parameter to search and preflight to control overlay vs baseline ordering, with a conditional default of overlay_first when knowledge_base_url/baseline override is set and merged otherwise.

Updates both Worker and Node CLI implementations to (a) stable-partition results without re-sorting, (b) support grouped responses exposing overlay_hits/baseline_hits (search) and start_here_overlay/start_here_baseline (preflight), and (c) widen the BM25 candidate pool to 50 when grouping is enabled before truncating to the final 5.

Extends observability by threading result_grouping through tool schemas and debug envelopes, and repurposes telemetry blob9 to record the caller-specified grouping value; adds/updates regression tests covering partitioning, defaults, candidate widening, and telemetry blob shape.

^{Reviewed by Cursor Bugbot for commit 885fcc9. Bugbot is set up for automated code reviews on this repo. Configure here.}

Fix-forward (post-Cursor-comparison) — 2026-04-28T17:30Z

After comparing this PR against #151 (Cursor's parallel implementation), two real gaps were closed in commit 60fe9b8:

Gap 1 — Candidate-pool widening (worker + CLI)

runSearch previously called searchBM25(bm25, input, 5), truncating to 5 candidates before the partition. Any overlay doc ranked at BM25 position 6+ would be invisible to the partition. PR #151 caught this by widening to 50 candidates. This fix-forward applies the same approach: when resolvedGrouping !== "merged", pull 50 candidates, partition, then .slice(0, FINAL_LIMIT) (5).

Gap 2 — Node CLI mirror (`src/core/actions.js`)

The original PR only changed workers/src/. PR #151 also updated the Node CLI's search action so the CLI behaves consistently. This fix-forward mirrors the worker logic to src/core/actions.js:

New partitionByOrigin helper (CLI uses origin: "local" | "baseline" where the worker uses source: "canon" | "baseline").
Same conditional default (baseline set → overlay_first, else merged).
Candidate-pool widening on the CLI search path.
overlay_hits / baseline_hits arrays for grouped mode.
NO_MATCH branch includes empty grouped arrays when grouped (matches the existing worker fix in 63eb8e0).
New result_grouping field in the CLI's debug envelope.

Bugbot disposition

Cursor Bugbot left one Low-severity finding on commit 20b5e41: "When result_grouping=grouped and no hits, NO_MATCH branch omits overlay_hits/baseline_hits." This was already fixed in commit 63eb8e0 (worker side) before Bugbot's review settled. The CLI now matches this behavior in commit 60fe9b8. Disposition: already-fixed.

Tests

2 new regression tests in workers/test/result-grouping.test.mjs validate the candidate-pool widening: partition surfaces overlay even when overlay is mostly low-score and widened pool: 50 candidates partition correctly without losing overlay.
All 26 tests in result-grouping.test.mjs pass; full worker test suite (5 files) green; tsc --noEmit clean; CLI smoke (bash tests/smoke.sh) green.

Independent validator

A Sonnet 4.6 read-only validator dispatch is queued. The orchestrator will run the canon-mandated independent corroboration (per klappy://canon/constraints/release-validation-gate Rule 2) once API credits are available; the validator brief is in the original implementation plan §8.

Validator dispatch — 2026-04-28T19:55Z

Per klappy://canon/constraints/release-validation-gate Rule 2 (load-bearing changes to orchestrate.ts require an independent validator), an isolated Sonnet 4.6 read-only validator was dispatched against SHA 3c8f77f3 with no access to the implementation plan.

Verdict: PASS. Live preview testing at feat-result-grouping-overlay-first-oddkit.klappy.workers.dev confirmed all 5 scenarios. The decisive evidence: with knowledge_base_url set and no explicit result_grouping, a baseline doc scoring 7.02 ranks below an overlay doc scoring 4.21 — the conditional-default overlay_first partition is live. Three consecutive identical curls confirmed zero flakiness.

Validator findings

ID	Severity	Status
F-1	Low (non-blocking)	Fixed in `96ef432` — `runPreflight` now applies the conditional default directly instead of `?? "merged"`, so direct importers (not just `src/core/actions.js`-mediated callers) get the correct behavior.
F-2	Cosmetic	The original PR file-changes table omitted `src/tasks/preflight.js`. The full file list as of HEAD is in the validator preview test results — see actual PR diff for ground truth.
F-3	Cosmetic	Pre-fix-forward test count (24) vs. post-fix-forward count (26) — the 2 regression tests added in `60fe9b8` are documented in the fix-forward section above.

Final file list (HEAD)

File	+ / -	Note
`CHANGELOG.md`	+9 / -0	Added + Fixed entries
`src/core/actions.js`	+99 / -38	CLI mirror, partitionByOrigin, grouped response
`src/mcp/orchestrate.js`	+3 / -0	Forward result_grouping to runPreflight (Cursor `8cb0b32a`)
`src/tasks/preflight.js`	+35 / -2	CLI preflight grouped logic (Cursor `8cb0b32a`/`3c8f77f3` + validator F-1 fix `96ef432`)
`workers/src/index.ts`	+5 / -0	Tool schema
`workers/src/orchestrate.ts`	+118 / -33	Worker partition, candidate-pool widening
`workers/src/telemetry.ts`	+13 / -8	result_grouping in blob9
`workers/test/result-grouping.test.mjs`	+598 / -0	26 tests including 2 regression cases for candidate-pool widening
`workers/test/telemetry-integration.test.mjs`	+3 / -3	blob9 length check

Final test counts

result-grouping.test.mjs: 26 / 26 pass (24 original + 2 regression cases for candidate-pool widening)
All 5 worker test files green
tsc --noEmit clean
CLI smoke green (bash tests/smoke.sh)
All 5 GitHub Actions checks: success (Workers Builds, Test CF Preview, Version Sync, Creed Freshness, Cursor Bugbot neutral)

…king when knowledge_base_url set (#150) Adds result_grouping parameter to search and preflight actions: - 'merged' (default when KB unset): pure BM25 score order (no behavior change) - 'overlay_first' (default when KB set): overlay docs ranked above baseline - 'grouped': separate overlay_hits/baseline_hits arrays in response Changes: - orchestrate.ts: ResultGrouping type, partitionBySource utility, runSearch and runPreflight accept resolvedGrouping, conditional default in dispatcher - index.ts: result_grouping in unified oddkit + oddkit_search + oddkit_preflight tool schemas, threaded through handler args - telemetry.ts: parseToolCall extracts result_grouping, blob9 repurposed - CHANGELOG.md: entry under [Unreleased] - workers/test/result-grouping.test.mjs: 24 tests covering partition logic, conditional defaults, grouped shape, preflight partition, telemetry recording - workers/test/telemetry-integration.test.mjs: updated blob count assertion No version bump (orchestrator decides packaging cadence). No changes to workers/src/bm25.ts. No cache-key change.

cloudflare-workers-and-pages · 2026-04-28T14:07:40Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Preview URL	Updated (UTC)
✅ Deployment successful! View logs	oddkit	`885fcc9`	Commit Preview URL Branch Preview URL	Apr 28 2026, 08:18 PM

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

✅ Fixed: Grouped search NO_MATCH response omits grouped arrays
- Updated the NO_MATCH early return in runSearch to include empty overlay_hits and baseline_hits arrays when resolvedGrouping is "grouped", matching the FOUND path's response shape.

Preview (63eb8e01fa)

diff --git a/CHANGELOG.md b/CHANGELOG.md
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,10 @@
 
 ## [Unreleased]
 
+### Added
+
+- **`result_grouping` parameter for search and preflight** — when `knowledge_base_url` is set, overlay (knowledge-base) docs are ranked above baseline docs by default (`"overlay_first"`). Callers can explicitly choose `"merged"` (pure BM25 score order, the previous default), `"overlay_first"` (overlay before baseline, preserving score order within each partition), or `"grouped"` (separate `overlay_hits`/`baseline_hits` arrays in search, `start_here_overlay`/`start_here_baseline` in preflight). Conditional default: `knowledge_base_url` unset → `"merged"` (no behavior change); `knowledge_base_url` set → `"overlay_first"`. Telemetry records the caller-specified value in blob9 (`result_grouping`). Closes #150.
+
 ## [0.26.0] - 2026-04-26
 
 ### Added

diff --git a/workers/src/index.ts b/workers/src/index.ts
--- a/workers/src/index.ts
+++ b/workers/src/index.ts
@@ -220,6 +220,7 @@
         "canon-tier-2", "canon-tier-1", "published-essay",
       ]).optional().describe("Optional mode hint. Epistemic modes (exploration/planning/execution) or writing-lifecycle modes (voice-dump/drafting/peer-review-ready/canon-tier-2/canon-tier-1/published-essay). Sourced from odd/challenge/stakes-calibration."),
       knowledge_base_url: z.string().optional().describe("Optional GitHub repo URL for your knowledge base. When set, strict mode is automatic: missing files fall through to the bundled governance tier rather than silently substituting from the default knowledge base."),
+      result_grouping: z.enum(["merged", "overlay_first", "grouped"]).optional().describe("For action='search' or 'preflight': controls how overlay (knowledge_base) and baseline results are ordered. 'merged' = pure score order (default when knowledge_base_url unset). 'overlay_first' = overlay docs ranked above baseline (default when knowledge_base_url set). 'grouped' = separate overlay_hits/baseline_hits arrays in response."),
       include_metadata: z.boolean().optional().describe("When true, search/get responses include a metadata object with full parsed frontmatter. Default: false."),
       section: z.string().optional().describe("For action='get': extract only the named ## section from the document. Returns section content or available sections if not found."),
       sort_by: z.enum(["date", "path"]).optional().describe("For action='catalog': sort articles. 'date' returns newest first (requires frontmatter). 'path' returns all docs alphabetically, including undated."),
@@ -241,6 +242,7 @@
         context: args.context,
         mode: args.mode,
         knowledge_base_url: args.knowledge_base_url,
+        result_grouping: args.result_grouping,
         include_metadata: args.include_metadata,
         section: args.section,
         sort_by: args.sort_by,
@@ -321,6 +323,7 @@
       schema: {
         input: z.string().describe("Natural language query or tags to search for."),
         knowledge_base_url: z.string().optional().describe("Optional: GitHub repo URL for your knowledge base. When set, strict mode is automatic: missing files fall through to the bundled governance tier."),
+        result_grouping: z.enum(["merged", "overlay_first", "grouped"]).optional().describe("Controls how overlay (knowledge_base) and baseline results are ordered. 'merged' = pure score order (default when knowledge_base_url unset). 'overlay_first' = overlay docs ranked above baseline (default when knowledge_base_url set). 'grouped' = separate overlay_hits/baseline_hits arrays in response."),
         include_metadata: z.boolean().optional().describe("When true, each hit includes a metadata object with full parsed frontmatter. Default: false."),
       },
       annotations: { readOnlyHint: true, destructiveHint: false, idempotentHint: true, openWorldHint: true },
@@ -386,6 +389,7 @@
       schema: {
         input: z.string().describe("Description of what you're about to implement."),
         knowledge_base_url: z.string().optional().describe("Optional: GitHub repo URL for your knowledge base. When set, strict mode is automatic: missing files fall through to the bundled governance tier."),
+        result_grouping: z.enum(["merged", "overlay_first", "grouped"]).optional().describe("Controls how overlay (knowledge_base) and baseline start_here results are ordered. 'merged' = pure score order (default when knowledge_base_url unset). 'overlay_first' = overlay docs ranked above baseline (default when knowledge_base_url set). 'grouped' = separate start_here_overlay/start_here_baseline arrays."),
       },
       annotations: { readOnlyHint: true, destructiveHint: false, idempotentHint: true, openWorldHint: true },
     },
@@ -432,6 +436,7 @@
           context: args.context as string | undefined,
           mode: args.mode as string | undefined,
           knowledge_base_url: args.knowledge_base_url as string | undefined,
+          result_grouping: args.result_grouping as "merged" | "overlay_first" | "grouped" | undefined,
           include_metadata: args.include_metadata as boolean | undefined,
           section: args.section as string | undefined,
           sort_by: args.sort_by as string | undefined,

diff --git a/workers/src/orchestrate.ts b/workers/src/orchestrate.ts
--- a/workers/src/orchestrate.ts
+++ b/workers/src/orchestrate.ts
@@ -225,12 +225,15 @@
 let cachedGatePrerequisitesKnowledgeBaseUrl: string | undefined = undefined;
 let cachedGatePrerequisitesSource: "knowledge_base" | "minimal" = "minimal";
 
+export type ResultGrouping = "merged" | "overlay_first" | "grouped";
+
 export interface UnifiedParams {
   action: string;
   input: string;
   context?: string;
   mode?: string;
   knowledge_base_url?: string;
+  result_grouping?: ResultGrouping;
   include_metadata?: boolean;
   section?: string;
   sort_by?: string;
@@ -1321,6 +1324,20 @@
 }
 
 // ──────────────────────────────────────────────────────────────────────────────
+// Result grouping — stable partition by source (overlay=canon vs baseline)
+// Preserves BM25 score order within each partition. Single forward pass — no re-sort.
+// ──────────────────────────────────────────────────────────────────────────────
+
+export function partitionBySource<T extends { source: "canon" | "baseline" }>(
+  arr: T[],
+): { overlay: T[]; baseline: T[] } {
+  const overlay: T[] = [];
+  const baseline: T[] = [];
+  for (const h of arr) (h.source === "canon" ? overlay : baseline).push(h);
+  return { overlay, baseline };
+}
+
+// ──────────────────────────────────────────────────────────────────────────────
 // Individual action handlers
 // ──────────────────────────────────────────────────────────────────────────────
 
@@ -1330,6 +1347,7 @@
   knowledgeBaseUrl?: string,
   state?: OddkitState,
   includeMetadata?: boolean,
+  resolvedGrouping: ResultGrouping = "merged",
 ): Promise<ActionResult> {
   const startMs = Date.now();
   const index = await fetcher.getIndex(knowledgeBaseUrl);
@@ -1354,13 +1372,18 @@
     : undefined;
 
   if (hits.length === 0) {
+    const noMatchResult: Record<string, unknown> = {
+      status: "NO_MATCH",
+      docs_considered: index.entries.length,
+      hits: [],
+    };
+    if (resolvedGrouping === "grouped") {
+      noMatchResult.overlay_hits = [];
+      noMatchResult.baseline_hits = [];
+    }
     return {
       action: "search",
-      result: {
-        status: "NO_MATCH",
-        docs_considered: index.entries.length,
-        hits: [],
-      },
+      result: noMatchResult,
       state: updatedState,
       assistant_text: `Searched ${index.stats.total} documents but found no matches for "${input}". Try rephrasing or ask with action "catalog" to see available documentation.`,
       debug: {
@@ -1373,12 +1396,22 @@
     };
   }
 
+  // Apply result_grouping partition (overlay_first / grouped re-order;
+  // merged preserves BM25 score order). Single forward pass — no re-sort.
+  let orderedHits = hits;
+  let isGrouped = false;
+  if (resolvedGrouping === "overlay_first" || resolvedGrouping === "grouped") {
+    const { overlay, baseline } = partitionBySource(hits);
+    orderedHits = [...overlay, ...baseline];
+    isGrouped = resolvedGrouping === "grouped";
+  }
+
   // Cache for fetched content to avoid redundant fetches when include_metadata is enabled
   const contentCache = new Map<string, string>();
 
-  // Fetch excerpts for top results
+  // Fetch excerpts for top results (uses partitioned order)
   const evidence: Array<{ quote: string; citation: string; source: string }> = [];
-  for (const entry of hits.slice(0, 3)) {
+  for (const entry of orderedHits.slice(0, 3)) {
     const content = await fetcher.getFile(entry.path, knowledgeBaseUrl);
     if (content) {
       contentCache.set(entry.path, content);
@@ -1394,17 +1427,18 @@
   }
 
   const assistantLines = [
-    `Found ${hits.length} result(s) for: "${input}"`,
+    `Found ${orderedHits.length} result(s) for: "${input}"`,
     "",
     ...evidence.map((e) => `> ${e.quote}\n— ${e.citation} (${e.source})`),
     "",
     "Results:",
-    ...hits.map((r) => `- \`${r.path}\` — ${r.title} (score: ${r.score.toFixed(2)}, ${r.source})`),
+    ...orderedHits.map((r) => `- \`${r.path}\` — ${r.title} (score: ${r.score.toFixed(2)}, ${r.source})`),
   ];
 
-  // When include_metadata is requested, fetch and parse frontmatter for each hit
+  // When include_metadata is requested, fetch and parse frontmatter for each hit.
+  // Iterates orderedHits so metadata-enriched array preserves the partitioned order.
   const hitsWithMetadata: Array<Record<string, unknown>> = [];
-  for (const h of hits) {
+  for (const h of orderedHits) {
     const hit: Record<string, unknown> = {
       uri: h.uri,
       path: h.path,
@@ -1425,14 +1459,27 @@
     hitsWithMetadata.push(hit);
   }
 
+  // Build result object — add overlay_hits / baseline_hits for "grouped" mode
+  const resultObj: Record<string, unknown> = {
+    status: "FOUND",
+    hits: hitsWithMetadata,
+    evidence,
+    docs_considered: index.entries.length,
+  };
+
+  if (isGrouped) {
+    const overlayHits: Record<string, unknown>[] = [];
+    const baselineHits: Record<string, unknown>[] = [];
+    for (const h of hitsWithMetadata) {
+      (h.source === "canon" ? overlayHits : baselineHits).push(h);
+    }
+    resultObj.overlay_hits = overlayHits;
+    resultObj.baseline_hits = baselineHits;
+  }
+
   return {
     action: "search",
-    result: {
-      status: "FOUND",
-      hits: hitsWithMetadata,
-      evidence,
-      docs_considered: index.entries.length,
-    },
+    result: resultObj,
     state: updatedState,
     assistant_text: assistantLines.join("\n").trim(),
     debug: {
@@ -2327,22 +2374,32 @@
   fetcher: KnowledgeBaseFetcher,
   knowledgeBaseUrl?: string,
   state?: OddkitState,
+  resolvedGrouping: ResultGrouping = "merged",
 ): Promise<ActionResult> {
   const startMs = Date.now();
   const index = await fetcher.getIndex(knowledgeBaseUrl);
   const topic = message.replace(/^preflight:\s*/i, "").trim();
-  const results = scoreEntries(index.entries, topic).slice(0, 5);
 
+  // Score all entries, then apply partition before slicing
+  const allScored = scoreEntries(index.entries, topic);
+  let orderedScored = allScored;
+  if (resolvedGrouping === "overlay_first" || resolvedGrouping === "grouped") {
+    const { overlay, baseline } = partitionBySource(allScored);
+    orderedScored = [...overlay, ...baseline];
+  }
+  const results = orderedScored.slice(0, 5);
+
   const dodEntry = index.entries.find((e) => e.path.toLowerCase().includes("definition-of-done"));
   const constraints = index.entries
     .filter((e) => e.path.includes("constraint") || e.authority_band === "governing")
     .slice(0, 3);
 
+  const startHere = results.slice(0, 3);
   const assistantText = [
     `Preflight: ${topic}`,
     ``,
     `Start here:`,
-    ...results.slice(0, 3).map((r) => `- \`${r.path}\` — ${r.title}`),
+    ...startHere.map((r) => `- \`${r.path}\` — ${r.title}`),
     ``,
     `Definition of Done:`,
     dodEntry ? `- \`${dodEntry.path}\`` : "- Check canon/definition-of-done.md",
@@ -2358,15 +2415,25 @@
     .join("\n")
     .trim();
 
+  // Build result object
+  const resultObj: Record<string, unknown> = {
+    topic,
+    start_here: startHere.map((r) => r.path),
+    dod: dodEntry?.path,
+    constraints: constraints.map((c) => c.path),
+    docs_available: index.stats.total,
+  };
+
+  // For "grouped" mode, split start_here into overlay and baseline arrays (each capped at 3)
+  if (resolvedGrouping === "grouped") {
+    const { overlay, baseline } = partitionBySource(allScored);
+    resultObj.start_here_overlay = overlay.slice(0, 3).map((r) => r.path);
+    resultObj.start_here_baseline = baseline.slice(0, 3).map((r) => r.path);
+  }
+
   return {
     action: "preflight",
-    result: {
-      topic,
-      start_here: results.slice(0, 3).map((r) => r.path),
-      dod: dodEntry?.path,
-      constraints: constraints.map((c) => c.path),
-      docs_available: index.stats.total,
-    },
+    result: resultObj,
     state: state ? initState(state) : undefined,
     assistant_text: assistantText,
     debug: {
@@ -3251,8 +3318,14 @@
 ] as const;
 
 export async function handleUnifiedAction(params: UnifiedParams): Promise<OddkitEnvelope> {
-  const { action, input, context, mode, knowledge_base_url, include_metadata, section, sort_by, limit, offset, filter_epoch, state, env, tracer } = params;
+  const { action, input, context, mode, knowledge_base_url, result_grouping, include_metadata, section, sort_by, limit, offset, filter_epoch, state, env, tracer } = params;
 
+  // Conditional default: when knowledge_base_url is set and caller didn't
+  // specify result_grouping, default to "overlay_first" (the fix for #150).
+  // When KB is unset, default to "merged" (no behavior change).
+  const resolvedGrouping: ResultGrouping =
+    result_grouping ?? (knowledge_base_url ? "overlay_first" : "merged");
+
   if (!VALID_ACTIONS.includes(action as (typeof VALID_ACTIONS)[number])) {
     return {
       action: "error",
@@ -3283,7 +3356,7 @@
         result = await runEncodeAction(input, context, fetcher, knowledge_base_url, state);
         break;
       case "search":
-        result = await runSearch(input, fetcher, knowledge_base_url, state, include_metadata);
+        result = await runSearch(input, fetcher, knowledge_base_url, state, include_metadata, resolvedGrouping);
         break;
       case "get":
         result = await runGet(input, fetcher, knowledge_base_url, state, include_metadata, section);
@@ -3301,7 +3374,7 @@
         result = await runValidate(input, state);
         break;
       case "preflight":
-        result = await runPreflight(input, fetcher, knowledge_base_url, state);
+        result = await runPreflight(input, fetcher, knowledge_base_url, state, resolvedGrouping);
         break;
       case "version":
         result = runVersion(env);
@@ -3310,7 +3383,7 @@
         result = await runCleanupStorage(fetcher, knowledge_base_url);
         break;
       default:
-        result = await runSearch(input, fetcher, knowledge_base_url, state);
+        result = await runSearch(input, fetcher, knowledge_base_url, state, undefined, resolvedGrouping);
     }
 
     // Inject trace into debug envelope (E0008.1)

diff --git a/workers/src/telemetry.ts b/workers/src/telemetry.ts
--- a/workers/src/telemetry.ts
+++ b/workers/src/telemetry.ts
@@ -185,6 +185,7 @@
   toolName: string;
   documentUri: string;
   knowledgeBaseUrl: string;
+  resultGrouping: string;
 } | null {
   if (typeof payload !== "object" || payload === null || !("method" in payload)) {
     return null;
@@ -197,7 +198,7 @@
 
   const params = msg.params;
   if (typeof params !== "object" || params === null) {
-    return { method, toolName: "", documentUri: "", knowledgeBaseUrl: "" };
+    return { method, toolName: "", documentUri: "", knowledgeBaseUrl: "", resultGrouping: "" };
   }
 
   const p = params as Record<string, unknown>;
@@ -206,6 +207,7 @@
   // Extract details from tool arguments
   let documentUri = "";
   let knowledgeBaseUrl = "";
+  let resultGrouping = "";
   const args = p.arguments;
   if (typeof args === "object" && args !== null) {
     const a = args as Record<string, unknown>;
@@ -217,9 +219,13 @@
     if (typeof a.knowledge_base_url === "string" && a.knowledge_base_url) {
       knowledgeBaseUrl = a.knowledge_base_url;
     }
+    // Extract result_grouping from tool arguments (#150)
+    if (typeof a.result_grouping === "string" && a.result_grouping) {
+      resultGrouping = a.result_grouping;
+    }
   }
 
-  return { method, toolName, documentUri, knowledgeBaseUrl };
+  return { method, toolName, documentUri, knowledgeBaseUrl, resultGrouping };
 }
 
 // ──────────────────────────────────────────────────────────────────────────────
@@ -297,9 +303,10 @@
         toolCall?.knowledgeBaseUrl || env.DEFAULT_KNOWLEDGE_BASE_URL || "",
         documentUri,
         env.ODDKIT_VERSION || BUILD_VERSION,
-        // blob9 retired (was cache_tier). Slot stays free per the
-        // "no deprecation, nobody uses them yet" rule. Cache effectiveness
-        // moved to double7/double8.
+        // blob9: result_grouping (#150). Was retired (cache_tier).
+        // Repurposed for the caller-specified grouping value; empty string
+        // when not applicable (non-search/preflight actions).
+        toolCall?.resultGrouping ?? "",
       ],
       doubles: [
         1,                // double1: count
@@ -354,9 +361,7 @@
   "knowledge_base_url", // blob6
   "document_uri",     // blob7
   "worker_version",   // blob8
-  // blob9 retired (was cache_tier). Slot stays free per the
-  // "no deprecation, nobody uses them yet" rule. Hit-rate moved to
-  // double7/double8.
+  "result_grouping",  // blob9 — repurposed from retired cache_tier (#150)
 ] as const;
 
 /**

diff --git a/workers/test/result-grouping.test.mjs b/workers/test/result-grouping.test.mjs
new file mode 100644
--- /dev/null
+++ b/workers/test/result-grouping.test.mjs
@@ -1,0 +1,533 @@
+#!/usr/bin/env node
+/**
+ * Unit + integration tests for the result_grouping feature (#150).
+ *
+ * Tests:
+ *   - partitionBySource: stable partition, edge cases, ordering guarantees
+ *   - Conditional default: KB set → overlay_first, KB unset → merged
+ *   - Grouped shape construction: overlay_hits / baseline_hits arrays
+ *   - Preflight partition: start_here_overlay / start_here_baseline
+ *   - Telemetry: blob9 carries result_grouping value
+ *
+ * Compiles orchestrate.ts + telemetry.ts via tsc into a temp dir, then
+ * dynamic-imports the compiled .js. Same pattern as tokenize.test.mjs
+ * and telemetry-integration.test.mjs.
+ */
+
+import assert from "node:assert/strict";
+import { spawnSync } from "node:child_process";
+import { mkdtempSync, writeFileSync, symlinkSync, existsSync, readdirSync, readFileSync } from "node:fs";
+import { tmpdir } from "node:os";
+import { join, dirname } from "node:path";
+import { fileURLToPath } from "node:url";
+
+const __dirname = dirname(fileURLToPath(import.meta.url));
+const WORKERS_ROOT = join(__dirname, "..");
+
+// ─── Compile orchestrate.ts + telemetry.ts to temp dir ────────────────────
+
+const tmp = mkdtempSync(join(tmpdir(), "oddkit-result-grouping-test-"));
+const tsconfig = {
+  compilerOptions: {
+    target: "ES2022",
+    module: "ES2022",
+    moduleResolution: "bundler",
+    lib: ["ES2022", "DOM"],
+    types: ["@cloudflare/workers-types"],
+    noEmitOnError: false,
+    strict: false,
+    skipLibCheck: true,
+    resolveJsonModule: true,
+    allowSyntheticDefaultImports: true,
+    esModuleInterop: true,
+    rootDir: join(WORKERS_ROOT, "src"),
+    outDir: join(tmp, "build"),
+  },
+  include: [
+    join(WORKERS_ROOT, "src", "orchestrate.ts"),
+    join(WORKERS_ROOT, "src", "telemetry.ts"),
+    join(WORKERS_ROOT, "src", "tracing.ts"),
+    join(WORKERS_ROOT, "src", "zip-baseline-fetcher.ts"),
+    join(WORKERS_ROOT, "src", "bm25.ts"),
+    join(WORKERS_ROOT, "src", "markdown-utils.ts"),
+  ],
+};
+const tsconfigPath = join(tmp, "tsconfig.json");
+writeFileSync(tsconfigPath, JSON.stringify(tsconfig, null, 2));
+
+const tmpNodeModules = join(tmp, "node_modules");
+if (!existsSync(tmpNodeModules)) {
+  symlinkSync(join(WORKERS_ROOT, "node_modules"), tmpNodeModules);
+}
+// orchestrate.ts imports ../package.json
+if (!existsSync(join(tmp, "package.json"))) {
+  symlinkSync(join(WORKERS_ROOT, "package.json"), join(tmp, "package.json"));
+}
+
+const compile = spawnSync("npx", ["--yes", "tsc", "-p", tsconfigPath], {
+  encoding: "utf8",
+});
+
+// With noEmitOnError: false, tsc may exit non-zero on type errors in the dep
+// graph (zip-baseline-fetcher.ts has workers-types friction) while still
+// producing the .js files we need. Only bail if target files weren't emitted.
+const buildDir = join(tmp, "build");
+const orchestrateJs = join(buildDir, "orchestrate.js");
+const telemetryJs = join(buildDir, "telemetry.js");
+if (!existsSync(orchestrateJs) || !existsSync(telemetryJs)) {
+  console.error("TypeScript compile failed (target files not emitted):");
+  console.error(compile.stdout);
+  console.error(compile.stderr);
+  process.exit(1);
+}
+if (compile.status !== 0 && process.env.DEBUG) {
+  console.error("Note: tsc reported errors but target .js files were emitted:");
+  console.error(compile.stdout);
+}
+
+// Patch compiled files: JSON import assertions + extensionless local imports
+for (const f of readdirSync(buildDir).filter((n) => n.endsWith(".js"))) {
+  const fpath = join(buildDir, f);
+  let src = readFileSync(fpath, "utf8");
+  src = src.replace(
+    /from ["']\.\.\/package\.json["'];/g,
+    'from "../package.json" with { type: "json" };',
+  );
+  src = src.replace(
+    /from ["'](\.\/[^"'.]+)["'];/g,
+    'from "$1.js";',
+  );
+  writeFileSync(fpath, src);
+}
+
+// Import the compiled module
+const { partitionBySource } = await import(orchestrateJs);
+const { recordTelemetry, parseToolCall } = await import(telemetryJs);
+
+// Also import tokenize for telemetry shape tests
+const tokenizeJs = join(buildDir, "tokenize.js");
+let measurePayloadShape = null;
+if (existsSync(tokenizeJs)) {
+  const tok = await import(tokenizeJs);
+  measurePayloadShape = tok.measurePayloadShape;
+}
+
+// ─── Test harness ─────────────────────────────────────────────────────────
+
+let pass = 0;
+let fail = 0;
+
+async function test(name, fn) {
+  try {
+    await fn();
+    console.log(`  ✓ ${name}`);
+    pass++;
+  } catch (err) {
+    console.log(`  ✗ ${name}`);
+    console.log(`    ${err.message}`);
+    if (err.stack && process.env.DEBUG) console.log(err.stack);
+    fail++;
+  }
+}
+
+console.log("result-grouping tests (#150)\n");
+
+// ─── Test fixtures ────────────────────────────────────────────────────────
+
+// Fixture: mixed entries with interleaving scores
+// canon entries have scores 10, 6 (ranked 1st, 3rd in BM25 order)
+// baseline entries have scores 8, 4 (ranked 2nd, 4th in BM25 order)
+// This ensures partition actually reorders — a fixture where canon always
+// outscores baseline would prove nothing.
+const mixedHits = [
+  { path: "canon/a.md", title: "Canon A", source: "canon", score: 10 },
+  { path: "docs/b.md", title: "Baseline B", source: "baseline", score: 8 },
+  { path: "canon/c.md", title: "Canon C", source: "canon", score: 6 },
+  { path: "docs/d.md", title: "Baseline D", source: "baseline", score: 4 },
+];
+
+const canonOnly = [
+  { path: "canon/x.md", title: "Canon X", source: "canon", score: 10 },
+  { path: "canon/y.md", title: "Canon Y", source: "canon", score: 5 },
+];
+
+const baselineOnly = [
+  { path: "docs/m.md", title: "Baseline M", source: "baseline", score: 9 },
+  { path: "docs/n.md", title: "Baseline N", source: "baseline", score: 3 },
+];
+
+// ─── partitionBySource tests ──────────────────────────────────────────────
+
+console.log("partitionBySource:");
+
+await test("splits mixed entries into overlay (canon) and baseline", () => {
+  const { overlay, baseline } = partitionBySource(mixedHits);
+  assert.equal(overlay.length, 2, "should have 2 overlay entries");
+  assert.equal(baseline.length, 2, "should have 2 baseline entries");
+  assert.ok(overlay.every((h) => h.source === "canon"), "all overlay should be canon");
+  assert.ok(baseline.every((h) => h.source === "baseline"), "all baseline should be baseline");
+});
+
+await test("preserves BM25 score order within each partition (stability)", () => {
+  const { overlay, baseline } = partitionBySource(mixedHits);
+  // overlay: canon/a (10) then canon/c (6)
+  assert.equal(overlay[0].path, "canon/a.md");
+  assert.equal(overlay[1].path, "canon/c.md");
+  assert.ok(overlay[0].score >= overlay[1].score, "overlay should be descending score");
+  // baseline: docs/b (8) then docs/d (4)
+  assert.equal(baseline[0].path, "docs/b.md");
+  assert.equal(baseline[1].path, "docs/d.md");
+  assert.ok(baseline[0].score >= baseline[1].score, "baseline should be descending score");
+});
+
+await test("overlay_first reorder: all canon before all baseline", () => {
+  const { overlay, baseline } = partitionBySource(mixedHits);
+  const ordered = [...overlay, ...baseline];
+  // Expected: canon/a(10), canon/c(6), docs/b(8), docs/d(4)
+  assert.equal(ordered[0].source, "canon");
+  assert.equal(ordered[1].source, "canon");
+  assert.equal(ordered[2].source, "baseline");
+  assert.equal(ordered[3].source, "baseline");
+  // Scores within tiers are descending
+  assert.ok(ordered[0].score >= ordered[1].score);
+  assert.ok(ordered[2].score >= ordered[3].score);
+});
+
+await test("canon-only input: overlay = all, baseline = empty", () => {
+  const { overlay, baseline } = partitionBySource(canonOnly);
+  assert.equal(overlay.length, 2);
+  assert.equal(baseline.length, 0);
+});
+
+await test("baseline-only input: overlay = empty, baseline = all", () => {
+  const { overlay, baseline } = partitionBySource(baselineOnly);
+  assert.equal(overlay.length, 0);
+  assert.equal(baseline.length, 2);
+});
+
+await test("empty array: both partitions empty", () => {
+  const { overlay, baseline } = partitionBySource([]);
+  assert.equal(overlay.length, 0);
+  assert.equal(baseline.length, 0);
+});
+
+await test("stability: entries with identical scores retain pre-partition relative order", () => {
+  const sameScore = [
+    { path: "canon/first.md", source: "canon", score: 5 },
+    { path: "docs/between.md", source: "baseline", score: 5 },
+    { path: "canon/second.md", source: "canon", score: 5 },
+    { path: "docs/last.md", source: "baseline", score: 5 },
+  ];
+  const { overlay, baseline } = partitionBySource(sameScore);
+  // Within canon: first then second (insertion order preserved)
+  assert.equal(overlay[0].path, "canon/first.md");
+  assert.equal(overlay[1].path, "canon/second.md");
+  // Within baseline: between then last (insertion order preserved)
+  assert.equal(baseline[0].path, "docs/between.md");
+  assert.equal(baseline[1].path, "docs/last.md");
+});
+
+// ─── Conditional default logic tests ──────────────────────────────────────
+
+console.log("\nconditional default:");
+
+await test("KB unset → default is merged", () => {
+  const knowledge_base_url = undefined;
+  const result_grouping = undefined;
+  const resolved = result_grouping ?? (knowledge_base_url ? "overlay_first" : "merged");
+  assert.equal(resolved, "merged");
+});
+
+await test("KB set → default is overlay_first", () => {
+  const knowledge_base_url = "https://github.com/klappy/klappy.dev";
+  const result_grouping = undefined;
+  const resolved = result_grouping ?? (knowledge_base_url ? "overlay_first" : "merged");
+  assert.equal(resolved, "overlay_first");
+});
+
+await test("explicit merged overrides KB-set default", () => {
+  const knowledge_base_url = "https://github.com/klappy/klappy.dev";
+  const result_grouping = "merged";
+  const resolved = result_grouping ?? (knowledge_base_url ? "overlay_first" : "merged");
+  assert.equal(resolved, "merged");
+});
+
+await test("explicit overlay_first works with KB unset", () => {
+  const knowledge_base_url = undefined;
+  const result_grouping = "overlay_first";
+  const resolved = result_grouping ?? (knowledge_base_url ? "overlay_first" : "merged");
+  assert.equal(resolved, "overlay_first");
+});
+
+await test("explicit grouped works regardless of KB", () => {
+  for (const kb of [undefined, "https://github.com/klappy/klappy.dev"]) {
+    const result_grouping = "grouped";
+    const resolved = result_grouping ?? (kb ? "overlay_first" : "merged");
+    assert.equal(resolved, "grouped", `should be grouped when kb=${kb}`);
+  }
+});
+
+// ─── Grouped shape construction tests ─────────────────────────────────────
+
+console.log("\ngrouped shape construction:");
+
+await test("grouped search: overlay_hits and baseline_hits arrays present and correct", () => {
+  // Simulate the grouped shape construction from runSearch
+  const orderedHits = (() => {
+    const { overlay, baseline } = partitionBySource(mixedHits);
+    return [...overlay, ...baseline];
+  })();
+
+  // Simulate metadata enrichment (adds uri field)
+  const hitsWithMetadata = orderedHits.map((h) => ({
+    uri: `klappy://${h.path.replace(".md", "")}`,
+    path: h.path,
+    title: h.title,
+    score: h.score,
+    source: h.source,
+  }));
+
+  // Build grouped shape
+  const overlayHits = [];
+  const baselineHits = [];
+  for (const h of hitsWithMetadata) {
+    (h.source === "canon" ? overlayHits : baselineHits).push(h);
+  }
+
+  // Assertions
+  assert.equal(overlayHits.length, 2, "overlay_hits should have 2 items");
+  assert.equal(baselineHits.length, 2, "baseline_hits should have 2 items");
+  assert.ok(overlayHits.every((h) => h.source === "canon"));
+  assert.ok(baselineHits.every((h) => h.source === "baseline"));
+
+  // hits (back-compat) is overlay-then-baseline
+  assert.equal(hitsWithMetadata[0].source, "canon");
+  assert.equal(hitsWithMetadata[1].source, "canon");
+  assert.equal(hitsWithMetadata[2].source, "baseline");
+  assert.equal(hitsWithMetadata[3].source, "baseline");
+});
+
+await test("grouped with empty overlay: overlay_hits=[], baseline_hits=[...]", () => {
+  const { overlay, baseline } = partitionBySource(baselineOnly);
+  assert.equal(overlay.length, 0);
+  assert.equal(baseline.length, 2);
+
+  const orderedHits = [...overlay, ...baseline];
+  assert.equal(orderedHits.length, 2);
+  assert.ok(orderedHits.every((h) => h.source === "baseline"));
+});
+
+await test("grouped with empty baseline: overlay_hits=[...], baseline_hits=[]", () => {
+  const { overlay, baseline } = partitionBySource(canonOnly);
+  assert.equal(overlay.length, 2);
+  assert.equal(baseline.length, 0);
+
+  const orderedHits = [...overlay, ...baseline];
+  assert.equal(orderedHits.length, 2);
+  assert.ok(orderedHits.every((h) => h.source === "canon"));
+});
+
+// ─── Preflight partition tests ────────────────────────────────────────────
+
+console.log("\npreflight partition:");
+
+await test("preflight overlay_first: partition applied before slice", () => {
+  // Simulate scoreEntries output with interleaving scores
+  const allScored = [
+    { path: "docs/high.md", source: "baseline", score: 20 },
+    { path: "canon/mid-high.md", source: "canon", score: 18 },
+    { path: "docs/mid.md", source: "baseline", score: 15 },
+    { path: "canon/mid-low.md", source: "canon", score: 12 },
+    { path: "docs/low.md", source: "baseline", score: 8 },
+    { path: "canon/lowest.md", source: "canon", score: 3 },
+  ];
+
+  // overlay_first: partition then slice(0, 5)
+  const { overlay, baseline } = partitionBySource(allScored);
+  const ordered = [...overlay, ...baseline];
+  const results = ordered.slice(0, 5);
+  const startHere = results.slice(0, 3).map((r) => r.path);
+
+  // First 3 results should be all canon (3 canon entries exist)
+  assert.equal(startHere[0], "canon/mid-high.md");
+  assert.equal(startHere[1], "canon/mid-low.md");
+  assert.equal(startHere[2], "canon/lowest.md");
+});
+
+await test("preflight grouped: start_here_overlay and start_here_baseline", () => {
+  const allScored = [
+    { path: "docs/high.md", source: "baseline", score: 20 },
+    { path: "canon/mid-high.md", source: "canon", score: 18 },
+    { path: "docs/mid.md", source: "baseline", score: 15 },
+    { path: "canon/mid-low.md", source: "canon", score: 12 },
+  ];
+
+  const { overlay, baseline } = partitionBySource(allScored);
+  const startHereOverlay = overlay.slice(0, 3).map((r) => r.path);
+  const startHereBaseline = baseline.slice(0, 3).map((r) => r.path);
+
+  assert.deepEqual(startHereOverlay, ["canon/mid-high.md", "canon/mid-low.md"]);
+  assert.deepEqual(startHereBaseline, ["docs/high.md", "docs/mid.md"]);
+});
+
+await test("preflight merged: no partition applied (pure score order)", () => {
+  const allScored = [
+    { path: "docs/high.md", source: "baseline", score: 20 },
+    { path: "canon/mid-high.md", source: "canon", score: 18 },
+    { path: "docs/mid.md", source: "baseline", score: 15 },
+  ];
+
+  // merged = just use allScored directly
+  const startHere = allScored.slice(0, 3).map((r) => r.path);
+  assert.deepEqual(startHere, ["docs/high.md", "canon/mid-high.md", "docs/mid.md"]);
+});
+
+// ─── Telemetry: parseToolCall extracts result_grouping ────────────────────
+
+console.log("\ntelemetry:");
+
+await test("parseToolCall extracts result_grouping from oddkit_search arguments", () => {
+  const payload = {
+    jsonrpc: "2.0",
+    id: 1,
+    method: "tools/call",
+    params: {
+      name: "oddkit_search",
+      arguments: {
+        input: "test query",
+        knowledge_base_url: "https://github.com/klappy/klappy.dev",
+        result_grouping: "overlay_first",
+      },
+    },
... diff truncated: showing 800 of 956 lines

_{You can send follow-ups to the cloud agent here.}

…ATCH

…CLI (#150 fix-forward) - Worker (workers/src/orchestrate.ts): runSearch now pulls 50 BM25 candidates when resolvedGrouping !== "merged", partitions, then slices to FINAL_LIMIT (5). The original implementation truncated to 5 before partition, making overlay docs at BM25 position 6+ invisible to the partition logic. - CLI (src/core/actions.js): mirrored — partitionByOrigin helper, conditional default on result_grouping (KB set → overlay_first, else merged), candidate- pool widening, overlay_hits/baseline_hits arrays for grouped mode, NO_MATCH branch includes empty grouped arrays when applicable. - Tests: 2 new regression cases for partitionBySource against widened pools. - CHANGELOG: separate Fixed entry for the candidate-pool widening, plus a CLI parity Added entry. All worker tests (5 files, 26 tests in result-grouping) and CLI smoke tests pass; tsc --noEmit clean.

klappy · 2026-04-28T17:22:47Z

Fix-forward landed in 60fe9b8 to close two gaps surfaced by comparison with #151:

Candidate-pool widening — runSearch and CLI search now pull 50 BM25 candidates when result_grouping !== "merged" and slice to 5 after partitioning. Without this, overlay docs at BM25 position 6+ were invisible to the partition.
Node CLI mirror — src/core/actions.js now has the full result_grouping parameter, conditional default, partition, and grouped response shape for parity with the worker.

Cursor Bugbot finding (20b5e41 review): "Grouped NO_MATCH omits overlay_hits/baseline_hits." Already addressed in commit 63eb8e0 for the worker; CLI now matches in 60fe9b8. Disposition: already-fixed.

Tests: 26/26 in result-grouping.test.mjs (2 new regression cases), all worker + CLI smoke tests green, tsc --noEmit clean.

Awaiting Bugbot re-review on the new commit, then a fresh-context Sonnet 4.6 validator dispatch (canon: release-validation-gate Rule 2). Validator is queued behind API credit top-up.

- runSearch: compute updatedState/canon_refs from orderedHits (the truncated returned set), not the wider 50-candidate BM25 pool. Mirrors src/core/actions.js ordering: partition+truncate first, then derive state. - runPreflight: build start_here_overlay / start_here_baseline by partitioning startHere (top 3 of results) instead of the full allScored list, so grouped arrays are subsets of start_here.

The CLI preflight case in handleAction delegated to runOrchestrate without forwarding resolvedGrouping, so runPreflight always received the default merged ordering. This broke parity with the worker, which already threads resolvedGrouping into runPreflight. - src/core/actions.js: pass resolvedGrouping into runOrchestrate for the preflight case. - src/mcp/orchestrate.js: accept result_grouping option and forward it to runPreflight. - src/tasks/preflight.js: partition start_here by origin for overlay_first/grouped, and emit start_here_overlay / start_here_baseline for grouped, mirroring workers/src/orchestrate.ts.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Reviewed by Cursor Bugbot for commit 3c8f77f. Configure here.}

…F-1) The Sonnet 4.6 read-only validator dispatch flagged that runPreflight had 'result_grouping ?? "merged"' which is dead code on every public path (src/core/actions.js pre-resolves before delegating) but would silently return "merged" for any direct importer with baseline set and no explicit result_grouping. Mirror the worker's conditional default for parity. Validator: dispatched per klappy://canon/constraints/release-validation-gate Rule 2; verdict PASS with this finding. Validator report archived in the session ledger.

klappy · 2026-04-28T19:57:52Z

Validator dispatch complete — Sonnet 4.6, fresh context, read-only.

Per klappy://canon/constraints/release-validation-gate Rule 2.

Verdict: PASS. SHA 3c8f77f3 validated; preview live testing confirmed all 5 scenarios including the critical proof point — overlay doc scoring 4.21 ranks above baseline doc scoring 7.02 when knowledge_base_url is set and result_grouping is unspecified, demonstrating the conditional default fires correctly.

Three findings:

F-1 (Low, non-blocking) — src/tasks/preflight.js had ?? "merged" instead of the conditional default. Dead code on every public path (CLI pre-resolves), but mattered for direct runPreflight importers. Fixed in 96ef432 — one-line correction to mirror the worker.
F-2 (Cosmetic) — PR file-list table predates src/tasks/preflight.js. PR body now updated with the full HEAD file list.
F-3 (Cosmetic) — Test count discrepancy already explained in the fix-forward section.

Per Rule 1, merge waits for Bugbot to complete. Bugbot already ran on 3c8f77f3 (neutral); the new commit 96ef432 will trigger a fresh review.

…pes (Bugbot) Cursor Bugbot finding (Low): worker debug envelopes for runSearch and runPreflight omitted result_grouping while CLI counterparts in src/core/actions.js and src/tasks/preflight.js included it. Closes the parity gap so debug observability is consistent across worker and CLI. The other six Bugbot findings on this PR were point-in-time reviews already addressed by intermediate commits (60fe9b8, ed3dd14, 8cb0b32, 3c8f77f, 96ef432) — verified by inspecting HEAD bytes against each finding's locator.

klappy · 2026-04-28T20:18:04Z

Bugbot disposition map — checked all 7 inline findings against HEAD bytes (885fcc9):

#	Finding	Severity	Status	Evidence
1	Grouped NO_MATCH omits arrays	Low	Stale	Fixed in `63eb8e0`. NO_MATCH branch sets `overlay_hits=[]`/`baseline_hits=[]` when `resolvedGrouping === "grouped"`.
2	State tracks unshown candidates	Med	Stale	Fixed in `ed3dd14e` (Cursor autofix). `updatedState` now derived from `orderedHits.map(h => h.path)` after slice — only tracks the 5 returned to the user.
3	Preflight grouped from wrong source	Med	Stale	Fixed in `ed3dd14e`. `partitionBySource(startHere)` not `partitionBySource(allScored)`.
4	CLI preflight ignores resolvedGrouping	Med	Stale	Fixed in `8cb0b32a`. `src/core/actions.js:487` now passes `result_grouping: resolvedGrouping` to `runOrchestrate`.
5	Redundant originByPath built twice	Low	Stale	Fixed in `3c8f77f3`. `overlayEntries`/`baselineEntries` hoisted out of conditional, reused for grouped arrays.
6	`\|\|` vs `??` in preflight default	Low	Stale	Fixed in `3c8f77f3` (`??`) and again in `96ef432` (validator F-1: full conditional default `?? (baselineOverride ? "overlay_first" : "merged")`).
7	Worker debug envelopes omit `result_grouping`	Low	Fixed in `885fcc9`	Added `result_grouping: resolvedGrouping` to `runSearch` NO_MATCH debug, `runSearch` FOUND debug, and `runPreflight` debug. CLI/worker debug parity restored.

Six of seven were point-in-time review echoes from earlier commits — Cursor Bugbot reviews each push, and findings on a stale commit don't auto-close when the issue's fixed in a later commit. Only #7 was actually live at HEAD; now closed.

tsc --noEmit clean; 26/26 result-grouping tests pass; CLI smoke green.

Awaiting Bugbot re-review on 885fcc9. Validator's PASS verdict at 3c8f77f3 stands; the F-1 fix-forward (96ef432) and this Bugbot F-7 (885fcc9) are both pure observability/parity tightenings — no behavioral change to the validated paths.

Documents the result_grouping parameter shipped in klappy/oddkit#152 (closes klappy/oddkit#150). - docs/oddkit/tools/oddkit_search.md: added knowledge_base_url and result_grouping parameters in input schema; expanded response shape with source: "canon" | "baseline" and conditional overlay_hits/baseline_hits; new Result Grouping section explaining the three values and conditional default. - docs/oddkit/tools/oddkit_preflight.md: parallel treatment scoped to preflight; expanded response shape with start_here_overlay/start_here_baseline; canon ref to klappy://canon/principles/scoped-truth.