Skip to content

feat(search,preflight): result_grouping parameter — overlay-first ranking when knowledge_base_url set (#150)#152

Merged
klappy merged 8 commits intomainfrom
feat/result-grouping-overlay-first
Apr 28, 2026
Merged

feat(search,preflight): result_grouping parameter — overlay-first ranking when knowledge_base_url set (#150)#152
klappy merged 8 commits intomainfrom
feat/result-grouping-overlay-first

Conversation

@klappy
Copy link
Copy Markdown
Owner

@klappy klappy commented Apr 28, 2026

Summary

Implements Option D1+D2 hybrid from #150: adds a result_grouping parameter to search and preflight actions that controls how overlay (knowledge-base) and baseline results are ordered.

Conditional default

Values

  • "merged" — existing behavior unchanged (BM25 score order)
  • "overlay_first" — stable partition: all source==="canon" hits precede all source==="baseline" hits; BM25 score order preserved within each partition
  • "grouped" — response carries overlay_hits / baseline_hits arrays (search) or start_here_overlay / start_here_baseline (preflight); merged hits / start_here still emitted for back-compat

Changes

File Change
workers/src/orchestrate.ts ResultGrouping type, partitionBySource utility (exported for testing), runSearch and runPreflight accept resolvedGrouping, conditional default computed in handleUnifiedAction dispatcher
workers/src/index.ts result_grouping schema added to unified oddkit tool, oddkit_search, and oddkit_preflight; threaded through handler args
workers/src/telemetry.ts parseToolCall extracts result_grouping from tool arguments; blob9 repurposed from retired cache_tier slot
workers/test/result-grouping.test.mjs 24 new tests
workers/test/telemetry-integration.test.mjs Updated blob count assertion (8→9) and schema map
CHANGELOG.md Entry under [Unreleased]

Test results

result-grouping.test.mjs:        24 passed, 0 failed
tokenize.test.mjs:                7 passed, 0 failed
governance-parser.test.mjs:     105 passed, 0 failed
telemetry-integration.test.mjs:  19 passed, 0 failed
TypeScript typecheck:            0 errors

Notes

  • No version bump in this PR. Orchestrator decides packaging cadence.
  • No changes to workers/src/bm25.ts. Re-ranking is post-BM25 in orchestrate.ts.
  • No cache-key change. The index is unchanged; only post-retrieval ordering differs.
  • Stable partition, not re-sort. partitionBySource uses a single forward pass that pushes to overlay or baseline arrays — never calls .sort() again.
  • Metadata enrichment iterates orderedHits (the partitioned order) so overlay_hits/baseline_hits in grouped mode contain the metadata-enriched objects.
  • Cursor Bugbot review awaited; orchestrator-side validator dispatch is in flight in parallel.
  • klappy.dev docs (docs/oddkit/tools/oddkit_search.md, oddkit_preflight.md) will be updated in a separate PR after this one merges.

Closes #150


Note

Medium Risk
Changes search and preflight result ordering and response shape (including a new grouped mode) across both the Cloudflare Worker and Node CLI, plus updates telemetry schema (blob9 repurposed), so regressions could affect ranking, state-threading, or downstream consumers expecting the old shape.

Overview
Adds a result_grouping parameter to search and preflight to control overlay vs baseline ordering, with a conditional default of overlay_first when knowledge_base_url/baseline override is set and merged otherwise.

Updates both Worker and Node CLI implementations to (a) stable-partition results without re-sorting, (b) support grouped responses exposing overlay_hits/baseline_hits (search) and start_here_overlay/start_here_baseline (preflight), and (c) widen the BM25 candidate pool to 50 when grouping is enabled before truncating to the final 5.

Extends observability by threading result_grouping through tool schemas and debug envelopes, and repurposes telemetry blob9 to record the caller-specified grouping value; adds/updates regression tests covering partitioning, defaults, candidate widening, and telemetry blob shape.

Reviewed by Cursor Bugbot for commit 885fcc9. Bugbot is set up for automated code reviews on this repo. Configure here.


Fix-forward (post-Cursor-comparison) — 2026-04-28T17:30Z

After comparing this PR against #151 (Cursor's parallel implementation), two real gaps were closed in commit 60fe9b8:

Gap 1 — Candidate-pool widening (worker + CLI)

runSearch previously called searchBM25(bm25, input, 5), truncating to 5 candidates before the partition. Any overlay doc ranked at BM25 position 6+ would be invisible to the partition. PR #151 caught this by widening to 50 candidates. This fix-forward applies the same approach: when resolvedGrouping !== "merged", pull 50 candidates, partition, then .slice(0, FINAL_LIMIT) (5).

Gap 2 — Node CLI mirror (src/core/actions.js)

The original PR only changed workers/src/. PR #151 also updated the Node CLI's search action so the CLI behaves consistently. This fix-forward mirrors the worker logic to src/core/actions.js:

  • New partitionByOrigin helper (CLI uses origin: "local" | "baseline" where the worker uses source: "canon" | "baseline").
  • Same conditional default (baseline set → overlay_first, else merged).
  • Candidate-pool widening on the CLI search path.
  • overlay_hits / baseline_hits arrays for grouped mode.
  • NO_MATCH branch includes empty grouped arrays when grouped (matches the existing worker fix in 63eb8e0).
  • New result_grouping field in the CLI's debug envelope.

Bugbot disposition

Cursor Bugbot left one Low-severity finding on commit 20b5e41: "When result_grouping=grouped and no hits, NO_MATCH branch omits overlay_hits/baseline_hits." This was already fixed in commit 63eb8e0 (worker side) before Bugbot's review settled. The CLI now matches this behavior in commit 60fe9b8. Disposition: already-fixed.

Tests

  • 2 new regression tests in workers/test/result-grouping.test.mjs validate the candidate-pool widening: partition surfaces overlay even when overlay is mostly low-score and widened pool: 50 candidates partition correctly without losing overlay.
  • All 26 tests in result-grouping.test.mjs pass; full worker test suite (5 files) green; tsc --noEmit clean; CLI smoke (bash tests/smoke.sh) green.

Independent validator

A Sonnet 4.6 read-only validator dispatch is queued. The orchestrator will run the canon-mandated independent corroboration (per klappy://canon/constraints/release-validation-gate Rule 2) once API credits are available; the validator brief is in the original implementation plan §8.


Validator dispatch — 2026-04-28T19:55Z

Per klappy://canon/constraints/release-validation-gate Rule 2 (load-bearing changes to orchestrate.ts require an independent validator), an isolated Sonnet 4.6 read-only validator was dispatched against SHA 3c8f77f3 with no access to the implementation plan.

Verdict: PASS. Live preview testing at feat-result-grouping-overlay-first-oddkit.klappy.workers.dev confirmed all 5 scenarios. The decisive evidence: with knowledge_base_url set and no explicit result_grouping, a baseline doc scoring 7.02 ranks below an overlay doc scoring 4.21 — the conditional-default overlay_first partition is live. Three consecutive identical curls confirmed zero flakiness.

Validator findings

ID Severity Status
F-1 Low (non-blocking) Fixed in 96ef432runPreflight now applies the conditional default directly instead of ?? "merged", so direct importers (not just src/core/actions.js-mediated callers) get the correct behavior.
F-2 Cosmetic The original PR file-changes table omitted src/tasks/preflight.js. The full file list as of HEAD is in the validator preview test results — see actual PR diff for ground truth.
F-3 Cosmetic Pre-fix-forward test count (24) vs. post-fix-forward count (26) — the 2 regression tests added in 60fe9b8 are documented in the fix-forward section above.

Final file list (HEAD)

File + / - Note
CHANGELOG.md +9 / -0 Added + Fixed entries
src/core/actions.js +99 / -38 CLI mirror, partitionByOrigin, grouped response
src/mcp/orchestrate.js +3 / -0 Forward result_grouping to runPreflight (Cursor 8cb0b32a)
src/tasks/preflight.js +35 / -2 CLI preflight grouped logic (Cursor 8cb0b32a/3c8f77f3 + validator F-1 fix 96ef432)
workers/src/index.ts +5 / -0 Tool schema
workers/src/orchestrate.ts +118 / -33 Worker partition, candidate-pool widening
workers/src/telemetry.ts +13 / -8 result_grouping in blob9
workers/test/result-grouping.test.mjs +598 / -0 26 tests including 2 regression cases for candidate-pool widening
workers/test/telemetry-integration.test.mjs +3 / -3 blob9 length check

Final test counts

  • result-grouping.test.mjs: 26 / 26 pass (24 original + 2 regression cases for candidate-pool widening)
  • All 5 worker test files green
  • tsc --noEmit clean
  • CLI smoke green (bash tests/smoke.sh)
  • All 5 GitHub Actions checks: success (Workers Builds, Test CF Preview, Version Sync, Creed Freshness, Cursor Bugbot neutral)

…king when knowledge_base_url set (#150)

Adds result_grouping parameter to search and preflight actions:
- 'merged' (default when KB unset): pure BM25 score order (no behavior change)
- 'overlay_first' (default when KB set): overlay docs ranked above baseline
- 'grouped': separate overlay_hits/baseline_hits arrays in response

Changes:
- orchestrate.ts: ResultGrouping type, partitionBySource utility, runSearch
  and runPreflight accept resolvedGrouping, conditional default in dispatcher
- index.ts: result_grouping in unified oddkit + oddkit_search + oddkit_preflight
  tool schemas, threaded through handler args
- telemetry.ts: parseToolCall extracts result_grouping, blob9 repurposed
- CHANGELOG.md: entry under [Unreleased]
- workers/test/result-grouping.test.mjs: 24 tests covering partition logic,
  conditional defaults, grouped shape, preflight partition, telemetry recording
- workers/test/telemetry-integration.test.mjs: updated blob count assertion

No version bump (orchestrator decides packaging cadence).
No changes to workers/src/bm25.ts.
No cache-key change.
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Apr 28, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
oddkit 885fcc9 Commit Preview URL

Branch Preview URL
Apr 28 2026, 08:18 PM

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Grouped search NO_MATCH response omits grouped arrays
    • Updated the NO_MATCH early return in runSearch to include empty overlay_hits and baseline_hits arrays when resolvedGrouping is "grouped", matching the FOUND path's response shape.
Preview (63eb8e01fa)
diff --git a/CHANGELOG.md b/CHANGELOG.md
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,10 @@
 
 ## [Unreleased]
 
+### Added
+
+- **`result_grouping` parameter for search and preflight** — when `knowledge_base_url` is set, overlay (knowledge-base) docs are ranked above baseline docs by default (`"overlay_first"`). Callers can explicitly choose `"merged"` (pure BM25 score order, the previous default), `"overlay_first"` (overlay before baseline, preserving score order within each partition), or `"grouped"` (separate `overlay_hits`/`baseline_hits` arrays in search, `start_here_overlay`/`start_here_baseline` in preflight). Conditional default: `knowledge_base_url` unset → `"merged"` (no behavior change); `knowledge_base_url` set → `"overlay_first"`. Telemetry records the caller-specified value in blob9 (`result_grouping`). Closes #150.
+
 ## [0.26.0] - 2026-04-26
 
 ### Added

diff --git a/workers/src/index.ts b/workers/src/index.ts
--- a/workers/src/index.ts
+++ b/workers/src/index.ts
@@ -220,6 +220,7 @@
         "canon-tier-2", "canon-tier-1", "published-essay",
       ]).optional().describe("Optional mode hint. Epistemic modes (exploration/planning/execution) or writing-lifecycle modes (voice-dump/drafting/peer-review-ready/canon-tier-2/canon-tier-1/published-essay). Sourced from odd/challenge/stakes-calibration."),
       knowledge_base_url: z.string().optional().describe("Optional GitHub repo URL for your knowledge base. When set, strict mode is automatic: missing files fall through to the bundled governance tier rather than silently substituting from the default knowledge base."),
+      result_grouping: z.enum(["merged", "overlay_first", "grouped"]).optional().describe("For action='search' or 'preflight': controls how overlay (knowledge_base) and baseline results are ordered. 'merged' = pure score order (default when knowledge_base_url unset). 'overlay_first' = overlay docs ranked above baseline (default when knowledge_base_url set). 'grouped' = separate overlay_hits/baseline_hits arrays in response."),
       include_metadata: z.boolean().optional().describe("When true, search/get responses include a metadata object with full parsed frontmatter. Default: false."),
       section: z.string().optional().describe("For action='get': extract only the named ## section from the document. Returns section content or available sections if not found."),
       sort_by: z.enum(["date", "path"]).optional().describe("For action='catalog': sort articles. 'date' returns newest first (requires frontmatter). 'path' returns all docs alphabetically, including undated."),
@@ -241,6 +242,7 @@
         context: args.context,
         mode: args.mode,
         knowledge_base_url: args.knowledge_base_url,
+        result_grouping: args.result_grouping,
         include_metadata: args.include_metadata,
         section: args.section,
         sort_by: args.sort_by,
@@ -321,6 +323,7 @@
       schema: {
         input: z.string().describe("Natural language query or tags to search for."),
         knowledge_base_url: z.string().optional().describe("Optional: GitHub repo URL for your knowledge base. When set, strict mode is automatic: missing files fall through to the bundled governance tier."),
+        result_grouping: z.enum(["merged", "overlay_first", "grouped"]).optional().describe("Controls how overlay (knowledge_base) and baseline results are ordered. 'merged' = pure score order (default when knowledge_base_url unset). 'overlay_first' = overlay docs ranked above baseline (default when knowledge_base_url set). 'grouped' = separate overlay_hits/baseline_hits arrays in response."),
         include_metadata: z.boolean().optional().describe("When true, each hit includes a metadata object with full parsed frontmatter. Default: false."),
       },
       annotations: { readOnlyHint: true, destructiveHint: false, idempotentHint: true, openWorldHint: true },
@@ -386,6 +389,7 @@
       schema: {
         input: z.string().describe("Description of what you're about to implement."),
         knowledge_base_url: z.string().optional().describe("Optional: GitHub repo URL for your knowledge base. When set, strict mode is automatic: missing files fall through to the bundled governance tier."),
+        result_grouping: z.enum(["merged", "overlay_first", "grouped"]).optional().describe("Controls how overlay (knowledge_base) and baseline start_here results are ordered. 'merged' = pure score order (default when knowledge_base_url unset). 'overlay_first' = overlay docs ranked above baseline (default when knowledge_base_url set). 'grouped' = separate start_here_overlay/start_here_baseline arrays."),
       },
       annotations: { readOnlyHint: true, destructiveHint: false, idempotentHint: true, openWorldHint: true },
     },
@@ -432,6 +436,7 @@
           context: args.context as string | undefined,
           mode: args.mode as string | undefined,
           knowledge_base_url: args.knowledge_base_url as string | undefined,
+          result_grouping: args.result_grouping as "merged" | "overlay_first" | "grouped" | undefined,
           include_metadata: args.include_metadata as boolean | undefined,
           section: args.section as string | undefined,
           sort_by: args.sort_by as string | undefined,

diff --git a/workers/src/orchestrate.ts b/workers/src/orchestrate.ts
--- a/workers/src/orchestrate.ts
+++ b/workers/src/orchestrate.ts
@@ -225,12 +225,15 @@
 let cachedGatePrerequisitesKnowledgeBaseUrl: string | undefined = undefined;
 let cachedGatePrerequisitesSource: "knowledge_base" | "minimal" = "minimal";
 
+export type ResultGrouping = "merged" | "overlay_first" | "grouped";
+
 export interface UnifiedParams {
   action: string;
   input: string;
   context?: string;
   mode?: string;
   knowledge_base_url?: string;
+  result_grouping?: ResultGrouping;
   include_metadata?: boolean;
   section?: string;
   sort_by?: string;
@@ -1321,6 +1324,20 @@
 }
 
 // ──────────────────────────────────────────────────────────────────────────────
+// Result grouping — stable partition by source (overlay=canon vs baseline)
+// Preserves BM25 score order within each partition. Single forward pass — no re-sort.
+// ──────────────────────────────────────────────────────────────────────────────
+
+export function partitionBySource<T extends { source: "canon" | "baseline" }>(
+  arr: T[],
+): { overlay: T[]; baseline: T[] } {
+  const overlay: T[] = [];
+  const baseline: T[] = [];
+  for (const h of arr) (h.source === "canon" ? overlay : baseline).push(h);
+  return { overlay, baseline };
+}
+
+// ──────────────────────────────────────────────────────────────────────────────
 // Individual action handlers
 // ──────────────────────────────────────────────────────────────────────────────
 
@@ -1330,6 +1347,7 @@
   knowledgeBaseUrl?: string,
   state?: OddkitState,
   includeMetadata?: boolean,
+  resolvedGrouping: ResultGrouping = "merged",
 ): Promise<ActionResult> {
   const startMs = Date.now();
   const index = await fetcher.getIndex(knowledgeBaseUrl);
@@ -1354,13 +1372,18 @@
     : undefined;
 
   if (hits.length === 0) {
+    const noMatchResult: Record<string, unknown> = {
+      status: "NO_MATCH",
+      docs_considered: index.entries.length,
+      hits: [],
+    };
+    if (resolvedGrouping === "grouped") {
+      noMatchResult.overlay_hits = [];
+      noMatchResult.baseline_hits = [];
+    }
     return {
       action: "search",
-      result: {
-        status: "NO_MATCH",
-        docs_considered: index.entries.length,
-        hits: [],
-      },
+      result: noMatchResult,
       state: updatedState,
       assistant_text: `Searched ${index.stats.total} documents but found no matches for "${input}". Try rephrasing or ask with action "catalog" to see available documentation.`,
       debug: {
@@ -1373,12 +1396,22 @@
     };
   }
 
+  // Apply result_grouping partition (overlay_first / grouped re-order;
+  // merged preserves BM25 score order). Single forward pass — no re-sort.
+  let orderedHits = hits;
+  let isGrouped = false;
+  if (resolvedGrouping === "overlay_first" || resolvedGrouping === "grouped") {
+    const { overlay, baseline } = partitionBySource(hits);
+    orderedHits = [...overlay, ...baseline];
+    isGrouped = resolvedGrouping === "grouped";
+  }
+
   // Cache for fetched content to avoid redundant fetches when include_metadata is enabled
   const contentCache = new Map<string, string>();
 
-  // Fetch excerpts for top results
+  // Fetch excerpts for top results (uses partitioned order)
   const evidence: Array<{ quote: string; citation: string; source: string }> = [];
-  for (const entry of hits.slice(0, 3)) {
+  for (const entry of orderedHits.slice(0, 3)) {
     const content = await fetcher.getFile(entry.path, knowledgeBaseUrl);
     if (content) {
       contentCache.set(entry.path, content);
@@ -1394,17 +1427,18 @@
   }
 
   const assistantLines = [
-    `Found ${hits.length} result(s) for: "${input}"`,
+    `Found ${orderedHits.length} result(s) for: "${input}"`,
     "",
     ...evidence.map((e) => `> ${e.quote}\n— ${e.citation} (${e.source})`),
     "",
     "Results:",
-    ...hits.map((r) => `- \`${r.path}\` — ${r.title} (score: ${r.score.toFixed(2)}, ${r.source})`),
+    ...orderedHits.map((r) => `- \`${r.path}\` — ${r.title} (score: ${r.score.toFixed(2)}, ${r.source})`),
   ];
 
-  // When include_metadata is requested, fetch and parse frontmatter for each hit
+  // When include_metadata is requested, fetch and parse frontmatter for each hit.
+  // Iterates orderedHits so metadata-enriched array preserves the partitioned order.
   const hitsWithMetadata: Array<Record<string, unknown>> = [];
-  for (const h of hits) {
+  for (const h of orderedHits) {
     const hit: Record<string, unknown> = {
       uri: h.uri,
       path: h.path,
@@ -1425,14 +1459,27 @@
     hitsWithMetadata.push(hit);
   }
 
+  // Build result object — add overlay_hits / baseline_hits for "grouped" mode
+  const resultObj: Record<string, unknown> = {
+    status: "FOUND",
+    hits: hitsWithMetadata,
+    evidence,
+    docs_considered: index.entries.length,
+  };
+
+  if (isGrouped) {
+    const overlayHits: Record<string, unknown>[] = [];
+    const baselineHits: Record<string, unknown>[] = [];
+    for (const h of hitsWithMetadata) {
+      (h.source === "canon" ? overlayHits : baselineHits).push(h);
+    }
+    resultObj.overlay_hits = overlayHits;
+    resultObj.baseline_hits = baselineHits;
+  }
+
   return {
     action: "search",
-    result: {
-      status: "FOUND",
-      hits: hitsWithMetadata,
-      evidence,
-      docs_considered: index.entries.length,
-    },
+    result: resultObj,
     state: updatedState,
     assistant_text: assistantLines.join("\n").trim(),
     debug: {
@@ -2327,22 +2374,32 @@
   fetcher: KnowledgeBaseFetcher,
   knowledgeBaseUrl?: string,
   state?: OddkitState,
+  resolvedGrouping: ResultGrouping = "merged",
 ): Promise<ActionResult> {
   const startMs = Date.now();
   const index = await fetcher.getIndex(knowledgeBaseUrl);
   const topic = message.replace(/^preflight:\s*/i, "").trim();
-  const results = scoreEntries(index.entries, topic).slice(0, 5);
 
+  // Score all entries, then apply partition before slicing
+  const allScored = scoreEntries(index.entries, topic);
+  let orderedScored = allScored;
+  if (resolvedGrouping === "overlay_first" || resolvedGrouping === "grouped") {
+    const { overlay, baseline } = partitionBySource(allScored);
+    orderedScored = [...overlay, ...baseline];
+  }
+  const results = orderedScored.slice(0, 5);
+
   const dodEntry = index.entries.find((e) => e.path.toLowerCase().includes("definition-of-done"));
   const constraints = index.entries
     .filter((e) => e.path.includes("constraint") || e.authority_band === "governing")
     .slice(0, 3);
 
+  const startHere = results.slice(0, 3);
   const assistantText = [
     `Preflight: ${topic}`,
     ``,
     `Start here:`,
-    ...results.slice(0, 3).map((r) => `- \`${r.path}\` — ${r.title}`),
+    ...startHere.map((r) => `- \`${r.path}\` — ${r.title}`),
     ``,
     `Definition of Done:`,
     dodEntry ? `- \`${dodEntry.path}\`` : "- Check canon/definition-of-done.md",
@@ -2358,15 +2415,25 @@
     .join("\n")
     .trim();
 
+  // Build result object
+  const resultObj: Record<string, unknown> = {
+    topic,
+    start_here: startHere.map((r) => r.path),
+    dod: dodEntry?.path,
+    constraints: constraints.map((c) => c.path),
+    docs_available: index.stats.total,
+  };
+
+  // For "grouped" mode, split start_here into overlay and baseline arrays (each capped at 3)
+  if (resolvedGrouping === "grouped") {
+    const { overlay, baseline } = partitionBySource(allScored);
+    resultObj.start_here_overlay = overlay.slice(0, 3).map((r) => r.path);
+    resultObj.start_here_baseline = baseline.slice(0, 3).map((r) => r.path);
+  }
+
   return {
     action: "preflight",
-    result: {
-      topic,
-      start_here: results.slice(0, 3).map((r) => r.path),
-      dod: dodEntry?.path,
-      constraints: constraints.map((c) => c.path),
-      docs_available: index.stats.total,
-    },
+    result: resultObj,
     state: state ? initState(state) : undefined,
     assistant_text: assistantText,
     debug: {
@@ -3251,8 +3318,14 @@
 ] as const;
 
 export async function handleUnifiedAction(params: UnifiedParams): Promise<OddkitEnvelope> {
-  const { action, input, context, mode, knowledge_base_url, include_metadata, section, sort_by, limit, offset, filter_epoch, state, env, tracer } = params;
+  const { action, input, context, mode, knowledge_base_url, result_grouping, include_metadata, section, sort_by, limit, offset, filter_epoch, state, env, tracer } = params;
 
+  // Conditional default: when knowledge_base_url is set and caller didn't
+  // specify result_grouping, default to "overlay_first" (the fix for #150).
+  // When KB is unset, default to "merged" (no behavior change).
+  const resolvedGrouping: ResultGrouping =
+    result_grouping ?? (knowledge_base_url ? "overlay_first" : "merged");
+
   if (!VALID_ACTIONS.includes(action as (typeof VALID_ACTIONS)[number])) {
     return {
       action: "error",
@@ -3283,7 +3356,7 @@
         result = await runEncodeAction(input, context, fetcher, knowledge_base_url, state);
         break;
       case "search":
-        result = await runSearch(input, fetcher, knowledge_base_url, state, include_metadata);
+        result = await runSearch(input, fetcher, knowledge_base_url, state, include_metadata, resolvedGrouping);
         break;
       case "get":
         result = await runGet(input, fetcher, knowledge_base_url, state, include_metadata, section);
@@ -3301,7 +3374,7 @@
         result = await runValidate(input, state);
         break;
       case "preflight":
-        result = await runPreflight(input, fetcher, knowledge_base_url, state);
+        result = await runPreflight(input, fetcher, knowledge_base_url, state, resolvedGrouping);
         break;
       case "version":
         result = runVersion(env);
@@ -3310,7 +3383,7 @@
         result = await runCleanupStorage(fetcher, knowledge_base_url);
         break;
       default:
-        result = await runSearch(input, fetcher, knowledge_base_url, state);
+        result = await runSearch(input, fetcher, knowledge_base_url, state, undefined, resolvedGrouping);
     }
 
     // Inject trace into debug envelope (E0008.1)

diff --git a/workers/src/telemetry.ts b/workers/src/telemetry.ts
--- a/workers/src/telemetry.ts
+++ b/workers/src/telemetry.ts
@@ -185,6 +185,7 @@
   toolName: string;
   documentUri: string;
   knowledgeBaseUrl: string;
+  resultGrouping: string;
 } | null {
   if (typeof payload !== "object" || payload === null || !("method" in payload)) {
     return null;
@@ -197,7 +198,7 @@
 
   const params = msg.params;
   if (typeof params !== "object" || params === null) {
-    return { method, toolName: "", documentUri: "", knowledgeBaseUrl: "" };
+    return { method, toolName: "", documentUri: "", knowledgeBaseUrl: "", resultGrouping: "" };
   }
 
   const p = params as Record<string, unknown>;
@@ -206,6 +207,7 @@
   // Extract details from tool arguments
   let documentUri = "";
   let knowledgeBaseUrl = "";
+  let resultGrouping = "";
   const args = p.arguments;
   if (typeof args === "object" && args !== null) {
     const a = args as Record<string, unknown>;
@@ -217,9 +219,13 @@
     if (typeof a.knowledge_base_url === "string" && a.knowledge_base_url) {
       knowledgeBaseUrl = a.knowledge_base_url;
     }
+    // Extract result_grouping from tool arguments (#150)
+    if (typeof a.result_grouping === "string" && a.result_grouping) {
+      resultGrouping = a.result_grouping;
+    }
   }
 
-  return { method, toolName, documentUri, knowledgeBaseUrl };
+  return { method, toolName, documentUri, knowledgeBaseUrl, resultGrouping };
 }
 
 // ──────────────────────────────────────────────────────────────────────────────
@@ -297,9 +303,10 @@
         toolCall?.knowledgeBaseUrl || env.DEFAULT_KNOWLEDGE_BASE_URL || "",
         documentUri,
         env.ODDKIT_VERSION || BUILD_VERSION,
-        // blob9 retired (was cache_tier). Slot stays free per the
-        // "no deprecation, nobody uses them yet" rule. Cache effectiveness
-        // moved to double7/double8.
+        // blob9: result_grouping (#150). Was retired (cache_tier).
+        // Repurposed for the caller-specified grouping value; empty string
+        // when not applicable (non-search/preflight actions).
+        toolCall?.resultGrouping ?? "",
       ],
       doubles: [
         1,                // double1: count
@@ -354,9 +361,7 @@
   "knowledge_base_url", // blob6
   "document_uri",     // blob7
   "worker_version",   // blob8
-  // blob9 retired (was cache_tier). Slot stays free per the
-  // "no deprecation, nobody uses them yet" rule. Hit-rate moved to
-  // double7/double8.
+  "result_grouping",  // blob9 — repurposed from retired cache_tier (#150)
 ] as const;
 
 /**

diff --git a/workers/test/result-grouping.test.mjs b/workers/test/result-grouping.test.mjs
new file mode 100644
--- /dev/null
+++ b/workers/test/result-grouping.test.mjs
@@ -1,0 +1,533 @@
+#!/usr/bin/env node
+/**
+ * Unit + integration tests for the result_grouping feature (#150).
+ *
+ * Tests:
+ *   - partitionBySource: stable partition, edge cases, ordering guarantees
+ *   - Conditional default: KB set → overlay_first, KB unset → merged
+ *   - Grouped shape construction: overlay_hits / baseline_hits arrays
+ *   - Preflight partition: start_here_overlay / start_here_baseline
+ *   - Telemetry: blob9 carries result_grouping value
+ *
+ * Compiles orchestrate.ts + telemetry.ts via tsc into a temp dir, then
+ * dynamic-imports the compiled .js. Same pattern as tokenize.test.mjs
+ * and telemetry-integration.test.mjs.
+ */
+
+import assert from "node:assert/strict";
+import { spawnSync } from "node:child_process";
+import { mkdtempSync, writeFileSync, symlinkSync, existsSync, readdirSync, readFileSync } from "node:fs";
+import { tmpdir } from "node:os";
+import { join, dirname } from "node:path";
+import { fileURLToPath } from "node:url";
+
+const __dirname = dirname(fileURLToPath(import.meta.url));
+const WORKERS_ROOT = join(__dirname, "..");
+
+// ─── Compile orchestrate.ts + telemetry.ts to temp dir ────────────────────
+
+const tmp = mkdtempSync(join(tmpdir(), "oddkit-result-grouping-test-"));
+const tsconfig = {
+  compilerOptions: {
+    target: "ES2022",
+    module: "ES2022",
+    moduleResolution: "bundler",
+    lib: ["ES2022", "DOM"],
+    types: ["@cloudflare/workers-types"],
+    noEmitOnError: false,
+    strict: false,
+    skipLibCheck: true,
+    resolveJsonModule: true,
+    allowSyntheticDefaultImports: true,
+    esModuleInterop: true,
+    rootDir: join(WORKERS_ROOT, "src"),
+    outDir: join(tmp, "build"),
+  },
+  include: [
+    join(WORKERS_ROOT, "src", "orchestrate.ts"),
+    join(WORKERS_ROOT, "src", "telemetry.ts"),
+    join(WORKERS_ROOT, "src", "tracing.ts"),
+    join(WORKERS_ROOT, "src", "zip-baseline-fetcher.ts"),
+    join(WORKERS_ROOT, "src", "bm25.ts"),
+    join(WORKERS_ROOT, "src", "markdown-utils.ts"),
+  ],
+};
+const tsconfigPath = join(tmp, "tsconfig.json");
+writeFileSync(tsconfigPath, JSON.stringify(tsconfig, null, 2));
+
+const tmpNodeModules = join(tmp, "node_modules");
+if (!existsSync(tmpNodeModules)) {
+  symlinkSync(join(WORKERS_ROOT, "node_modules"), tmpNodeModules);
+}
+// orchestrate.ts imports ../package.json
+if (!existsSync(join(tmp, "package.json"))) {
+  symlinkSync(join(WORKERS_ROOT, "package.json"), join(tmp, "package.json"));
+}
+
+const compile = spawnSync("npx", ["--yes", "tsc", "-p", tsconfigPath], {
+  encoding: "utf8",
+});
+
+// With noEmitOnError: false, tsc may exit non-zero on type errors in the dep
+// graph (zip-baseline-fetcher.ts has workers-types friction) while still
+// producing the .js files we need. Only bail if target files weren't emitted.
+const buildDir = join(tmp, "build");
+const orchestrateJs = join(buildDir, "orchestrate.js");
+const telemetryJs = join(buildDir, "telemetry.js");
+if (!existsSync(orchestrateJs) || !existsSync(telemetryJs)) {
+  console.error("TypeScript compile failed (target files not emitted):");
+  console.error(compile.stdout);
+  console.error(compile.stderr);
+  process.exit(1);
+}
+if (compile.status !== 0 && process.env.DEBUG) {
+  console.error("Note: tsc reported errors but target .js files were emitted:");
+  console.error(compile.stdout);
+}
+
+// Patch compiled files: JSON import assertions + extensionless local imports
+for (const f of readdirSync(buildDir).filter((n) => n.endsWith(".js"))) {
+  const fpath = join(buildDir, f);
+  let src = readFileSync(fpath, "utf8");
+  src = src.replace(
+    /from ["']\.\.\/package\.json["'];/g,
+    'from "../package.json" with { type: "json" };',
+  );
+  src = src.replace(
+    /from ["'](\.\/[^"'.]+)["'];/g,
+    'from "$1.js";',
+  );
+  writeFileSync(fpath, src);
+}
+
+// Import the compiled module
+const { partitionBySource } = await import(orchestrateJs);
+const { recordTelemetry, parseToolCall } = await import(telemetryJs);
+
+// Also import tokenize for telemetry shape tests
+const tokenizeJs = join(buildDir, "tokenize.js");
+let measurePayloadShape = null;
+if (existsSync(tokenizeJs)) {
+  const tok = await import(tokenizeJs);
+  measurePayloadShape = tok.measurePayloadShape;
+}
+
+// ─── Test harness ─────────────────────────────────────────────────────────
+
+let pass = 0;
+let fail = 0;
+
+async function test(name, fn) {
+  try {
+    await fn();
+    console.log(`  ✓ ${name}`);
+    pass++;
+  } catch (err) {
+    console.log(`  ✗ ${name}`);
+    console.log(`    ${err.message}`);
+    if (err.stack && process.env.DEBUG) console.log(err.stack);
+    fail++;
+  }
+}
+
+console.log("result-grouping tests (#150)\n");
+
+// ─── Test fixtures ────────────────────────────────────────────────────────
+
+// Fixture: mixed entries with interleaving scores
+// canon entries have scores 10, 6 (ranked 1st, 3rd in BM25 order)
+// baseline entries have scores 8, 4 (ranked 2nd, 4th in BM25 order)
+// This ensures partition actually reorders — a fixture where canon always
+// outscores baseline would prove nothing.
+const mixedHits = [
+  { path: "canon/a.md", title: "Canon A", source: "canon", score: 10 },
+  { path: "docs/b.md", title: "Baseline B", source: "baseline", score: 8 },
+  { path: "canon/c.md", title: "Canon C", source: "canon", score: 6 },
+  { path: "docs/d.md", title: "Baseline D", source: "baseline", score: 4 },
+];
+
+const canonOnly = [
+  { path: "canon/x.md", title: "Canon X", source: "canon", score: 10 },
+  { path: "canon/y.md", title: "Canon Y", source: "canon", score: 5 },
+];
+
+const baselineOnly = [
+  { path: "docs/m.md", title: "Baseline M", source: "baseline", score: 9 },
+  { path: "docs/n.md", title: "Baseline N", source: "baseline", score: 3 },
+];
+
+// ─── partitionBySource tests ──────────────────────────────────────────────
+
+console.log("partitionBySource:");
+
+await test("splits mixed entries into overlay (canon) and baseline", () => {
+  const { overlay, baseline } = partitionBySource(mixedHits);
+  assert.equal(overlay.length, 2, "should have 2 overlay entries");
+  assert.equal(baseline.length, 2, "should have 2 baseline entries");
+  assert.ok(overlay.every((h) => h.source === "canon"), "all overlay should be canon");
+  assert.ok(baseline.every((h) => h.source === "baseline"), "all baseline should be baseline");
+});
+
+await test("preserves BM25 score order within each partition (stability)", () => {
+  const { overlay, baseline } = partitionBySource(mixedHits);
+  // overlay: canon/a (10) then canon/c (6)
+  assert.equal(overlay[0].path, "canon/a.md");
+  assert.equal(overlay[1].path, "canon/c.md");
+  assert.ok(overlay[0].score >= overlay[1].score, "overlay should be descending score");
+  // baseline: docs/b (8) then docs/d (4)
+  assert.equal(baseline[0].path, "docs/b.md");
+  assert.equal(baseline[1].path, "docs/d.md");
+  assert.ok(baseline[0].score >= baseline[1].score, "baseline should be descending score");
+});
+
+await test("overlay_first reorder: all canon before all baseline", () => {
+  const { overlay, baseline } = partitionBySource(mixedHits);
+  const ordered = [...overlay, ...baseline];
+  // Expected: canon/a(10), canon/c(6), docs/b(8), docs/d(4)
+  assert.equal(ordered[0].source, "canon");
+  assert.equal(ordered[1].source, "canon");
+  assert.equal(ordered[2].source, "baseline");
+  assert.equal(ordered[3].source, "baseline");
+  // Scores within tiers are descending
+  assert.ok(ordered[0].score >= ordered[1].score);
+  assert.ok(ordered[2].score >= ordered[3].score);
+});
+
+await test("canon-only input: overlay = all, baseline = empty", () => {
+  const { overlay, baseline } = partitionBySource(canonOnly);
+  assert.equal(overlay.length, 2);
+  assert.equal(baseline.length, 0);
+});
+
+await test("baseline-only input: overlay = empty, baseline = all", () => {
+  const { overlay, baseline } = partitionBySource(baselineOnly);
+  assert.equal(overlay.length, 0);
+  assert.equal(baseline.length, 2);
+});
+
+await test("empty array: both partitions empty", () => {
+  const { overlay, baseline } = partitionBySource([]);
+  assert.equal(overlay.length, 0);
+  assert.equal(baseline.length, 0);
+});
+
+await test("stability: entries with identical scores retain pre-partition relative order", () => {
+  const sameScore = [
+    { path: "canon/first.md", source: "canon", score: 5 },
+    { path: "docs/between.md", source: "baseline", score: 5 },
+    { path: "canon/second.md", source: "canon", score: 5 },
+    { path: "docs/last.md", source: "baseline", score: 5 },
+  ];
+  const { overlay, baseline } = partitionBySource(sameScore);
+  // Within canon: first then second (insertion order preserved)
+  assert.equal(overlay[0].path, "canon/first.md");
+  assert.equal(overlay[1].path, "canon/second.md");
+  // Within baseline: between then last (insertion order preserved)
+  assert.equal(baseline[0].path, "docs/between.md");
+  assert.equal(baseline[1].path, "docs/last.md");
+});
+
+// ─── Conditional default logic tests ──────────────────────────────────────
+
+console.log("\nconditional default:");
+
+await test("KB unset → default is merged", () => {
+  const knowledge_base_url = undefined;
+  const result_grouping = undefined;
+  const resolved = result_grouping ?? (knowledge_base_url ? "overlay_first" : "merged");
+  assert.equal(resolved, "merged");
+});
+
+await test("KB set → default is overlay_first", () => {
+  const knowledge_base_url = "https://github.com/klappy/klappy.dev";
+  const result_grouping = undefined;
+  const resolved = result_grouping ?? (knowledge_base_url ? "overlay_first" : "merged");
+  assert.equal(resolved, "overlay_first");
+});
+
+await test("explicit merged overrides KB-set default", () => {
+  const knowledge_base_url = "https://github.com/klappy/klappy.dev";
+  const result_grouping = "merged";
+  const resolved = result_grouping ?? (knowledge_base_url ? "overlay_first" : "merged");
+  assert.equal(resolved, "merged");
+});
+
+await test("explicit overlay_first works with KB unset", () => {
+  const knowledge_base_url = undefined;
+  const result_grouping = "overlay_first";
+  const resolved = result_grouping ?? (knowledge_base_url ? "overlay_first" : "merged");
+  assert.equal(resolved, "overlay_first");
+});
+
+await test("explicit grouped works regardless of KB", () => {
+  for (const kb of [undefined, "https://github.com/klappy/klappy.dev"]) {
+    const result_grouping = "grouped";
+    const resolved = result_grouping ?? (kb ? "overlay_first" : "merged");
+    assert.equal(resolved, "grouped", `should be grouped when kb=${kb}`);
+  }
+});
+
+// ─── Grouped shape construction tests ─────────────────────────────────────
+
+console.log("\ngrouped shape construction:");
+
+await test("grouped search: overlay_hits and baseline_hits arrays present and correct", () => {
+  // Simulate the grouped shape construction from runSearch
+  const orderedHits = (() => {
+    const { overlay, baseline } = partitionBySource(mixedHits);
+    return [...overlay, ...baseline];
+  })();
+
+  // Simulate metadata enrichment (adds uri field)
+  const hitsWithMetadata = orderedHits.map((h) => ({
+    uri: `klappy://${h.path.replace(".md", "")}`,
+    path: h.path,
+    title: h.title,
+    score: h.score,
+    source: h.source,
+  }));
+
+  // Build grouped shape
+  const overlayHits = [];
+  const baselineHits = [];
+  for (const h of hitsWithMetadata) {
+    (h.source === "canon" ? overlayHits : baselineHits).push(h);
+  }
+
+  // Assertions
+  assert.equal(overlayHits.length, 2, "overlay_hits should have 2 items");
+  assert.equal(baselineHits.length, 2, "baseline_hits should have 2 items");
+  assert.ok(overlayHits.every((h) => h.source === "canon"));
+  assert.ok(baselineHits.every((h) => h.source === "baseline"));
+
+  // hits (back-compat) is overlay-then-baseline
+  assert.equal(hitsWithMetadata[0].source, "canon");
+  assert.equal(hitsWithMetadata[1].source, "canon");
+  assert.equal(hitsWithMetadata[2].source, "baseline");
+  assert.equal(hitsWithMetadata[3].source, "baseline");
+});
+
+await test("grouped with empty overlay: overlay_hits=[], baseline_hits=[...]", () => {
+  const { overlay, baseline } = partitionBySource(baselineOnly);
+  assert.equal(overlay.length, 0);
+  assert.equal(baseline.length, 2);
+
+  const orderedHits = [...overlay, ...baseline];
+  assert.equal(orderedHits.length, 2);
+  assert.ok(orderedHits.every((h) => h.source === "baseline"));
+});
+
+await test("grouped with empty baseline: overlay_hits=[...], baseline_hits=[]", () => {
+  const { overlay, baseline } = partitionBySource(canonOnly);
+  assert.equal(overlay.length, 2);
+  assert.equal(baseline.length, 0);
+
+  const orderedHits = [...overlay, ...baseline];
+  assert.equal(orderedHits.length, 2);
+  assert.ok(orderedHits.every((h) => h.source === "canon"));
+});
+
+// ─── Preflight partition tests ────────────────────────────────────────────
+
+console.log("\npreflight partition:");
+
+await test("preflight overlay_first: partition applied before slice", () => {
+  // Simulate scoreEntries output with interleaving scores
+  const allScored = [
+    { path: "docs/high.md", source: "baseline", score: 20 },
+    { path: "canon/mid-high.md", source: "canon", score: 18 },
+    { path: "docs/mid.md", source: "baseline", score: 15 },
+    { path: "canon/mid-low.md", source: "canon", score: 12 },
+    { path: "docs/low.md", source: "baseline", score: 8 },
+    { path: "canon/lowest.md", source: "canon", score: 3 },
+  ];
+
+  // overlay_first: partition then slice(0, 5)
+  const { overlay, baseline } = partitionBySource(allScored);
+  const ordered = [...overlay, ...baseline];
+  const results = ordered.slice(0, 5);
+  const startHere = results.slice(0, 3).map((r) => r.path);
+
+  // First 3 results should be all canon (3 canon entries exist)
+  assert.equal(startHere[0], "canon/mid-high.md");
+  assert.equal(startHere[1], "canon/mid-low.md");
+  assert.equal(startHere[2], "canon/lowest.md");
+});
+
+await test("preflight grouped: start_here_overlay and start_here_baseline", () => {
+  const allScored = [
+    { path: "docs/high.md", source: "baseline", score: 20 },
+    { path: "canon/mid-high.md", source: "canon", score: 18 },
+    { path: "docs/mid.md", source: "baseline", score: 15 },
+    { path: "canon/mid-low.md", source: "canon", score: 12 },
+  ];
+
+  const { overlay, baseline } = partitionBySource(allScored);
+  const startHereOverlay = overlay.slice(0, 3).map((r) => r.path);
+  const startHereBaseline = baseline.slice(0, 3).map((r) => r.path);
+
+  assert.deepEqual(startHereOverlay, ["canon/mid-high.md", "canon/mid-low.md"]);
+  assert.deepEqual(startHereBaseline, ["docs/high.md", "docs/mid.md"]);
+});
+
+await test("preflight merged: no partition applied (pure score order)", () => {
+  const allScored = [
+    { path: "docs/high.md", source: "baseline", score: 20 },
+    { path: "canon/mid-high.md", source: "canon", score: 18 },
+    { path: "docs/mid.md", source: "baseline", score: 15 },
+  ];
+
+  // merged = just use allScored directly
+  const startHere = allScored.slice(0, 3).map((r) => r.path);
+  assert.deepEqual(startHere, ["docs/high.md", "canon/mid-high.md", "docs/mid.md"]);
+});
+
+// ─── Telemetry: parseToolCall extracts result_grouping ────────────────────
+
+console.log("\ntelemetry:");
+
+await test("parseToolCall extracts result_grouping from oddkit_search arguments", () => {
+  const payload = {
+    jsonrpc: "2.0",
+    id: 1,
+    method: "tools/call",
+    params: {
+      name: "oddkit_search",
+      arguments: {
+        input: "test query",
+        knowledge_base_url: "https://github.com/klappy/klappy.dev",
+        result_grouping: "overlay_first",
+      },
+    },
... diff truncated: showing 800 of 956 lines

You can send follow-ups to the cloud agent here.

Comment thread workers/src/orchestrate.ts
cursoragent and others added 2 commits April 28, 2026 14:18
…CLI (#150 fix-forward)

- Worker (workers/src/orchestrate.ts): runSearch now pulls 50 BM25 candidates
  when resolvedGrouping !== "merged", partitions, then slices to FINAL_LIMIT
  (5). The original implementation truncated to 5 before partition, making
  overlay docs at BM25 position 6+ invisible to the partition logic.
- CLI (src/core/actions.js): mirrored — partitionByOrigin helper, conditional
  default on result_grouping (KB set → overlay_first, else merged), candidate-
  pool widening, overlay_hits/baseline_hits arrays for grouped mode, NO_MATCH
  branch includes empty grouped arrays when applicable.
- Tests: 2 new regression cases for partitionBySource against widened pools.
- CHANGELOG: separate Fixed entry for the candidate-pool widening, plus a
  CLI parity Added entry.

All worker tests (5 files, 26 tests in result-grouping) and CLI smoke tests
pass; tsc --noEmit clean.
@klappy
Copy link
Copy Markdown
Owner Author

klappy commented Apr 28, 2026

Fix-forward landed in 60fe9b8 to close two gaps surfaced by comparison with #151:

  1. Candidate-pool wideningrunSearch and CLI search now pull 50 BM25 candidates when result_grouping !== "merged" and slice to 5 after partitioning. Without this, overlay docs at BM25 position 6+ were invisible to the partition.
  2. Node CLI mirrorsrc/core/actions.js now has the full result_grouping parameter, conditional default, partition, and grouped response shape for parity with the worker.

Cursor Bugbot finding (20b5e41 review): "Grouped NO_MATCH omits overlay_hits/baseline_hits." Already addressed in commit 63eb8e0 for the worker; CLI now matches in 60fe9b8. Disposition: already-fixed.

Tests: 26/26 in result-grouping.test.mjs (2 new regression cases), all worker + CLI smoke tests green, tsc --noEmit clean.

Awaiting Bugbot re-review on the new commit, then a fresh-context Sonnet 4.6 validator dispatch (canon: release-validation-gate Rule 2). Validator is queued behind API credit top-up.

Comment thread workers/src/orchestrate.ts
Comment thread workers/src/orchestrate.ts
- runSearch: compute updatedState/canon_refs from orderedHits (the
  truncated returned set), not the wider 50-candidate BM25 pool.
  Mirrors src/core/actions.js ordering: partition+truncate first,
  then derive state.
- runPreflight: build start_here_overlay / start_here_baseline by
  partitioning startHere (top 3 of results) instead of the full
  allScored list, so grouped arrays are subsets of start_here.
Comment thread src/core/actions.js
The CLI preflight case in handleAction delegated to runOrchestrate
without forwarding resolvedGrouping, so runPreflight always received
the default merged ordering. This broke parity with the worker, which
already threads resolvedGrouping into runPreflight.

- src/core/actions.js: pass resolvedGrouping into runOrchestrate for
  the preflight case.
- src/mcp/orchestrate.js: accept result_grouping option and forward
  it to runPreflight.
- src/tasks/preflight.js: partition start_here by origin for
  overlay_first/grouped, and emit start_here_overlay /
  start_here_baseline for grouped, mirroring workers/src/orchestrate.ts.
Comment thread src/tasks/preflight.js Outdated
Comment thread src/tasks/preflight.js Outdated
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit 3c8f77f. Configure here.

Comment thread workers/src/orchestrate.ts
…F-1)

The Sonnet 4.6 read-only validator dispatch flagged that runPreflight had
'result_grouping ?? "merged"' which is dead code on every public path
(src/core/actions.js pre-resolves before delegating) but would silently
return "merged" for any direct importer with baseline set and no explicit
result_grouping. Mirror the worker's conditional default for parity.

Validator: dispatched per klappy://canon/constraints/release-validation-gate
Rule 2; verdict PASS with this finding. Validator report archived in the
session ledger.
@klappy
Copy link
Copy Markdown
Owner Author

klappy commented Apr 28, 2026

Validator dispatch complete — Sonnet 4.6, fresh context, read-only.

Per klappy://canon/constraints/release-validation-gate Rule 2.

Verdict: PASS. SHA 3c8f77f3 validated; preview live testing confirmed all 5 scenarios including the critical proof point — overlay doc scoring 4.21 ranks above baseline doc scoring 7.02 when knowledge_base_url is set and result_grouping is unspecified, demonstrating the conditional default fires correctly.

Three findings:

  • F-1 (Low, non-blocking)src/tasks/preflight.js had ?? "merged" instead of the conditional default. Dead code on every public path (CLI pre-resolves), but mattered for direct runPreflight importers. Fixed in 96ef432 — one-line correction to mirror the worker.
  • F-2 (Cosmetic) — PR file-list table predates src/tasks/preflight.js. PR body now updated with the full HEAD file list.
  • F-3 (Cosmetic) — Test count discrepancy already explained in the fix-forward section.

Per Rule 1, merge waits for Bugbot to complete. Bugbot already ran on 3c8f77f3 (neutral); the new commit 96ef432 will trigger a fresh review.

…pes (Bugbot)

Cursor Bugbot finding (Low): worker debug envelopes for runSearch and
runPreflight omitted result_grouping while CLI counterparts in
src/core/actions.js and src/tasks/preflight.js included it. Closes the
parity gap so debug observability is consistent across worker and CLI.

The other six Bugbot findings on this PR were point-in-time reviews
already addressed by intermediate commits (60fe9b8, ed3dd14, 8cb0b32,
3c8f77f, 96ef432) — verified by inspecting HEAD bytes against each
finding's locator.
@klappy
Copy link
Copy Markdown
Owner Author

klappy commented Apr 28, 2026

Bugbot disposition map — checked all 7 inline findings against HEAD bytes (885fcc9):

# Finding Severity Status Evidence
1 Grouped NO_MATCH omits arrays Low Stale Fixed in 63eb8e0. NO_MATCH branch sets overlay_hits=[]/baseline_hits=[] when resolvedGrouping === "grouped".
2 State tracks unshown candidates Med Stale Fixed in ed3dd14e (Cursor autofix). updatedState now derived from orderedHits.map(h => h.path) after slice — only tracks the 5 returned to the user.
3 Preflight grouped from wrong source Med Stale Fixed in ed3dd14e. partitionBySource(startHere) not partitionBySource(allScored).
4 CLI preflight ignores resolvedGrouping Med Stale Fixed in 8cb0b32a. src/core/actions.js:487 now passes result_grouping: resolvedGrouping to runOrchestrate.
5 Redundant originByPath built twice Low Stale Fixed in 3c8f77f3. overlayEntries/baselineEntries hoisted out of conditional, reused for grouped arrays.
6 || vs ?? in preflight default Low Stale Fixed in 3c8f77f3 (??) and again in 96ef432 (validator F-1: full conditional default ?? (baselineOverride ? "overlay_first" : "merged")).
7 Worker debug envelopes omit result_grouping Low Fixed in 885fcc9 Added result_grouping: resolvedGrouping to runSearch NO_MATCH debug, runSearch FOUND debug, and runPreflight debug. CLI/worker debug parity restored.

Six of seven were point-in-time review echoes from earlier commits — Cursor Bugbot reviews each push, and findings on a stale commit don't auto-close when the issue's fixed in a later commit. Only #7 was actually live at HEAD; now closed.

tsc --noEmit clean; 26/26 result-grouping tests pass; CLI smoke green.

Awaiting Bugbot re-review on 885fcc9. Validator's PASS verdict at 3c8f77f3 stands; the F-1 fix-forward (96ef432) and this Bugbot F-7 (885fcc9) are both pure observability/parity tightenings — no behavioral change to the validated paths.

@klappy klappy merged commit 77856e7 into main Apr 28, 2026
5 checks passed
@klappy klappy deleted the feat/result-grouping-overlay-first branch April 28, 2026 20:37
klappy added a commit to klappy/klappy.dev that referenced this pull request Apr 28, 2026
Documents the result_grouping parameter shipped in klappy/oddkit#152 (closes klappy/oddkit#150).

- docs/oddkit/tools/oddkit_search.md: added knowledge_base_url and result_grouping parameters in input schema; expanded response shape with source: "canon" | "baseline" and conditional overlay_hits/baseline_hits; new Result Grouping section explaining the three values and conditional default.
- docs/oddkit/tools/oddkit_preflight.md: parallel treatment scoped to preflight; expanded response shape with start_here_overlay/start_here_baseline; canon ref to klappy://canon/principles/scoped-truth.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Isolate or re-rank knowledge base content vs baseline in search corpus

2 participants