Skip to content

fix: search index corruption on incremental updates and compile indexing#57

Merged
keeganthomp merged 1 commit intomainfrom
fix/search-index-and-type-errors
Apr 10, 2026
Merged

fix: search index corruption on incremental updates and compile indexing#57
keeganthomp merged 1 commit intomainfrom
fix/search-index-and-type-errors

Conversation

@keeganthomp
Copy link
Copy Markdown
Owner

Summary

  • Fix critical search index corruptionserialize() and recomputeIdf() both silently broke after 2+ sequential ingests, making most terms unsearchable. The root cause was that loaded documents have tokens: [] (an intentional optimization), but two code paths incorrectly used tokens instead of tokenCount/termFreqs.
  • Compile now updates the search index with wiki articles, so kib search --wiki returns results immediately after compile without needing a manual cache clear.
  • --wiki/--raw scope flags now work on cached indexes — previously they were only applied during fresh builds, not when loading from cache.
  • Fix all strict TypeScript errors across core and CLI packages (folder-watcher, recovery, search engine, skills registry, watch test).

Test plan

  • All 497 unit tests pass
  • bun run check (biome lint) clean
  • tsc --noEmit zero errors for both core and CLI
  • Manual e2e: init → ingest 3 docs → search all 3 by distinct terms → compile → search --wiki returns wiki articles → search --raw returns raw sources
  • Manual e2e: daemon start/stop/status, HTTP POST /ingest, inbox file drop, background mode
  • Manual e2e: query, skills, lint, export (markdown + HTML), MCP tools

🤖 Generated with Claude Code

The incremental search index (used by ingest → addDocument → save) had two
compounding bugs that caused search to silently degrade after multiple ingests:

1. serialize() used d.tokens.length instead of d.tokenCount — loaded docs have
   tokens: [] (by design), so every save overwrote real token counts with 0.
2. recomputeIdf() iterated doc.tokens (empty for loaded docs) instead of
   doc.termFreqs.keys(), so IDF was only computed from the most recently added
   document, making all prior terms unsearchable.

Additionally:
- Compile now updates the search index with wiki articles (previously only
  ingest updated it, so --wiki searches returned nothing after compile).
- --wiki/--raw scope flags now filter results from cached indexes (previously
  scope was only applied during build, not when loading from cache).
- Fix all strict TypeScript errors across core and CLI packages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 10, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
kib Ready Ready Preview, Comment Apr 10, 2026 3:17pm

Request Review

@keeganthomp keeganthomp merged commit 02f1bb4 into main Apr 10, 2026
3 checks passed
@keeganthomp keeganthomp deleted the fix/search-index-and-type-errors branch April 10, 2026 15:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant