feat: compile-on-ingest pipeline with SQLite status tracking and token optimizations by keeganthomp · Pull Request #58 · keeganthomp/kib

keeganthomp · 2026-04-11T13:04:19Z

Replaces the batch "ingest N sources, then compile" model with an inline
pipeline that processes each source end-to-end (extract → ingest → compile)
immediately upon arrival.

Key changes:

SQLite pipeline DB (.kb/pipeline.db) tracks source lifecycle in real-time:
queued → extracting → ingested → compiling → compiled → enriched
Uses bun:sqlite with WAL mode for concurrent reads during compilation.
Compile-on-ingest: watch daemon now compiles each source inline instead of
accumulating a batch. Cross-reference enrichment is batched separately.
Anthropic prompt caching: system prompts marked with cache_control: ephemeral
so repeated compilations reuse cached system prompts server-side.
Compact topic map: replaces full INDEX.md context with a dense slug[tags]
representation, saving ~40-70% input tokens per compilation.
Relevant-only article context: instead of loading all articles a source
previously produced, scores articles by tag overlap and sends only relevant
ones, reducing context waste.
Fast model routing: short/simple sources (< 2000 words, no complex structure)
automatically use the fast model for lower cost and latency.
New kib sources CLI command shows pipeline status with color-coded
lifecycle indicators, token usage, and article counts.
Source status field added to manifest schema for portability.

524 tests pass (27 new), all lint checks pass.

https://claude.ai/code/session_01Ta23zoCERDxSnhCjqvzuQ1

…n optimizations Replaces the batch "ingest N sources, then compile" model with an inline pipeline that processes each source end-to-end (extract → ingest → compile) immediately upon arrival. Key changes: - SQLite pipeline DB (.kb/pipeline.db) tracks source lifecycle in real-time: queued → extracting → ingested → compiling → compiled → enriched Uses bun:sqlite with WAL mode for concurrent reads during compilation. - Compile-on-ingest: watch daemon now compiles each source inline instead of accumulating a batch. Cross-reference enrichment is batched separately. - Anthropic prompt caching: system prompts marked with cache_control: ephemeral so repeated compilations reuse cached system prompts server-side. - Compact topic map: replaces full INDEX.md context with a dense slug[tags] representation, saving ~40-70% input tokens per compilation. - Relevant-only article context: instead of loading all articles a source previously produced, scores articles by tag overlap and sends only relevant ones, reducing context waste. - Fast model routing: short/simple sources (< 2000 words, no complex structure) automatically use the fast model for lower cost and latency. - New `kib sources` CLI command shows pipeline status with color-coded lifecycle indicators, token usage, and article counts. - Source status field added to manifest schema for portability. 524 tests pass (27 new), all lint checks pass. https://claude.ai/code/session_01Ta23zoCERDxSnhCjqvzuQ1

Implements caveman-style text compression (inspired by JuliusBrussee/caveman and wilpel/caveman-compression) to minimize LLM token usage during compilation. Key changes: - Caveman compression module (compile/caveman.ts): Strips articles, filler words, hedging phrases, weak verbs, verbose connectors, and redundant expressions from text while preserving code blocks, URLs, file paths, YAML frontmatter, wikilinks, and all technical terms. Zero dependencies, pure regex/string ops. - Compressed system prompt: Rewritten in telegraphic style — 807 chars (202 tokens) down from ~1300 chars (325 tokens) = 38% reduction. Brevity constraints improve LLM accuracy per 2026 research. - Source content compression: Raw sources caveman-compressed before LLM call. 16-27% savings depending on prose density. Technical content untouched. - Article context compression: Existing articles sent as context also compressed, saving ~20% on context tokens. - Compressed prompt structure: Section headers shortened (CURRENT WIKI INDEX → INDEX, EXISTING ARTICLES THAT MAY NEED UPDATES → EXISTING, etc.) - Enrichment prompt compressed similarly. Combined savings stack (all optimizations from both commits): Old: ~3575 tokens/compilation → New: ~2582 tokens/compilation = 28% reduction + Anthropic prompt caching (system prompt reused server-side) + Fast model routing (short sources use cheaper model) + Relevant-only article context (fewer articles sent) 546 tests pass (22 new caveman tests), all lint checks pass. https://claude.ai/code/session_01Ta23zoCERDxSnhCjqvzuQ1

vercel · 2026-04-11T13:04:26Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
kib	Ready	Preview, Comment	Apr 11, 2026 1:04pm

claude added 2 commits April 11, 2026 12:05

keeganthomp merged commit 65deb05 into main Apr 11, 2026
3 checks passed

keeganthomp deleted the claude/optimize-compilation-performance-XhdcD branch April 11, 2026 13:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: compile-on-ingest pipeline with SQLite status tracking and token optimizations#58

feat: compile-on-ingest pipeline with SQLite status tracking and token optimizations#58
keeganthomp merged 2 commits intomainfrom
claude/optimize-compilation-performance-XhdcD

keeganthomp commented Apr 11, 2026

Uh oh!

vercel Bot commented Apr 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

keeganthomp commented Apr 11, 2026

Uh oh!

vercel Bot commented Apr 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants