feat(gateway): add Anthropic prompt caching for conversation turns and workers#127
Merged
feat(gateway): add Anthropic prompt caching for conversation turns and workers#127
Conversation
…d workers Enable cache_control annotations on requests forwarded to Anthropic, targeting two high-value cache slots: 1. System prompt caching: explicit breakpoint with 5m TTL for conversation turns (frequent enough for refresh) and 1h TTL for worker calls (bursts separated by minutes of thinking). 2. Conversation prefix caching: breakpoint on the last message block so Anthropic caches the byte-stable prefix between turns. At layer 0 (~100% prefix stability) this yields ~78% savings; at layer 1 (~85-93% stability) ~67% savings over 10-turn stretches. Title/summary passthrough requests are explicitly excluded — their unique-per-call content would produce 1.25x write cost with zero reads. Projected savings: ~50-1200/month on current Lore API spend.
## Summary - **Moves knowledge entries** from the lore-managed section in AGENTS.md to a dedicated `.lore.md` file - **AGENTS.md now contains only a pointer** to `.lore.md`, reducing system prompt bloat (~16K tokens of entries no longer in the system prompt injection path) and merge conflicts - **Automatic migration**: first idle export after update creates `.lore.md` and rewrites the AGENTS.md section to a pointer — no user action needed - **Backward compatible**: all adapters (OpenCode, Gateway, Pi) prefer `.lore.md` on startup import, falling back to AGENTS.md for older repos ## Changes ### Core (`packages/core/src/agents-file.ts`) - New functions: `exportLoreFile`, `importLoreFile`, `shouldImportLoreFile`, `loreFileExists` - `exportToFile` now writes a pointer section in AGENTS.md and delegates entries to `exportLoreFile` - Extracted shared `_importEntries` helper from `importFromFile` ### Adapters - **OpenCode/Gateway/Pi**: startup import prefers `.lore.md`; idle export writes `.lore.md` always + agents file pointer when enabled - **Commit reminder**: gated on `knowledge.enabled` (not `agentsFile.enabled`), mentions both `.lore.md` and agents file - **Gateway config**: project path regex updated to match `.lore.md` references in system prompts ### Tests - Updated existing tests to verify pointer in AGENTS.md and entries in `.lore.md` - Added 4 new test suites: `exportLoreFile`, `loreFileExists`, `shouldImportLoreFile`, `importLoreFile` - Added `migration from AGENTS.md to .lore.md` suite with full lifecycle tests - All 744 tests pass
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
cache_controlannotations to Anthropic API requests at two levels: system prompt (explicit breakpoint) and conversation prefix (breakpoint on last message block)Caching strategy by request type
handleConversationTurn→forwardToUpstream(cache)handlePassthrough→ raw forwardcreateGatewayLLMClient→ direct fetchCost analysis
Changes
packages/gateway/src/translate/anthropic.ts—AnthropicCacheOptionstype +buildAnthropicRequest()accepts cache parameterpackages/gateway/src/pipeline.ts—forwardToUpstream()andhandleConversationTurn()wire up caching optionspackages/gateway/src/llm-adapter.ts— Worker calls send system as block array with 1h TTLpackages/gateway/test/anthropic-caching.test.ts— 17 tests covering all caching paths and edge cases