fix: chunk enrich transcript so complete works on large sessions (0.22.1)#40
Merged
Conversation
The enrich pass fed an entire session transcript to the model in one
call; a large multi-session task exceeded the ~200k-token context limit
and claude -p returned HTTP 400 ("Prompt is too long · ~220310 tokens").
Split the transcript into line-aligned chunks under a safe byte budget
and merge the recovered events (run_dream dedups), so finalize works on
any session size. --quick was unaffected (it skips enrich).
Also surface the JSON error claude -p prints on stdout under
--output-format json (the real cause goes there, not stderr), so a
failure is legible instead of a bare "exit status 1" — which is exactly
what let us diagnose the context overflow.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Found in production on a real task:
task-journal complete <id>failed with a barecomplete <id> --quickworked fine. The failure was the enrich pass, which fed an entire session transcript to the model in one call. A large multi-session task produced a ~220k-token prompt, over the ~200k context limit, soclaude -preturned HTTP 400 — but the JSON error was on stdout (where--output-format jsonputs it) and the runner only showed stderr (empty).Fix
LlmDreamBackend::backfillsplits the transcript into line-aligned chunks under a safe byte budget (360k) and runs one call per chunk, merging the recovered events (run_dreamalready dedups). No content dropped; finalize now works on any session size.--quickwas never affected (it skips enrich).claude -pstdout on failure.claude_exit_errornow includes the stdout JSON (capped) so a failure reads asPrompt is too long · ~220310 tokens (limit 200000)instead ofexit status 1. This is what made the diagnosis instant.Verified
complete tj-yh4bv32b8j --quickalready retitled#: 5→Finalize FIN-1034-docs branch merge and push PR #625and closed it; the full path now chunks instead of 400ing.fmt --check,clippy --workspace --all-targets -D warnings,test --workspace, lean--no-default-featuresbuild.🤖 Generated with Claude Code