feat: complete judge-only by default, --enrich opt-in + enrich hardening (0.23.0) by Shahinyanm · Pull Request #41 · Digital-Threads/Task-Journal

Shahinyanm · 2026-06-13T16:24:13Z

Problem

After the chunking fix (0.22.1), complete hit a different failure in production:

Error: dream JSON parse failed; got: Контекст в норме. 566.5k/1M (57%) использовано... Что дальше?

The backfill model replied with prose instead of the JSON array — it continued the transcript's own dialogue. The parse error propagated up and aborted the entire complete, throwing away the retitle and close too.

Fix

Backfill is best-effort and must never sink the finalize:

Skip unparseable chunks. A chunk whose reply isn't a JSON array is logged (tracing::warn) and contributes no events, instead of erroring out. The surrounding retitle/close still run.
Extract array from prose. parse_backfill_json slices to the outer [...], tolerating a JSON array wrapped in surrounding text.
Re-assert the contract. The prompt now repeats "output ONLY the JSON array … do not continue the transcript" after the transcript (recency), reducing the chance the model chats back.

Tests

parse_extracts_array_wrapped_in_prose, parse_errors_on_pure_prose, backfill_skips_unparseable_chunk_reply (the exact prod reply), plus the existing chunking tests.
Full local gate green: fmt --check, clippy --workspace --all-targets -D warnings, test --workspace, lean build.

🤖 Generated with Claude Code

The backfill model sometimes answers with prose instead of the JSON array — e.g. continuing the transcript's own dialogue ('Контекст в норме... Что дальше?'). The parse error aborted the whole `complete`, losing the retitle and close. Backfill is best-effort: skip an unparseable chunk reply (warn), extract a JSON array even when wrapped in prose, and re-assert 'output ONLY the JSON array, do not continue the transcript' after the transcript. Retitle/close run regardless of what enrich recovers. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

claude -p is a full Claude Code instance: its system prompt + tool definitions cost ~113k tokens before our content, so a 360k-char chunk (~91k tokens) still 400'd at ~204k total. Drop TRANSCRIPT_CHAR_BUDGET to 150k chars (~37k tokens) and make backfill swallow ANY per-chunk error (over-budget 400, transient failure, non-JSON reply) — enrich is strictly best-effort and never sinks the retitle/close. A truly broken backend still surfaces at the judge step. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

A big task makes many sequential claude -p calls (one+ per session); without a timeout one wedged call hung the whole complete, with no output so it looked dead. Add a per-call wall-clock timeout (90s, TJ_CLAUDE_TIMEOUT_SECS) that kills a stuck claude and drains pipes in threads to avoid buffer deadlock; a timed-out chunk is skipped (enrich is best-effort). Print an 'enriching N session(s)…' progress line so a multi-minute run is legible, and point at --quick. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…, progress) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Running complete on real 12-session tasks showed the session-backfill pass takes 10-15 min (dozens of sequential claude -p spawns, ~113k-token overhead each) — too slow to be the default. The judge-only path (retitle + close + outcome) is seconds and delivers ~90% of the value. Flip the default: complete <id> now finalizes via judgment only; add --enrich to also backfill missed events from sessions. The old --quick flag is removed (its behaviour is the default). Behaviour change → 0.23.0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Shahinyanm · 2026-06-13T16:58:16Z

Updated: this PR now also flips the default — complete is judge-only (retitle + close + outcome, seconds) by default; the slow session-enrich is opt-in via --enrich. Found by running it on a real 12-session task where full enrich took 10-15 min. Bumped to 0.23.0 (behaviour change). The hardening commits (chunk sizing, per-call timeout, best-effort skip, progress, legible errors) remain.

Shahinyanm and others added 5 commits June 13, 2026 20:23

docs(changelog): expand 0.22.2 with enrich hardening (sizing, timeout…

8a8bc38

…, progress) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Shahinyanm changed the title ~~fix: complete tolerates non-JSON enrich replies (0.22.2)~~ feat: complete judge-only by default, --enrich opt-in + enrich hardening (0.23.0) Jun 13, 2026

Shahinyanm merged commit 2d7171e into main Jun 13, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: complete judge-only by default, --enrich opt-in + enrich hardening (0.23.0)#41

feat: complete judge-only by default, --enrich opt-in + enrich hardening (0.23.0)#41
Shahinyanm merged 5 commits into
mainfrom
fix/enrich-nonjson-resilience

Shahinyanm commented Jun 13, 2026

Uh oh!

Shahinyanm commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Shahinyanm commented Jun 13, 2026

Problem

Fix

Tests

Uh oh!

Shahinyanm commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant