#274 Slice A: skip harness-injected pseudo-turns in the consolidation sweep#359
Merged
Merged
Conversation
… sweep Claude Code injects non-conversational blocks as user-role turns — a background-agent <task-notification>, and the /compact continuation summary. They reach the ledger as turn-delta records and are already recall-excluded by kind (#332), but the consolidation sweep read the raw ledger unfiltered, so the in-context AI consolidated them as if the operator had said them. Per #333 these are NOT dropped pre-ledger (that deletes recoverable content against the durability law). Instead capture TAGS an injected message (records.INJECTED_TAG) on every chunk — recognised on the whole message before chunking, so a >4 KB continuation summary is fully tagged, not just its first chunk — and the sweep skips a tagged/injected record as fuel. The record stays physically resident and recoverable; recall already excludes it. The marker set is the two distinctive, ground-truthed standalone sentinels (<task-notification>, the continuation summary) — each is the whole injected message and never fuses with a real prompt. <system-reminder> is deliberately excluded: it fuses with a human prompt in the same turn, so a start-anchored drop would lose real content (confirmed against live transcripts). detect_unconsolidated / _scan_sessions apply the same predicate, so a session whose turn-deltas are all injected is never flagged pending forever (read and detection must agree, or the sweep loops). - records.py: INJECTED_TAG + is_injected_pseudo_turn_text / is_injected_record - capture.py: _make_record(injected=) + message-level tagging in _capture - consolidate.py: read_deltas + _scan_sessions skip injected; demo PART 6 - tests: test_records / test_capture (incl. multi-chunk) / test_consolidate Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… "fuel" coinage Wrap two over-long PART 6 output lines (108/114 chars) toward the demo's ~80 convention, and rename the internal `fuel` variable / operator-facing phrasing so the demo no longer leans on the undefined "fuel" metaphor — "what the AI reads to tidy this session" carries the meaning plainly. Behaviour and the demo's pass/fail self-check are unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Part of #274 (the memory anchor decision). Slice A of three — the small, independent piece; #274 closes with the consolidated-raw erasure slice, not here. Folds in the #333 sub-concern.
Purpose
Stop the consolidation sweep from tidying the engine's own injected notifications as if the operator had written them — while keeping every byte resident and recoverable.
user-role turns: a background-agent<task-notification>, and the/compactcontinuation summary (This session is being continued from a previous conversation…). They land in the ledger asturn-deltarecords and are already excluded from recall by kind (Memory recall is dominated by raw turn-notes; curated summaries never surface (locked-spec tension) #332), butconsolidate.read_deltasread the raw ledger unfiltered, so the in-context AI consolidated them into the operator's episodic record as junk.Impact: consolidation now reads only genuine conversation as its fuel; the injected notes stay in the ledger, recoverable, and recall is unchanged.
Scope
Recognise an injected pseudo-turn at capture and tag it (resident, never dropped); skip a tagged/injected record in the consolidation sweep.
records.py:INJECTED_TAGplusis_injected_pseudo_turn_text(text)(start-anchored on the two ground-truthed standalone sentinels) andis_injected_record(record)(the durable tag path, with a text-prefix fallback for records captured before tagging).capture.py:_make_record(..., injected=)appendsINJECTED_TAG;_capturedecides injectedness on the whole message before chunking, so every chunk of a >4 KB continuation summary is tagged — not just the first.consolidate.py:read_deltasand_scan_sessionsskip injected records. Applying it in detection as well as the read keeps a session whose turn-deltas are all injected from being flagged pending forever (read and detection must agree, or the sweep loops). A demo part (PART 6) shows the notice + banner skipped as fuel while all three notes stay in the cabinet.Impact: the fix is read-side for consolidation and a tag at capture; no record is ever deleted, and recall (already turn-delta-excluded) is untouched.
Out of scope
The rest of #274,
<system-reminder>, and any pre-ledger drop.<system-reminder>is deliberately not a marker: ground-truth over live transcripts shows it fuses with a human prompt in the same captured turn (8 confirmed cases), so a start-anchored drop would lose real content.[SYSTEM NOTIFICATION…]and<user-prompt-submit-hook>are omitted as inert — they anchor 0 stored records.Impact: the change is bounded to the two distinctive standalone sentinels and never touches the recall path or the erasure machinery.
Risk
Low — a read-side consolidation filter plus a capture-time tag, both behaviour-preserving for genuine conversation; the failure mode is at most slightly noisier consolidation fuel, never a lost or hidden turn.
<system-reminder>) is excluded..engine/tools/memory/*.pyedits tripengine-guard, so this carries aguardrail-ack; no guardrail is weakened.Impact: the worst case is an injected note that escapes the filter and is summarised as before; each tool's own test suite plus the demo is the regression catch.
Validation
Full suite and the CI validator green; the demo exercises the real filter.
python -m unittest discover -s tools -p 'test_*.py'→ 2755 tests OK (2 documented offline skips), run from the worktree. New legs: the two predicates (start-anchor, tag/text,<system-reminder>excluded); capture tagging incl. a multi-chunk continuation summary where every chunk is tagged;read_deltasskip-but-keep; the all-injected session never flagged pending.validate.py --suite CI→ the sole hard finding is the known local no-token state ofdisposition-issue-resolution(it needs a token to bite; CI is its real witness); the change does not touch it.graph.jsonregenerated and in sync;self-map.mdunchanged (declaration-derived). Demo (consolidate.py demo, PART 6) shows the notification + banner skipped as fuel while all three notes remain resident.Impact: the behaviour-preserving claim rests on the existing memory suites staying green plus the new injected-turn legs and the falsifiable demo.
Review
A four-lens cold plan gate and a four-lens cold deliverable gate both ran; findings were ground-truthed against source and folded. No blocking or serious findings remain.
_scan_sessions; a blocking false-drop (the<system-reminder>fusion) and the inert/over-broad markers — fixed by paring the list to the two ground-truthed sentinels (verified against the live ledger, 206 task-notifications and 20 continuation summaries, 0 fused); and the chunking gap — fixed by tagging at capture before chunking.Impact: this is the engine's own account of the review — the maintainer's merge is the binding gate.
Files of interest
The predicate, the capture-time tag, and the sweep skip.
.engine/tools/memory/records.py—is_injected_pseudo_turn_text/is_injected_recordandINJECTED_TAG, the shared cycle-free vocabulary both writers import..engine/tools/memory/capture.py— message-level tagging in_capture(covers every chunk)..engine/tools/memory/consolidate.py—read_deltas+_scan_sessionsskip, and the read↔detection agreement.Impact: these three carry the mechanism; the test files pin each leg.
Claude involvement
Claude (Opus 4.8) built the slice under the maintainer's direction; the maintainer chose the complete (capture-time) scope and holds the merge.
<system-reminder>exclusion were settled by ground-truth against the live ledger and transcripts.Impact: AI judgment is load-bearing on which shapes are injected vs conversation; the ground-truth tally, the tests, and the demo are the correlate.