-
Notifications
You must be signed in to change notification settings - Fork 63
Description
Context
PR #160 introduced incremental JSONL parsing for Codex and Claude
sessions to prevent SyncPaths from blocking SSE live updates. The
incremental path works well for the common case (linear appends), but
several edge cases were identified during review that require schema
changes or larger refactors beyond that PR's scope.
Follow-up items
1. DAG fork detection at sync boundaries
Problem: hasDAGFork() validates UUID linearity only within the
newly appended entries. The first appended entry's parentUuid is not
checked against the last UUID from the previous sync. If a user forks a
Claude session from an older node (not the tip), the incremental parser
treats it as linear and appends it incorrectly.
Fix: Add a last_entry_uuid column to the sessions table. Populate
it during full parses with the UUID of the last user/assistant entry.
Pass it into ParseClaudeSessionFrom to initialize lastUUID in
hasDAGFork, or fall back to a full parse when the first appended
entry's parentUuid doesn't match.
2. Ordinal drift when trailing messages are filtered
Problem: The next incremental parse starts from MaxOrdinal()+1,
but toDBMessages / pairAndFilter can drop messages (e.g. user
entries containing only tool_result blocks). If an incremental append
ends with only filtered messages, file_size advances but the DB's max
ordinal does not. The subsequent sync reuses ordinals, causing drift
versus a full parse.
Fix: Store next_ordinal in session metadata (separate from the
DB's max stored ordinal). Update it after each incremental sync to
reflect the raw parsed ordinal count, not the filtered count.
3. Non-atomic message + metadata writes
Problem: writeIncremental executes message insertion and
UpdateSessionIncremental in separate statements. If messages are
inserted but the metadata update fails, file_size stays stale while
MaxOrdinal has advanced, risking duplicate messages on retry.
Fix: Wrap both operations in a single database transaction so
message ordinals and file offsets advance atomically.
4. Codex exec events dropped in incremental path
Problem: processCodex hardcodes includeExec=false for the
incremental parse function. If a session was originally synced via
SyncSingleSession with includeExec=true (exec-originated), a
subsequent incremental sync silently drops newly appended exec events
while advancing the offset.
Fix: Store whether the session is exec-originated and pass it
through to the incremental parser, or skip incremental for exec sessions.
5. Cross-sync tool result pairing
Problem: The incremental parser pairs tool calls and results only
within the appended chunk. If an assistant tool_use was stored in a
previous sync and the matching user tool_result arrives in a later
append, pairToolResults can't see the earlier call. The result content
is never linked.
Fix: Look up unmatched tool calls by tool_use_id from the DB
during incremental sync, or fall back to a full reparse when unmatched
tool results are detected.
6. Subagent mapping lost in incremental Claude parsing
Problem: Full Claude parsing collects queue-operation and
progress events to populate subagentMap, which annotates Task/Agent
tool calls with SubagentSessionID. The incremental parser skips all
non-user/assistant lines, so newly appended Task/Agent tool calls lose
subagent linkage.
Fix: Mirror the subagent-mapping collection in
ParseClaudeSessionFrom and annotate incremental messages before
returning.