Skip to content

release: v0.11.3 — cursor scanner fix + critical follow-ups#46

Merged
subinium merged 1 commit into
mainfrom
release/0.11.3
May 29, 2026
Merged

release: v0.11.3 — cursor scanner fix + critical follow-ups#46
subinium merged 1 commit into
mainfrom
release/0.11.3

Conversation

@subinium

Copy link
Copy Markdown
Owner

Builds on @rooty0's #45 (already merged to main, commits cf66a4d..8f3688f, authorship preserved as Stan <github@rooty.name>) and adds the follow-ups that surfaced in a multi-angle adversarial review (correctness + regression + claims + test gaps + code quality).

Verified from #45 (correct, kept)

  • depth-4 .jsonl + legacy depth-3 .txt walk — confirmed against live data (~/.cursor/projects has 2 JSONL at depth 4, 0 TXT; previous scanner returned No sessions found.)
  • SELECT value FROM meta WHERE key = '0' — confirmed via sqlite3 on a real store.db: tables are blobs + meta(key TEXT, value TEXT), value is a 506-byte hex-encoded JSON with agentId/name/createdAt/mode
  • orphan skip — confirmed: 100% of my JSONL transcripts have no matching ~/.cursor/chats/<workspace>/<session_id>/store.db (the chat session UUIDs don't overlap with the transcript UUIDs at all), and cursor-agent --resume would refuse them
  • var-folders-* rejection — confirmed: ~/.cursor/projects had 17 contaminated dirs, all matched
  • 13 well-targeted unit tests

Closes #35.

Critical follow-ups in this PR

These were confirmed bugs in #45's new code, validated by reproducers:

  • extract_first_prompt panic on inverted <user_query> tagsstr::find returns the FIRST occurrence of each substring independently; a text where </user_query> byte-precedes <user_query> (pasted log, AI-generated code) gave s > e and text[s+12..e] panicked with begin > end. Confirmed via a standalone rustc reproducer. The panic was caught by the rayon thread join, so the binary stayed up — but the entire Cursor scan silently returned 0. Fix searches for the closing tag after the opening one. (extract_first_prompt_does_not_panic_on_inverted_tags)
  • extract_first_prompt aborts on first bad line — both the per-line IO read (let line = line.ok()?;) and serde_json::from_str(line).ok()? propagated None out of the whole function on the first malformed/non-UTF-8 line, silently disabling the blank-summary fallback the PR introduced. Replaced with let Ok(...) else { continue; };, matching scanner/pi.rs. (extract_first_prompt_skips_malformed_json_lines, extract_first_prompt_skips_invalid_utf8_lines)
  • Legacy .txt deletion silently brokendelete_cursor_agent_session called remove_dirs_matching_name, which filters on path.is_dir(). Legacy sessions live at agent-transcripts/<uuid>.txt (a file), so it never matched: delete returned Ok(()), the orphan persisted, and the next scan resurrected it. New sibling helper remove_files_matching_name handles the file form. (delete_cursor_agent_removes_legacy_txt_transcript)
  • CACHE_VERSION 5 → 6 — the new orphan-skip rule fires only on fresh scans; cached 0.11.x cursor entries would persist until each transcript's mtime changes, so upgraders wouldn't see the orphan fix until then. Bumped to force a one-time rescan, so the PR's "35 orphans → 0" effect actually lands for upgraders.

Quality follow-ups

  • 512 KiB byte budget on extract_first_prompt — matches the cap pi.rs got in v0.11.2 after Claude logs stalled the TUI. Cursor transcripts can carry multi-MB tool-result blobs, and the CACHE_VERSION bump forces a cold rescan for everyone, so the same precaution applies.
  • Stem == parent UUID invariant on the .jsonl arm. A stray agent-transcripts/<uuidA>/<uuidB>.jsonl produces a session_id that mismatches both the store.db key and what cursor-agent --resume expects. Real Cursor always writes them equal, but the invariant is now explicit. (scan_from_rejects_jsonl_with_stem_mismatched_to_parent)
  • Hyphenated-path backtracker testagent-tui-finder is this very repo's name, so Users-...-Desktop-github-agent-tui-finder with agent/agent-tui/agent-tui-finder as siblings is the load-bearing case to lock down. (decode_dash_path_resolves_hyphenated_segments)
  • SUMMARY_MAX_CHARS = 100 lifted from inline magic numbers.
  • cargo fmt over the new code.

Test plan

  • cargo fmt --all --check
  • cargo clippy --all-targets -- -D warnings
  • cargo test50 passed (44 from main + 6 new regression tests)
  • cargo build --release, installed locally, agf --version → 0.11.3
  • agf list works across all agents (Claude/Codex/Hermes/etc. unchanged)
  • agf list --agent cursor-agent → "No sessions found." on my local data (every JSONL is an orphan, matches cursor-agent's own /resume behavior, no crash)
  • Diff scoped to 6 intended files; no shared code paths affecting other agents

Closes #35.

Builds on @rooty0's #45 (cursor-agent scanner fix, commits cf66a4d..8f3688f
on main, authorship preserved) and adds the follow-ups that surfaced in
adversarial review.

Verified from #45 (correct):
- depth-4 .jsonl + legacy depth-3 .txt walk
- meta(key='0') store.db read with hex-encoded JSON value
- orphan skip when no matching ~/.cursor/chats/<ws>/<id>/store.db
- decode_dash_path with var-folders rejection
- 13 well-targeted unit tests

Follow-ups in this commit:

Critical fixes (verified bugs in the new code):
- extract_first_prompt: do not panic on inverted <user_query> tags.
  str::find returns the FIRST occurrence of each substring independently,
  so a text with </user_query> byte-preceding <user_query> gave s>e and
  text[s+12..e] panicked. Confirmed via rustc reproducer. Search for the
  closing tag AFTER the opening one. Regression test added.
- extract_first_prompt: skip-not-abort on bad lines. .ok()? on both the
  IO read and serde_json::from_str killed extraction for the whole file
  on the first error, defeating the PR's own blank-summary fallback.
  Replaced with let Ok(...) else { continue; }, matching scanner/pi.rs.
  Two regression tests added (malformed JSON, invalid UTF-8).
- delete_cursor_agent_session: legacy .txt transcripts are now actually
  removed. remove_dirs_matching_name filters on path.is_dir() so .txt
  files never matched; delete returned Ok(()) and the next scan
  resurrected the orphan. New sibling helper remove_files_matching_name
  handles the file form. Regression test added.
- cache::CACHE_VERSION bumped 5 -> 6 so the new orphan-skip rule fires
  for upgraders on first launch instead of waiting for each transcript's
  mtime to change.

Quality follow-ups:
- 512 KiB byte budget on extract_first_prompt (matches the cap pi.rs got
  in v0.11.2 after Claude logs stalled the TUI; cursor transcripts can
  carry multi-MB tool-result blobs).
- Stem == parent UUID invariant on the .jsonl arm. A stray
  agent-transcripts/<uuidA>/<uuidB>.jsonl produces a session_id that
  mismatches both the store.db key and cursor-agent --resume. Regression
  test added.
- SUMMARY_MAX_CHARS constant lifted from inline 100s.
- Hyphenated-path backtracker test (agent-tui-finder case).
- cargo fmt over the new code.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@subinium subinium merged commit 904022d into main May 29, 2026
5 checks passed
@subinium subinium deleted the release/0.11.3 branch May 29, 2026 01:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

scanner/cursor: agent-transcripts now uses .jsonl in nested <id>/ subdir, 0 sessions surfaced

1 participant