feat: add Cursor vscdb session support#168
Conversation
Cursor changed its agent transcript storage format. Old format stored
transcripts as flat files:
agent-transcripts/<uuid>.{txt,jsonl}
New format nests the transcript inside a UUID-named subdirectory:
agent-transcripts/<uuid>/<uuid>.{jsonl,txt}
Discovery, FindSourceFile, classifyPaths (file-watcher path
classification), and project-name extraction all updated to handle
both formats. Dedup and .jsonl-over-.txt preference rules apply to
the new format too.
Made-with: Cursor
roborev: Combined Review (
|
Made-with: Cursor
2695c8e to
29252de
Compare
roborev: Combined Review (
|
|
I'm going to cut 0.14.0 before getting to this, seems like it may need more work? |
feat: add Cursor vscdb session support
Problem
Cursor's JSONL transcript files (written to
~/.cursor/projects/) are a lossy export.Every assistant message is reduced to
{"type": "text"}blocks only — tool calls, MCPinvocations, and subagent relationships are stripped before the file is written. As a
result, agentsview showed Cursor sessions as plain text conversations with no tool blocks,
even for sessions with heavy tool usage.
The full session data is stored locally in Cursor's internal SQLite database at
~/.config/Cursor/User/globalStorage/state.vscdb. This is where Cursor itself reads fromwhen opening a session — which is why tool calls appear instantly in the Cursor UI but are
absent from the JSONL files.
Solution
Read Cursor sessions from
state.vscdbas the primary data source, falling back to JSONLonly for sessions not present in the database.
Data model
state.vscdbhas a singlecursorDiskKVtable (key/value store). Two key patterns arerelevant:
composerData:<sessionId>— session metadata including an ordered list of bubble IDs,timestamps, session name, and
subComposerIds(child session IDs for subagent sessions).bubbleId:<sessionId>:<bubbleId>— individual messages. Each bubble hastype(1=user, 2=assistant),
text,createdAt(ISO 8601), and optionallytoolFormerDatawith the tool name, call ID, JSON params, and result.
Project names are resolved by scanning
~/.config/Cursor/User/workspaceStorage/*/workspace.json(which maps each workspace hashto a
file://project path) and the companionstate.vscdbin each workspace directory(which lists which composer IDs belong to that workspace).
Message reconstruction
Cursor stores each tool call as its own separate bubble (type 2), rather than grouping
them into a single assistant message as Claude Code does. The parser merges consecutive
type-2 bubbles — accumulating tool calls and concatenating text content — into a single
ParsedMessage. This matches the structure the rest of agentsview expects.Changes
internal/parser/cursor_vscdb.go(new)ListCursorVscdbSessions— returns lightweight metadata for all sessions (used forchange detection). Resolves project names via the workspace storage scan.
ParseCursorVscdbSession— parses a single session: loads bubble order fromcomposerData, fetches allbubbleIdentries, merges consecutive assistant bubblesinto messages.
parseCursorBubbleTime— handles Cursor's ISO 8601 bubble timestamps.normalizeCursorParamsJSON— handles params stored as either a JSON object or aJSON-encoded string.
internal/parser/cursor_vscdb_test.go(new): 13 unit tests covering list/parsefunctions, tool call extraction, message grouping, and params normalization.
internal/parser/taxonomy.gorun_terminal_command_v2→ Bash,read_file_v2→Read,
edit_file_v2/search_replace→ Edit,ripgrep_raw_search/rg→ Grep,glob_file_search→ Glob,task_v2→ Task,delete_file→ Write,list_dir_v2→Read, and several Tool-category names (
todo_write,create_plan, etc.).mcp-prefix catch-all for MCP tool invocations.internal/config/config.goCursorStateDB stringfield, defaulting to~/.config/Cursor/User/globalStorage/state.vscdb.CURSOR_STATE_DBenv var to override.internal/sync/engine.goCursorStateDBtoEngineConfigandEngine.syncCursorVscdb— follows the OpenCode pattern: useslastUpdatedAt(unix millis)for per-session change detection via the existing
GetFileInfoByPathvirtual-pathmechanism, calls
writeSessionFullfor in-place replacement. Builds a child→parent mapfrom
subComposerIdsand setsParentSessionID/RelSubagenton child sessions.engine for the duration of the sync.
processCursorchecks this set and skips anysession already handled by vscdb, preventing the text-only JSONL from overwriting the
richer data.
SyncStats.Syncedcounter.internal/sync/engine_integration_test.go: 4 integration tests — basic sync, changedetection (unchanged sessions are not re-parsed), dedup (vscdb wins over file-based
JSONL), and subagent parent-child linking.
cmd/agentsview/main.go,cmd/agentsview/sync.go: PassCursorStateDBthroughto
EngineConfig.Validation
Tested against a real
state.vscdbwith 702 sessions — 563 sessions with actualconversation content synced, producing 25,253 tool calls that were previously invisible.