Skip to content

feat: add Cursor vscdb session support#168

Draft
BadLiveware wants to merge 2 commits intowesm:mainfrom
BadLiveware:cursor-compatability
Draft

feat: add Cursor vscdb session support#168
BadLiveware wants to merge 2 commits intowesm:mainfrom
BadLiveware:cursor-compatability

Conversation

@BadLiveware
Copy link

@BadLiveware BadLiveware commented Mar 13, 2026

feat: add Cursor vscdb session support

Problem

Cursor's JSONL transcript files (written to ~/.cursor/projects/) are a lossy export.
Every assistant message is reduced to {"type": "text"} blocks only — tool calls, MCP
invocations, and subagent relationships are stripped before the file is written. As a
result, agentsview showed Cursor sessions as plain text conversations with no tool blocks,
even for sessions with heavy tool usage.

The full session data is stored locally in Cursor's internal SQLite database at
~/.config/Cursor/User/globalStorage/state.vscdb. This is where Cursor itself reads from
when opening a session — which is why tool calls appear instantly in the Cursor UI but are
absent from the JSONL files.

Solution

Read Cursor sessions from state.vscdb as the primary data source, falling back to JSONL
only for sessions not present in the database.

Data model

state.vscdb has a single cursorDiskKV table (key/value store). Two key patterns are
relevant:

composerData:<sessionId> — session metadata including an ordered list of bubble IDs,
timestamps, session name, and subComposerIds (child session IDs for subagent sessions).

bubbleId:<sessionId>:<bubbleId> — individual messages. Each bubble has type
(1=user, 2=assistant), text, createdAt (ISO 8601), and optionally toolFormerData
with the tool name, call ID, JSON params, and result.

Project names are resolved by scanning
~/.config/Cursor/User/workspaceStorage/*/workspace.json (which maps each workspace hash
to a file:// project path) and the companion state.vscdb in each workspace directory
(which lists which composer IDs belong to that workspace).

Message reconstruction

Cursor stores each tool call as its own separate bubble (type 2), rather than grouping
them into a single assistant message as Claude Code does. The parser merges consecutive
type-2 bubbles — accumulating tool calls and concatenating text content — into a single
ParsedMessage. This matches the structure the rest of agentsview expects.

Changes

internal/parser/cursor_vscdb.go (new)

  • ListCursorVscdbSessions — returns lightweight metadata for all sessions (used for
    change detection). Resolves project names via the workspace storage scan.
  • ParseCursorVscdbSession — parses a single session: loads bubble order from
    composerData, fetches all bubbleId entries, merges consecutive assistant bubbles
    into messages.
  • parseCursorBubbleTime — handles Cursor's ISO 8601 bubble timestamps.
  • normalizeCursorParamsJSON — handles params stored as either a JSON object or a
    JSON-encoded string.

internal/parser/cursor_vscdb_test.go (new): 13 unit tests covering list/parse
functions, tool call extraction, message grouping, and params normalization.

internal/parser/taxonomy.go

  • Added vscdb tool name mappings: run_terminal_command_v2 → Bash, read_file_v2
    Read, edit_file_v2/search_replace → Edit, ripgrep_raw_search/rg → Grep,
    glob_file_search → Glob, task_v2 → Task, delete_file → Write, list_dir_v2
    Read, and several Tool-category names (todo_write, create_plan, etc.).
  • Added mcp- prefix catch-all for MCP tool invocations.

internal/config/config.go

  • Added CursorStateDB string field, defaulting to
    ~/.config/Cursor/User/globalStorage/state.vscdb.
  • Reads CURSOR_STATE_DB env var to override.

internal/sync/engine.go

  • Added CursorStateDB to EngineConfig and Engine.
  • syncCursorVscdb — follows the OpenCode pattern: uses lastUpdatedAt (unix millis)
    for per-session change detection via the existing GetFileInfoByPath virtual-path
    mechanism, calls writeSessionFull for in-place replacement. Builds a child→parent map
    from subComposerIds and sets ParentSessionID/RelSubagent on child sessions.
  • vscdb sync runs before file workers. The set of synced session IDs is stored on the
    engine for the duration of the sync. processCursor checks this set and skips any
    session already handled by vscdb, preventing the text-only JSONL from overwriting the
    richer data.
  • vscdb session counts fold into the overall SyncStats.Synced counter.

internal/sync/engine_integration_test.go: 4 integration tests — basic sync, change
detection (unchanged sessions are not re-parsed), dedup (vscdb wins over file-based
JSONL), and subagent parent-child linking.

cmd/agentsview/main.go, cmd/agentsview/sync.go: Pass CursorStateDB through
to EngineConfig.

Validation

Tested against a real state.vscdb with 702 sessions — 563 sessions with actual
conversation content synced, producing 25,253 tool calls that were previously invisible.

Cursor changed its agent transcript storage format. Old format stored
transcripts as flat files:
  agent-transcripts/<uuid>.{txt,jsonl}

New format nests the transcript inside a UUID-named subdirectory:
  agent-transcripts/<uuid>/<uuid>.{jsonl,txt}

Discovery, FindSourceFile, classifyPaths (file-watcher path
classification), and project-name extraction all updated to handle
both formats. Dedup and .jsonl-over-.txt preference rules apply to
the new format too.

Made-with: Cursor
@roborev-ci
Copy link

roborev-ci bot commented Mar 13, 2026

roborev: Combined Review (2695c8e)

Verdict: The PR successfully adds Cursor state.vscdb support but introduces several medium-severity issues regarding platform compatibility, data loss for tool results, and synchronization edge cases.

Medium

Platform-Specific Path Hardcoding
internal/config /config.go:146
The CursorStateDB default path hardcodes the Linux .config/Cursor/... structure. This breaks out-of-the-box discovery on macOS (Library/Application Support/Cursor/...) and Windows (AppData\Roaming\Cursor\...).
*
Suggested fix:* Use os.UserConfigDir() to dynamically construct the correct base path for the active platform, and add tests for the platform-specific branches.

Missing Tool Result Data in VSCDB Parser
internal/parser/cursor_vscdb.go:248,
internal/parser/cursor_vscdb.go:467, internal/parser/cursor_vscdb.go:508

The new vscdb parser reads toolFormerData.result into the struct but never converts it into ParsedToolResults
or persists it. Because syncCursorVscdb() marks vscdb sessions as authoritative, this drops tool output data that the existing Cursor JSONL path preserves, losing command output/read results in the UI and analytics.
Suggested fix: Convert non-empty toolFormerData.result into paired
tool results (or populate result_content directly), and add an integration test for a vscdb tool bubble with a result payload.

Single-Session Sync Fails for VSCDB Sessions
internal/sync/engine.go:1907, internal/sync/ engine.go:1936
FindSourceFile() and SyncSingleSession() are transcript-file only. A Cursor session that exists only in state.vscdb has no discoverable source path, so SyncSingleSession("cursor:<id>") returns a not found
error. Per-session watch/poll flows relying on FindSourceFile() will never refresh it until the next whole-engine SyncAll.
Suggested fix: Add a Cursor vscdb branch here, treating CursorStateDB#<sessionID> as a virtual source and re-parsing from
state.vscdb. Add a test covering single-session re-sync/watch behavior for a vscdb-only Cursor session.

VSCDB Session Overwrite on File Watcher Update
internal/sync/engine.go:1584
e.cursorVsc dbSynced is only populated during a full SyncAll(). If the file watcher detects an update to a Cursor .jsonl file and triggers SyncPaths(), e.cursorVscdbSynced will be nil. This causes the engine to parse the text-only .jsonl and mistakenly overwrite the rich vscdb session (which contains tool calls) in the database.
Suggested fix: Ensure targeted syncs respect vscdb precedence. Either spot-check state.vscdb for the targeted session ID in processCursor before falling back to file
parsing, or bypass .jsonl file watcher updates for Cursor sessions entirely.


Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@BadLiveware BadLiveware force-pushed the cursor-compatability branch from 2695c8e to 29252de Compare March 13, 2026 08:53
@roborev-ci
Copy link

roborev-ci bot commented Mar 13, 2026

roborev: Combined Review (29252de)

Verdict: The PR adds support for Cursor's state.vsc db sync and nested transcript layout, but introduces medium-severity issues regarding error handling, data parsing regressions, and cross-platform configuration.

Medium

Unhandled cancellation and write errors in Cursor vscdb sync
File: internal/sync/engine.go
The new Cursor vscdb
write loop ignores both cancellation and write errors. In syncAllLocked, every cvPending session is written with e.writeSessionFull(pw) before the later ctx.Err() abort check, and stats.RecordSynced(cvCount) is called unconditionally. This means a canceled SyncAll can still mutate the DB through vscdb writes, and excluded/failed writes are incorrectly reported as synced. The OpenCode path handles this correctly by checking ctx.Err() and counting only successful writes; the Cursor vscdb path should mirror that logic. Add tests for canceled SyncAll with CursorStateDB enabled and for excluded vscdb-backed sessions.

Missing tool result parsing causes data loss
Files: internal/parser/cursor_vscdb.go, internal/parser/cursor.go
The cursorToolFormerData struct includes a Result field, but the
new vscdb parser never converts it into ParsedToolResults. buildCursorVscdbMessages only emits tool calls and assistant text. This is a behavioral regression because the existing Cursor JSONL parser preserves tool results, and the new sync suppresses file-based Cursor ingestion when a vscdb copy exists
. Consequently, Cursor sessions synced from vscdb lose tool-result pairing and content, breaking filters and analytics that depend on them.
Suggested fix: Parse toolFormerData.result into ParsedToolResults and add a test verifying result pairing/filtering for vscdb-backed Cursor sessions.

**
Hardcoded Linux-specific default path for CursorStateDB**
File: internal/config/config.go
The default CursorStateDB path is hardcoded to ~/.config/Cursor/User/globalStorage/state.vscdb, which is Linux-specific.
On macOS and Windows, the new vscdb sync will be silently disabled unless the user manually sets CURSOR_STATE_DB.
Suggested fix: Choose the default path dynamically based on the platform, similar to other cross-platform agent path defaults, and add configuration tests for non-Linux platforms.


Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@wesm
Copy link
Owner

wesm commented Mar 15, 2026

I'm going to cut 0.14.0 before getting to this, seems like it may need more work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants