MCP writes silently reverted when multiple ohno-mcp processes run concurrently (full rows + task_activity wiped)

## Summary
When two or more `ohno-mcp` processes are running against the same `.ohno/tasks.db`, writes from one process can be silently reverted by another. Every MCP write returns `{"success": true}` and subsequent `get_task` calls (via the same MCP) confirm the change, but direct sqlite inspection shows the row was never updated — or was updated and then fully restored to a pre-write state, including stale `updated_at` timestamps and disappearance of freshly-inserted `task_activity` rows.

## Environment
- `@stevestomp/ohno-cli` 0.19.0
- `@stevestomp/ohno-mcp` 0.19.0
- macOS Darwin 25.4.0 (arm64)
- Two simultaneous Claude Code sessions, each spawning its own `ohno-mcp` via `npx`. Both point at the same project's `.ohno/tasks.db`. A third process (`ohno serve`) is running for the kanban viewer.

ps snapshot showing three long-lived processes at the time of the bug:

```
PID 4429/4449    npm exec ohno-mcp        started Mon 5 PM (~48h old)
PID 97179/97214  npm exec ohno-mcp        started today 11:26 AM
PID 12827/12842  npm exec ohno-cli serve  started today 12:28 PM
```

## Reproduction
1. Start two `ohno-mcp` processes against the same db (e.g., open two Claude Code sessions in the same project, both of which will spawn their own MCP via `npx`).
2. From session A, call `archive_task` on a task. It returns `{"success": true}`.
3. From session A, call `get_task` on the same id. It returns `status: "archived"`.
4. Run `sqlite3 .ohno/tasks.db "SELECT status FROM tasks WHERE id='<id>'"` directly. **The status is still `todo`**. The archive never hit disk.

Two distinct symptoms observed:

### Symptom A — Silent write failure
`archive_task` × 25 and `update_story` × 4 all returned `{"success": true}` over ~10 minutes. None persisted. The MCP held the modifications in memory and served them back on reads, but none reached the db file.

```bash
$ sqlite3 .ohno/tasks.db \
    "SELECT COUNT(*) FROM tasks WHERE id IN (<25 ids>) AND status='archived'"
0
```

### Symptom B — Direct sqlite write gets rolled back
Working around Symptom A, I bypassed the MCP with a direct transaction:

```sql
BEGIN IMMEDIATE;
UPDATE tasks SET status='archived', updated_at=strftime(...) WHERE id IN (<25 ids>);
INSERT INTO task_activity (...) SELECT 'activity-ymyl-...', id, ... FROM tasks WHERE ...;
UPDATE stories SET status='done', description=... WHERE id IN (<3 ids>);
UPDATE stories SET description=... WHERE id='story-bfab7c86';
COMMIT;
```

Transaction committed successfully. Immediate post-commit SELECT confirmed all 25 rows at `status='archived'` with new `updated_at`, the 25 `task_activity` rows present, 3 stories at `status='done'`, and the description of `story-bfab7c86` updated.

**~2 minutes later**, after session B's agent called `update_task_status` on two tasks in a completely unrelated story (`task-a14981ab`, `task-7620bd11`), a re-SELECT showed:

- All 25 tasks reverted to `status='todo'`
- Their `updated_at` reverted to values from **2026-04-14** (pre-archive, ~18 hours old)
- All 25 `task_activity` rows I inserted were **gone** from the table (not hidden — the rows do not exist)
- All 3 stories reverted to `status='todo'` with their original descriptions
- `story-bfab7c86`'s description was restored to the pre-edit content

A plain write-wins conflict would explain the status flip, but cannot explain:
- `updated_at` reverting to a ~18h-old timestamp (a write would leave `updated_at` at the wall-clock time of the stomping write)
- `task_activity` rows disappearing completely
- Rows in tables the concurrent MCP had no reason to touch (stories) also rolling back

Both of those require a **full restore from a cached snapshot**, not just concurrent writes on the same row.

## Observed timeline
```
10:07Z  session A: archive_task x25         → all return success, 0 persisted
10:17Z  session A: update_story x4          → all return success, 0 persisted
10:31Z  session A: bypass MCP, sqlite transaction commits and verifies:
        - 25 tasks archived (new updated_at)
        - 25 task_activity rows inserted
        - 3 stories status=done + description rewrite
        - 1 story description rewrite
10:33Z  session B agent: update_task_status(task-a14981ab → done)
10:35Z  session B agent: update_task_status(task-7620bd11 → done)
10:37Z  session A re-SELECT: all 25 archives gone, 25 activity rows gone,
        updated_at values back to 2026-04-14, 3 stories back to status=todo,
        story-bfab7c86 description reverted
```

Every concurrent write acts as a "stomp event" that wipes changes made outside that MCP's in-memory model. Changes made via the MCP that performs the stomp survive; changes made via any other channel (other MCP, direct sqlite, CLI) do not.

## Hypothesis (not source-verified)
The MCP appears to hold an in-memory model of the `tasks`, `stories`, and `task_activity` tables loaded at startup (or at first read). Write operations mutate that in-memory model and trigger a full flush of the cached state back to the db on commit, overwriting any rows touched in the meantime by other actors. The 2-day-old PID 4429 process has the oldest snapshot and the most destructive flushes: any write from it rolls back ~2 days of activity to its Monday view.

This would also explain Symptom A if that MCP's flush path is somehow no-oping (e.g., a transaction started but never committed, or a write to an in-memory-only buffer). I haven't reproduced Symptom A cleanly enough to be sure.

## Impact
- Any multi-session workflow using concurrent MCPs against one db is unsafe.
- Cleanup/audit scripts using direct sqlite writes are also unsafe if *any* MCP is running.
- Since the kanban viewer (`ohno serve`) reads the db file directly, users see their changes appear and then disappear. This matches issue #23 ("Kanban board loses filter and modal state on database updates") symptomatically and may share a root cause.

## Workarounds used
- `ps aux | grep ohno-mcp` and kill stale MCPs before any write. After that, direct sqlite writes persist until the next ohno-mcp process starts.
- Verify with `sqlite3` after every write and accept the ~2 minute half-life.

Neither is viable for real use.

## Suggested fixes
- Don't hold cached state across writes; read from the db at write time and use sqlite transactions.
- Optimistic locking: compare `updated_at` before writing and refuse to flush if the row on disk has drifted from the cached value.
- Make the MCP stateless — load on every tool call, transact, release.
- Detect multiple `ohno-mcp` processes pointing at the same db file and warn or refuse to start.

Happy to provide full sqlite dumps, strace of the MCP write path, or a minimal repro script. Thanks for ohno — great tool and this is the first real issue I've hit.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MCP writes silently reverted when multiple ohno-mcp processes run concurrently (full rows + task_activity wiped) #36

Summary

Environment

Reproduction

Symptom A — Silent write failure

Symptom B — Direct sqlite write gets rolled back

Observed timeline

Hypothesis (not source-verified)

Impact

Workarounds used

Suggested fixes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

MCP writes silently reverted when multiple ohno-mcp processes run concurrently (full rows + task_activity wiped) #36

Description

Summary

Environment

Reproduction

Symptom A — Silent write failure

Symptom B — Direct sqlite write gets rolled back

Observed timeline

Hypothesis (not source-verified)

Impact

Workarounds used

Suggested fixes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions