fix: interrupt cleanup doesn't propagate to LangGraph checkpoint

## Summary

Session state cleanup on interrupt only affects in-memory state, not the LangGraph checkpoint, causing orphaned tool calls to persist and corrupt subsequent turns.

## Environment

- **OS**: Linux 7.0.2-7-pve
- **Node.js**: v25.8.1
- **madz version**: 1.22.0
- **LLM provider**: Unknown — user to confirm

## Reproduction

1. Start a conversation in the TUI
2. Trigger a tool call that requires user input (e.g., file read, web search)
3. Interrupt the tool execution (e.g., via command or cancel)
4. Send a new message to resume the conversation
5. Observe orphaned tool calls in the checkpoint corrupting the resumed turn

## Expected Behavior

Interrupt cleanup should propagate to the LangGraph checkpoint, removing orphaned tool calls and ensuring the checkpoint state matches the in-memory conversation state.

## Actual Behavior

Only in-memory state is cleaned up. The LangGraph checkpoint retains partial AIMessages with tool_calls, causing duplicate tool calls, dangling tool references in LLM API requests, and corrupted conversation history on resume.

## Additional Context

- **Affected files**: `./src/tui/app.js`, `./src/agent/react.js`, `./src/session/stateManager.js`
- `handleChat()` (app.js:922-924) calls both `removeLastAssistantToolCallMessage()` and `popExchange()`, but `handleCommand()` (app.js:524-526) only calls `popExchange()` — tool call messages are not cleaned up in the command path
- `removeLastAssistantToolCallMessage()` (stateManager.js:80-88) only removes from `sessionState.#state.conversation`. The LangGraph checkpoint — written at superstep boundaries by `createReactAgentGraph` — retains the partial AIMessage with tool_calls
- No resumption reconciliation: When the user sends a new message after interrupt, a fresh `dispatchProvider` call uses the cleaned in-memory conversation, but the LangGraph checkpoint still contains orphaned messages. The `isNewThread` flag only controls system prompt injection, not checkpoint state
- Orphaned tool calls corrupt resume: Partial AIMessages with tool_calls persist in the checkpoint. On resume, `streamEvents` replays from the checkpoint, potentially sending dangling tool references to the LLM API

## Audit Checklist

The following areas were audited during investigation:

- [x] 1. Verify abort signal propagation through the entire streaming pipeline (app.js → react.js → LangGraph `streamEvents`)
- [ ] 2. Verify session state cleanup on interrupt — ensure all partial messages are removed from both in-memory state and LangGraph checkpoint
- [ ] 3. Verify checkpoint state consistency after interrupt — no orphaned AIMessages with incomplete tool_calls persist in checkpoint
- [ ] 4. Verify resumption behavior — confirm the graph can resume from checkpoint after interrupt without duplicating or losing messages
- [ ] 5. Verify that interrupted tool calls are properly cleaned up in both session state and checkpoint before resume
- [ ] 6. Verify that the abort signal doesn't leave partial state in the LangGraph checkpoint that would corrupt subsequent turns

## Audit Log

### 1. Abort signal propagation → ✅ PASS (Risk: Low)
Signal is properly propagated and checked on each event. `react.js:225-228` passes signal to `streamOptions`, `react.js:293-306` checks `signal.aborted` per event, clears turn hashes, emits pending `tool_end`, and returns cleanly.

### 2. Session state cleanup on interrupt → ⚠️ INCONSISTENT (Risk: Medium)
Inconsistent cleanup between code paths. `handleChat()` (app.js:922-924) calls both `removeLastAssistantToolCallMessage()` and `popExchange()`, but `handleCommand()` (app.js:524-526) only calls `popExchange()` — tool call messages are not cleaned up in the command path.

### 3. Checkpoint state consistency after interrupt → ⚠️ FAIL (Risk: High)
`removeLastAssistantToolCallMessage()` only modifies `sessionState.#state.conversation` (in-memory array). It does NOT update the LangGraph checkpoint. Orphaned AIMessages with tool_calls persist in the checkpoint and will be replayed on resume.

### 4. Resumption behavior → ❌ FAIL (Risk: High)
No explicit resumption mechanism exists. When the user sends a new message after interrupt, a new `dispatchProvider` call is made with the cleaned in-memory conversation. However, the LangGraph checkpoint retains the old state with orphaned messages. The `isNewThread` flag (react.js:164) only controls system prompt injection, not checkpoint state reconciliation.

### 5. Interrupted tool call cleanup in checkpoint → ⚠️ FAIL (Risk: High)
Tool call cleanup is in-memory only. `stateManager.js:80-88` removes the last assistant message with tool_calls from the conversation array, but the LangGraph checkpoint (written by `createReactAgentGraph` at superstep boundaries) still contains the partial AIMessage. On resume, the checkpoint will replay the orphaned tool call.

### 6. Partial state in checkpoint corrupting subsequent turns → ⚠️ FAIL (Risk: High)
Partial AIMessages from interrupted streams persist in the checkpoint. When the graph resumes via `streamEvents`, it reads from the checkpoint which may include orphaned AIMessages with incomplete tool_calls. This can cause the LLM API to receive corrupted conversation history with dangling tool references.

## Recommendations

1. **Propagate checkpoint cleanup on interrupt**: After cleaning in-memory state, explicitly update or reset the LangGraph checkpoint for the current `thread_id` to remove orphaned messages. This may require calling `checkpointer.update()` or `checkpointer.put()` with the cleaned state.

2. **Add cleanup to `handleCommand()`**: Mirror the `handleChat()` interrupt cleanup by calling `sessionState.removeLastAssistantToolCallMessage()` in the command path (app.js:524-526).

3. **Implement checkpoint reconciliation on resume**: Before the next `dispatchProvider` call after an interrupt, verify that the checkpoint state matches the in-memory conversation. If they diverge, reconcile by writing the cleaned state to the checkpoint.

4. **Add integration test**: Create a test that simulates interrupt during tool execution, then resumes with a new message, and verifies the checkpoint contains no orphaned tool calls.

## References

- https://reference.langchain.com/python/langgraph.checkpoint.md
- https://reference.langchain.com/python/langgraph.store.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: interrupt cleanup doesn't propagate to LangGraph checkpoint #468

Summary

Environment

Reproduction

Expected Behavior

Actual Behavior

Additional Context

Audit Checklist

Audit Log

1. Abort signal propagation → ✅ PASS (Risk: Low)

2. Session state cleanup on interrupt → ⚠️ INCONSISTENT (Risk: Medium)

3. Checkpoint state consistency after interrupt → ⚠️ FAIL (Risk: High)

4. Resumption behavior → ❌ FAIL (Risk: High)

5. Interrupted tool call cleanup in checkpoint → ⚠️ FAIL (Risk: High)

6. Partial state in checkpoint corrupting subsequent turns → ⚠️ FAIL (Risk: High)

Recommendations

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

fix: interrupt cleanup doesn't propagate to LangGraph checkpoint #468

Description

Summary

Environment

Reproduction

Expected Behavior

Actual Behavior

Additional Context

Audit Checklist

Audit Log

1. Abort signal propagation → ✅ PASS (Risk: Low)

2. Session state cleanup on interrupt → ⚠️ INCONSISTENT (Risk: Medium)

3. Checkpoint state consistency after interrupt → ⚠️ FAIL (Risk: High)

4. Resumption behavior → ❌ FAIL (Risk: High)

5. Interrupted tool call cleanup in checkpoint → ⚠️ FAIL (Risk: High)

6. Partial state in checkpoint corrupting subsequent turns → ⚠️ FAIL (Risk: High)

Recommendations

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions