From 3d3dcad19506ae556cbf34255e0077ef22c042e2 Mon Sep 17 00:00:00 2001 From: Jason Mulligan Date: Sun, 28 Jun 2026 11:30:03 -0400 Subject: [PATCH 01/11] feat: add OpenSpec proposal for interrupt checkpoint cleanup --- .../.openspec.yaml | 2 + .../interrupt-checkpoint-cleanup/design.md | 69 +++++++++++++++++++ .../interrupt-checkpoint-cleanup/proposal.md | 25 +++++++ .../specs/interrupt-cleanup/spec.md | 57 +++++++++++++++ .../interrupt-checkpoint-cleanup/tasks.md | 46 +++++++++++++ 5 files changed, 199 insertions(+) create mode 100644 openspec/changes/interrupt-checkpoint-cleanup/.openspec.yaml create mode 100644 openspec/changes/interrupt-checkpoint-cleanup/design.md create mode 100644 openspec/changes/interrupt-checkpoint-cleanup/proposal.md create mode 100644 openspec/changes/interrupt-checkpoint-cleanup/specs/interrupt-cleanup/spec.md create mode 100644 openspec/changes/interrupt-checkpoint-cleanup/tasks.md diff --git a/openspec/changes/interrupt-checkpoint-cleanup/.openspec.yaml b/openspec/changes/interrupt-checkpoint-cleanup/.openspec.yaml new file mode 100644 index 0000000..c0d3374 --- /dev/null +++ b/openspec/changes/interrupt-checkpoint-cleanup/.openspec.yaml @@ -0,0 +1,2 @@ +schema: spec-driven +created: 2026-06-28 diff --git a/openspec/changes/interrupt-checkpoint-cleanup/design.md b/openspec/changes/interrupt-checkpoint-cleanup/design.md new file mode 100644 index 0000000..62f2618 --- /dev/null +++ b/openspec/changes/interrupt-checkpoint-cleanup/design.md @@ -0,0 +1,69 @@ +## Context + +The TUI uses LangGraph for conversation state management. When a user interrupts a tool call (via command or cancel), the cleanup only removes messages from the in-memory conversation array. The LangGraph checkpoint — which persists state at superstep boundaries — retains the partial AIMessage with tool_calls. On resume, `streamEvents` replays from the checkpoint, sending orphaned tool references to the LLM API. + +Current cleanup paths: +- `handleChat()` (app.js:922-924): calls `removeLastAssistantToolCallMessage()` + `popExchange()` +- `handleCommand()` (app.js:524-526): only calls `popExchange()` — missing tool call cleanup + +The `removeLastAssistantToolCallMessage()` function (stateManager.js:80-88) only modifies `sessionState.#state.conversation`. + +## Goals / Non-Goals + +**Goals:** +- Ensure interrupt cleanup propagates to the LangGraph checkpoint +- Add missing cleanup to `handleCommand()` interrupt path +- Implement checkpoint reconciliation on resume as a safety net +- Add integration test covering interrupt/resume scenario + +**Non-Goals:** +- Refactoring the overall interrupt handling architecture +- Changing the `isNewThread` flag behavior +- Modifying the LangGraph checkpoint schema or persistence layer +- Handling interrupts during system prompt generation + +## Decisions + +1. **Checkpoint update via checkpointer.put()** + - Use `checkpointer.put()` with the cleaned conversation state rather than trying to delete individual messages + - Rationale: LangGraph's JavaScript implementation doesn't expose a `delete` API; `put()` with cleaned state is the canonical way to update checkpoint state + - Alternatives considered: `checkpointer.update()` (less explicit), manual message filtering (fragile) + +2. **Pass checkpointer to stateManager** + - The checkpointer instance needs to be accessible from `stateManager.js` to update the checkpoint + - Pass it through the session state constructor or as a parameter to `removeLastAssistantToolCallMessage()` + - Rationale: Keeps stateManager focused on state operations while giving it checkpoint access + - Alternatives considered: Global singleton (harder to test), callback pattern (more complex) + +3. **Reconciliation only after interrupt** + - Checkpoint reconciliation runs only when an interrupt occurred, not on every resume + - Rationale: Adds minimal overhead to normal resume path; interrupt is the only scenario where divergence is expected + - Implementation: Track interrupt state in session, check before `dispatchProvider` in react.js + +4. **Compare conversation arrays for divergence** + - Reconciliation compares the in-memory conversation array with the checkpoint's stored messages + - If lengths differ or last messages don't match, write cleaned state to checkpoint + - Rationale: Simple comparison is sufficient for this bug; complex diffing is unnecessary + +## Risks / Trade-offs + +1. **[Risk] Checkpointer API differences between Python and JS** → [Mitigation] Verify the JavaScript LangGraph checkpointer API; use `put()` with full state if `update()` is unavailable +2. **[Risk] Race condition if interrupt and resume happen rapidly** → [Mitigation] Reconciliation on resume catches any divergence; interrupt flag prevents concurrent cleanup +3. **[Risk] Performance impact of checkpoint write on interrupt** → [Mitigation] Checkpoint writes are infrequent (only on interrupt); negligible compared to LLM API calls +4. **[Risk] Test complexity for interrupt simulation** → [Mitigation] Use mocked checkpointer and simulate interrupt via controlled state changes rather than real async timing + +## Migration Plan + +This is a bug fix with no migration required. The changes are: +1. Modify `removeLastAssistantToolCallMessage()` to accept optional checkpointer parameter +2. Update `handleCommand()` to call cleanup before `popExchange()` +3. Add reconciliation check in `react.js` before `dispatchProvider` +4. Add integration test + +No rollback strategy needed — the fix is additive and defensive. If issues arise, revert the feature branch. + +## Open Questions + +1. Does the JavaScript LangGraph checkpointer expose the same `put()`/`update()` API as Python? +2. What is the exact structure of the checkpoint state that needs to be updated? +3. Should reconciliation also verify tool result messages, or only assistant messages with tool_calls? \ No newline at end of file diff --git a/openspec/changes/interrupt-checkpoint-cleanup/proposal.md b/openspec/changes/interrupt-checkpoint-cleanup/proposal.md new file mode 100644 index 0000000..3c9ad70 --- /dev/null +++ b/openspec/changes/interrupt-checkpoint-cleanup/proposal.md @@ -0,0 +1,25 @@ +## Why + +Interrupt cleanup in the TUI only affects in-memory state, not the LangGraph checkpoint. When a user interrupts a tool call and resumes the conversation, orphaned AIMessages with incomplete tool_calls persist in the checkpoint. This causes duplicate tool calls, dangling tool references in LLM API requests, and corrupted conversation history on resume. + +## What Changes + +- Modify `removeLastAssistantToolCallMessage()` to update both in-memory state and the LangGraph checkpoint +- Add cleanup call to `handleCommand()` interrupt path to match `handleChat()` behavior +- Implement checkpoint reconciliation before resume to detect and fix state divergence +- Add integration test covering interrupt/resume scenario for both chat and command paths + +## Capabilities + +### New Capabilities +- `interrupt-cleanup`: Consistent cleanup of interrupted tool calls across in-memory state and LangGraph checkpoint, with reconciliation on resume + +### Modified Capabilities + + +## Impact + +- `src/tui/app.js` — `handleChat()` and `handleCommand()` interrupt cleanup paths +- `src/session/stateManager.js` — `removeLastAssistantToolCallMessage()` checkpoint propagation +- `src/agent/react.js` — Resume reconciliation before `dispatchProvider` +- New integration test for interrupt/resume scenario \ No newline at end of file diff --git a/openspec/changes/interrupt-checkpoint-cleanup/specs/interrupt-cleanup/spec.md b/openspec/changes/interrupt-checkpoint-cleanup/specs/interrupt-cleanup/spec.md new file mode 100644 index 0000000..c290404 --- /dev/null +++ b/openspec/changes/interrupt-checkpoint-cleanup/specs/interrupt-cleanup/spec.md @@ -0,0 +1,57 @@ +## ADDED Requirements + +### Requirement: Interrupt cleanup propagates to LangGraph checkpoint +The system SHALL update the LangGraph checkpoint when removing orphaned tool call messages from in-memory state during interrupt handling. + +#### Scenario: Checkpoint updated on chat interrupt +- **WHEN** a user interrupts a tool call via the chat interface +- **THEN** `removeLastAssistantToolCallMessage()` removes the message from in-memory state AND updates the LangGraph checkpoint with the cleaned state + +#### Scenario: Checkpoint updated on command interrupt +- **WHEN** a user interrupts a tool call via a command +- **THEN** `handleCommand()` calls `removeLastAssistantToolCallMessage()` before `popExchange()`, updating both in-memory state and the checkpoint + +#### Scenario: No orphaned tool calls persist after interrupt +- **WHEN** an interrupt has occurred and cleanup is complete +- **THEN** the LangGraph checkpoint contains no AIMessages with incomplete or orphaned tool_calls + +### Requirement: Command path mirrors chat path cleanup +The system SHALL apply identical cleanup logic in both the chat and command interrupt paths. + +#### Scenario: handleCommand calls removeLastAssistantToolCallMessage +- **WHEN** a tool call is interrupted via command +- **THEN** `handleCommand()` calls `sessionState.removeLastAssistantToolCallMessage()` before `popExchange()` + +#### Scenario: Consistent cleanup across paths +- **WHEN** interrupt occurs via either chat or command path +- **THEN** the resulting in-memory state and checkpoint state are identical + +### Requirement: Checkpoint reconciliation on resume +The system SHALL verify checkpoint consistency before resuming a conversation after an interrupt. + +#### Scenario: Reconciliation runs after interrupt +- **WHEN** a new message is sent after an interrupt occurred +- **THEN** the system compares checkpoint state with in-memory conversation before calling `dispatchProvider` + +#### Scenario: Divergent state is reconciled +- **WHEN** checkpoint state differs from in-memory conversation after an interrupt +- **THEN** the system writes the cleaned in-memory state to the checkpoint before resuming + +#### Scenario: Normal resume unaffected +- **WHEN** a message is sent without a prior interrupt +- **THEN** no reconciliation check is performed and resume behavior is unchanged + +### Requirement: Integration test covers interrupt/resume scenario +The system SHALL include an integration test that verifies checkpoint consistency after interrupt and resume. + +#### Scenario: Test simulates interrupt during tool execution +- **WHEN** the test triggers a tool call and then simulates an interrupt +- **THEN** the checkpoint contains no orphaned tool call messages + +#### Scenario: Test verifies resume after interrupt +- **WHEN** the test sends a new message after the interrupt +- **THEN** the resumed conversation contains no duplicate tool calls or dangling references + +#### Scenario: Test covers both interrupt paths +- **WHEN** the test runs for both chat and command interrupt scenarios +- **THEN** both paths produce clean checkpoint state with no orphaned messages \ No newline at end of file diff --git a/openspec/changes/interrupt-checkpoint-cleanup/tasks.md b/openspec/changes/interrupt-checkpoint-cleanup/tasks.md new file mode 100644 index 0000000..ec76426 --- /dev/null +++ b/openspec/changes/interrupt-checkpoint-cleanup/tasks.md @@ -0,0 +1,46 @@ +## 1. Examine existing code and checkpointer API + +- [ ] 1.1 Read `src/session/stateManager.js` to understand `removeLastAssistantToolCallMessage()` implementation and its access to session state +- [ ] 1.2 Read `src/agent/react.js` to understand how the LangGraph graph is created, how the checkpointer is passed, and where `dispatchProvider` is called +- [ ] 1.3 Read `src/tui/app.js` to understand `handleChat()` and `handleCommand()` interrupt cleanup paths (lines 524-526 and 922-924) +- [ ] 1.4 Verify the JavaScript LangGraph checkpointer API — confirm `put()` or `update()` method signature and expected state format + +## 2. Propagate checkpoint cleanup in stateManager + +- [ ] 2.1 Modify `removeLastAssistantToolCallMessage()` to accept an optional checkpointer parameter +- [ ] 2.2 After removing the last assistant message with tool_calls from in-memory state, call `checkpointer.put()` with the cleaned conversation state +- [ ] 2.3 Handle edge case where checkpointer is not provided (graceful no-op for backward compatibility) +- [ ] 2.4 Ensure the checkpoint state update uses the correct thread_id from session state + +## 3. Add cleanup to handleCommand interrupt path + +- [ ] 3.1 In `handleCommand()` interrupt path (app.js:524-526), add call to `sessionState.removeLastAssistantToolCallMessage()` before `popExchange()` +- [ ] 3.2 Ensure the call is guarded — only execute if there are assistant messages with tool_calls to remove +- [ ] 3.3 Verify both interrupt paths (chat and command) produce identical cleanup behavior + +## 4. Implement checkpoint reconciliation on resume + +- [ ] 4.1 Add an `#interruptOccurred` flag to session state to track when an interrupt has happened +- [ ] 4.2 Set the flag in both `handleChat()` and `handleCommand()` interrupt paths +- [ ] 4.3 In `react.js`, before `dispatchProvider` call, check if `#interruptOccurred` is true +- [ ] 4.4 If flag is true, compare checkpoint messages with in-memory conversation messages +- [ ] 4.5 If they diverge (different length or last messages don't match), write cleaned in-memory state to checkpoint via `checkpointer.put()` +- [ ] 4.6 Clear the `#interruptOccurred` flag after reconciliation + +## 5. Add integration test for interrupt/resume scenario + +- [ ] 5.1 Create `tests/integration/interrupt-checkpoint.test.js` (or add to existing integration test file) +- [ ] 5.2 Mock the LangGraph checkpointer to capture state writes +- [ ] 5.3 Simulate a tool call interrupt via the chat path and verify checkpoint is cleaned +- [ ] 5.4 Simulate a tool call interrupt via the command path and verify checkpoint is cleaned +- [ ] 5.5 Verify that resuming after interrupt does not replay orphaned tool calls +- [ ] 5.6 Verify that normal resume (no interrupt) is unaffected by reconciliation logic + +## 6. Verify and commit + +- [ ] 6.1 Run `npm run test` to ensure all tests pass +- [ ] 6.2 Run `npm run lint` to ensure lint passes +- [ ] 6.3 Run `npm run coverage` to ensure 100% coverage is maintained +- [ ] 6.4 Run `timeout 10 npm start` to verify application starts without crashing +- [ ] 6.5 Commit changes with conventional commit format +- [ ] 6.6 Push branch and create PR \ No newline at end of file From ccf8d3fe75d72a3111515e9fe651b22bb4033361 Mon Sep 17 00:00:00 2001 From: Jason Mulligan Date: Mon, 29 Jun 2026 09:14:36 -0400 Subject: [PATCH 02/11] fix: use promisify(execFile) in searchFiles to fix trim error (#472) --- src/tools/filesystem.js | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/tools/filesystem.js b/src/tools/filesystem.js index 3caef39..d793421 100644 --- a/src/tools/filesystem.js +++ b/src/tools/filesystem.js @@ -2,8 +2,12 @@ import { tool } from "@langchain/core/tools"; import { z } from "zod"; import { access, readFile, writeFile, mkdir, readdir, stat } from "node:fs/promises"; import { dirname, basename, join } from "node:path"; +import { promisify } from "node:util"; +import { execFile } from "node:child_process"; import { validatePath, checkFileLimit } from "./common.js"; +const execFileAsync = promisify(execFile); + const MAX_CONTENT_SIZE = 500 * 1024; // 500KB for write operations // --- Helpers --- @@ -422,7 +426,6 @@ export async function searchFilesImpl(input, options) { } try { - const { execFile } = await import("node:child_process"); const limit = input.maxResults || 20; const rgArgs = [ "--line-number", @@ -432,7 +435,7 @@ export async function searchFilesImpl(input, options) { input.pattern, resolved.path, ].filter(Boolean); - const { stdout } = await execFile("rg", rgArgs, { timeout: 10000, encoding: "utf-8" }); + const { stdout } = await execFileAsync("rg", rgArgs, { timeout: 10000, encoding: "utf-8" }); const output = (stdout ?? "").trim(); if (!output) { From b7d2dca1442b07951f03017e58224c4afaf1fab4 Mon Sep 17 00:00:00 2001 From: Jason Mulligan Date: Mon, 29 Jun 2026 09:25:42 -0400 Subject: [PATCH 03/11] chore: bump version to 1.22.1 (#474) --- CHANGELOG.md | 7 +++++++ package-lock.json | 4 ++-- package.json | 2 +- 3 files changed, 10 insertions(+), 3 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 24489b2..69b10af 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,8 +4,15 @@ All notable changes to this project will be documented in this file. Dates are d Generated by [`auto-changelog`](https://github.com/CookPete/auto-changelog). +#### [1.22.1](https://github.com/avoidwork/madz/compare/1.22.0...1.22.1) + +- fix: use promisify(execFile) in searchFiles to fix trim error [`#472`](https://github.com/avoidwork/madz/pull/472) + #### [1.22.0](https://github.com/avoidwork/madz/compare/1.21.0...1.22.0) +> 28 June 2026 + +- chore: release v1.22.0 [`#466`](https://github.com/avoidwork/madz/pull/466) - feat: React agent regex false positives, dead code, and test gaps [`#464`](https://github.com/avoidwork/madz/pull/464) #### [1.21.0](https://github.com/avoidwork/madz/compare/1.20.0...1.21.0) diff --git a/package-lock.json b/package-lock.json index d207702..1b4ed4d 100644 --- a/package-lock.json +++ b/package-lock.json @@ -1,12 +1,12 @@ { "name": "@avoidwork/madz", - "version": "1.22.0", + "version": "1.22.1", "lockfileVersion": 3, "requires": true, "packages": { "": { "name": "@avoidwork/madz", - "version": "1.22.0", + "version": "1.22.1", "license": "BSD-3-Clause", "dependencies": { "@langchain/langgraph": "^1.4.5", diff --git a/package.json b/package.json index 475a377..d723be1 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "@avoidwork/madz", - "version": "1.22.0", + "version": "1.22.1", "description": "A personality-driven AI harness channeling Mads Mikkelsen's cinematic soul.", "keywords": [ "cli", From c9525f2dc6e35f9725d981bda8835911d2fedf34 Mon Sep 17 00:00:00 2001 From: Jason Mulligan Date: Mon, 29 Jun 2026 15:43:52 -0400 Subject: [PATCH 04/11] docs: update project documentation and PR template (#476) --- .github/PULL_REQUEST_TEMPLATE.md | 4 ++-- AGENTS.md | 4 ++-- README.md | 5 ++--- 3 files changed, 6 insertions(+), 7 deletions(-) diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index 5422fab..0be902f 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -18,11 +18,11 @@ Describe the tests you wrote/run and their coverage. Use "N/A" if not applicable ## Coverage -- [ ] 100% line coverage maintained +- [ ] Line coverage maintained ## Checklist - [ ] `npm run lint` passes -- [ ] Tests pass with 100% line coverage +- [ ] Tests pass with maintained line coverage - [ ] No forbidden patterns used - [ ] Conventional Commit style applied diff --git a/AGENTS.md b/AGENTS.md index dab3df8..5190202 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -308,7 +308,7 @@ Session learnings — critical gotchas that affect how code must be written and ### 6.1 Coverage -The pre-commit hook enforces **100% code coverage**. Every new function or class needs test coverage. No exceptions. +The pre-commit hook enforces **maintained code coverage**. Every new function or class needs test coverage. ```bash npm run coverage @@ -390,5 +390,5 @@ The `README.md` may show a more up-to-date project structure (e.g., additional m - [ ] `oxlint` and `oxfmt` pass via pre-commit hooks. - [ ] No hardcoded secrets or credentials introduced. - [ ] Environment variable configuration used (no config file logic). -- [ ] 100% code coverage maintained (pre-commit will enforce this). +- [ ] Code coverage maintained (pre-commit will enforce this). - [ ] Threat model considerations addressed in PR description. diff --git a/README.md b/README.md index dfb7419..8263194 100644 --- a/README.md +++ b/README.md @@ -5,7 +5,6 @@ [![License: BSD-3-Clause](https://img.shields.io/badge/License-BSD--3--Clause-blue.svg)](LICENSE) [![Node.js >= 24](https://img.shields.io/badge/node-%3E%3D24-brightgreen)](https://nodejs.org) [![Tests](https://img.shields.io/badge/tests-passing-brightgreen)](#testing) -[![Coverage](https://img.shields.io/badge/coverage-93.37%25-brightgreen)](#testing) `madz` is a Node.js AI harness that combines a terminal-based UI with structured skill execution and a distinctive personality. Drawn from Mads Mikkelsen's most iconic roles, it speaks with calm, precision, and quiet intensity — solving problems with style, remembering your context, safely running your skills, and automating the mundane. Everything is persisted as version-controllable Markdown files, making it easy to audit with `git log` and re-load across sessions. Built on LangGraph, OpenTelemetry, and Ink — with persistent memory, sandboxed skill execution, cron scheduling, and a React-powered TUI. @@ -620,7 +619,7 @@ npm run fix npm run lint ``` -The pre-commit hook runs linting, formatting, and tests (targeting 100% code coverage). A commit will fail if any gate does not pass. +The pre-commit hook runs linting, formatting, and tests (targeting maintained code coverage). A commit will fail if any gate does not pass. ## Development @@ -628,7 +627,7 @@ The pre-commit hook runs linting, formatting, and tests (targeting 100% code cov npm install npm run fix # Format and lint-fix all files npm run test # Verify changes -npm run coverage # Generate and verify 100% coverage +npm run coverage # Generate and verify coverage ``` ### Extending Skills From 74a3928728f7109960902f994c5dbc052445ae20 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 29 Jun 2026 22:37:10 +0000 Subject: [PATCH 05/11] chore(deps): bump cron-parser from 5.6.0 to 5.6.1 (#479) Bumps [cron-parser](https://github.com/harrisiirak/cron-parser) from 5.6.0 to 5.6.1. - [Release notes](https://github.com/harrisiirak/cron-parser/releases) - [Commits](https://github.com/harrisiirak/cron-parser/compare/v5.6.0...v5.6.1) --- updated-dependencies: - dependency-name: cron-parser dependency-version: 5.6.1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- package-lock.json | 8 ++++---- package.json | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/package-lock.json b/package-lock.json index 1b4ed4d..901dd56 100644 --- a/package-lock.json +++ b/package-lock.json @@ -14,7 +14,7 @@ "@langchain/openai": "^1.5.2", "@opentelemetry/api": "^1.9.0", "@opentelemetry/sdk-node": "^0.219.0", - "cron-parser": "^5.6.0", + "cron-parser": "^5.6.1", "ink": "^7.1.0", "ink-scroll-view": "^0.3.7", "js-yaml": "^4.2.0", @@ -2275,9 +2275,9 @@ } }, "node_modules/cron-parser": { - "version": "5.6.0", - "resolved": "https://registry.npmjs.org/cron-parser/-/cron-parser-5.6.0.tgz", - "integrity": "sha512-6159NDv4eDOjXYDmMTkvUtaVcIrNZI779ydTMOfDdi9fSjhPqmySx0icCHX2+nOyXgNSw6sFCsgLKxYs/2KPuQ==", + "version": "5.6.1", + "resolved": "https://registry.npmjs.org/cron-parser/-/cron-parser-5.6.1.tgz", + "integrity": "sha512-QBm4o1PwZiuY7KFbVvW7FLC8bozy7YWzv+Fz6KRS7sQghzcbDZCGxr/Bc5b6TQreAoSwuWVP491dIcK0THCX6A==", "license": "MIT", "dependencies": { "luxon": "^3.7.2" diff --git a/package.json b/package.json index d723be1..7ffcb85 100644 --- a/package.json +++ b/package.json @@ -67,7 +67,7 @@ "@langchain/openai": "^1.5.2", "@opentelemetry/api": "^1.9.0", "@opentelemetry/sdk-node": "^0.219.0", - "cron-parser": "^5.6.0", + "cron-parser": "^5.6.1", "ink": "^7.1.0", "ink-scroll-view": "^0.3.7", "js-yaml": "^4.2.0", From a1363d5c9ee39f19361f03dcf0eb8009d008f574 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 29 Jun 2026 22:38:05 +0000 Subject: [PATCH 06/11] chore(deps): bump the langgraph group with 2 updates (#477) Bumps the langgraph group with 2 updates: [@langchain/langgraph](https://github.com/langchain-ai/langgraphjs/tree/HEAD/libs/langgraph-core) and [@langchain/openai](https://github.com/langchain-ai/langchainjs). Updates `@langchain/langgraph` from 1.4.5 to 1.4.7 - [Release notes](https://github.com/langchain-ai/langgraphjs/releases) - [Changelog](https://github.com/langchain-ai/langgraphjs/blob/main/libs/langgraph-core/CHANGELOG.md) - [Commits](https://github.com/langchain-ai/langgraphjs/commits/@langchain/langgraph@1.4.7/libs/langgraph-core) Updates `@langchain/openai` from 1.5.2 to 1.5.3 - [Release notes](https://github.com/langchain-ai/langchainjs/releases) - [Commits](https://github.com/langchain-ai/langchainjs/compare/@langchain/openai@1.5.2...@langchain/openai@1.5.3) --- updated-dependencies: - dependency-name: "@langchain/langgraph" dependency-version: 1.4.7 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: langgraph - dependency-name: "@langchain/openai" dependency-version: 1.5.3 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: langgraph ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- package-lock.json | 40 +++++++++++++++++----------------------- package.json | 4 ++-- 2 files changed, 19 insertions(+), 25 deletions(-) diff --git a/package-lock.json b/package-lock.json index 901dd56..6123a5d 100644 --- a/package-lock.json +++ b/package-lock.json @@ -9,9 +9,9 @@ "version": "1.22.1", "license": "BSD-3-Clause", "dependencies": { - "@langchain/langgraph": "^1.4.5", + "@langchain/langgraph": "^1.4.7", "@langchain/langgraph-checkpoint-sqlite": "^1.0.3", - "@langchain/openai": "^1.5.2", + "@langchain/openai": "^1.5.3", "@opentelemetry/api": "^1.9.0", "@opentelemetry/sdk-node": "^0.219.0", "cron-parser": "^5.6.1", @@ -248,13 +248,13 @@ } }, "node_modules/@langchain/langgraph": { - "version": "1.4.5", - "resolved": "https://registry.npmjs.org/@langchain/langgraph/-/langgraph-1.4.5.tgz", - "integrity": "sha512-V+o29JPBaMoK/e+8R/m81XaC8h5iNuwWymvgLFhXfJbf7E2xt2mQUkcVXTi4cudGRHbRd14kidCpfaQbfPoYCw==", + "version": "1.4.7", + "resolved": "https://registry.npmjs.org/@langchain/langgraph/-/langgraph-1.4.7.tgz", + "integrity": "sha512-2tcyf3QGC7v89kqSxMCtRvzg/3L/4yHtOaWC49A8KieCciWJs7LGaxHoPB6QRxXyUgyR+Zg9Q1ss/XJIE+JuSQ==", "license": "MIT", "dependencies": { - "@langchain/langgraph-checkpoint": "^1.1.2", - "@langchain/langgraph-sdk": "~1.9.24", + "@langchain/langgraph-checkpoint": "^1.1.3", + "@langchain/langgraph-sdk": "~1.9.25", "@langchain/protocol": "^0.0.18", "@standard-schema/spec": "1.1.0" }, @@ -263,19 +263,13 @@ }, "peerDependencies": { "@langchain/core": "^1.1.48", - "zod": "^3.25.32 || ^4.2.0", - "zod-to-json-schema": "^3.x" - }, - "peerDependenciesMeta": { - "zod-to-json-schema": { - "optional": true - } + "zod": "^3.25.32 || ^4.2.0" } }, "node_modules/@langchain/langgraph-checkpoint": { - "version": "1.1.2", - "resolved": "https://registry.npmjs.org/@langchain/langgraph-checkpoint/-/langgraph-checkpoint-1.1.2.tgz", - "integrity": "sha512-m5Xd7W3G9JrlEhFZ5WAcqZPgE46R9gr1gFDFaVqEKeuwin3tgEp0jlPbru+iFXCug338DcQjFS/Kuuci21ydvw==", + "version": "1.1.3", + "resolved": "https://registry.npmjs.org/@langchain/langgraph-checkpoint/-/langgraph-checkpoint-1.1.3.tgz", + "integrity": "sha512-wgzdQNeEsdw1e+4lvlj0tdq/RYR/k1vPin10g0ymGoehZDDgd9nvIllGXSXN4TFgF9sf5qQP/KTkOcLfeseIhA==", "license": "MIT", "engines": { "node": ">=18" @@ -301,9 +295,9 @@ } }, "node_modules/@langchain/langgraph-sdk": { - "version": "1.9.24", - "resolved": "https://registry.npmjs.org/@langchain/langgraph-sdk/-/langgraph-sdk-1.9.24.tgz", - "integrity": "sha512-WhM6QdxNipndQjl5nkvqnBt9Wl16oO2p0KiVhndAFLJMwO3bZLEx++lwtbqUFQu1sHyNxiWixgRGm8qZsuHCeA==", + "version": "1.9.25", + "resolved": "https://registry.npmjs.org/@langchain/langgraph-sdk/-/langgraph-sdk-1.9.25.tgz", + "integrity": "sha512-mRKW8zyQUaHox+HirRFMRrPqOvNbQI3xeXDt6kkk4PbBg77V92bsO1WzUVNrmJ81zCkvxyOrWSK8D6ioCj0a8A==", "license": "MIT", "dependencies": { "@langchain/protocol": "^0.0.18", @@ -368,9 +362,9 @@ } }, "node_modules/@langchain/openai": { - "version": "1.5.2", - "resolved": "https://registry.npmjs.org/@langchain/openai/-/openai-1.5.2.tgz", - "integrity": "sha512-En/QzXO3YFuaaZWQiGx0ZBNJMK3ipL/tz8F/PReG/63oV3wk2nz906QA8drYnd8r2/3NtSkbf3x/8qms5o6qTg==", + "version": "1.5.3", + "resolved": "https://registry.npmjs.org/@langchain/openai/-/openai-1.5.3.tgz", + "integrity": "sha512-OStS2AUvy9oe/hEf/3ndBOFztUDOfuJYLNXh89m3iiJAI2Cp5Dp0n/pvpO27MO0b+VgENd+xSHVyQZ7fe+ulxg==", "license": "MIT", "dependencies": { "js-tiktoken": "^1.0.12", diff --git a/package.json b/package.json index 7ffcb85..d7f01a1 100644 --- a/package.json +++ b/package.json @@ -62,9 +62,9 @@ "oxlint": "^1.71.0" }, "dependencies": { - "@langchain/langgraph": "^1.4.5", + "@langchain/langgraph": "^1.4.7", "@langchain/langgraph-checkpoint-sqlite": "^1.0.3", - "@langchain/openai": "^1.5.2", + "@langchain/openai": "^1.5.3", "@opentelemetry/api": "^1.9.0", "@opentelemetry/sdk-node": "^0.219.0", "cron-parser": "^5.6.1", From 9eb228122a262bcd76423252c7496c3833ee7801 Mon Sep 17 00:00:00 2001 From: Jason Mulligan Date: Mon, 29 Jun 2026 20:15:27 -0400 Subject: [PATCH 07/11] fix: restore corrupted JSDoc and function signature in subAgent.js (#481) --- src/tools/subAgent.js | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/tools/subAgent.js b/src/tools/subAgent.js index 8b01af9..4f02cb0 100644 --- a/src/tools/subAgent.js +++ b/src/tools/subAgent.js @@ -174,8 +174,11 @@ export function spawnSubAgentProcess(prompt, timeout, targetCwd = defaultCwd, te * @param {"continue" | "fail-fast"} onError - Error handling strategy * @param {number} timeout - Timeout in milliseconds * @param {string} targetCwd - Working directory for the sub-agent - * @returns {Promise<{ ok: boolean, result: string, error?: const prompt = task.context ? `${task.context}\n\n${task.delegation}` : task.delegation; - const result = await spawnSubAgentProcess(prompt, timeout, targetCwd, temperature);e; + * @returns {Promise<{ ok: boolean, result: string, error?: string }>} + */ +async function executeFanOut(tasks, strategy, maxConcurrent, onError, timeout, targetCwd) { + const results = []; + let failed = false; if (strategy === "sequential") { for (const task of tasks) { From e1585075b306f3d608e1d229067f96e26534f30b Mon Sep 17 00:00:00 2001 From: Jason Mulligan Date: Mon, 29 Jun 2026 20:19:07 -0400 Subject: [PATCH 08/11] chore: release v1.22.2 (#482) --- CHANGELOG.md | 10 ++++++++++ package-lock.json | 4 ++-- package.json | 2 +- 3 files changed, 13 insertions(+), 3 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 69b10af..502fbe8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,8 +4,18 @@ All notable changes to this project will be documented in this file. Dates are d Generated by [`auto-changelog`](https://github.com/CookPete/auto-changelog). +#### [1.22.2](https://github.com/avoidwork/madz/compare/1.22.1...1.22.2) + +- fix: restore corrupted JSDoc and function signature in subAgent.js [`#481`](https://github.com/avoidwork/madz/pull/481) +- chore(deps): bump the langgraph group with 2 updates [`#477`](https://github.com/avoidwork/madz/pull/477) +- chore(deps): bump cron-parser from 5.6.0 to 5.6.1 [`#479`](https://github.com/avoidwork/madz/pull/479) +- docs: update project documentation and PR template [`#476`](https://github.com/avoidwork/madz/pull/476) + #### [1.22.1](https://github.com/avoidwork/madz/compare/1.22.0...1.22.1) +> 29 June 2026 + +- chore: bump version to 1.22.1 [`#474`](https://github.com/avoidwork/madz/pull/474) - fix: use promisify(execFile) in searchFiles to fix trim error [`#472`](https://github.com/avoidwork/madz/pull/472) #### [1.22.0](https://github.com/avoidwork/madz/compare/1.21.0...1.22.0) diff --git a/package-lock.json b/package-lock.json index 6123a5d..4c526da 100644 --- a/package-lock.json +++ b/package-lock.json @@ -1,12 +1,12 @@ { "name": "@avoidwork/madz", - "version": "1.22.1", + "version": "1.22.2", "lockfileVersion": 3, "requires": true, "packages": { "": { "name": "@avoidwork/madz", - "version": "1.22.1", + "version": "1.22.2", "license": "BSD-3-Clause", "dependencies": { "@langchain/langgraph": "^1.4.7", diff --git a/package.json b/package.json index d7f01a1..5a1e17c 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "@avoidwork/madz", - "version": "1.22.1", + "version": "1.22.2", "description": "A personality-driven AI harness channeling Mads Mikkelsen's cinematic soul.", "keywords": [ "cli", From e49ad8571d657a20be7896901670452da05e0cfe Mon Sep 17 00:00:00 2001 From: Jason Mulligan Date: Mon, 29 Jun 2026 21:15:49 -0400 Subject: [PATCH 09/11] refactor: flatten providers config to use type as active provider selector (#485) * refactor: flatten providers config to use type as active provider selector Move the 'type' field from providers.openai.type to providers.type, making it the single source of truth for the active provider. Provider- specific configs remain as sibling keys under providers. - config.yaml: add providers.type, remove type from providers.openai - schemas.js: remove type from _OpenaiProviderConfigSchema - schemas.js: add type enum to ProvidersSchema * refactor: restrict ProvidersSchema to 'openai' only for now Keep the enum array form for easy future expansion, but only allow 'openai' until other providers are implemented. --- config.yaml | 2 +- src/config/schemas.js | 5 +++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/config.yaml b/config.yaml index 72029c3..222de36 100644 --- a/config.yaml +++ b/config.yaml @@ -1,6 +1,6 @@ providers: + type: openai openai: - type: openai base_url: https://api.openai.com/v1 model: gpt-4o encoding: cl100k_base diff --git a/src/config/schemas.js b/src/config/schemas.js index c3456a1..78e7032 100644 --- a/src/config/schemas.js +++ b/src/config/schemas.js @@ -65,7 +65,6 @@ const SearchConfigSchema = z.object({ }); const _OpenaiProviderConfigSchema = z.object({ - type: z.literal("openai").default("openai"), base_url: z.string().url().default("https://api.openai.com/v1"), model: z.string().min(1), encoding: z.string().optional(), @@ -85,7 +84,9 @@ const _FalProviderConfigSchema = z.object({ credentials: FalCredentialsSchema, }); -export const ProvidersSchema = z.object({}).passthrough(); +export const ProvidersSchema = z.object({ + type: z.enum(["openai"]).default("openai"), +}).passthrough(); // --- Sandbox schemas --- From 5fcd0566a91bd71ec387b4e5be19f84855f82e3f Mon Sep 17 00:00:00 2001 From: Jason Mulligan Date: Tue, 30 Jun 2026 07:37:03 -0400 Subject: [PATCH 10/11] Revert "refactor: flatten providers config to use type as active provider selector (#485)" (#487) This reverts commit e49ad8571d657a20be7896901670452da05e0cfe. --- config.yaml | 2 +- src/config/schemas.js | 5 ++--- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/config.yaml b/config.yaml index 222de36..72029c3 100644 --- a/config.yaml +++ b/config.yaml @@ -1,6 +1,6 @@ providers: - type: openai openai: + type: openai base_url: https://api.openai.com/v1 model: gpt-4o encoding: cl100k_base diff --git a/src/config/schemas.js b/src/config/schemas.js index 78e7032..c3456a1 100644 --- a/src/config/schemas.js +++ b/src/config/schemas.js @@ -65,6 +65,7 @@ const SearchConfigSchema = z.object({ }); const _OpenaiProviderConfigSchema = z.object({ + type: z.literal("openai").default("openai"), base_url: z.string().url().default("https://api.openai.com/v1"), model: z.string().min(1), encoding: z.string().optional(), @@ -84,9 +85,7 @@ const _FalProviderConfigSchema = z.object({ credentials: FalCredentialsSchema, }); -export const ProvidersSchema = z.object({ - type: z.enum(["openai"]).default("openai"), -}).passthrough(); +export const ProvidersSchema = z.object({}).passthrough(); // --- Sandbox schemas --- From 39d3414edf3fe0e6df065791aae01feceee4c0f3 Mon Sep 17 00:00:00 2001 From: Jason Mulligan Date: Tue, 30 Jun 2026 07:49:42 -0400 Subject: [PATCH 11/11] docs: add OpenSpec proposal for interrupt checkpoint cleanup --- .../.openspec.yaml | 2 + .../interrupt-checkpoint-cleanup/design.md | 85 +++++++++++++++++++ .../interrupt-checkpoint-cleanup/proposal.md | 25 ++++++ .../specs/interrupt-cleanup/spec.md | 59 +++++++++++++ .../interrupt-checkpoint-cleanup/tasks.md | 40 +++++++++ 5 files changed, 211 insertions(+) create mode 100644 openspec/changes/interrupt-checkpoint-cleanup/.openspec.yaml create mode 100644 openspec/changes/interrupt-checkpoint-cleanup/design.md create mode 100644 openspec/changes/interrupt-checkpoint-cleanup/proposal.md create mode 100644 openspec/changes/interrupt-checkpoint-cleanup/specs/interrupt-cleanup/spec.md create mode 100644 openspec/changes/interrupt-checkpoint-cleanup/tasks.md diff --git a/openspec/changes/interrupt-checkpoint-cleanup/.openspec.yaml b/openspec/changes/interrupt-checkpoint-cleanup/.openspec.yaml new file mode 100644 index 0000000..d6b53de --- /dev/null +++ b/openspec/changes/interrupt-checkpoint-cleanup/.openspec.yaml @@ -0,0 +1,2 @@ +schema: spec-driven +created: 2026-06-30 diff --git a/openspec/changes/interrupt-checkpoint-cleanup/design.md b/openspec/changes/interrupt-checkpoint-cleanup/design.md new file mode 100644 index 0000000..6b5cf65 --- /dev/null +++ b/openspec/changes/interrupt-checkpoint-cleanup/design.md @@ -0,0 +1,85 @@ +## Context + +The madz application uses LangGraph for state machine management of AI conversations. When a user interrupts a tool call (e.g., via command or cancel), the cleanup process only removes messages from in-memory state (`sessionState.#state.conversation`) but does not update the LangGraph checkpoint. The checkpoint, written at superstep boundaries by `createReactAgentGraph`, retains partial AIMessages with tool_calls. On resume, `streamEvents` replays from the checkpoint, causing orphaned tool calls to corrupt the resumed turn. + +The current codebase has two interrupt paths: +- `handleChat()` in `app.js` (lines 922-924): calls both `removeLastAssistantToolCallMessage()` and `popExchange()` +- `handleCommand()` in `app.js` (lines 524-526): only calls `popExchange()` — missing tool call cleanup + +The `removeLastAssistantToolCallMessage()` function in `stateManager.js` (lines 80-88) only modifies the in-memory conversation array. + +## Goals / Non-Goals + +**Goals:** +- Propagate checkpoint cleanup when in-memory state is cleaned on interrupt +- Ensure consistent cleanup across both `handleChat()` and `handleCommand()` interrupt paths +- Implement checkpoint reconciliation before resume to ensure state consistency +- Add integration test to verify the fix + +**Non-Goals:** +- Refactoring the entire checkpoint management system +- Changing the LangGraph graph structure or state machine logic +- Adding new interrupt mechanisms or signals +- Handling checkpoint corruption beyond the specific orphaned tool call case + +## Decisions + +### Decision 1: Extend `removeLastAssistantToolCallMessage()` to accept an optional checkpointer + +**Choice:** Modify `removeLastAssistantToolCallMessage()` to accept an optional `checkpointer` parameter. When provided, after removing the message from in-memory conversation, also update the checkpoint. + +**Rationale:** This keeps the cleanup logic centralized in `stateManager.js` where it already lives. The checkpointer parameter is optional to maintain backward compatibility with existing callers that don't need checkpoint cleanup. + +**Alternatives considered:** +- Create a separate `cleanupCheckpoint()` function: More explicit but duplicates the message identification logic +- Pass checkpointer to all stateManager methods: Over-engineered, most methods don't need it + +### Decision 2: Use `checkpointer.put()` with cleaned state for checkpoint update + +**Choice:** After removing the orphaned message from in-memory state, call `checkpointer.put()` with the cleaned conversation state to update the checkpoint. + +**Rationale:** The LangGraph JS checkpointer API provides `put()` for writing state tuples. By putting a cleaned state tuple, we effectively replace the checkpoint state with the cleaned version. + +**Alternatives considered:** +- Use `checkpointer.update()`: May not be available in the JS variant, or may have different semantics +- Delete specific checkpoint entries: More complex, requires knowing the exact checkpoint entry IDs + +### Decision 3: Add reconciliation step before `dispatchProvider` after interrupt + +**Choice:** Before calling `dispatchProvider` after an interrupt, add a reconciliation step that compares the checkpoint state with in-memory conversation. If they diverge, write the cleaned state to the checkpoint. + +**Rationale:** This provides a safety net for cases where checkpoint cleanup might fail or be incomplete. It ensures the graph always resumes from a consistent state. + +**Alternatives considered:** +- Rely solely on checkpoint cleanup during interrupt: Less robust, no fallback if cleanup fails +- Always reset the checkpoint on resume: Too aggressive, might lose valid state + +### Decision 4: Graceful degradation if checkpoint cleanup fails + +**Choice:** If checkpoint cleanup fails (e.g., checkpointer not available, API error), log a warning but continue with in-memory cleanup. The checkpoint inconsistency is a bug but not a crash. + +**Rationale:** The in-memory cleanup still prevents the most obvious user-facing issues (duplicate messages in conversation). The checkpoint inconsistency can be addressed by the reconciliation step on resume. + +**Alternatives considered:** +- Fail the interrupt handling if checkpoint cleanup fails: Too strict, leaves the user in a broken state +- Silently ignore checkpoint cleanup failures: Doesn't help with debugging + +## Risks / Trade-offs + +| Risk | Mitigation | +|------|------------| +| LangGraph JS checkpointer API may differ from Python | Test against actual LangGraph JS version, use public API only | +| Checkpoint reconciliation adds complexity to resume flow | Keep reconciliation simple: compare conversation arrays, write cleaned state if different | +| Thread ID management for checkpoint operations | Scope all checkpoint operations to current `thread_id`, verify thread ID is available | +| Performance impact of state comparison during reconciliation | Compare only message IDs and types, not full message content | +| Checkpointer may not be available in all contexts | Make checkpointer parameter optional, degrade gracefully if unavailable | + +## Migration Plan + +This is a bug fix with no migration required. The changes are internal to the cleanup and resume logic. No user-facing behavior changes except the fix itself (no more orphaned tool calls on resume). + +## Open Questions + +1. **LangGraph JS checkpointer API specifics**: Need to verify the exact API for updating checkpoint state in the Node.js variant. The Python docs reference `checkpointer.put()` and `checkpointer.update()`, but the JS API may differ. +2. **Thread ID availability**: Need to verify that `thread_id` is available in the interrupt handling context to scope checkpoint operations correctly. +3. **Checkpoint state format**: Need to understand the exact format of the state tuple that `checkpointer.put()` expects to ensure we're writing the correct structure. \ No newline at end of file diff --git a/openspec/changes/interrupt-checkpoint-cleanup/proposal.md b/openspec/changes/interrupt-checkpoint-cleanup/proposal.md new file mode 100644 index 0000000..4581a8c --- /dev/null +++ b/openspec/changes/interrupt-checkpoint-cleanup/proposal.md @@ -0,0 +1,25 @@ +## Why + +When a user interrupts a tool call during a conversation, the cleanup only removes messages from in-memory state but leaves orphaned AIMessages with tool_calls in the LangGraph checkpoint. On resume, the checkpoint replays these orphaned messages, causing duplicate tool calls, dangling tool references in LLM API requests, and corrupted conversation history. This breaks the user experience and can cause confusing or broken conversation flows. + +## What Changes + +- Extend `removeLastAssistantToolCallMessage()` in `stateManager.js` to accept an optional checkpointer and propagate cleanup to the LangGraph checkpoint +- Add missing `removeLastAssistantToolCallMessage()` call to `handleCommand()` interrupt path in `app.js` to match `handleChat()` behavior +- Implement checkpoint reconciliation before `dispatchProvider` after an interrupt to ensure checkpoint and in-memory state are consistent +- Add integration test verifying checkpoint contains no orphaned tool calls after interrupt + resume + +## Capabilities + +### New Capabilities +- `interrupt-cleanup`: Propagate session state cleanup to LangGraph checkpoint on interrupt, ensuring no orphaned tool calls persist across resume + +### Modified Capabilities + + +## Impact + +- **Affected code**: `./src/tui/app.js` (handleChat, handleCommand), `./src/session/stateManager.js` (removeLastAssistantToolCallMessage), `./src/agent/react.js` (dispatchProvider, graph setup) +- **Dependencies**: LangGraph checkpointer API (JS/Node.js variant) +- **Breaking changes**: None — this is a bug fix that changes internal cleanup behavior without altering public APIs +- **Systems**: LangGraph checkpoint store, conversation state management \ No newline at end of file diff --git a/openspec/changes/interrupt-checkpoint-cleanup/specs/interrupt-cleanup/spec.md b/openspec/changes/interrupt-checkpoint-cleanup/specs/interrupt-cleanup/spec.md new file mode 100644 index 0000000..4ac7668 --- /dev/null +++ b/openspec/changes/interrupt-checkpoint-cleanup/specs/interrupt-cleanup/spec.md @@ -0,0 +1,59 @@ +## ADDED Requirements + +### Requirement: Interrupt cleanup propagates to LangGraph checkpoint +The system SHALL propagate session state cleanup to the LangGraph checkpoint when an interrupt occurs during tool execution, ensuring no orphaned AIMessages with tool_calls persist in the checkpoint. + +#### Scenario: Interrupt during tool call cleans checkpoint +- **WHEN** a tool call is interrupted (via command or cancel) +- **THEN** the LangGraph checkpoint is updated to remove the partial AIMessage containing the tool_call +- **AND** the in-memory conversation state is also cleaned of the partial message + +#### Scenario: No orphaned tool calls after interrupt and resume +- **WHEN** a user interrupts a tool call and then sends a new message to resume +- **THEN** the resumed conversation contains no duplicate tool calls +- **AND** the LLM API request does not include dangling tool references + +#### Scenario: Checkpoint cleanup with checkpointer provided +- **WHEN** `removeLastAssistantToolCallMessage()` is called with a checkpointer parameter +- **THEN** the function removes the message from in-memory conversation +- **AND** the function updates the LangGraph checkpoint to remove the corresponding message + +### Requirement: Consistent cleanup across interrupt paths +The system SHALL perform identical cleanup operations in both `handleChat()` and `handleCommand()` interrupt paths, ensuring tool call messages are removed from in-memory state regardless of how the interrupt was triggered. + +#### Scenario: handleCommand() cleans tool call messages +- **WHEN** an interrupt occurs via a command (not chat message) +- **THEN** `removeLastAssistantToolCallMessage()` is called to clean tool call messages +- **AND** `popExchange()` is called to clean the exchange state + +#### Scenario: handleChat() cleans tool call messages +- **WHEN** an interrupt occurs during chat message processing +- **THEN** `removeLastAssistantToolCallMessage()` is called to clean tool call messages +- **AND** `popExchange()` is called to clean the exchange state + +### Requirement: Checkpoint reconciliation on resume +The system SHALL reconcile the LangGraph checkpoint with in-memory conversation state before resuming after an interrupt, ensuring the graph starts from a consistent state. + +#### Scenario: Reconciliation before dispatchProvider +- **WHEN** a new message is sent after an interrupt +- **THEN** the system compares checkpoint state with in-memory conversation before calling `dispatchProvider` +- **AND** if the states diverge, the cleaned state is written to the checkpoint + +#### Scenario: Normal resume without reconciliation needed +- **WHEN** a conversation resumes without a prior interrupt +- **THEN** no reconciliation is performed (checkpoint and in-memory state are already consistent) +- **AND** `dispatchProvider` proceeds normally + +### Requirement: Graceful degradation on checkpoint cleanup failure +The system SHALL gracefully handle failures during checkpoint cleanup, ensuring in-memory cleanup still proceeds even if checkpoint update fails. + +#### Scenario: Checkpointer unavailable +- **WHEN** the checkpointer is not available during interrupt cleanup +- **THEN** in-memory cleanup proceeds normally +- **AND** a warning is logged indicating checkpoint cleanup was skipped + +#### Scenario: Checkpoint update error +- **WHEN** the checkpoint update fails with an error +- **THEN** in-memory cleanup proceeds normally +- **AND** the error is logged for debugging +- **AND** the reconciliation step on resume will address the inconsistency \ No newline at end of file diff --git a/openspec/changes/interrupt-checkpoint-cleanup/tasks.md b/openspec/changes/interrupt-checkpoint-cleanup/tasks.md new file mode 100644 index 0000000..ffc5443 --- /dev/null +++ b/openspec/changes/interrupt-checkpoint-cleanup/tasks.md @@ -0,0 +1,40 @@ +## 1. Investigate LangGraph Checkpointer API + +- [ ] 1.1 Review LangGraph JS checkpointer API in src/agent/react.js to understand available methods for checkpoint updates +- [ ] 1.2 Verify checkpointer.put() or checkpointer.update() signature and expected state tuple format +- [ ] 1.3 Confirm thread_id is available in interrupt handling context for scoping checkpoint operations + +## 2. Extend removeLastAssistantToolCallMessage for checkpoint cleanup + +- [ ] 2.1 Modify removeLastAssistantToolCallMessage() in src/session/stateManager.js to accept optional checkpointer parameter +- [ ] 2.2 Implement checkpoint update logic: after removing message from in-memory conversation, call checkpointer.put() with cleaned state +- [ ] 2.3 Add graceful degradation: if checkpointer is unavailable or update fails, log warning and continue with in-memory cleanup only + +## 3. Add cleanup to handleCommand() interrupt path + +- [ ] 3.1 Identify the interrupt handling code in handleCommand() in src/tui/app.js (around lines 524-526) +- [ ] 3.2 Add removeLastAssistantToolCallMessage() call alongside existing popExchange() to match handleChat() behavior +- [ ] 3.3 Pass checkpointer to removeLastAssistantToolCallMessage() if available + +## 4. Implement checkpoint reconciliation on resume + +- [ ] 4.1 Add reconciliation step before dispatchProvider in src/agent/react.js that compares checkpoint state with in-memory conversation +- [ ] 4.2 Implement state comparison logic: compare message IDs and types between checkpoint and in-memory state +- [ ] 4.3 If states diverge, write cleaned state to checkpoint before proceeding with dispatchProvider +- [ ] 4.4 Ensure normal (non-interrupt) resume is not affected — reconciliation only triggers when states diverge + +## 5. Add integration test for interrupt cleanup + +- [ ] 5.1 Create test file tests/unit/interrupt-checkpoint-cleanup.test.js +- [ ] 5.2 Mock a tool call that requires user input and simulate interrupt during execution +- [ ] 5.3 Verify checkpoint contains no orphaned AIMessages with tool_calls after interrupt +- [ ] 5.4 Verify no duplicate tool calls in resumed conversation after interrupt + new message +- [ ] 5.5 Test both handleChat() and handleCommand() interrupt paths +- [ ] 5.6 Test graceful degradation when checkpointer is unavailable + +## 6. Verify and test + +- [ ] 6.1 Run npm run test to verify all tests pass +- [ ] 6.2 Run npm run lint to verify no lint errors +- [ ] 6.3 Run npm run coverage to verify coverage is maintained +- [ ] 6.4 Run timeout 10 npm start to verify application starts without crashing \ No newline at end of file