Skip to content

fix: recover incomplete resumed tool calls#771

Open
ktwu01 wants to merge 1 commit into
MoonshotAI:mainfrom
ktwu01:codex/recover-incomplete-tool-results
Open

fix: recover incomplete resumed tool calls#771
ktwu01 wants to merge 1 commit into
MoonshotAI:mainfrom
ktwu01:codex/recover-incomplete-tool-results

Conversation

@ktwu01

@ktwu01 ktwu01 commented Jun 15, 2026

Copy link
Copy Markdown

fix: recover incomplete resumed tool calls

Summary

On a hard crash (process killed, power loss, OOM), the persistence layer can write a tool.call record before its matching tool.result record. When the session is later resumed, the transcript ends with a dangling tool call: pendingToolResultIds is non-empty and the next prompt is sent to the model with an open, never-answered tool exchange.

This PR repairs that state on resume by synthesizing an error tool.result for each orphaned tool.call, so the model sees the call was interrupted and continues from the next user instruction. The repair is flushed to disk, so it is durable across further resumes.

This is a single, self-contained fix with a focused regression test. It is not part of a bulk/batch submission.

Reproduction

  1. Start a session; let the agent issue a tool call (e.g. Bash { command: "pwd" }).
  2. Kill the process after the tool.call record is persisted but before the tool.result is written.
  3. Resume the session and send a new prompt.

Before: the resumed history contains an assistant turn with a tool call that has no result. The next request is built on top of an open tool exchange.

After: resume appends an error tool result:

Tool result missing because the previous process exited before it was recorded. Treat this tool call as interrupted and continue from the next user instruction.

and the conversation continues cleanly.

Design note (the one thing worth a reviewer's attention)

There is already a recovery path for trailing open tool exchanges: trimTrailingOpenToolExchange in packages/agent-core/src/agent/context/projector.ts. Its strategy is to drop the incomplete exchange from projected history.

This PR deliberately takes the opposite strategy for the crash-recovery case — it keeps the tool call and synthesizes an error result, rather than dropping it. Rationale:

  • Trim hides that any work was attempted; the model resumes as if the call never happened.
  • Repair preserves the record that a tool was invoked and surfaces, in-band, that it was interrupted — which is more faithful to what actually occurred and lets the model reason about the interruption.

I am not attached to repair-over-trim. If the maintainers prefer to route crash recovery through the existing trimTrailingOpenToolExchange instead of adding a second path, I'm happy to rework it that way. Flagging it explicitly so the design choice is a decision, not an accident — and so the two mechanisms don't silently diverge.

Changes

  • packages/agent-core/src/agent/context/index.tsrecoverIncompleteToolResultsAfterRestore(): appends an error tool.result for each pending tool-call id, clears open steps, returns whether a repair happened.
  • packages/agent-core/src/agent/records/index.ts — call the recovery after restore and flush persistence if a repair occurred.
  • packages/agent-core/test/agent/resume.test.ts — regression test: a restored session whose last record is a tool.call with no result is repaired, the projected history is [user, assistant, tool], and a follow-up prompt continues correctly (verified via inline snapshot).
  • .changeset/recover-incomplete-tool-results.md — patch bump for @moonshot-ai/agent-core and @moonshot-ai/kimi-code.

Test evidence

$ npx vitest run packages/agent-core/test/agent/resume.test.ts -t "repairs restored tool calls"

 Test Files  1 passed (1)
      Tests  1 passed | 11 skipped (12)

+125 / -0. No existing behavior changed; the new path only fires when pendingToolResultIds is non-empty after restore.

@changeset-bot

changeset-bot Bot commented Jun 15, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: f7f69a4

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@moonshot-ai/agent-core Patch
@moonshot-ai/kimi-code Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant