fix(todo): refactor SetTodoList to persist state and prevent tool call storms#1742
fix(todo): refactor SetTodoList to persist state and prevent tool call storms#1742
Conversation
…l storms Refactored the SetTodoList tool to address issue #1710 where the model would repeatedly call SetTodoList in a loop ("storm") especially when Shell was disabled. Key changes: - Persist todos to SessionState (root agent) and independent state.json (sub-agents) for proper state management - Add query mode: omit todos param to read current state, pass [] to clear - Add anti-storm guidance in tool description preventing repeated calls without progress - Protect against race conditions in Session.save_state() for todos field - Return non-empty output ("Todo list updated") so models get confirmation Closes #1710
There was a problem hiding this comment.
Pull request overview
Refactors the SetTodoList tool to persist todo state (root session vs. subagent isolation), adds a query/clear mode to reduce repeated tool calls, and updates session persistence to avoid overwriting externally-written todos.
Changes:
- Make
SetTodoListsupport update/query/clear via optionaltodosparam, with non-empty tool output and updated tool description guidance. - Persist root todos into
SessionState(state.json) and persist subagent todos into per-subagentstate.json. - Add/adjust unit + e2e tests and update schema/description snapshots; refresh
Session.save_state()to preserve externally-updated todos.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/tools/test_tool_schemas.py | Updates expected JSON schema to allow todos: null / omitted. |
| tests/tools/test_tool_descriptions.py | Updates SetTodoList description snapshot with usage modes + anti-storm guidance. |
| tests/tools/test_todo.py | Adds new unit tests for query/update/clear and root/subagent persistence behavior. |
| tests/core/test_session.py | Adds regression test ensuring save_state() doesn’t overwrite externally-written todos. |
| tests/conftest.py | Updates set_todo_list_tool fixture to inject Runtime. |
| tests_e2e/test_wire_prompt.py | Updates e2e snapshot to expect non-empty SetTodoList output. |
| tests_e2e/test_wire_approvals_tools.py | Updates e2e snapshots to expect non-empty SetTodoList output. |
| src/kimi_cli/tools/todo/set_todo_list.md | Documents the new usage modes and anti-storm guidance. |
| src/kimi_cli/tools/todo/init.py | Rewrites tool to support modes + persistence for root/subagents. |
| src/kimi_cli/session.py | Reloads todos in save_state() to avoid race-condition overwrites. |
| src/kimi_cli/session_state.py | Adds TodoItemState and SessionState.todos persisted field. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/kimi_cli/tools/todo/__init__.py
Outdated
| todos = self._runtime.session.state.todos | ||
| return [Todo(title=t.title, status=t.status) for t in todos] | ||
|
|
There was a problem hiding this comment.
In root Query mode, _load_root_todos() reads from self._runtime.session.state.todos only. This can return stale data if another component/process writes updated todos to state.json (the Session.save_state() change here explicitly acknowledges externally-mutated todos). Consider loading from disk via load_session_state(session.dir) (and optionally refreshing session.state.todos) so Query mode reflects the persisted source of truth.
| todos = self._runtime.session.state.todos | |
| return [Todo(title=t.title, status=t.status) for t in todos] | |
| from kimi_cli.session_state import load_session_state | |
| session = self._runtime.session | |
| fresh = load_session_state(session.dir) | |
| session.state.todos = fresh.todos | |
| return [Todo(title=t.title, status=t.status) for t in fresh.todos] |
src/kimi_cli/tools/todo/__init__.py
Outdated
| return json.loads(path.read_text(encoding="utf-8")) | ||
| except (json.JSONDecodeError, OSError): | ||
| logger.warning("Corrupted subagent todo state, using defaults: {path}", path=path) | ||
| return {} |
There was a problem hiding this comment.
_read_subagent_state() assumes the decoded JSON is a dict and only catches JSONDecodeError/OSError. If state.json is valid JSON but not an object (e.g., []), or if it contains invalid UTF-8 (UnicodeDecodeError), subsequent .get()/indexing will raise and fail the tool. Consider catching UnicodeDecodeError and validating isinstance(data, dict) (fallback to {}) to keep reads/writes robust against corrupted-but-parseable files.
| return json.loads(path.read_text(encoding="utf-8")) | |
| except (json.JSONDecodeError, OSError): | |
| logger.warning("Corrupted subagent todo state, using defaults: {path}", path=path) | |
| return {} | |
| data = json.loads(path.read_text(encoding="utf-8")) | |
| except (json.JSONDecodeError, OSError, UnicodeDecodeError): | |
| logger.warning("Corrupted subagent todo state, using defaults: {path}", path=path) | |
| return {} | |
| if not isinstance(data, dict): | |
| logger.warning("Invalid subagent todo state type, using defaults: {path}", path=path) | |
| return {} | |
| return data |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3987feed55
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
src/kimi_cli/tools/todo/__init__.py
Outdated
| session.state.todos = items | ||
|
|
||
| def _load_root_todos(self) -> list[Todo]: | ||
| todos = self._runtime.session.state.todos |
There was a problem hiding this comment.
Reload root todos from disk before serving query mode
Query mode is intended to return the current persisted todo list, but this path reads only runtime.session.state.todos, which can be stale if another process/component has updated state.json directly (the same cross-writer scenario this commit already accounts for in Session.save_state). In that case the model sees outdated todos and may overwrite newer items on its next update.
Useful? React with 👍 / 👎.
src/kimi_cli/tools/todo/__init__.py
Outdated
| if not path.exists(): | ||
| return {} | ||
| try: | ||
| return json.loads(path.read_text(encoding="utf-8")) |
There was a problem hiding this comment.
Validate loaded subagent state is an object
The corruption fallback only handles JSON parse/IO failures, but json.loads may successfully return a non-dict (for example []), and later callers assume mapping methods (data.get(...) / key assignment). That causes a runtime exception instead of graceful recovery, so a malformed-but-valid state.json can break both read and write modes.
Useful? React with 👍 / 👎.
- Load root todos from disk in query mode to ensure freshness - Catch UnicodeDecodeError in subagent state file reading - Validate that parsed JSON is a dict before using it - Add cast for pyright type narrowing after isinstance check Addresses PR review feedback from #1742
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c2e201553e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
src/kimi_cli/tools/todo/__init__.py
Outdated
| raw_todos = data.get("todos", []) | ||
| result: list[Todo] = [] | ||
| for item in raw_todos: |
There was a problem hiding this comment.
Validate
todos value is a list before iteration
_load_subagent_todos assumes data["todos"] is iterable, but valid JSON like {"todos": null} or {"todos": 1} will make for item in raw_todos raise TypeError and surface as a tool runtime error. This commit already treats malformed subagent state as recoverable, so this case should also fall back safely (for example to []) instead of failing query mode for subagents.
Useful? React with 👍 / 👎.
Guard against non-list values in subagent state.json (e.g. {"todos": null})
that would cause TypeError during iteration.
Addresses PR review feedback from #1742
Replace standalone read-modify-write in _save_root_todos with session.save_state() to reuse existing merge semantics and avoid overwriting worker-owned fields. Remove todos from externally-mutable field refresh in save_state() since todos are worker-owned, not externally-mutable. Addresses PR review feedback from #1742
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6bd46e7a31
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| class TodoItemState(BaseModel): | ||
| """A single todo item stored in session or subagent state.""" | ||
|
|
||
| title: str |
There was a problem hiding this comment.
Enforce non-empty titles in persisted todo state
TodoItemState allows empty title values, but root query mode later reconstructs each item as Todo(title=...) (which requires min_length=1). If state.json contains an empty title (e.g., from a manual edit or another writer), SetTodoList query mode will raise a runtime error instead of returning the list or recovering, so a malformed-but-parseable root state can break the tool. Adding the same title constraint to TodoItemState (or handling invalid entries during _load_root_todos) avoids this crash path.
Useful? React with 👍 / 👎.
Signed-off-by: Kai <me@kaiyi.cool>
Add per-item try/except in _load_root_todos to skip invalid items (e.g. empty titles) instead of crashing query mode, consistent with the subagent loading path. Addresses PR review feedback from #1742
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cfe8c25f0a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
src/kimi_cli/session_state.py
Outdated
| archived_at: float | None = None | ||
| auto_archive_exempt: bool = False | ||
| # Todo list state | ||
| todos: list[TodoItemState] = Field(default_factory=lambda: list[TodoItemState]()) |
There was a problem hiding this comment.
Merge persisted todos before saving session state
Adding SessionState.todos makes todos vulnerable to cross-process clobbering because Session.save_state() still reloads only title/archive fields before writing self.state back. In a multi-writer flow (e.g., process A calls SetTodoList, process B later calls save_state() for approvals or plan mode), process B can overwrite the newer on-disk todos with its stale in-memory list. This is a data-loss regression introduced with the new todos field and should be handled the same way as other externally mutable state (or otherwise protected with field-level merge semantics).
Useful? React with 👍 / 👎.
Replace misleading lambda: list[TodoItemState]() with default_factory=list for consistency with other list fields in SessionState.
|
To use Codex here, create an environment for this repo. |
|
Codex Review: Didn't find any major issues. What shall we delve into next? ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Summary
SetTodoListto persist todos toSessionState(root agent) and independentstate.json(sub-agents), enabling proper state management and isolationtodosparam to read current state; pass[]to clear all todosSession.save_state()for the new todos fieldBackground
After updating to v1.29.0, users with Shell disabled experienced a "SetTodoList storm" where the model would repeatedly call SetTodoList without making real progress (up to 100 times per turn). Root causes:
SetTodoListreturnedoutput=""— the model received no meaningful confirmationChanges
Core (
src/kimi_cli/)session_state.py: AddedTodoItemStatemodel andSessionState.todosfieldtools/todo/__init__.py: Full rewrite — DI viaRuntime, read/write/clear modes, root/subagent storage isolationtools/todo/set_todo_list.md: Updated description with usage modes and anti-storm guidancesession.py: Added todos refresh insave_state()to prevent race condition overwritesTests
tests/tools/test_todo.pywith 12 test cases covering all modes, persistence, isolation, and edge casestests/core/test_session.py::test_save_state_preserves_external_todosTest plan
make check(ruff + pyright): 0 errorsCloses #1710