fix: self-healing agent, tool timeout, message ordering by Blankll · Pull Request #7 · geek-fun/data-studio-agent

Blankll · 2026-06-20T13:42:24Z

Summary

session_store: ORDER BY rowid ASC fixes UUID race within 1ms writes
build_llm_messages: strip orphan tool_calls at end, include unparseable tool JSON
loop: continue on empty prepared (don't exit on tool error)
loop: prepare_for_llm cancelable via tokio::select with cancel_rx
loop: 400 'insufficient tool messages' auto-repair + retry up to 3x
loop: Phase 3 tool storage isolated (one failure doesn't cascade)
loop: tool execution wrapped in tokio::time::timeout(30s)
loop: tool retry up to 3 attempts with 2s/5s backoff
loop: retry events emitted for UI progress indicators
loop: runaway-loop guard threshold 3→5, progress-aware reset

- session_store: ORDER BY rowid ASC fixes UUID race within 1ms writes - build_llm_messages: strip orphan tool_calls at end of message list - build_llm_messages: include unparseable tool messages as-is (no silent drops) - loop: continue on empty prepared instead of returning (tool retry) - loop: prepare_for_llm cancelable via tokio::select with cancel_rx - loop: 400 'insufficient tool messages' auto-repair + retry up to 3x - loop: Phase 3 tool storage isolated (one failure doesn't cascade) - loop: tool execution wrapped in tokio::time::timeout(30s) - loop: tool retry up to 3 attempts with 2s/5s backoff - loop: retry events emitted for UI progress indicators - loop: runaway-loop guard threshold 3→5, progress-aware reset

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Sets up cargo-llvm-cov for lcov generation on Linux, uploads to Codecov. Adds codecov.yml with 2% project / 5% patch thresholds and Rust flag mapping. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

…tfmt.toml) The rustfmt.toml uses nightly-only options (imports_granularity, group_imports, trailing_comma, etc.). Stable cargo fmt ignores them and produces different formatting. Install both stable (clippy) and nightly (rustfmt) toolchains. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

dtolnay/rust-toolchain@nightly sets default to nightly, causing 'cargo clippy' to fail. Use 'cargo +stable clippy' to stay on stable. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

The project's rustfmt.toml uses nightly-only options. Bulk-format the entire codebase with nightly rustfmt so CI (now using cargo +nightly fmt) passes. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Remove all nightly-only rustfmt options (imports_granularity, group_imports, trailing_comma, trailing_semicolon, struct_field_align_threshold, enum_discrim_align_threshold, format_macro_matchers, normalize_comments, wrap_comments, comment_width). CI now uses single stable toolchain for both clippy and fmt. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

cargo llvm-cov --output-path coverage/lcov.info fails when coverage/ doesn't exist. Add mkdir -p before running it. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

The else if role == "assistant" branch was accidentally removed in bd34c5a, merging assistant-message handling into the tool block. This broke all tool-calling conversations: assistant messages with tool_calls were pushed as plain text (tool_calls not extracted, pending_tool_call_ids never populated) and tool responses were silently dropped. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

The parallel tool execution path called tool_executor.execute() directly with no timeout and no retry, while the sequential path had 30s timeout with 3 retry attempts. A hanging tool in a parallel batch would block indefinitely. Now both paths have consistent timeout and retry behavior, emitting agent-loop-tool-retry events for UI visibility. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Three minor fixes from code review: - Update stale runaway guard comment and message from 3 to 5 iterations to match MAX_RUNAWAY_ITERATIONS. - Persist 400 'insufficient tool messages' errors via inline_append before retry, giving the UI visibility and recording the failure in DB history. - Emit agent-loop-tool-storage-error event when Phase 3 storage fails, so the UI can detect lost tool results instead of only logging to stderr. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Blankll and others added 11 commits June 20, 2026 21:40

fix: resolve never_loop and manual_next_back clippy lints

f79c1e9

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: self-healing agent, tool timeout, message ordering#7

fix: self-healing agent, tool timeout, message ordering#7
Blankll wants to merge 11 commits into
masterfrom
fix/agent-self-healing-and-timeout

Blankll commented Jun 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Blankll commented Jun 20, 2026

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant