Releases: docker/docker-agent
v1.34.0
This release improves tool call handling and evaluation functionality with several technical fixes and optimizations.
Improvements
- Optimizes partial tool call streaming by sending only delta arguments instead of accumulated arguments
- Reduces evaluation summary display width for better terminal formatting
- Includes tool definition only on the first partial tool call to reduce redundancy
Bug Fixes
- Fixes schema conversion for OpenAI Responses API strict mode, resolving issues with gpt-4.1-nano
- Removes duplicate tool call data from tool call response events to reduce payload size
Technical Changes
- Updates evaluation system to not provide all API keys when using models gateway
- Removes redundant tool call information from response events while preserving tool call IDs for client reference
What's Changed
- docs: update CHANGELOG.md for v1.33.0 by @docker-read-write[bot] in #2159
- Fix (reduce) evals summary width by @gtardif in #2160
- Evals: don't provide all API keys when using models gateway by @gtardif in #2162
- Only send the delta on the partial tool call by @rumpl in #2105
- build(deps): bump google.golang.org/grpc from 1.79.2 to 1.79.3 in the go_modules group across 1 directory by @dependabot[bot] in #2164
- Remove the tool call from the tool call response event by @rumpl in #2163
- Fix schema conversion for OpenAI Responses API strict mode - Fixes tool calls with gpt-4.1-nano by @gtardif in #2168
Full Changelog: v1.33.0...v1.34.0
v1.33.0
This release improves file editing reliability, adds session exit keywords, and fixes several issues with sub-sessions and evaluation handling.
What's New
- Adds support for "exit", "quit", and ":q" keywords to quit sessions immediately
- Adds per-eval Docker image override via evals.image property in evaluation configurations
- Adds run instructions to creator agent prompt for proper agent execution guidance
Bug Fixes
- Fixes handling of double-serialized edits argument in edit_file tool when LLMs send JSON strings instead of arrays
- Fixes sub-session thinking state being incorrectly derived from parent session instead of child agent
- Fixes --sandbox flag when running in CLI plugin mode
- Fixes cross-model Gemini function calls by using dummy thought_signature
- Fixes event timestamps for user messages in SessionFromEvents to prevent duration calculation issues
Improvements
- Displays breakdown of failure types in evaluation summary for better debugging
- Declines elicitations in run --exec --json mode
- Validates path field consistently in edit file operations
Technical Changes
- Removes unused fileWriteTracker from creator package
- Simplifies UnmarshalJSON implementation for better path validation
- Updates evaluation image build cache to handle different images per working directory
What's Changed
- docs: update CHANGELOG.md for v1.32.5 by @docker-read-write[bot] in #2147
- Better rendering in tmux and ghostty by @dgageot in #2146
- Fix --sandbox when running cli plugin mode by @gtardif in #2151
- Display breakdown of types of failures in eval summary by @gtardif in #2150
- feat: support "exit" as a keyword to quit the session by @trungutt in #2152
- Add per-eval Docker image override via evals.image property by @dgageot in #2153
- Add run instructions to creator agent prompt by @dgageot in #2154
- Decline elicitations in run --exec --json mode by @dgageot in #2156
- Remove unused fileWriteTracker from creator package by @dgageot in #2157
- fix: use dummy thought_signature for cross-model Gemini function calls by @dgageot in #2155
- fix: sub-session thinking state derived from child agent, not parent session by @dgageot in #2149
- fix: handle double-serialized edits argument in edit_file tool by @trungutt in #2144
- fix: use event timestamps for user messages in SessionFromEvents by @dgageot in #2158
Full Changelog: v1.32.5...v1.33.0
v1.32.5
This release improves agent reliability and performance with better tool loop detection, enhanced MCP handling, and various bug fixes.
What's New
- Adds framework-level tool loop detection to prevent degenerate agent loops when the same tool is called repeatedly
- Adds support for dynamic command expansion in skills using
!\command`` syntax - Adds support for running skills as isolated sub-agents via
context: forkfrontmatter - Adds CLI flags (
--hook-pre-tool-use,--hook-post-tool-use, etc.) to override agent hooks from command line - Adds stop and notification hooks with session lifecycle integration
Improvements
- Reworks thinking budget system to be opt-in by default with adaptive thinking and effort levels
- Caches syntax highlighting results for code blocks to improve markdown rendering performance
- Optimizes MCP catalog loading with single fetch per run and ETag caching
- Derives meaningful names for external sub-agents instead of using generic 'root' name
- Optimizes filesystem tool performance by avoiding duplicate string allocations
- Speeds up history loading with ReadFile and strconv.Unquote optimizations
Bug Fixes
- Fixes context cancelling during RAG initialization and query operations
- Fixes frozen spinner during MCP tool loading
- Fixes model name display in TUI sidebar for all model types
- Fixes two data races in shell tool execution
- Fixes character handling issues in tmux integration
- Fixes binary download URLs in documentation to match release artifact naming
- Validates thinking_budget effort levels at parse time and rejects unknown values
Technical Changes
- Removes unused methods from codebase
- Hardens and simplifies MCP gateway code
- Adds logging for selected model in Agent.Model() for better observability
- Fixes pool_size reporting to reflect actual selection pool
- Reverts timeout changes for remote MCP initialization and tool calls
What's Changed
- Fix rag init context cancel by @gtardif in #2114
- docs: update CHANGELOG.md for v1.32.4 by @docker-read-write[bot] in #2112
- Bump dependencies by @dgageot in #2113
- Fix frozen spinner during MCP tool loading by @dgageot in #2115
- Support dynamic command expansion in skills (!
commandsyntax) by @dgageot in #2116 - Fix model name display in TUI sidebar for all model types by @dgageot in #2118
- perf(markdown): cache syntax highlighting results for code blocks by @dgageot in #2119
- feat: framework-level tool loop detection by @derekmisler in #2123
- Fix issues on builtin filesystem tools by @dgageot in #2125
- Fix two data races in shell tool by @dgageot in #2127
- Fix a few characters for tmux by @dgageot in #2128
- Simplify MCP catalog loading: single fetch per run with ETag caching by @dgageot in #2124
- docs: fix binary download URLs to match release artifact naming by @dgageot in #2129
- More doc fixing with "agent serve mcp" by @gtardif in #2130
- Rework thinking budget: opt-in by default, adaptive thinking, effort levels by @dgageot in #2121
- Add timeouts to remote MCP initialization and tool calls by @dgageot in #2131
- Optimize start time by @dgageot in #2138
- gateway: harden and simplify MCP gateway code by @dgageot in #2133
- Derive meaningful names for external sub-agents instead of using 'root' by @dgageot in #2132
- Add --hook-* CLI flags to override agent hooks from the command line by @dgageot in #2135
- feat: support running skills as isolated sub-agents via context: fork by @dgageot in #2137
- Revert "Add timeouts to remote MCP initialization and tool calls" by @dgageot in #2141
- Add stop and notification hooks, wire up session lifecycle hooks by @dgageot in #2136
- Reject unknown thinking_budget effort levels at parse time by @dgageot in #2142
- Log selected model in Agent.Model() for alloy observability by @derekmisler in #2134
Full Changelog: v1.32.4...v1.32.5
v1.32.4
This release optimizes tool instructions, removes unused session metadata, and includes several bug fixes and improvements.
Improvements
- Optimizes builtin tool instructions for conciseness by applying Claude 4 prompt engineering best practices
- Removes unused branch metadata and split_diff_view from sessions to clean up data storage
Bug Fixes
- Fixes emoji rendering issues in iTerm2
- Reverts keyboard enhancement changes that caused incorrect behavior in VSCode with AZERTY layout
Technical Changes
- Extracts compaction logic into dedicated pkg/compaction package for better code organization
- Updates skill configuration
- Improves evaluation system by validating LLM judge, disabling thinking for LLM as judge, and removing handoffs scoring
- Disallows unknown fields in configuration validation
What's Changed
- Bump dependencies by @dgageot in #2094
- Optimize builtin tool instructions for conciseness by @dgageot in #2091
- docs: update CHANGELOG.md for v1.32.3 by @docker-read-write[bot] in #2097
- Remove unused branch metadata and split_diff_view from sessions by @rumpl in #2078
- Revert "tui: improve tmux experience and simplify keyboard enhancements" by @gtardif in #2098
- Fix 2089 - emoji rendering in iTerm2 by @dgageot in #2099
- Extract compaction into a dedicated pkg/compaction package by @dgageot in #2101
- Improve evals by @dgageot in #2100
Full Changelog: v1.32.3...v1.32.4
v1.32.3
This release removes an experimental feature and improves error handling for rate-limited API requests.
Improvements
- Makes HTTP 429 (Too Many Requests) errors retryable when no fallback model is available, respecting the Retry-After header
Bug Fixes
- Gates 429 retry behavior behind WithRetryOnRateLimit() opt-in option to prevent unexpected retry behavior
Technical Changes
- Removes experimental feature from the codebase
- Adds optional gateway usage for LLM evaluation as a judge
- Refactors to use typed StatusError for retry metadata, with providers wrapping errors at Recv()
What's Changed
- Remove experimental feature by @dgageot in #2087
- docs: update CHANGELOG.md for v1.32.2 by @docker-read-write[bot] in #2090
- This can be retried by @dgageot in #2093
- [eval] Optionnally use the gateway for the llm as a judge by @dgageot in #2092
- fix: make HTTP 429 retryable when no fallback model, respect Retry-After header by @simon-agent-go-expert in #2096
New Contributors
- @simon-agent-go-expert made their first contribution in #2096
Full Changelog: v1.32.2...v1.32.3
v1.32.2
This release focuses on security improvements and bug fixes, including prevention of PATH hijacking vulnerabilities and fixes to environment file support.
Bug Fixes
- Fixes prevention of PATH hijacking and TOCTOU (Time-of-Check-Time-of-Use) vulnerabilities in shell/binary resolution (CWE-426)
- Fixes --env-file support for the gateway
Technical Changes
- Removes debug code from codebase
- Reverts user prompt options feature that was previously added
What's Changed
- docs: update CHANGELOG.md for v1.32.1 by @docker-read-write[bot] in #2084
- fix: prevent PATH hijacking and TOCTOU in shell/binary resolution by @dgageot in #2083
- Remove debug code by @dgageot in #2086
- Fix --env-file support for the gateway by @dgageot in #2085
- Revert "Add options-based selection to user_prompt tool" by @trungutt in #2088
Full Changelog: v1.32.1...v1.32.2
v1.32.1
This release fixes several issues with session handling, tool elicitation, and MCP environment variable validation.
Bug Fixes
- Fixes corrupted session history by filtering sub-agent streaming events from parent session persistence
- Fixes elicitation requests failing in sessions with ToolsApproved=true by decoupling elicitation channel from ToolsApproved flag
- Fixes MCP environment variable validation being skipped when any gateway preflight errors occur
Improvements
- Prevents sidebar from scrolling to top when clicking navigation links in documentation
Technical Changes
- Adds end-to-end test for tool result block validation
- Updates CHANGELOG.md for v1.32.0 release
What's Changed
- docs: update CHANGELOG.md for v1.32.0 by @docker-read-write[bot] in #2072
- Don't scroll sidebar to the top by @dgageot in #2076
- Fix corrupted session history by @dgageot in #2077
- Fix MCP env var check skipped when any gateway preflight errors by @dgageot in #2081
- fix: decouple elicitation channel from ToolsApproved flag by @dgageot in #2080
Full Changelog: v1.32.0...v1.32.1
v1.32.0
This release adds support for newer Gemini models, improves toolset documentation, and enhances user interaction capabilities.
What's New
- Adds options-based selection to user_prompt tool, allowing the agent to present users with labeled choices instead of free-form input
- Documents {ORIGINAL_INSTRUCTIONS} placeholder for enriching toolset instructions rather than replacing them
Bug Fixes
- Fixes support for Gemini 3.x versioned models (e.g., gemini-3.1-pro-preview) to ensure proper model recognition and thinking configuration
- Fixes gateway handling when using docker agent without a command
- Fixes broken links in documentation
Technical Changes
- Adds check for broken links in CI
- Updates .gitignore to exclude cagent-* binaries from being committed
What's Changed
- docs: update CHANGELOG.md for v1.31.0 by @docker-read-write[bot] in #2063
- doc: document {ORIGINAL_INSTRUCTIONS} placeholder for toolset instructions by @dgageot in #2062
- fix: support Gemini 3.x versioned models (e.g., gemini-3.1-pro-preview) by @dgageot in #2054
- Fix gateway handling with docker agent without command by @dgageot in #2064
- Fix broken links by @dgageot in #2067
- gitignore cagent-* binaries by @derekmisler in #2069
- Check for broken links by @dgageot in #2068
- Add options-based selection to user_prompt tool by @trungutt in #2071
Full Changelog: v1.31.0...v1.32.0
v1.31.0
This release enhances the cost dialog with detailed session statistics and improves todo tool reliability for better task completion tracking.
What's New
- Adds total token count, session duration, and message count to cost dialog
- Adds reasoning tokens display for supported models (e.g. o1)
- Adds average cost per 1K tokens and per message metrics to cost analysis
- Adds cost percentage breakdown per model and per message
- Adds cache hit rate and per-entry cached token count display
Improvements
- Improves todo tool reliability by reminding LLM of incomplete items and including full state in all responses
Bug Fixes
- Fixes Sonnet model name
- Fixes various edge-case bugs in cost dialog formatting
Technical Changes
- Adds cache to building hub image in CI
- Optimizes CI by building and testing Go on the same runner to avoid duplicate compilation
- Freezes config to v6
- Deduplicates tool documentation into individual pages
- Adds docs-serve task for local Jekyll preview via Docker
What's Changed
- docs: update CHANGELOG.md for v1.30.1 by @docker-read-write[bot] in #2050
- Add cache to building hub image in CI by @gtardif in #2037
- cost dialog: enrich with session stats, per-model percentages, and formatting fixes by @dgageot in #2046
- Fix sonnet model name by @dgageot in #2052
- fix: improve todo completion reliability by @trungutt in #2048
- Freeze config v6 by @dgageot in #2059
- Improve the toolsets documentation by @dgageot in #2056
Full Changelog: v1.30.1...v1.31.0
v1.30.1
This release improves command history handling, adds sound notifications, and includes various bug fixes and performance optimizations.
What's New
- Adds sound notifications for long-running tasks and errors (opt-in feature, disabled by default)
- Adds LSP multiplexer to support multiple LSP toolsets simultaneously
- Adds per-toolset model routing via model field on toolsets configuration
- Adds click-to-copy functionality for working directory in TUI sidebar
- Makes background_agents a standalone toolset that can be enabled independently
Improvements
- Improves tmux experience with better keyboard enhancements and focus handling
- Optimizes BM25 scoring strategy for better performance
- Reduces redundant work during evaluation runs
- Fixes animated spinners inside terminal multiplexers
- Repaints terminal on focus to fix broken display after tab switch in Docker Desktop
Bug Fixes
- Fixes loading very long lines in command history that previously caused crashes
- Fixes LSP server being killed by context cancellation and restart failures
- Fixes session-pinned agent usage in RunStream instead of shared currentAgent
- Fixes sidebar context percentage flickering during sub-agent transfers
- Fixes concurrent map writes by moving registerDefaultTools to constructor
- Returns clear error when OPENAI_API_KEY is missing for speech-to-text
Technical Changes
- Splits monolithic runtime.go into focused files by concern
- Refactors code to use slices and maps stdlib functions instead of manual implementations
- Enables modernize and perfsprint linters with all findings resolved
- Migrates tool output to structured JSON schemas for todo tools
- Replaces json.MarshalIndent with json.Marshal in builtin tools
- Uses errors.AsType consistently instead of errors.As with pre-declared variables
What's Changed
- Don't ignore GITHUB_TOKEN by @dgageot in #2002
- history: Fix loading very long lines by @vvoland in #1940
- docs: update CHANGELOG.md for v1.30.0 by @docker-read-write[bot] in #2003
- Fix broken links to pages subsections by @gtardif in #2005
- feat: add sound notifications for task completion and errors by @Mostamhd in #1870
- Add LSP multiplexer to support multiple LSP toolsets by @dgageot in #1970
- fix(#2012): Return clear error when OPENAI_API_KEY is missing for speech-to-text by @aheritier in #2013
- Bump direct Go dependencies by @dgageot in #2011
- Fix LSP server killed by context cancellation and restart failures by @dgageot in #2008
- Replace duplicated mockEnvProvider test types with shared environment providers by @dgageot in #2014
- fix: use session-pinned agent in RunStream instead of shared currentAgent by @dgageot in #2009
- docs: fix hallucinated CLI flags, commands, and config formats by @dgageot in #2020
- refactor: use slices and maps stdlib functions instead of manual implementations by @dgageot in #2021
- refactor(anthropic): deduplicate sequencing, media-type, and test helpers by @dgageot in #2019
- refactor: split runtime.go and extract pkg/modelerrors by @dgageot in #2010
- tui: improve tmux experience and simplify keyboard enhancements by @dgageot in #2017
- codemode: fix Start() fail-fast and use tools.As for wrapper unwrapping by @dgageot in #2007
- Simplify rulebased router: remove redundant types and score aggregation by @dgageot in #2016
- Unify streamAdapter/betaStreamAdapter retry logic into generic retryableStream by @dgageot in #2018
- Fix task deploy-local by @dgageot in #2024
- fix: default sound notifications to off (opt-in) by @dgageot in #2025
- feat: add per-toolset model routing via model field on toolsets by @dgageot in #2015
- refactor: use errors.AsType consistently instead of errors.As with pre-declared variables by @dgageot in #2028
- tui: repaint terminal on focus to fix broken display after tab switch by @dgageot in #2026
- Enable modernize and perfsprint linters, fix all findings by @dgageot in #2027
- update Slack link in readme by @derekmisler in #2032
- Replace json.MarshalIndent with json.Marshal in builtin tools by @dgageot in #2031
- refactor(dmr): split client.go into focused files by concern by @dgageot in #2029
- Fix last brew install cagent mention by @gtardif in #2034
- tui: fix animated spinners inside terminal multiplexers by @dgageot in #2035
- refactor(runtime): split monolithic runtime.go into focused files by @dgageot in #2030
- feat: click to copy working directory in TUI sidebar by @dgageot in #2036
- Use ResultSuccess/ResultError helpers in tasks and user_prompt tools by @dgageot in #2040
- fix: move registerDefaultTools to constructor to prevent concurrent map writes by @dgageot in #2041
- Fix sidebar context % flickering during sub-agent transfers by @dgageot in #2042
- eval: reduce redundant work during evaluation runs by @dgageot in #2047
- perf: optimize BM25 scoring strategy by @dgageot in #2043
- refactor: remove duplication in model resolution, thinking budget, and message construction by @dgageot in #2038
- feat: make background_agents a standalone toolset by @dgageot in #2033
- todo: migrate tool output to structured JSON schemas by @dgageot in #2045
Full Changelog: v1.30.0...v1.30.1