Support streaming tool output and deduplication#343
Support streaming tool output and deduplication#343timvisher-dd wants to merge 33 commits intoxenodium:mainfrom
Conversation
|
I appreciate the contribution, but please be mindful there's a lot being sent my way here, from long issue descriptions to chunky PRs. When I ask folks to file a feature request before sending a PR, I don't mean it to be just a checkbox item so folks send the PR soon after that. From CONTRIBUTING.org
There's a span of minutes between the feature request/issue and the PRs submitted for review. According to commit logs, I see this feature was all built today. Being a larger contribution, please use it for some time before sending it my way. Let it settle for some time. Use it. It takes a lot of concentration for me, energy, and time to parse of all this. I am maintaining this package, so I need to understand the contributions sent my way and that takes a significant effort. Please be mindful. |
36ba031 to
5496a0d
Compare
So my intent was to be respectful of your wish to use an Issue to discuss the feature and the PR is opened in Draft to indicate that you shouldn't review it yet unless you wish. It's meant to be a both/and offering rather than to indicate that you should feel the need to read both. If you'd prefer I can keep the draft changeset entirely in my local dev integration branches until you tell me explicitly that I may open a PR. But from my perspective the 'this code might answer this issue if we both agree that the issue is real and make sense' is a useful thing especially when I'm mostly opening the issue because I've fixed it for myself.
The commit logs show that just because I rebase and push every time I work to be sure that I'm not in conflict with anything happening on trunk. I assure you that I'm not opening Issues or PRs without using the changes I'm making for at least a few days.
I wonder if there's some way to set a 'max in flight Issues/PRs' setting on the repo? To me I'm opening issues as I see and solve them for myself and I have zero expectation that you'll spend any more time than is reasonable for you on it. If that means my Issues/PRs sit unwatched for months then so be it. That's the nature of open source. Glob knows my own open source projects often will go that long with zero attention paid by me. That said I'm unsure how better to indicate to you that something's available for you to look at. Like if we had a discord somewhere I'd still be sending you a message 'Hey this thing maybe could use your attention' and then you'd have to decide whether you have any to spare at that time or not. Again it's just kind of the nature of an open source community. I guess finally if I'm pushing too much into your attention we could go where I just work on my fork and integrate your work and you can check my fork every now and then to see if anything interesting has landed. That's a reasonable stance for you to take as well. LMK how you'd like to proceed! Really appreciate all the effort you've put into this so far. :) |
16a7ad0 to
ae7e165
Compare
Tests for agent-shell-usage.el covering notification updates, context usage indicator rendering, number formatting, and a compaction replay driven by real observed ACP traffic. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the global `agent-shell--state` function redefinition with a `cl-letf` wrapper macro scoped to each test, preventing order-dependent side effects across the test suite. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Port 6 tests from the add-terminal-caps branch adapted for non-terminal streaming/dedup: - meta-response-text: extract toolResponse from _meta namespaces - normalize-output trailing newline and persisted-output stripping - tool-call-update writes output to buffer - meta-response stdout no-duplication (simplified, no terminal traffic) - initialize request includes terminal_output meta capability agent-shell-meta.el is a stub (functions return nil) so the tests load but fail at assertion time, giving a clean set of failures to drive the implementation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace stubs with real implementations of meta-lookup, meta-find-tool-response, and tool-call-meta-response-text. Handles three toolResponse shapes: stdout alist, content alist, and text vector. Omits terminal-specific tool-call-terminal-output-data. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Streaming tool call handler that accumulates incremental output from _meta.*.toolResponse and renders it once on the final update, avoiding duplicate output. Includes: normalize-output, content-text extraction, chunk accumulation (append/output-text/clear/markers), streaming handler with two cond branches (accumulate non-final, render final), and mark-tool-calls-cancelled. Uses declare-function for agent-shell.el functions that this module calls but does not define. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Require agent-shell-streaming alongside other modules - Replace inline tool_call_update rendering with agent-shell--handle-tool-call-update-streaming (preserves rich metadata save and emit-event, delegates rendering) - Add agent-shell--mark-tool-calls-cancelled in agent-shell-interrupt - Advertise terminal_output meta capability in initialize request so agents send toolResponse.stdout streaming updates - Fix shell-maker-define-major-mode call to pass quoted symbol instead of keymap value (prevents void-function keymap error) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Port agent-shell--tool-call-terminal-output-data to agent-shell-meta.el for extracting _meta.terminal_output.data from updates. Add terminal-data cond branch to the streaming handler so codex-acp's incremental output streams live to the buffer. Verified with perf demo: codex streaming shows terminal_chunks=10007, matching the source branch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add test mirroring codex-acp's notification pattern: tool_call with terminal content, incremental terminal_output.data chunks, then completed update. Asserts both chunks render and no duplication. Fix: remove already-has-output guard from terminal-data branch. Without the full terminal protocol, all terminal chunks arrive via the streaming handler (not via a separate terminal process), so each chunk is new incremental data, not a duplicate of stdout. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- meta-find-tool-response: remove dead cond branch where (listp (cdr entry)) is redundant with (consp (cdr entry)) - tool-call-terminal-output-data: remove redundant string-key fallback (meta-lookup already handles symbol-to-string) - tool-call-clear-output: remove dead :output-last field - Move codex test after file header so lexical-binding cookie is on line 1 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The variable was renamed during conflict resolution but one reference in the session-resume display code was missed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous fix commits were tailored to the old rebased code. On clean upstream/main, acp-session (not acp-selected-session) is the correct variable, and agent-shell--select-session-to-load doesn't exist — tests should use agent-shell--prompt-select-session. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codex-acp sends cumulative re-deliveries of thought text after incremental tokens, causing doubled output in the buffer. agent-shell--thought-chunk-delta detects prefix/suffix overlap and emits only the genuinely new tail. Accumulator state resets at each new thought group. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The rebase onto upstream/main dropped (require 'agent-shell-streaming) from agent-shell.el and unquoted the mode-map symbol passed to shell-maker-define-major-mode. The missing require caused void-function errors for streaming handlers; the unquoted symbol caused (void-function keymap) when agent-shell-mode tried to evaluate the keymap data structure as code. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
agent-shell-ui-update-fragment previously deleted and reinserted the entire fragment on every :append t call. This destroyed save-excursion markers, causing point to jump to the fragment start on each streaming token. When appending to a fragment that already has a non-empty body, the new path inserts directly at the body end using insert-and-inherit (insert alone does not respect text-property stickiness, so agent-shell-ui-section would not extend to the new text). Collapsed fragments inherit invisible via stickiness; expanded fragments un-hide old trailing whitespace before inserting and re-hide new trailing whitespace after. Falls back to the full rebuild for first body content, label changes, or non-append replacements. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Upstream's xenodium#345 fix renamed :notification to :acp-notification and added (shell-maker-busy) guards that treat all notifications as stale when the shell is idle. Update every test call site and mock shell-maker-busy to return t so notifications are processed. Add agent-shell-ui-update-fragment-append-preserves-point-test to assert that the append-in-place path keeps point stable. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two bugs prevented tool call status labels from updating: 1. agent-shell--tool-call-update-overrides called make-diff-info with :tool-call instead of :acp-tool-call. cl-defun rejects unknown keywords, so handle-tool-call-final errored before reaching the label update — every completed tool call stayed at [wait]. 2. The streaming handler only updated labels on final notifications. Upstream updates labels on every tool_call_update, so intermediate transitions like pending→in_progress ([wait]→[busy]) were missed. The non-final label rebuild invalidates the output marker used by live terminal streaming, so reset it afterwards to let ensure-output-marker recreate it at the correct body-end position. Also harden test stubs: replace (lambda (&rest _args) nil) with cl-function stubs that validate the :acp-tool-call keyword, and add three tests for label transitions (keyword, done, busy). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
markdown-overlays-put moves point to (point-min) of the narrowed region — its internal parsers call goto-char without saving point. Every streaming chunk triggered a full overlay re-process that silently reset point to the start of the body, causing the cursor to fight during message streaming. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
markdown-overlays-put re-parses the entire body on every chunk (remove all overlays, find source blocks, inline codes, headers, bold, italic, etc., then recreate). As the body grows each chunk takes progressively longer — O(n) per chunk, O(n²) overall — causing visible jerkiness during streaming. For append updates, schedule overlay processing via an idle timer (0.15s) that resets on each chunk. Overlays only run once when streaming pauses. Non-append updates (new blocks, label changes) still process synchronously. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The append-in-place optimization (e4b07bc) only checked for append + new-body + existing-body before taking the fast path, which inserts body text without touching labels. When finalize-session-init passed both :append t and a new :label-left (in_progress → completed), the label change was silently dropped and Starting agent stayed stuck at [busy]. Guard the in-place path so it only fires when no new labels are provided; otherwise fall through to the full rebuild that handles label, body, and state together. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two issues prevented context text (gathered from the source buffer via agent-shell--context) from being included when submitting: 1. save-excursion in agent-shell--insert-to-shell-buffer restored point to before the inserted text. Add (goto-char insert-start) after save-excursion so point lands at the prompt where the user types. 2. shell-maker-with-auto-scroll-edit in agent-shell--update-fragment and agent-shell--update-text advanced process-mark past the context during subsequent fragment updates. Fragment updates only touch the output area above the prompt, so save and restore process-mark around the macro to prevent it from being dragged past pending input. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Test that context inserted at the prompt by agent-shell--insert-to-shell-buffer satisfies three invariants: - Point lands at the prompt (not after the context) - Context sits between process-mark and point-max - A subsequent fragment update does not drag process-mark past it Covers representative text from each context source: line (magit), region (file path + code), files (bare path), error (diagnostic), and markdown (fenced code blocks). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Upstream commit 75cc736 introduced prepending a ```console code block with the saved shell command to the tool call body. The streaming extraction into handle-tool-call-final dropped this feature. Restore it so execute-kind tool calls display the command that was run. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Verifies that completed execute-kind tool calls with a saved :command display a ```console fenced block in the rendered body. Uses :active-request instead of shell-maker-busy mock to match the upstream stale-notification guard. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…mock Upstream commit 4ad3dc9 replaced the (shell-maker-busy) stale-notification guard with (map-elt state :active-request). Update all streaming tests to set :active-request t in state setup and remove the now-ineffective shell-maker-busy mocks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The regex used a literal backtick instead of \` (start-of-string anchor), so it matched four-backtick sequences but missed standard ``` and ```lang fence lines. Add test covering plain fences, language-tagged fences, indented fences, and inline backtick preservation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The cond branch for meta-response only matched non-final updates, so stdout sent exclusively on the completed tool_call_update was never accumulated and handle-tool-call-final rendered empty output. Move meta-response accumulation before the cond so it runs regardless of final status. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Silences byte-compile warning about free variable reference. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace cl-loop with seq-find in agent-shell-meta.el (CONTRIBUTING limits cl-lib to cl-defun) - Extract store-output lambda to named helper agent-shell--store-tool-call-output - Replace nreverse+copy-sequence with reverse - Replace save-mark-and-excursion with save-excursion where mark is not modified - Convert agent-shell-show-session-id from defcustom to defvar per "avoid premature customization" guideline - Extract agent-shell--with-preserved-process-mark macro to DRY the process-mark save/restore pattern in update-fragment and update-text - Extract markdown overlay debounce delay to defvar Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix completely flat indentation in codex terminal output dedup test - Move assertions inside cl-letf in meta-response-stdout dedup test so unwind-protect cleanup runs unconditionally - Avoid shadowing agent-shell--state in let* then setq-local (removes "Making buffer-local while locally let-bound" warnings) - Suppress message output in copy-session-id-test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
67bce45 to
0757e2a
Compare
…flag Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The in-place append path (introduced in 9d991c6) copied the leading-newline strip from the initial-insert path. The initial insert correctly strips them (label-to-body padding already adds \n\n), but appended chunks carry content newlines — list-item separators, paragraph breaks — that must be preserved. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Verify that single newlines (list-item separators) and double newlines (paragraph breaks) at the start of appended chunks survive the in-place append path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fixes #342
See that issue for the full problem description, root cause analysis, and perf measurements.
Depends on xenodium/acp.el#15 (adds
:terminal-capabilityand:meta-capabilitiestoacp-make-initialize-request).Checklist
M-x checkdocandM-x byte-compile-file.Implementation
New files
agent-shell-meta.el— extractors for ACP_metapayloads:agent-shell--meta-lookup— key lookup handling both symbol and string keys in alists.agent-shell--meta-find-tool-response— walks any_metanamespace to find atoolResponsevalue (used by claude-agent-acp).agent-shell--tool-call-meta-response-text— extracts stdout text from_meta.*.toolResponsein its various shapes (string, alist withstdoutkey, vector of content blocks).agent-shell--tool-call-terminal-output-data— extracts_meta.terminal_output.data(used by codex-acp for incremental streaming chunks).agent-shell-streaming.el— streaming tool call update handler:agent-shell--tool-call-normalize-output— strips markdown fences, strips<persisted-output>XML tags (rendering the preview withfont-lock-comment-face), and ensures trailing newlines.agent-shell--append-tool-call-output— accumulates streamed output in the state's:tool-callshash under an:accumulatedkey per tool call ID.agent-shell--handle-tool-call-update-streaming— the main handler, replacing the inlinetool_call_updateblock inagent-shell.el. Three branches:_meta.terminal_output.data): normalize the chunk, accumulate it, and immediately append it to the fragment body for live streaming._meta.*.toolResponse): normalize and accumulate silently (rendered only on final update to avoid duplication)."completed"or"failed"): render accumulated output (or fall back tocontenttext), log to transcript, clean up permission dialogs, and apply title/label updates.agent-shell--mark-tool-calls-cancelled— marks all in-progress tool calls as cancelled (called fromagent-shell-interrupt).Changes to
agent-shell.el(require 'agent-shell-streaming)added.tool_call_updaterendering block (lines 1290-1346 on main) is replaced by a single call toagent-shell--handle-tool-call-update-streaming. The metadata save (title/description/command/raw-input/diff) remains inline before the handler call.initializerequest now passes:terminal-capability tand:meta-capabilities '((terminal_output . t))toacp-make-initialize-request.agent-shell-interruptcallsagent-shell--mark-tool-calls-cancelledafter sending the cancel notification.shell-maker-define-major-modecall passes'agent-shell-mode-map(quoted symbol) instead of the bare variable.Tests
7 new tests in
tests/agent-shell-streaming-tests.el:agent-shell--tool-call-meta-response-text-test— extracts text from_meta.claudeCode.toolResponse.stdout.agent-shell--tool-call-normalize-output-test— strips fences and ensures trailing newline.agent-shell--tool-call-normalize-output-persisted-output-test— strips<persisted-output>tags.agent-shell--tool-call-update-writes-output-test— verifies accumulated output is written to the fragment body.agent-shell--tool-call-meta-response-no-duplication-test— meta response text is rendered once, not duplicated with content.agent-shell-initialize-request-meta-capabilities-test— the initialize request includes_meta.terminal_output.agent-shell--tool-call-terminal-output-data-streaming-test— codex-style_meta.terminal_output.datachunks are accumulated and rendered incrementally.