Skip to content

Add verbose flow logging for agent tool calls and results#243

Merged
alexkroman merged 2 commits into
mainfrom
claude/relaxed-mccarthy-cmotu6
Jun 18, 2026
Merged

Add verbose flow logging for agent tool calls and results#243
alexkroman merged 2 commits into
mainfrom
claude/relaxed-mccarthy-cmotu6

Conversation

@alexkroman

Copy link
Copy Markdown
Collaborator

Summary

Adds structured logging of the agent's tool-call loop when running in verbose mode (-v), making it possible to debug stalled spoken turns by seeing which tools the agent reached for and what they returned.

Changes

  • brain.py:

    • Added _run_graph() to conditionally stream the graph's state snapshots in verbose mode instead of invoking it once, enabling incremental logging
    • Added _log_flow() to extract and log tool calls, tool results, and interim assistant messages from state snapshots using the coding agent's event vocabulary
    • Added _clip() to truncate long tool outputs (capped at 500 chars) so a single large result doesn't bury the rest of the flow log
    • Integrated verbose logging into build_completer() via _run_graph() instead of direct invoke()
    • Logs are emitted at INFO level to aai_cli.agent_cascade.brain logger so plain -v surfaces them
  • test_agent_cascade_brain.py:

    • Added _StreamingGraph test fixture that yields scripted state snapshots to verify streaming behavior
    • Added test_run_graph_streams_and_logs_flow_when_verbose() to verify the loop streams with stream_mode="values" and logs tool calls with args, tool results (clipped), and interim assistant lines
    • Added test_run_graph_invokes_when_not_verbose() to verify non-verbose mode uses plain invoke()
    • Added test_run_graph_invokes_when_graph_cannot_stream() to verify fallback to invoke() when graph lacks stream() method
    • Added test_log_flow_ignores_non_list_messages() for defensive handling
    • Added test_clip_passes_short_text_and_truncates_long_text() to verify truncation behavior
    • Moved ToolMessage import to module level (used in existing test)

Implementation Details

  • Streaming only activates when debuglog.active() returns True (verbose mode) AND the graph has a stream() method, ensuring backward compatibility with test fakes and non-streaming graphs
  • _log_flow() reuses the coding agent's message_events() vocabulary (AssistantText, ToolCall, ToolResult) so the flow log speaks the same language as the TUI
  • The high-water mark pattern (seen counter) ensures each snapshot only logs newly-added messages, avoiding duplicates across growing state snapshots
  • Result truncation preserves the original length in the log message so users can see when output was clipped

https://claude.ai/code/session_01T3YQmYYiZDEZkDvWqXYk6D

`assembly -v live` only showed the httpx request lines because the deepagents
brain answers each spoken turn with a single blocking `graph.invoke(...)` that
runs the whole tool loop internally — so a turn that stalled mid-tool ("Let me
search…" then nothing) was undebuggable.

When verbose logging is active (`debuglog.active()`) and the graph can stream,
`brain._run_graph` now drives it as incremental state snapshots and logs each
tool call (name + args), tool result (truncated), and interim assistant line as
it lands, reusing the coding agent's `events.message_events` vocabulary. The
non-verbose path and the invoke-only test fakes keep the plain `invoke` branch.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01T3YQmYYiZDEZkDvWqXYk6D
if isinstance(event, ToolCall):
_FLOW_LOG.info("tool call %s args=%s", event.name, event.args)
elif isinstance(event, ToolResult):
_FLOW_LOG.info("tool result %s -> %s", event.name, _clip(event.content))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tool result content is logged verbatim (event.content via _clip). Sanitize, escape, or redact sensitive fields before logging.

Details

✨ AI Reasoning
​The change logs tool results at INFO level: _FLOW_LOG.info("tool result %s -> %s", event.name, _clip(event.content)). Even though _clip truncates length, it does not sanitize or escape CR/LF, templates, or sensitive content. Logging raw external content risks personal-data leakage and log injection.

🔧 How do I fix it?
Keep sensitive data such as emails, passwords, and tokens out of logs. When logging values tied to a user, prefer a safe identifier like a user ID over the raw input, and strip line breaks from any user-provided text you do log.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed the actionable part in eb60cff: _clip now collapses all whitespace (CR/LF included) to single spaces before truncating, so untrusted tool output can't forge extra [aai_cli.…] log lines.

On the redaction part: this is opt-in -v debug output to the user's own stderr on a single-user local CLI, deliberately showing the tool content the user asked to see — and registered secrets (API key / session JWT) are already masked across every record by the debuglog redacting formatter. So beyond the newline-injection fix, redacting the content further would defeat the feature without closing a real leak.


Generated by Claude Code

Untrusted tool output (a fetched page, a search payload) is logged in the
verbose flow log; embedded CR/LF could forge fake `[aai_cli.…]` log lines. `_clip`
now collapses all whitespace to single spaces before truncating, so each result
stays on one line and can't inject log lines. (Secrets remain masked separately
by the debuglog formatter.) Addresses the Aikido review note on PR #243.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01T3YQmYYiZDEZkDvWqXYk6D
@alexkroman alexkroman enabled auto-merge June 18, 2026 17:55
@alexkroman alexkroman added this pull request to the merge queue Jun 18, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 18, 2026
@alexkroman alexkroman added this pull request to the merge queue Jun 18, 2026
Merged via the queue into main with commit 6b9fb5c Jun 18, 2026
38 of 42 checks passed
@alexkroman alexkroman deleted the claude/relaxed-mccarthy-cmotu6 branch June 18, 2026 18:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants