Add assembly code terminal coding agent backed by LLM Gateway#230
Conversation
…ay only) A terminal coding agent — a bespoke port of langchain-ai/deepagents' `code` agent — that talks only to the AssemblyAI LLM Gateway. Built on deepagents over a cwd-scoped LocalShellBackend, with: the `assembly` CLI exposed as a tool, the docs MCP server, Tavily web search, URL fetch, ask-user, installed-skills + long-term-memory middleware, persistent SQLite sessions, and human approval on mutating tools. Front-ends: a Textual TUI modeled on deepagents-code (ASSEMBLY wordmark in the brand blue on a flat dark canvas, bordered prompt, status line with mode badge + cwd/branch, approval/ask modals, copy/paste via mouse-off, web-disabled toast), plus a Rich headless REPL fallback. `code` is its own "Coding Agent" help panel. Gateway compatibility fix: the gateway 500s on OpenAI content-parts arrays, so the model subclass flattens list-content messages to plain strings before sending. Tests drive the real deepagents graph with a fake chat model (no network/TTY) and the Textual app via pilot; 100% patch coverage. The full local gate (./scripts/check.sh) passes, including diff-cover and the diff-scoped mutation gate. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Mqx2vYy9FS5Lxpf3ekBGsr
| def _confirm(name: str, args: dict[str, object]) -> bool: | ||
| """Headless approval: print the pending tool call and read a y/N from stdin.""" | ||
| rendered = ", ".join(f"{key}={value!r}" for key, value in args.items()) | ||
| output.error_console.print(output.warn(f"Run {name}({rendered})? [y/N] ")) |
There was a problem hiding this comment.
Printing the formatted tool arguments (rendered) via output.error_console.print can expose user-controlled or secret values. Avoid printing raw args; sanitize or redact sensitive fields before display.
Details
✨ AI Reasoning
The code prints a rendering of tool call arguments (which can include user-supplied values) directly to the error console for confirmation. This may expose sensitive or untrusted strings (API keys, tokens, file contents, URLs) without sanitization. The printed value is built from args via repr() and interpolated into a message that is output to the console; this behavior was introduced in this change. Printing raw user-controlled arguments can leak secrets or allow log injection if those values contain control characters.
🔧 How do I fix it?
Keep sensitive data such as emails, passwords, and tokens out of logs. When logging values tied to a user, prefer a safe identifier like a user ID over the raw input, and strip line breaks from any user-provided text you do log.
Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
|
|
||
| def _submit(self, text: str) -> None: | ||
| log = self.query_one("#log", RichLog) | ||
| log.write(f"[b cyan]» {text}[/b cyan]") |
There was a problem hiding this comment.
User-submitted prompt (text) is written to the log raw. Avoid logging unsanitized user input; sanitize, truncate, or mask sensitive content before logging.
Details
✨ AI Reasoning
The code writes the raw user-entered prompt string directly to the UI log without sanitization or encoding. This logs arbitrary user-controlled text, which can include secrets, CR/LF sequences for log forging, or other sensitive data. Logging unsanitized input increases risk of credential leakage and log injection attacks and should be avoided or sanitized before persisting/displaying in logs.
🔧 How do I fix it?
Keep sensitive data such as emails, passwords, and tokens out of logs. When logging values tied to a user, prefer a safe identifier like a user ID over the raw input, and strip line breaks from any user-provided text you do log.
Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
| from pathlib import Path | ||
| from typing import TYPE_CHECKING, ClassVar | ||
|
|
||
| from textual.app import App, ComposeResult |
- tests: `_abbrev_home` assertion compares to the platform-native path string (was hardcoded POSIX), fixing the Windows test job; close the SqliteSaver connection in the checkpointer test so its sqlite3.Connection isn't GC'd mid-suite (the unclosed connection raised PytestUnraisableExceptionWarning → failure under filterwarnings=error on py3.13/Windows, in random later tests). - Escape dynamic content (assistant text, tool name/args/results, user prompt, agent questions, approval prompt) before writing to Rich markup surfaces in the TUI and the Rich renderer/REPL. A model/tool string containing "[" would otherwise be parsed as Rich markup — injecting styling or raising MarkupError and crashing the turn. Matches the inline-escape convention in ui/output.py. (Addresses the Aikido review comments.) - Remove the no-op `...` from the CompiledAgent Protocol method (docstring is the body). The langchain GHSA (Aikido/pip-audit) is handled separately — bumping it is blocked by a websockets>=16 conflict; relaxing that floor is the next change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Mqx2vYy9FS5Lxpf3ekBGsr
…5-jv2w-4656 Resolves the medium-severity LangChain path-confinement advisory (flagged by Aikido + pip-audit on PR #230). The patched langchain 1.3.9 requires deepagents 0.6.10, whose langgraph-sdk caps websockets <16 — so relax the CLI's `websockets>=16` floor (a dependabot artifact, not a real requirement) to `>=14`; the resolver now picks 15.0.1. Safe for our usage either way: - websockets: the realtime STT/TTS code uses websockets.sync/asyncio.client (stable since 13.x) and the assemblyai SDK only needs >=11; the full suite passes on 15.0.1. - the advisory's affected components (langchain file-search middleware / config loaders / path-prefix checks) aren't on our path: we use deepagents' LocalShellBackend with virtual_mode=True, which confines file/shell tools to the working directory. Full local suite (3261 tests) + ruff + mypy + lock-check pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Mqx2vYy9FS5Lxpf3ekBGsr
Introduces a new
assembly codecommand that runs an autonomous coding agent in the terminal, built on the deepagents SDK and wired to only communicate with the AssemblyAI LLM Gateway.Summary
This PR adds a complete coding agent feature to the CLI, enabling users to run an interactive agent that can read/write files, execute shell commands, search documentation, and invoke the
assemblyCLI itself — all within a working directory. The agent is available through two interfaces: a rich Textual TUI (primary) and a plain Rich REPL (fallback for headless/piped runs).Key Changes
Core Agent Infrastructure (
aai_cli/code_agent/)agent.py: Assembles the deepagents graph with the gateway model, filesystem/shell tools, custom CLI tool, and human-in-the-loop approval gating on mutating operationssession.py: Framework-agnostic turn orchestration — drives the agent, resolves approval interrupts, and emits display events to injected sinksevents.py: Converts langchain messages to a small vocabulary of display events (AssistantText, ToolCall, ToolResult, ErrorText) consumed by both front-endsmodel.py: Builds the chat model (always AssemblyAI LLM Gateway via langchain_openai), with content-flattening to work around gateway limitationsprompt.py: System prompt template and model defaults (Claude Sonnet 4.6, 8K max tokens)Tools & Integrations
cli_tool.py: Exposes theassemblyCLI as a tool; runs subcommands in a subprocess with API key injected via environment (never argv) to prevent secret leakagefetch_tool.py: URL-fetch tool (approval-gated for SSRF protection)ask_tool.py: Allows the agent to ask the user questions mid-task via an injected bridge (framework-agnostic)docs_mcp.py: Loads AssemblyAI docs MCP server tools for documentation searchweb_search.py: Optional Tavily web search (enabled whenTAVILY_API_KEYis set)skills.py: Imports installed agent skills (e.g.,assemblyaiskill) via a separate filesystem backendmemory.py: Long-term memory middleware with persistent storage across sessionsstore.py: SQLite checkpoint persistence for resumable sessions (in-memory fallback for ephemeral runs)banner.py: Startup splash with ASSEMBLY wordmark and intro copyUser Interfaces
tui.py: Textual app with scrolling transcript, bottom input, and modal approval/ask screens; runs the agent on a worker thread with events streamed back to the UI threadrender.py: Rich console renderer for headless/piped runs and as a fallbacksession.py:run_repl()function for interactive line-by-line inputCommand Wiring (
aai_cli/commands/code/)__init__.py: Command definition with all flags (--model, --dir, --auto, --docs, --skills, --web, --memory, --session, --persist, --tui)_exec.py: Run logic that assembles tools, middlewares, the agent, and dispatches to TUI (if TTY) or REPL (headless)Tests
tests/test_code_agent.py: 386-line end-to-end suite exercising the real deepagents graph with a fake chat model, covering file writes, approvals, auto-approve, REPL loop, tool invocation, and middlewaretests/test_code_tui.py: Textual pilot tests (headless) for app composition, splash rendering, turn execution, event rendering, and approval/ask modalstests/test_code_command.py: Command wiring tests for flag parsing, TTY/headless dispatch, and tool assemblyNotable Implementation Details
https://claude.ai/code/session_01Mqx2vYy9FS5Lxpf3ekBGsr