Skip to content

Add assembly code terminal coding agent backed by LLM Gateway#230

Merged
alexkroman merged 3 commits into
mainfrom
claude/youthful-cray-l8oqnp
Jun 17, 2026
Merged

Add assembly code terminal coding agent backed by LLM Gateway#230
alexkroman merged 3 commits into
mainfrom
claude/youthful-cray-l8oqnp

Conversation

@alexkroman

Copy link
Copy Markdown
Collaborator

Introduces a new assembly code command that runs an autonomous coding agent in the terminal, built on the deepagents SDK and wired to only communicate with the AssemblyAI LLM Gateway.

Summary

This PR adds a complete coding agent feature to the CLI, enabling users to run an interactive agent that can read/write files, execute shell commands, search documentation, and invoke the assembly CLI itself — all within a working directory. The agent is available through two interfaces: a rich Textual TUI (primary) and a plain Rich REPL (fallback for headless/piped runs).

Key Changes

Core Agent Infrastructure (aai_cli/code_agent/)

  • agent.py: Assembles the deepagents graph with the gateway model, filesystem/shell tools, custom CLI tool, and human-in-the-loop approval gating on mutating operations
  • session.py: Framework-agnostic turn orchestration — drives the agent, resolves approval interrupts, and emits display events to injected sinks
  • events.py: Converts langchain messages to a small vocabulary of display events (AssistantText, ToolCall, ToolResult, ErrorText) consumed by both front-ends
  • model.py: Builds the chat model (always AssemblyAI LLM Gateway via langchain_openai), with content-flattening to work around gateway limitations
  • prompt.py: System prompt template and model defaults (Claude Sonnet 4.6, 8K max tokens)

Tools & Integrations

  • cli_tool.py: Exposes the assembly CLI as a tool; runs subcommands in a subprocess with API key injected via environment (never argv) to prevent secret leakage
  • fetch_tool.py: URL-fetch tool (approval-gated for SSRF protection)
  • ask_tool.py: Allows the agent to ask the user questions mid-task via an injected bridge (framework-agnostic)
  • docs_mcp.py: Loads AssemblyAI docs MCP server tools for documentation search
  • web_search.py: Optional Tavily web search (enabled when TAVILY_API_KEY is set)
  • skills.py: Imports installed agent skills (e.g., assemblyai skill) via a separate filesystem backend
  • memory.py: Long-term memory middleware with persistent storage across sessions
  • store.py: SQLite checkpoint persistence for resumable sessions (in-memory fallback for ephemeral runs)
  • banner.py: Startup splash with ASSEMBLY wordmark and intro copy

User Interfaces

  • tui.py: Textual app with scrolling transcript, bottom input, and modal approval/ask screens; runs the agent on a worker thread with events streamed back to the UI thread
  • render.py: Rich console renderer for headless/piped runs and as a fallback
  • session.py: run_repl() function for interactive line-by-line input

Command Wiring (aai_cli/commands/code/)

  • __init__.py: Command definition with all flags (--model, --dir, --auto, --docs, --skills, --web, --memory, --session, --persist, --tui)
  • _exec.py: Run logic that assembles tools, middlewares, the agent, and dispatches to TUI (if TTY) or REPL (headless)

Tests

  • tests/test_code_agent.py: 386-line end-to-end suite exercising the real deepagents graph with a fake chat model, covering file writes, approvals, auto-approve, REPL loop, tool invocation, and middleware
  • tests/test_code_tui.py: Textual pilot tests (headless) for app composition, splash rendering, turn execution, event rendering, and approval/ask modals
  • tests/test_code_command.py: Command wiring tests for flag parsing, TTY/headless dispatch, and tool assembly

Notable Implementation Details

  • Approval gating: Mutating tools (write_file, edit_file, execute, assembly, fetch_url) are gated behind an approver callback unless `

https://claude.ai/code/session_01Mqx2vYy9FS5Lxpf3ekBGsr

…ay only)

A terminal coding agent — a bespoke port of langchain-ai/deepagents' `code` agent —
that talks only to the AssemblyAI LLM Gateway. Built on deepagents over a cwd-scoped
LocalShellBackend, with: the `assembly` CLI exposed as a tool, the docs MCP server,
Tavily web search, URL fetch, ask-user, installed-skills + long-term-memory
middleware, persistent SQLite sessions, and human approval on mutating tools.

Front-ends: a Textual TUI modeled on deepagents-code (ASSEMBLY wordmark in the brand
blue on a flat dark canvas, bordered prompt, status line with mode badge + cwd/branch,
approval/ask modals, copy/paste via mouse-off, web-disabled toast), plus a Rich
headless REPL fallback. `code` is its own "Coding Agent" help panel.

Gateway compatibility fix: the gateway 500s on OpenAI content-parts arrays, so the
model subclass flattens list-content messages to plain strings before sending.

Tests drive the real deepagents graph with a fake chat model (no network/TTY) and the
Textual app via pilot; 100% patch coverage. The full local gate (./scripts/check.sh)
passes, including diff-cover and the diff-scoped mutation gate.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Mqx2vYy9FS5Lxpf3ekBGsr
Comment thread aai_cli/commands/code/_exec.py Outdated
def _confirm(name: str, args: dict[str, object]) -> bool:
"""Headless approval: print the pending tool call and read a y/N from stdin."""
rendered = ", ".join(f"{key}={value!r}" for key, value in args.items())
output.error_console.print(output.warn(f"Run {name}({rendered})? [y/N] "))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Printing the formatted tool arguments (rendered) via output.error_console.print can expose user-controlled or secret values. Avoid printing raw args; sanitize or redact sensitive fields before display.

Details

✨ AI Reasoning
​The code prints a rendering of tool call arguments (which can include user-supplied values) directly to the error console for confirmation. This may expose sensitive or untrusted strings (API keys, tokens, file contents, URLs) without sanitization. The printed value is built from args via repr() and interpolated into a message that is output to the console; this behavior was introduced in this change. Printing raw user-controlled arguments can leak secrets or allow log injection if those values contain control characters.

🔧 How do I fix it?
Keep sensitive data such as emails, passwords, and tokens out of logs. When logging values tied to a user, prefer a safe identifier like a user ID over the raw input, and strip line breaks from any user-provided text you do log.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

Comment thread aai_cli/code_agent/tui.py Outdated

def _submit(self, text: str) -> None:
log = self.query_one("#log", RichLog)
log.write(f"[b cyan]» {text}[/b cyan]")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User-submitted prompt (text) is written to the log raw. Avoid logging unsanitized user input; sanitize, truncate, or mask sensitive content before logging.

Details

✨ AI Reasoning
​The code writes the raw user-entered prompt string directly to the UI log without sanitization or encoding. This logs arbitrary user-controlled text, which can include secrets, CR/LF sequences for log forging, or other sensitive data. Logging unsanitized input increases risk of credential leakage and log injection attacks and should be avoided or sanitized before persisting/displaying in logs.

🔧 How do I fix it?
Keep sensitive data such as emails, passwords, and tokens out of logs. When logging values tied to a user, prefer a safe identifier like a user ID over the raw input, and strip line breaks from any user-provided text you do log.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

Comment thread aai_cli/code_agent/agent.py Fixed
Comment thread aai_cli/code_agent/tui.py
from pathlib import Path
from typing import TYPE_CHECKING, ClassVar

from textual.app import App, ComposeResult
claude added 2 commits June 17, 2026 22:57
- tests: `_abbrev_home` assertion compares to the platform-native path string (was
  hardcoded POSIX), fixing the Windows test job; close the SqliteSaver connection in
  the checkpointer test so its sqlite3.Connection isn't GC'd mid-suite (the unclosed
  connection raised PytestUnraisableExceptionWarning → failure under filterwarnings=error
  on py3.13/Windows, in random later tests).
- Escape dynamic content (assistant text, tool name/args/results, user prompt, agent
  questions, approval prompt) before writing to Rich markup surfaces in the TUI and the
  Rich renderer/REPL. A model/tool string containing "[" would otherwise be parsed as
  Rich markup — injecting styling or raising MarkupError and crashing the turn. Matches
  the inline-escape convention in ui/output.py. (Addresses the Aikido review comments.)
- Remove the no-op `...` from the CompiledAgent Protocol method (docstring is the body).

The langchain GHSA (Aikido/pip-audit) is handled separately — bumping it is blocked by
a websockets>=16 conflict; relaxing that floor is the next change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Mqx2vYy9FS5Lxpf3ekBGsr
…5-jv2w-4656

Resolves the medium-severity LangChain path-confinement advisory (flagged by Aikido +
pip-audit on PR #230). The patched langchain 1.3.9 requires deepagents 0.6.10, whose
langgraph-sdk caps websockets <16 — so relax the CLI's `websockets>=16` floor (a
dependabot artifact, not a real requirement) to `>=14`; the resolver now picks 15.0.1.

Safe for our usage either way:
- websockets: the realtime STT/TTS code uses websockets.sync/asyncio.client (stable
  since 13.x) and the assemblyai SDK only needs >=11; the full suite passes on 15.0.1.
- the advisory's affected components (langchain file-search middleware / config loaders /
  path-prefix checks) aren't on our path: we use deepagents' LocalShellBackend with
  virtual_mode=True, which confines file/shell tools to the working directory.

Full local suite (3261 tests) + ruff + mypy + lock-check pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Mqx2vYy9FS5Lxpf3ekBGsr
@alexkroman alexkroman added this pull request to the merge queue Jun 17, 2026
Merged via the queue into main with commit 5d08c6d Jun 17, 2026
19 checks passed
@alexkroman alexkroman deleted the claude/youthful-cray-l8oqnp branch June 17, 2026 23:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants