The contracts scripts and agents can rely on: exit codes, environment variables, configuration precedence, and machine-readable output shapes.
Stable, and deliberately split the way gh splits them (the source of truth
is the docstring in aai_cli/errors.py):
| Code | Meaning |
|---|---|
0 |
Success. |
1 |
Generic runtime failure: an API/network error, a missing dependency, or an unexpected internal error. |
2 |
Usage/validation error: bad flags, a bad path, a malformed id, or an unusable config file. |
4 |
Not authenticated: no usable credential, a rejected key, or a self-service command that needs a browser login. |
130 |
Cancelled with Ctrl-C. |
A subprocess the CLI shells out to (assembly deploy, assembly dev,
assembly update) propagates that process's own exit code unchanged. Under
--json, every failure also emits one {"error": {"type": …, "message": …}}
object on stderr; the error.type pairs 1:1 with the exit code.
Product-scoped variables are ASSEMBLYAI_*; CLI-behavior variables are
AAI_*. Keep new variables in that split.
| Variable | Effect |
|---|---|
ASSEMBLYAI_API_KEY |
API key for all API calls; beats the keyring, loses to nothing but a --api-key validation flag. |
AAI_ENV |
Backend environment (production, sandbox000); beats the profile's stored env, loses to --env/--sandbox. The non-production environments are internal: selecting one (here, via --env/--sandbox, or a profile binding) is rejected with exit 2 unless the active profile is signed in with an @assemblyai.com login, and --env/--sandbox and the sandbox-only commands are hidden from --help for everyone else. |
AAI_AUTH_PORT |
Loopback callback port for assembly login (dev/test only; default 8585). |
AAI_NO_UPDATE_CHECK |
Disables the "update available" notice, its interactive "update now?" prompt, and the background refresh. |
AAI_TELEMETRY_DISABLED / DO_NOT_TRACK |
Disables anonymous usage telemetry (always beats the persisted choice). |
NO_COLOR / FORCE_COLOR |
Standard color overrides; --color always / --color never sets them for child consoles too. |
CI |
Suppresses interactive affordances (spinners, the update notice); never changes output shape. |
Non-secret settings persist in config.toml (assembly config path prints
where; assembly config list/get/set reads and writes it). The API key lives
only in the OS keyring — never in a file.
Precedence, highest first:
- Command flags (
--profile,--env/--sandbox). - Environment variables (
ASSEMBLYAI_API_KEY,AAI_ENV). - Stored settings (
config.toml+ keyring): the active profile, its env binding, and its key. - Built-in defaults (
production, profiledefault).
Pipe the key on stdin so it never reaches shell history or ps:
printenv ASSEMBLYAI_API_KEY | assembly login --with-api-keyOr skip storage entirely and set ASSEMBLYAI_API_KEY per invocation. On a
remote/SSH machine the browser flow also works by forwarding the callback
port (ssh -L 8585:127.0.0.1:8585 <host>) and opening the printed URL in
your local browser.
--json (or -o json) is always an explicit opt-in — piping never switches
the output shape. One-shot commands emit a single JSON object on stdout;
errors and warnings are single JSON objects on stderr.
The list/account read commands — assembly transcripts list, assembly sessions list/get, assembly balance, assembly usage, assembly limits,
assembly keys list, and assembly audit — also accept -o FIELDS to project
columns straight out of the JSON, so a "grab one column" pipeline needs no
external jq. Pass a single field (-o id) or a comma-separated list (-o id,status); dotted paths (-o transform.model) reach nested objects. A list
result prints one tab-separated line per row, a single record one line; a
missing field (or null) is an empty column, and a nested object/list is
re-serialized as compact JSON. -o takes precedence over --json.
assembly transcripts list -o id | head -1 # newest transcript id
assembly keys list -o id,name # id<TAB>name per key
assembly balance -o balance_in_cents # the raw integerStreaming commands emit newline-delimited JSON (NDJSON), one event per line,
each carrying a "type" field to dispatch on:
| Command | Event types |
|---|---|
assembly stream --json |
begin, turn, termination (with --from-stdin, a source event precedes each file's events) |
assembly agent --json |
session.ready, transcript.user.delta, transcript.user, reply.started, transcript.agent, reply.done |
assembly live --json |
session.ready, transcript.user.delta, transcript.user, tool.use, reply.started, transcript.agent, reply.done |
assembly dictate --json |
utterance |
assembly llm --follow --json |
answer |
assembly transcribe <batch> --json |
result (one per source), then reduce if --llm-reduce is set |
New event types may be added; existing fields are stable. Consumers should ignore types they don't recognize.
With --llm-reduce, batch mode emits one final
{"type":"reduce","model","prompts","output"} record after the per-source
result records — the aggregate prompt(s) run once over every result, with the
output printed to stdout (the progress table is routed to stderr so stdout stays
clean for piping; the global -q/--quiet drops that table entirely). --llm-reduce
is repeatable, each prompt running on the previous one's output; for a single
source it extends the --llm chain over that transcript.
assembly eval takes the same --llm/--llm-reduce flags but emits one JSON
object per dataset (not NDJSON; a single dataset is therefore one object):
--llm runs a chain over each transcript and attaches {"model","steps"} under
the row's llm key (the WER score still uses the raw transcript), and
--llm-reduce runs one prompt over every item's result and adds a top-level
reduce ({"model","prompts","output"}) to the object.
assembly stream --save-dir DIR auto-names a capture under DIR/YYYY-MM-DD/
with a timestamped stem (YYYY-MM-DD-HHMMSS[-slug]) shared across every file it
writes:
<stem>.txt— the transcript, one finalized turn per line (flushed live).<stem>.wav— the recorded audio, 16-bit mono PCM. Suppress it with--no-save-audioto keep only the text. Under--system-audiothe two channels can't share a file, so each gets its own<stem>-you.wav/<stem>-system.wav.<stem>.md— written when--llm "…"is also passed: the final answer of the live prompt chain, captured as a note next to the transcript.<stem>.aai.json— a metadata sidecar so a list/browse UI needs no transcript parsing:{"title", "date", "duration_seconds", "speakers", "turns", "transcript", "audio", "note"}.audiois the list of WAV file names (empty under--no-save-audio, two entries under--system-audio);noteisnullwhen no--llmnote was written.
--name "Title" slugs an explicit title into the stem; --auto-name instead
derives that title from the transcript via the LLM Gateway once the stream ends,
renaming the files to match (the timestamp stem is kept if the title is empty).
The two are mutually exclusive.
assembly live answers each spoken turn with a tool-using agent, so it can reach
external tools mid-conversation. Its toolset is deliberately small — a low-latency
spoken turn does best with one obvious tool rather than a large menu to choose
among — so its one built-in tool is Firecrawl web search. It loads when a
FIRECRAWL_API_KEY is set; without it the session prints a one-line notice and
runs from the model's own knowledge (no web search).
--mcp-config FILE adds your own MCP servers (none load by default), from a
standard mcpServers JSON file — the same
{"mcpServers": {"name": {"command": "…", "args": […]}}} shape Claude Desktop and
Claude Code use. Repeat the flag to merge several files; a later file wins on a
name clash. Remote servers use {"url": "…"} instead of command/args.
Each server is launched independently and best-effort: one that won't start (a
missing npx/uvx, an offline host) drops only its own tools, so a single broken
tool never sinks the session. MCP tools are a live-run feature and are not
reflected in --show-code output.