Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,31 @@ The post-edit hook (`.claude/settings.json`) runs `ruff check --fix --unfixable

The suite is hermetic by construction, enforced three ways (`tests/conftest.py` + `pyproject.toml` `[tool.pytest.ini_options]`): **pytest-randomly** shuffles order, an autouse `pin_timezone` fixture pins `TZ` to a fixed non-UTC zone (UTC-normalized rendering must be unaffected; use **time-machine** to freeze `now`), and **pytest-socket** (`--disable-socket`) blocks real network so an unmocked SDK/HTTP call fails loudly instead of hitting the API. A test that only binds a loopback server opts back in with the tight `@pytest.mark.allow_hosts(["127.0.0.1"])` (still blocks external hosts). The `e2e`/`install`/`install_script` marker suites legitimately reach the real network in-process (PyPI reachability probes, real-API runs), so a `pytest_collection_modifyitems` hook in `conftest.py` auto-grants them full sockets — adding a network marker is all that's needed, no per-test `enable_socket`.

### Writing tests that pass the diff gates

Lessons that cost iterations getting the patch-coverage and mutation tail gates green:

- **A boolean literal/default survives the mutation gate unless a test asserts the
difference between its two values**, not just that the line ran. `json_mode=False` passed
to `output.emit`, or `quiet=False` on `output.status`, get mutated to `True` — kill them by
asserting the *behavioral* split: the human branch prints bare text
(`result.output.strip() == "…"`, not a JSON object), or the spinner is actually entered
(monkeypatch `error_console.status` and assert it ran). A changed message / `prompter.note`
string is mutated whole, so one substring assert on the actionable keyword kills it.
- **Help text and docstrings are pinned by the syrupy snapshots, not unit asserts** — a
mutated help string is killed by the regenerated `.ambr`, so `--snapshot-update` and commit
rather than adding redundant `--help` substring asserts.
- **Typer's `CliRunner` merges stderr into `result.output`, and not in call order**, so don't
assume `splitlines()[-1]` is the command payload. In `--json` mode the env-mismatch warning
is its own `{"warning": …}` line, so filter parsed lines by a key the payload carries
(`next(o for o in objs if "env" in o)`). A monkeypatched fake must also mirror the real
signature — when a helper gains a kwarg (e.g. `output.status(…, quiet=…)`), doubles that
patch it must accept it or the call `TypeError`s.
- **`--json` / `-j` is a per-command flag, not a root flag**: `aai --json transcribe …` fails
with "No such option"; it's `aai transcribe … --json`. (The root callback still sniffs the
whole token list via `_command_line_requests_json`, so a callback-level failure like a bad
`--env` keeps the JSON error shape — but the flag itself lives on the subcommand.)

### Manual QA / running the CLI in sandboxed sessions

Lessons that cost time in agent sessions — read before exercising `uv run aai` by hand:
Expand Down
39 changes: 38 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,12 @@ Prefers [`pipx`](https://pipx.pypa.io), falling back to `pip --user`.

## Quick Start

```sh
aai onboard # guided setup: sign in, first transcription, start building
```

Prefer to do it by hand?

```sh
aai login # store your API key (browser-assisted)
aai transcribe --sample # transcribe the hosted wildfires.mp3 sample
Expand Down Expand Up @@ -91,7 +97,38 @@ Your key is written to a git-ignored `.env` (never sent to the browser). Use `--
| `aai setup install` | Set up your coding agent for AssemblyAI (docs MCP + skills). |
| `aai keys` / `balance` / `usage` / `limits` / `sessions` / `audit` | Account self-service (browser login). |

Every command prints human-readable text by default — terminal, pipe, CI, or agent alike. Add `--json` for machine-readable output; it never switches on just because stdout is piped, so `aai transcribe call.mp3 | grep hello` still gets the transcript, not a JSON blob. Errors go to **stderr**, so stdout stays clean for pipelines.
Every command prints human-readable text by default — terminal, pipe, CI, or agent alike. Add `--json` (or `-j`) for machine-readable output; it never switches on just because stdout is piped, so `aai transcribe call.mp3 | grep hello` still gets the transcript, not a JSON blob. Errors go to **stderr**, so stdout stays clean for pipelines.

Account data lives in **top-level** commands — `aai balance` / `usage` / `limits` / `keys` / `audit`, and `aai login` / `logout` / `whoami` — not under an `aai account` group.

### JSON output

`--json` is the scripting contract. The shapes are stable:

| Command | `--json` shape |
| --- | --- |
| `transcribe` / `transcripts get` | the full transcript payload (`id`, `status`, `text`, `words`, `utterances`, …) — identical for both, so a fetched transcript round-trips |
| `transcribe --llm` | `{id, status, text, transform: {model, steps: [{prompt, output}]}}` |
| `transcripts list` / `sessions list` / `keys list` | a JSON array of row objects (`[]` when empty) |
| `balance` / `usage` / `limits` / `audit` | the raw AMS payload (e.g. `balance.balance_in_cents`; `usage.usage_items[].line_items[].price` in cents) |
| `doctor` | `{ok, profile, environment, checks: [{name, status, affects, detail, fix}]}` |
| any error | `{"error": {"type", "message", "suggestion"?, "transcript_id"?}}` on **stderr** |

`stream`/`agent` with `--json` emit newline-delimited JSON (one object per event/turn).

### Exit codes

Scripts can branch on the exit code:

| Code | Meaning |
| --- | --- |
| `0` | success |
| `1` | API/network error, missing dependency, or unexpected internal error |
| `2` | usage/validation error (bad flag, bad path, malformed id, unusable config) |
| `4` | not authenticated (no usable key, rejected key, or a self-service command needing browser login) |
| `130` | cancelled with Ctrl-C |

`aai deploy` / `aai dev` shell out to other tools and propagate that tool's own exit code.

> **Tip:** Quote URLs that contain `?` (most YouTube links do) — in zsh the `?` is a glob character: `aai transcribe "https://www.youtube.com/watch?v=VIDEO_ID"`.

Expand Down
10 changes: 5 additions & 5 deletions aai_cli/agent/session.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
from typing import Any

from aai_cli import environments
from aai_cli.errors import APIError, CLIError, auth_failure, is_auth_failure
from aai_cli.errors import APIError, CLIError, NotAuthenticated, auth_failure, is_auth_failure


def ws_url() -> str:
Expand All @@ -29,7 +29,8 @@ def ws_url() -> str:
)
DEFAULT_GREETING = "Hey, what's on your mind?"

# session.error codes that mean the connection is unauthorized -> exit 2.
# session.error codes that mean the connection is unauthorized -> exit 4, the same
# NotAuthenticated code every other rejected-credential path across the CLI uses.
_AUTH_ERROR_CODES = {"UNAUTHORIZED", "FORBIDDEN"}

# A pre-upgrade HTTP 403 on the WebSocket handshake (see _is_rejected_key).
Expand Down Expand Up @@ -154,10 +155,9 @@ def raise_error(self, event: dict[str, Any]) -> None:
code = event.get("code", "")
message = event.get("message") or code or "Voice agent error."
if code in _AUTH_ERROR_CODES:
raise CLIError(
raise NotAuthenticated(
f"Voice agent rejected the connection: {message}",
error_type="unauthorized",
exit_code=2,
suggestion="Run 'aai login' with a valid key, or set ASSEMBLYAI_API_KEY.",
)
raise APIError(f"Voice agent error ({code}): {message}")

Expand Down
10 changes: 8 additions & 2 deletions aai_cli/auth/ams.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,15 @@ def _detail(resp: httpx.Response) -> str:

def _raise_for_error(resp: httpx.Response) -> None:
if resp.status_code in (401, 403):
raise NotAuthenticated(f"AMS rejected the login ({resp.status_code}): {_detail(resp)}")
raise NotAuthenticated(
f"AMS rejected the login ({resp.status_code}): {_detail(resp)}",
suggestion="Your browser session may have expired — run 'aai login' again.",
)
if resp.status_code >= _HTTP_ERROR_MIN_STATUS:
raise APIError(f"AMS request failed ({resp.status_code}): {_detail(resp)}")
raise APIError(
f"AMS request failed ({resp.status_code}): {_detail(resp)}",
suggestion="Check your network and try again; if it persists, contact support.",
)


def _json_or_raise(resp: httpx.Response) -> object:
Expand Down
9 changes: 7 additions & 2 deletions aai_cli/commands/account.py
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,12 @@ def usage(
),
end: str | None = typer.Option(None, "--end", help="End date (YYYY-MM-DD). Default: today."),
window: str | None = typer.Option(None, "--window", help="Window size, e.g. 'day' or 'month'."),
include_zero: bool = typer.Option(False, "--all", help="Include zero-usage windows."),
include_zero: bool = typer.Option(
False,
"--include-zero",
"--all",
help="Include zero-usage windows (matches --include-logins on `aai audit`).",
),
json_out: bool = options.json_option(),
) -> None:
"""Show usage over a date range (defaults to the last 30 days)."""
Expand Down Expand Up @@ -204,7 +209,7 @@ def render(d: dict[str, object]) -> object:
table.add_row(*row)
hidden_note = (
output.muted(
f"Hidden: {hidden_count} zero-usage window(s). Use --all to show them."
f"Hidden: {hidden_count} zero-usage window(s). Use --include-zero to show them."
)
if hidden_count
else None
Expand Down
12 changes: 3 additions & 9 deletions aai_cli/commands/deploy.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,11 +71,7 @@ def command(self, *, prod: bool) -> list[str]:
def _resolve_target(selected: list[Target]) -> Target:
if len(selected) > 1:
flags = " / ".join(t.flag for t in TARGETS)
raise CLIError(
f"Pass at most one deploy target ({flags}).",
error_type="usage_error",
exit_code=1,
)
raise UsageError(f"Pass at most one deploy target ({flags}).")
return selected[0] if selected else VERCEL # Vercel is the default


Expand All @@ -102,11 +98,9 @@ def _confirmed(target: Target, *, assume_yes: bool) -> bool:
if assume_yes:
return True
if output.is_agentic():
raise CLIError(
raise UsageError(
"Refusing to deploy without confirmation in a non-interactive session. "
"Pass --yes to deploy.",
error_type="usage_error",
exit_code=1,
"Pass --yes to deploy."
)
return typer.confirm(f"Deploy this project to {target.name}?")

Expand Down
15 changes: 15 additions & 0 deletions aai_cli/commands/doctor.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,16 @@ def _check_api_key(profile: str) -> Check:
try:
key = config.resolve_api_key(profile=profile)
except NotAuthenticated:
if not config.keyring_usable():
# On a box with no keyring, `aai login` can't persist a key either, so
# point at the env var that actually works here instead of a dead end.
return _check(
"api-key",
"fail",
"No API key found, and this machine has no usable OS keyring.",
fix="Set ASSEMBLYAI_API_KEY (browser login can't store a key without a keyring).",
affects=["everything"],
)
return _check(
"api-key",
"fail",
Expand Down Expand Up @@ -217,6 +227,11 @@ def render(data: DoctorResult) -> str:
lines.append(" " + output.hint(f"fix: {escape(c['fix'])}"))
if data["ok"]:
lines.append(" " + output.success("Everything looks good."))
# Only the real `aai doctor` carries profile context; the onboarding wizard
# reuses render() for a partial check and has its own next-steps, so don't
# tack a "try transcribe" hint onto that one.
if data.get("profile") is not None:
lines.append(" " + output.hint("Try it: aai transcribe --sample"))
else:
failed = sum(1 for c in checks if c["status"] == "fail")
noun = "problem" if failed == 1 else "problems"
Expand Down
10 changes: 7 additions & 3 deletions aai_cli/commands/keys.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

import typer
from rich.markup import escape
from rich.table import Table

from aai_cli import jsonshape, options, output
from aai_cli.auth import ams
Expand Down Expand Up @@ -37,7 +36,10 @@ def _default_project_id(account_id: int, jwt: str) -> int:
project = jsonshape.as_mapping(projects[0].get("project")) if projects else None
pid = _project_id(project) if project is not None else None
if pid is None:
raise APIError("Your account has no project to create a key in.")
raise APIError(
"Your account has no project to create a key in.",
suggestion="Create a project in the AssemblyAI dashboard, then try again.",
)
return pid


Expand Down Expand Up @@ -75,7 +77,9 @@ def body(state: AppState, json_mode: bool) -> None:
for token in jsonshape.mapping_list(entry.get("tokens"))
)

def render(data: list[dict[str, object]]) -> Table:
def render(data: list[dict[str, object]]) -> object:
if not data:
return output.muted("No API keys found.")
table = output.data_table("id", "name", "project", "key", "disabled")
for row in data:
table.add_row(
Expand Down
27 changes: 21 additions & 6 deletions aai_cli/commands/login.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,13 +58,28 @@ def body(state: AppState, json_mode: bool) -> None:
# api-key-only, so account self-service must report it needs a browser
# login rather than silently reusing the old (possibly different) identity.
config.clear_session(profile)
# An --api-key login stores no browser session, so the AMS self-service
# commands won't work for this profile — say so up front instead of letting
# the user hit "needs a browser login" later.
api_key_only = api_key is not None

def render(_d: object) -> str:
lines = [
output.success(f"Signed in as {escape(profile)} ({escape(env)})."),
output.hint("Run `aai onboard` to finish setup, or `aai transcribe <file>`."),
]
if api_key_only:
lines.append(
output.hint(
"Account commands (keys/balance/usage/limits/audit) need "
"`aai login` without --api-key."
)
)
return "\n".join(lines)

output.emit(
{"authenticated": True, "profile": profile, "env": env},
lambda _d: (
output.success(f"Signed in as {escape(profile)} ({escape(env)}).")
+ "\n"
+ output.hint("Run `aai onboard` to finish setup, or `aai transcribe <file>`.")
),
{"authenticated": True, "profile": profile, "env": env, "api_key_only": api_key_only},
render,
json_mode=json_mode,
)

Expand Down
14 changes: 8 additions & 6 deletions aai_cli/commands/sessions.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
from rich.markup import escape
from rich.table import Table

from aai_cli import jsonshape, options, output, theme
from aai_cli import jsonshape, options, output, theme, timeparse
from aai_cli.auth import ams
from aai_cli.context import AppState, resolve_session, run_command
from aai_cli.help_text import examples_epilog
Expand Down Expand Up @@ -62,19 +62,21 @@ def body(state: AppState, json_mode: bool) -> None:
payload = ams.list_streaming(jwt, limit=limit, status=status)
rows = _session_rows(payload.get("data"))

def render(data: list[dict[str, object]]) -> Table:
def render(data: list[dict[str, object]]) -> object:
if not data:
return output.muted("No streaming sessions yet.")
table = output.data_table(
"session id",
"status",
"created",
"created (UTC)",
"audio (s)",
"model",
)
for s in data:
table.add_row(
escape(str(s["session_id"])),
theme.status_text(str(s["status"])),
escape(str(s.get("created_at") or "")),
escape(timeparse.format_utc_datetime(s.get("created_at"))),
escape(str(s.get("audio_duration_sec") or "")),
escape(str(s.get("speech_model") or "")),
)
Expand All @@ -88,8 +90,8 @@ def render(data: list[dict[str, object]]) -> Table:
@app.command(
epilog=examples_epilog(
[
("Show one session's details", "aai sessions get <session-id>"),
("Raw JSON for one session", "aai sessions get <session-id> --json"),
("Show one session's details", "aai sessions get sess_5551234"),
("Raw JSON for one session", "aai sessions get sess_5551234 --json"),
(
"Drill into the latest session",
"aai sessions get $(aai sessions list --json | jq -r '.[0].session_id')",
Expand Down
11 changes: 6 additions & 5 deletions aai_cli/commands/stream.py
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ def stream(
prompt: str | None = typer.Option(
None,
"--prompt",
help="Prompt to bias the speech model (u3-pro).",
help="Prompt to bias the speech model (supported models only).",
rich_help_panel=help_panels.OPT_MODEL,
),
keyterms_prompt: list[str] | None = typer.Option(
Expand Down Expand Up @@ -341,10 +341,11 @@ def stream(
"""Transcribe live audio in real time — from your mic, a file, a URL, or a pipe.

Pass - as the source to read raw PCM16/mono/16k audio on stdin, e.g.
ffmpeg -i input.mp4 -f s16le -ar 16000 -ac 1 - | aai stream -. --prompt biases the
speech model. --llm runs a prompt over the live transcript in-process, refreshing the
answer on every finalized turn; for a separate step instead, pipe the text out with
-o text | aai llm -f "…".
ffmpeg -i input.mp4 -f s16le -ar 16000 -ac 1 - | aai stream -.

--prompt biases the speech model. --llm runs a prompt over the live transcript
in-process, refreshing the answer on every finalized turn; for a separate step
instead, pipe the text out with -o text | aai llm -f "…".
"""

def body(state: AppState, json_mode: bool) -> None:
Expand Down
15 changes: 9 additions & 6 deletions aai_cli/commands/transcribe.py
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ def transcribe(
prompt: str | None = typer.Option(
None,
"--prompt",
help="Prompt to bias the speech model (u3-pro).",
help="Prompt to bias the speech model (supported models only).",
rich_help_panel=help_panels.OPT_MODEL,
),
# formatting
Expand Down Expand Up @@ -359,10 +359,13 @@ def transcribe(
) -> None:
"""Transcribe an audio file, URL, or YouTube link.

Quickest start: aai transcribe call.mp3 (or --sample for the hosted demo). Save with
--out FILE, or pipe one field with -o text. A YouTube URL is downloaded first, then
transcribed. Curated flags cover common features; --config KEY=VALUE and --config-file
reach every other field. Analysis (summary, chapters, ...) renders in human mode.
Quickest start: aai transcribe call.mp3 (or --sample for the hosted demo).

Save with --out FILE, or pipe one field with -o text. A YouTube URL is downloaded
first, then transcribed.

Curated flags cover common features; --config KEY=VALUE and --config-file reach
every other field. Analysis (summary, chapters, ...) renders in human mode.
"""

def body(state: AppState, json_mode: bool) -> None:
Expand Down Expand Up @@ -447,7 +450,7 @@ def body(state: AppState, json_mode: bool) -> None:
transcribe_exec.check_source_exists(source, sample=sample)

api_key = config.resolve_api_key(profile=state.profile)
with output.status("Transcribing…", json_mode=json_mode):
with output.status("Transcribing…", json_mode=json_mode, quiet=state.quiet):
transcript = transcribe_exec.run_transcription(
api_key, source, sample=sample, transcription_config=tc
)
Expand Down
Loading
Loading