AssemblyAI · alexkroman · Jun 17, 2026 · Jun 17, 2026 · Jun 17, 2026 · Jun 17, 2026
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -132,19 +132,25 @@ jobs:
       # (require_ffmpeg) before doing their work, so without it those tests fail at the
       # probe rather than exercising the mocked run. PortAudio needs no install — the
       # sounddevice wheel bundles it on Windows. choco ships on the runner but its download
-      # occasionally flakes (one matrix cell got ffmpeg, the other didn't), so retry and
-      # verify ffmpeg is callable here — a real miss fails this step instead of surfacing as
-      # confusing "ffmpeg not on PATH" test failures. The shim lands in choco's bin dir,
-      # already on the runner PATH, so later steps pick it up.
+      # from community.chocolatey.org doesn't just flake — it sometimes *hangs* for the whole
+      # job timeout, and a plain retry loop never gets to retry because the stuck attempt
+      # never returns. So bound each attempt with a hard timeout (Start-Job + Wait-Job): a
+      # hung download is killed and the next attempt retries, instead of wedging the cell
+      # until it's cancelled. The shim lands in choco's bin dir (machine-wide, already on the
+      # runner PATH), so the parent shell and later steps pick it up.
       - name: System deps (ffmpeg)
         shell: pwsh
         run: |
           $ErrorActionPreference = "Stop"
           $env:PATH = "C:\ProgramData\chocolatey\bin;$env:PATH"
           for ($i = 1; $i -le 3; $i++) {
-            choco install ffmpeg --no-progress -y
+            $job = Start-Job { choco install ffmpeg --no-progress -y }
+            if (Wait-Job $job -Timeout 240) { Receive-Job $job } else {
+              Stop-Job $job
+              Write-Host "choco install ffmpeg hung (attempt $i); killing and retrying…"
+            }
+            Remove-Job $job -Force
             if (Get-Command ffmpeg -ErrorAction SilentlyContinue) { break }
-            Write-Host "ffmpeg not yet on PATH (attempt $i); retrying…"
             Start-Sleep -Seconds 5
           }
           ffmpeg -version

diff --git a/README.md b/README.md
@@ -46,7 +46,7 @@ That's it. Run `assembly onboard` for a guided tour, or see [Installation](#-ins
 | :--- | :--- |
 | `assembly transcribe` | Transcribe files, URLs, YouTube/podcast pages, podcast RSS feeds, directories, globs, or bucket storage (`s3://`, `gs://`, `az://`) — with speaker labels, PII redaction, summarization, SRT/VTT captions, and resumable batch runs |
 | `assembly stream` | Real-time transcription from your microphone, a file, or a URL — on macOS it can capture system audio too |
-| `assembly dictate` | Push-to-talk dictation: recording starts immediately, press Enter for instant text (Sync STT API, up to 120 s per utterance) |
+| `assembly dictate` | Signal-driven dictation: records immediately, send SIGTERM for instant text — scriptable from hotkey tools like Hammerspoon (Sync STT API, up to 120 s per utterance) |
 | `assembly agent` | Full-duplex spoken conversation with a voice agent, right in your terminal |
 | `assembly agent-cascade` | Same live conversation, but wired client-side from Streaming STT + the LLM Gateway + streaming TTS, like the `agent-cascade` starter (sandbox-only) |
 | `assembly speak` | Synthesize text to speech over the streaming-TTS WebSocket (sandbox-only) |

diff --git a/aai_cli/AGENTS.md b/aai_cli/AGENTS.md
@@ -32,9 +32,9 @@ between layers is enforced — higher may import lower, never the reverse:
   `help_text`, `typer_patches`, `update_check`.
 - **`core/`** — the Rich-free library layer: `client`, `config`,
   `config_builder`, `keyring_store`, `environments`, `env`, `errors`, `llm`,
-  `telemetry`, `debuglog`, `remotefs`, `sync_stt`, `hotkey`, `ws`, `youtube`,
+  `telemetry`, `debuglog`, `remotefs`, `sync_stt`, `signals`, `ws`, `youtube`,
   `wer`, `argscan`, `jsonshape`, `timeparse`, `microphone`, `procs`, `stdio`,
-  `choices`, `locking`, `config_lock`. Contract 4 also forbids `rich` here, so
+  `choices`. Contract 4 also forbids `rich` here, so
   "no Rich below the UI layer" is structural.
 
 Three things sit *beside* the stack, intentionally unlisted in the layers
@@ -139,7 +139,7 @@ heavily-reworked commands with long bodies; small commands keep the inline
 ### Cross-cutting state (resolution order matters)
 
 - **`app/context.py`** — `AppState` (profile, env) is attached to the Typer context in the root `@app.callback()`. `run_command` is the standard command wrapper.
-- **`core/config.py`** — profiles persisted in `config.toml` (via `platformdirs`); the **API key lives only in the OS keyring**, never in a dotfile. The keyring access itself is factored into **`core/keyring_store.py`** (the single importer of `keyring`, holding `KEYRING_SERVICE = "assemblyai-cli"` + `set_secret`/`get_secret`/`restore_secret`/`delete_secret`/`usable`), so the "secrets never touch the dotfile" split is structural; `config` reads/writes secrets through it and only `config.keyring_usable` re-surfaces the probe on the auth facade. Key resolution order: `--api-key` flag (validation paths only) → `ASSEMBLYAI_API_KEY` env → keyring. **Run commands deliberately expose no `--api-key` flag** so keys can't leak into `ps`/shell history. Every `config.toml` write is a read-modify-write (`_load` → mutate → `_dump`): `_dump` is a temp-file + atomic `os.replace` (a reader never sees a torn file), and the whole RMW runs under a cross-process `filelock` (`config_lock.update`/`.locked`, built on `core/locking.py`) so two concurrent `assembly` processes can't lose each other's updates. Readers stay lock-free. The lock helpers live in `config_lock.py` (not `config.py`) only to keep the latter under the file-length gate; reuse one cached `FileLock` per path so nested writers (`persist_login`) stay reentrant.
+- **`core/config.py`** — profiles persisted in `config.toml` (via `platformdirs`); the **API key lives only in the OS keyring**, never in a dotfile. The keyring access itself is factored into **`core/keyring_store.py`** (the single importer of `keyring`, holding `KEYRING_SERVICE = "assemblyai-cli"` + `set_secret`/`get_secret`/`restore_secret`/`delete_secret`/`usable`), so the "secrets never touch the dotfile" split is structural; `config` reads/writes secrets through it and only `config.keyring_usable` re-surfaces the probe on the auth facade. Key resolution order: `--api-key` flag (validation paths only) → `ASSEMBLYAI_API_KEY` env → keyring. **Run commands deliberately expose no `--api-key` flag** so keys can't leak into `ps`/shell history. Every `config.toml` write is a read-modify-write (`_load` → mutate → `_dump`) via the `config._update` context manager: `_dump` is a temp-file + atomic `os.replace`, so a reader never sees a torn file. Writers and readers are otherwise unsynchronized — last write wins (there is **no** cross-process lock; an earlier `filelock`-based serialization was removed because it was a recurring Windows CI flake and the lost-update race it closed isn't worth the cost for a single-user CLI). On Windows the atomic replace has no replace-over-open guarantee, so both the lock-free read and the `os.replace` ride out the transient `PermissionError` through `config._retry_on_sharing_violation` (a no-op on POSIX).
 - **`core/environments.py`** — a frozen `Environment` (api_base, streaming_host, llm_gateway_base, ams_base, stytch_*). `DEFAULT_ENV` is **`production`**; use `--sandbox` (or `--env sandbox000` / `AAI_ENV`) to target the sandbox. The active environment is a process-global set once at startup; precedence: `--env` → `AAI_ENV` → profile's stored env → default. A credential is only valid against the environment that minted it.
 - **`core/client.py`** — thin wrappers over the `assemblyai` SDK (`transcribe`, `list_transcripts`, `stream_audio`, etc.). It normalizes SDK exceptions: auth failures become a single clean `auth_failure()` `CLIError`; everything else becomes `APIError`. New SDK calls should follow this try/except shape.
 - **`core/errors.py`** — the `CLIError` hierarchy (each with `error_type` + `exit_code`). `ui/output.py` emits errors to **stderr**; stdout stays clean for pipelines. `--json` switches to machine-readable output; it is never auto-enabled — `output.resolve_json()` deliberately keeps human text the default even when piped or agent-run.
@@ -149,7 +149,7 @@ heavily-reworked commands with long bodies; small commands keep the inline
 ### Feature subsystems
 
 - **`streaming/`** + `client.stream_audio` — v3 realtime API. Event callbacks run on the SDK reader thread and guard against `BrokenPipeError` (`stdio.silence_stdout()`) so a closed pipe never dumps a thread traceback.
-- **`core/sync_stt.py`** + **`core/hotkey.py`** + `commands/dictate/` — `assembly dictate`: push-to-talk dictation over the **Sync STT API** (`Environment.sync_base`, one POST `/transcribe` per utterance with the required `X-AAI-Model: u3-sync-pro` header; 80 ms–120 s of PCM/WAV). `hotkey.TerminalKeys` scopes stdin into cbreak (Ctrl-C still signals) and reads single keypresses; `dictate_exec._record` polls it with a zero timeout between ~100 ms mic chunks. All three boundaries (keys, mic, HTTP) are injectable, so the suite never needs a real terminal — `tests/test_hotkey.py` drives a pty pair for the termios behavior.
+- **`core/sync_stt.py`** + **`core/signals.py`** + `commands/dictate/` — `assembly dictate`: headless dictation over the **Sync STT API** (`Environment.sync_base`, one POST `/transcribe` per utterance with the required `X-AAI-Model: u3-sync-pro` header; 80 ms–120 s of PCM/WAV). It needs no terminal: recording starts immediately and `dictate_exec._record` polls `signals.stop_on_terminate` between ~100 ms mic chunks for a SIGTERM, which finishes the utterance (clean exit 0) — so a hotkey tool like Hammerspoon can launch it as a background task and `kill -TERM`/`task:terminate()` to transcribe. SIGINT (Ctrl-C) still cancels (exit 130). Both boundaries (the stop latch, mic, HTTP) are injectable, so the suite never needs a real signal or microphone (`tests/test_dictate_exec.py` scripts the SIGTERM latch). Contrast `signals.terminate_as_interrupt` (used by `stream`/`agent`/`speak`), which routes SIGTERM into the *cancel* path instead.
 - **`agent/`** — full-duplex voice agent (mic in, TTS out via `voices.py`).
 - **`agent_cascade/`** + `commands/agent_cascade/` — `assembly agent-cascade`: the same live terminal conversation as `assembly agent`, but **client-orchestrated** — `engine.run_cascade` wires Streaming STT → the LLM Gateway → streaming TTS itself instead of talking to the Voice Agent endpoint, mirroring what the `agent-cascade` `assembly init` template does server-side. **Sandbox-only** (streaming TTS has no prod host; guarded via `tts.session.require_available`). Reuses the agent slice's `DuplexAudio`/`AgentRenderer` and `core.client.stream_audio`/`core.llm.complete`/`tts.session.synthesize`; the three network legs are injected through `engine.CascadeDeps` (the `tts/session.py` seam) so the cascade — greeting, per-sentence TTS, barge-in, history window — is unit-tested against fakes with no sockets/mic/speaker.
 - **`tts/`** + `commands/speak.py` — `assembly speak` synthesizes text to speech over the sandbox streaming-TTS WebSocket (`streaming-tts.sandbox000.…`). **Sandbox-only:** `session.is_available()` is false in production (empty `Environment.streaming_tts_host`), so the command exits 2 with a `--sandbox` hint. `session.synthesize` drives a Begin→Generate→Flush→Audio→Terminate protocol with an injectable `connect` for hermetic tests (mirrors `agent/session.py`); `audio.py` plays the PCM (default) or writes a WAV (`--out`).

diff --git a/aai_cli/commands/dictate/__init__.py b/aai_cli/commands/dictate/__init__.py
@@ -22,7 +22,11 @@
     rich_help_panel=help_panels.TRANSCRIPTION,
     epilog=examples_epilog(
         [
-            ("Dictate one utterance: recording starts, Enter transcribes it", "assembly dictate"),
+            ("Record until SIGTERM, then print the transcript", "assembly dictate"),
+            (
+                "Stop the recording and transcribe (e.g. from a hotkey tool)",
+                "kill -TERM $(pgrep -f 'assembly dictate')",
+            ),
             (
                 "Pipe the utterance into another command",
                 'assembly dictate | assembly llm "write a conventional commit"',
@@ -75,13 +79,15 @@ def dictate(
         help="Output mode: text (the bare transcript per utterance, pipe-friendly) or json",
     ),
 ) -> None:
-    """Push-to-talk dictation: record the mic, get the transcript back
+    """Signal-driven dictation: record the mic, get the transcript back
 
-    Recording starts immediately; press Enter (or Space) to stop and the
-    utterance is sent to the AssemblyAI Sync API — the transcript prints right
-    away (no polling) and dictate exits, so it flows straight to the next
-    command in a pipe. The recording can be up to 120 seconds long. Press
-    Ctrl-C to cancel without transcribing.
+    Recording starts immediately and runs headless — no terminal needed — so a
+    hotkey tool like Hammerspoon can launch it as a background task and send
+    SIGTERM (kill -TERM, task:terminate()) to stop. On SIGTERM the utterance is
+    sent to the AssemblyAI Sync API, the transcript prints right away (no
+    polling), and dictate exits, so it flows straight to the next command in a
+    pipe. The recording can be up to 120 seconds long. Ctrl-C (SIGINT) cancels
+    without transcribing.
     """
     opts = dictate_exec.DictateOptions(
         language=language,