Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion aai_cli/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ heavily-reworked commands with long bodies; small commands keep the inline
### Cross-cutting state (resolution order matters)

- **`app/context.py`** — `AppState` (profile, env) is attached to the Typer context in the root `@app.callback()`. `run_command` is the standard command wrapper.
- **`core/config.py`** — profiles persisted in `config.toml` (via `platformdirs`); the **API key lives only in the OS keyring**, never in a dotfile. The keyring access itself is factored into **`core/keyring_store.py`** (the single importer of `keyring`, holding `KEYRING_SERVICE = "assemblyai-cli"` + `set_secret`/`get_secret`/`restore_secret`/`delete_secret`/`usable`), so the "secrets never touch the dotfile" split is structural; `config` reads/writes secrets through it and only `config.keyring_usable` re-surfaces the probe on the auth facade. Key resolution order: `--api-key` flag (validation paths only) → `ASSEMBLYAI_API_KEY` env → keyring. **Run commands deliberately expose no `--api-key` flag** so keys can't leak into `ps`/shell history. Every `config.toml` write is a read-modify-write (`_load` → mutate → `_dump`) via the `config._update` context manager: `_dump` is a temp-file + atomic `os.replace`, so a reader never sees a torn file. Writers and readers are otherwise unsynchronized — last write wins (there is **no** cross-process lock; an earlier `filelock`-based serialization was removed because it was a recurring Windows CI flake and the lost-update race it closed isn't worth the cost for a single-user CLI). On Windows the atomic replace has no replace-over-open guarantee, so both the lock-free read and the `os.replace` ride out the transient `PermissionError` through `config._retry_on_sharing_violation` (a no-op on POSIX).
- **`core/config.py`** — profiles persisted in `config.toml` (via `platformdirs`); the **API key lives only in the OS keyring**, never in a dotfile. The keyring access itself is factored into **`core/keyring_store.py`** (the single importer of `keyring`, holding `KEYRING_SERVICE = "assemblyai-cli"` + `set_secret`/`get_secret`/`restore_secret`/`delete_secret`/`usable`), so the "secrets never touch the dotfile" split is structural; `config` reads/writes secrets through it and only `config.keyring_usable` re-surfaces the probe on the auth facade. Key resolution order: `--api-key` flag (validation paths only) → `ASSEMBLYAI_API_KEY` env → keyring. **Run commands deliberately expose no `--api-key` flag** so keys can't leak into `ps`/shell history. The `config.toml` document schema (`Profile`/`Config`/`StoredSession`) and its parse/cache/atomic-write machinery live one layer down in **`core/config_store.py`** (the same factoring as `keyring_store`): `config` is the auth/profile facade and reads as plain accessors over `config_store.load`/`dump`/`update`, so the store rules stay structural. Every `config.toml` write is a read-modify-write (`load` → mutate → `dump`) via the `config_store.update` context manager: `dump` is a temp-file + atomic `os.replace`, so a reader never sees a torn file. Writers and readers are otherwise unsynchronized — last write wins (there is **no** cross-process lock; an earlier `filelock`-based serialization was removed because it was a recurring Windows CI flake and the lost-update race it closed isn't worth the cost for a single-user CLI). On Windows the atomic replace has no replace-over-open guarantee, so both the lock-free read and the `os.replace` ride out the transient `PermissionError` through `config_store._retry_on_sharing_violation` (a no-op on POSIX). Tests isolate the config dir by patching `config_store.config_dir` (the autouse `tmp_config` fixture).
- **`core/environments.py`** — a frozen `Environment` (api_base, streaming_host, llm_gateway_base, ams_base, stytch_*). `DEFAULT_ENV` is **`production`**; use `--sandbox` (or `--env sandbox000` / `AAI_ENV`) to target the sandbox. The active environment is a process-global set once at startup; precedence: `--env` → `AAI_ENV` → profile's stored env → default. A credential is only valid against the environment that minted it.
- **`core/client.py`** — thin wrappers over the `assemblyai` SDK (`transcribe`, `list_transcripts`, `stream_audio`, etc.). It normalizes SDK exceptions: auth failures become a single clean `auth_failure()` `CLIError`; everything else becomes `APIError`. New SDK calls should follow this try/except shape.
- **`core/errors.py`** — the `CLIError` hierarchy (each with `error_type` + `exit_code`). `ui/output.py` emits errors to **stderr**; stdout stays clean for pipelines. `--json` switches to machine-readable output; it is never auto-enabled — `output.resolve_json()` deliberately keeps human text the default even when piped or agent-run.
Expand Down
104 changes: 104 additions & 0 deletions aai_cli/commands/clip/_cut.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
"""ffmpeg cutting for `assembly clip`: silence detection + per-segment re-encode.

The pure selection logic (range parsing, utterance filtering, merging) lives in
``_select``; this module is the final stage — it turns the merged ``Segment``
windows into output files with ffmpeg: one ``silencedetect`` pass to snap cuts
into nearby pauses, then a frame-accurate re-encode per segment. The
orchestration that ties selection and cutting together stays in ``_exec``.
"""

from __future__ import annotations

from dataclasses import dataclass
from pathlib import Path

from rich.markup import escape

from aai_cli.app import mediafile
from aai_cli.commands.clip import _select as clip_select
from aai_cli.commands.clip._select import Segment
from aai_cli.ui import output

# -30dB for at least 0.2s reads as a pause in normal speech recordings.
SILENCE_FILTER = "silencedetect=noise=-30dB:d=0.2"


def detect_silences(ffmpeg: str, media: Path) -> list[Segment]:
"""The silence intervals ffmpeg hears in ``media`` (one decode pass).

Snapping is best-effort: a failed detection returns no silences (so the
cut proceeds at the selected times) rather than failing the command.
silencedetect logs at info level on stderr, so the usual ``-loglevel
error`` would silence the very lines this parses.
"""
result = mediafile.run_ffmpeg(
[
ffmpeg,
"-hide_banner",
"-nostats",
"-i",
str(media),
"-af",
SILENCE_FILTER,
"-f",
"null",
"-",
]
)
if result.returncode != 0:
return []
return clip_select.parse_silences(result.stderr)


def cut_clip(ffmpeg: str, media: Path, segment: Segment, dest: Path) -> None:
"""Re-encode one segment of ``media`` into ``dest``.

Re-encoding (no ``-c copy``) keeps cuts frame-accurate where stream copy
would snap to the nearest keyframe; ``-y`` makes a re-run overwrite its own
earlier output instead of stalling on ffmpeg's prompt.
"""
result = mediafile.run_ffmpeg(
[
ffmpeg,
"-hide_banner",
"-loglevel",
"error",
"-y",
"-i",
str(media),
"-ss",
f"{segment.start:.3f}",
"-to",
f"{segment.end:.3f}",
mediafile.path_arg(dest),
]
)
if result.returncode != 0:
raise mediafile.ffmpeg_failure(result, "cut", dest, error_type="clip_failed")


def clip_dest(media: Path, out_dir: Path | None, index: int) -> Path:
directory = out_dir if out_dir is not None else media.parent
return directory / f"{media.stem}.clip{index:02d}{media.suffix}"


@dataclass(frozen=True)
class WrittenClip:
"""One output file and the source window it was cut from."""

path: Path
segment: Segment

def payload(self) -> dict[str, object]:
return {
"path": str(self.path),
"start": round(self.segment.start, 3),
"end": round(self.segment.end, 3),
"duration": round(self.segment.end - self.segment.start, 3),
}

def human_line(self) -> str:
start = clip_select.format_clock(self.segment.start)
end = clip_select.format_clock(self.segment.end)
duration = round(self.segment.end - self.segment.start, 3)
return output.success(f"{escape(str(self.path))} {start} - {end} ({duration}s)")
105 changes: 10 additions & 95 deletions aai_cli/commands/clip/_exec.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@
options/run split, see AGENTS.md), so tests drive transcript resolution and the
ffmpeg orchestration by constructing options directly. The pure selection logic
(range parsing, utterance filtering, LLM reply parsing, merging) lives in
``clip_select``.
``clip_select``; the ffmpeg cutting (silence detection + per-segment re-encode)
lives in ``_cut``. This module is the orchestration that ties them together.

Selection composes four sources: ``--speaker`` and ``--search`` filter the
diarized utterances of a transcript (made on the fly, reused via
Expand All @@ -25,10 +26,9 @@
from pathlib import Path
from types import SimpleNamespace

from rich.markup import escape

from aai_cli.app import batch, mediafile
from aai_cli.app.context import AppState
from aai_cli.commands.clip import _cut as clip_cut
from aai_cli.commands.clip import _select as clip_select
from aai_cli.commands.clip._select import Segment
from aai_cli.core import jsonshape, llm, stdio, youtube
Expand Down Expand Up @@ -230,91 +230,6 @@ def _validate_selection(opts: ClipOptions) -> None:
)


# -30dB for at least 0.2s reads as a pause in normal speech recordings.
_SILENCE_FILTER = "silencedetect=noise=-30dB:d=0.2"


def _detect_silences(ffmpeg: str, media: Path) -> list[Segment]:
"""The silence intervals ffmpeg hears in ``media`` (one decode pass).

Snapping is best-effort: a failed detection returns no silences (so the
cut proceeds at the selected times) rather than failing the command.
silencedetect logs at info level on stderr, so the usual ``-loglevel
error`` would silence the very lines this parses.
"""
result = mediafile.run_ffmpeg(
[
ffmpeg,
"-hide_banner",
"-nostats",
"-i",
str(media),
"-af",
_SILENCE_FILTER,
"-f",
"null",
"-",
]
)
if result.returncode != 0:
return []
return clip_select.parse_silences(result.stderr)


def _cut_clip(ffmpeg: str, media: Path, segment: Segment, dest: Path) -> None:
"""Re-encode one segment of ``media`` into ``dest``.

Re-encoding (no ``-c copy``) keeps cuts frame-accurate where stream copy
would snap to the nearest keyframe; ``-y`` makes a re-run overwrite its own
earlier output instead of stalling on ffmpeg's prompt.
"""
result = mediafile.run_ffmpeg(
[
ffmpeg,
"-hide_banner",
"-loglevel",
"error",
"-y",
"-i",
str(media),
"-ss",
f"{segment.start:.3f}",
"-to",
f"{segment.end:.3f}",
mediafile.path_arg(dest),
]
)
if result.returncode != 0:
raise mediafile.ffmpeg_failure(result, "cut", dest, error_type="clip_failed")


def _clip_dest(media: Path, out_dir: Path | None, index: int) -> Path:
directory = out_dir if out_dir is not None else media.parent
return directory / f"{media.stem}.clip{index:02d}{media.suffix}"


@dataclass(frozen=True)
class WrittenClip:
"""One output file and the source window it was cut from."""

path: Path
segment: Segment

def payload(self) -> dict[str, object]:
return {
"path": str(self.path),
"start": round(self.segment.start, 3),
"end": round(self.segment.end, 3),
"duration": round(self.segment.end - self.segment.start, 3),
}

def human_line(self) -> str:
start = clip_select.format_clock(self.segment.start)
end = clip_select.format_clock(self.segment.end)
duration = round(self.segment.end - self.segment.start, 3)
return output.success(f"{escape(str(self.path))} {start} - {end} ({duration}s)")


def run_clip(opts: ClipOptions, state: AppState, *, json_mode: bool) -> None:
"""Execute `assembly clip`: one source, or a stdin batch (`--from-stdin`)."""
_validate_out_dir(opts.out_dir)
Expand Down Expand Up @@ -407,7 +322,7 @@ def _clip_one(
state: AppState,
*,
json_mode: bool,
) -> tuple[dict[str, object], list[WrittenClip]]:
) -> tuple[dict[str, object], list[clip_cut.WrittenClip]]:
"""Resolve ``opts.media`` to a local file and cut its clips; the payload + clips.

A media-page URL is downloaded once — the audio track by default, the full
Expand Down Expand Up @@ -443,21 +358,21 @@ def _cut(
state: AppState,
*,
json_mode: bool,
) -> tuple[dict[str, object], list[WrittenClip]]:
) -> tuple[dict[str, object], list[clip_cut.WrittenClip]]:
"""Select and cut the clips for an already-local media file; the payload + clips."""
matched, transcript_id = _transcript_segments(opts, media, state, json_mode=json_mode)
segments = clip_select.merge_segments([*matched, *explicit], opts.padding)
if opts.snap:
with output.status("Detecting silence…", json_mode=json_mode, quiet=state.quiet):
silences = _detect_silences(ffmpeg, media)
silences = clip_cut.detect_silences(ffmpeg, media)
segments = clip_select.snap_to_silences(segments, silences)
written: list[WrittenClip] = []
written: list[clip_cut.WrittenClip] = []
cutting = f"Cutting {len(segments)} clip(s)…"
with output.status(cutting, json_mode=json_mode, quiet=state.quiet):
for index, segment in enumerate(segments, 1):
dest = _clip_dest(media, out_dir, index)
_cut_clip(ffmpeg, media, segment, dest)
written.append(WrittenClip(path=dest, segment=segment))
dest = clip_cut.clip_dest(media, out_dir, index)
clip_cut.cut_clip(ffmpeg, media, segment, dest)
written.append(clip_cut.WrittenClip(path=dest, segment=segment))
payload: dict[str, object] = {
"source": opts.media,
"transcript_id": transcript_id,
Expand Down
6 changes: 3 additions & 3 deletions aai_cli/commands/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@

from aai_cli import command_registry, help_panels, options
from aai_cli.app.context import AppState, run_command
from aai_cli.core import config, environments
from aai_cli.core import config, config_store, environments
from aai_cli.core.choices import ConfigKey
from aai_cli.core.errors import UsageError
from aai_cli.ui import output
Expand Down Expand Up @@ -104,7 +104,7 @@ def path(
"""Print where config.toml lives"""

def body(_state: AppState, json_mode: bool) -> None:
file = config.config_file_path()
file = config_store.config_file_path()
if json_mode:
output.emit({"path": str(file)}, str, json_mode=True)
else:
Expand Down Expand Up @@ -134,7 +134,7 @@ def list_settings(

def body(_state: AppState, json_mode: bool) -> None:
data: dict[str, object] = {
"path": str(config.config_file_path()),
"path": str(config_store.config_file_path()),
"active_profile": config.get_active_profile(),
"profiles": config.list_profiles(),
"telemetry_enabled": config.get_telemetry_enabled(),
Expand Down
Loading
Loading