Add `assembly clip` command to cut media by transcript content by alexkroman · Pull Request #129 · AssemblyAI/cli

alexkroman · 2026-06-12T21:06:12Z

Implements the assembly clip command, which cuts clips out of audio/video files based on speaker labels, text search, LLM-driven selection, or explicit time ranges.

Summary

This PR adds a complete new command (assembly clip) that orchestrates media cutting via ffmpeg, driven by transcript-based selection. The command supports multiple selection sources (speaker/search filters, LLM Gateway model picks, explicit ranges), handles YouTube/media-page downloads via yt-dlp, accepts piped transcripts on stdin, and outputs clips with optional padding and merging.

Key Changes

Core Implementation:

aai_cli/clip_exec.py (369 lines): Main orchestration logic for validation, transcript resolution, segment selection, and ffmpeg invocation. Handles local files, YouTube URLs, piped transcripts (-t -), and LLM-driven selection via the LLM Gateway.
aai_cli/clip_select.py (198 lines): Pure selection logic—range parsing (seconds and clock times like 1:30-2:45), utterance filtering by speaker/search, segment merging with padding, and LLM reply parsing.
aai_cli/commands/clip.py (128 lines): Typer CLI command definition with all flags (--speaker, --search, --llm, --range, --padding, --out-dir, --transcript-id, etc.) and help text.

Test Suite:

tests/test_clip_exec.py (362 lines): Tests validation, ffmpeg orchestration, range-only cutting, and transcript-backed selection (ffmpeg boundary faked).
tests/test_clip_select.py (209 lines): Tests pure selection logic—range parsing, segment merging, utterance filtering, LLM listing/reply contract, and clock formatting.
tests/test_clip_sources.py (294 lines): Tests YouTube/media-page downloads, stdin transcript piping (-t -), and LLM-driven selection (all boundaries faked).
tests/test_clip_command.py (158 lines): CLI-level tests for argv parsing, error rendering, and command placement in help.
tests/_clip_helpers.py (67 lines): Shared test builders (option defaults, transcript fakes, ffmpeg recording).

Integration:

Updated aai_cli/main.py to register the clip command sub-app.
Updated .importlinter to allow clip_exec and clip_select modules.
Updated AGENTS.md to document the command layer architecture.
Updated aai_cli/skills/aai-cli/references/transcription.md with clip command documentation.
Updated help snapshot tests in tests/__snapshots__/test_snapshots_help_run.ambr.
Updated tests/_snapshot_surface.py to include clip in the help group.
Updated README.md to list the new command.
Updated tests/test_smoke.py to verify command ordering.

Notable Implementation Details

Selection composition: --speaker and --search filter utterances first; --llm then picks windows from the filtered set (or the whole transcript if unfiltered). --range adds explicit segments. All sources merge and overlap-coalesce.
Transcript sources: Transcripts can be made fresh (with speaker labels), fetched by ID, or piped as JSON on stdin (-t -), avoiding re-transcription.
YouTube support: Media-page URLs are downloaded via youtube.download_audio() into a temp directory; clips land in --out-dir or the current directory.
LLM integration: The LLM Gateway receives a timestamped utterance listing and returns JSON segment picks; the reply is parsed robustly (handles markdown code blocks, surrounding text).
Padding & merging: Segments are padded (clamped at 0), sorted, and coalesced where they touch or overlap, so consecutive utterances don't shatter into per-sentence files.
ffmpeg orchestration: Each surviving segment is re-encoded into its own file (`.clip

https://claude.ai/code/session_011SdBCjATahktayRZfjmwWk

A FunClip-style transcript-driven clipping command. assembly clip cuts a local audio/video file (or a YouTube/media-page URL, downloaded via yt-dlp) with ffmpeg, selecting segments four composable ways: - --speaker A / --search "topic": filter diarized utterances (the file is transcribed with speaker labels on the fly, or reuse one with -t TRANSCRIPT_ID, or pipe `transcribe --json` output in with -t -) - --llm "the best moments": the timestamped utterances go to LLM Gateway and the model picks the windows (composes with the filters) - --range 1:30-2:45: explicit windows, no transcript needed Selections are padded (--padding), merged where they touch, and each surviving segment is re-encoded to <name>.clipNN<ext> (next to the input, or --out-dir; downloads land in the cwd). --json emits the written clips with start/end/duration. The pure selection logic (range parsing, utterance filtering, LLM reply parsing, merging) lives in clip_select; orchestration (transcript resolution, yt-dlp, ffmpeg) in clip_exec, following the options/run split with commands/clip.py as the thin argv surface. https://claude.ai/code/session_011SdBCjATahktayRZfjmwWk

Combines the new clip command from #129 with dictate: both registered in the command order, help groups, import-linter contracts, and AGENTS.md (seven options/run-split exec modules now); run-group help snapshot regenerated. Note: the full check.sh gate was deliberately skipped for this merge commit at the operator's request; the default pytest suite passed on the merged tree (2210 passed). https://claude.ai/code/session_01FCXQLAyo8xpZiXrQ7hCMAf

alexkroman enabled auto-merge June 12, 2026 21:06

alexkroman added this pull request to the merge queue Jun 12, 2026

Merged via the queue into main with commit e53d28e Jun 12, 2026
16 checks passed

alexkroman deleted the claude/bold-volta-b0y5c2 branch June 12, 2026 21:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `assembly clip` command to cut media by transcript content#129

Add `assembly clip` command to cut media by transcript content#129
alexkroman merged 1 commit into
mainfrom
claude/bold-volta-b0y5c2

alexkroman commented Jun 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

alexkroman commented Jun 12, 2026

Summary

Key Changes

Notable Implementation Details

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants