Add visual-regression snapshot testing for the code/live TUIs by alexkroman · Pull Request #253 · AssemblyAI/cli

alexkroman · 2026-06-19T03:59:07Z

The two Textual apps (assembly code / live) are the most layout-fragile
surface in the repo: a one-line CSS edit silently reflows the whole frame,
and the pilot tests only ever assert one widget/region at a time. Add
pytest-textual-snapshot SVG goldens that pin the painted frame — splash,
prompt bar, status line, voice bar, transcript widgets, and the docked
approval/ask modals.

Building the harness surfaced a real bug: a docked container with
width: 1fr (the Horizontal/Vertical default) ignores horizontal margin
and overflows the screen, clipping its rounded right border off-edge.
This affected #promptbar (code TUI) and the #approvalbox / #askbox modals
— all three rendered as unclosed boxes on the right. Fixed by switching
them to width: 100%, which honors the side margins, and added precise
region assertions (box.region.right <= screen width) so a regression is
killed deterministically, not only by the SVG diff.

tests/_tui_snapshot.py freezes the four sources of non-determinism
(version string, voice-bar animation timer, the cascade worker, and the
cwd/branch status line); the strategy is documented in tests/AGENTS.md.

Co-Authored-By: Claude Opus 4.8 (1M context) noreply@anthropic.com
Claude-Session: https://claude.ai/code/session_01BCoAfjg7pCUxD3nxAhGcRL

The two Textual apps (assembly code / live) are the most layout-fragile surface in the repo: a one-line CSS edit silently reflows the whole frame, and the pilot tests only ever assert one widget/region at a time. Add pytest-textual-snapshot SVG goldens that pin the painted frame — splash, prompt bar, status line, voice bar, transcript widgets, and the docked approval/ask modals. Building the harness surfaced a real bug: a docked container with `width: 1fr` (the Horizontal/Vertical default) ignores horizontal margin and overflows the screen, clipping its rounded right border off-edge. This affected #promptbar (code TUI) and the #approvalbox / #askbox modals — all three rendered as unclosed boxes on the right. Fixed by switching them to `width: 100%`, which honors the side margins, and added precise region assertions (box.region.right <= screen width) so a regression is killed deterministically, not only by the SVG diff. tests/_tui_snapshot.py freezes the four sources of non-determinism (version string, voice-bar animation timer, the cascade worker, and the cwd/branch status line); the strategy is documented in tests/AGENTS.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01BCoAfjg7pCUxD3nxAhGcRL

The project-wide 90% coverage gate is an average, so a layout-fragile TUI module could rot while the rest of the suite carries it. Add a per-surface floor: after the main pytest run, `coverage report --fail-under=90` over just the Textual modules, reusing the .coverage data already written (no re-run, branches included). The module set is derived — every aai_cli file that imports `textual` — so a new TUI module is held to the floor automatically with no list to hand-maintain. Documented in the AGENTS.md gate-stage list and the tests/AGENTS.md Textual testing section. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01BCoAfjg7pCUxD3nxAhGcRL

Pin four more regression-prone painted states the initial suite missed: - approval modal with args expanded (`e`) — the multi-line content reveal - tool output collapsed — the clipped preview + "(Ctrl+O to expand)" hint - tool output expanded (Ctrl+O) — full content + collapse hint - live "thinking" phase — the amber voice bar between turns Deliberately did not add a snapshot for the `assembly code` voice-input mode: it was the other high-priority candidate, but voice is slated for removal, so pinning it would be counterproductive. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01BCoAfjg7pCUxD3nxAhGcRL

Pin the remaining medium-value painted states: assembly code: - working spinner ("✶ Working… (7s)") docked above the prompt - reply mid-stream (plain text, literal markdown) before the Markdown swap - approval prompt for a benign command (no warning label) - failed turn rendered as a red ✗ error line assembly live: - interim (partial) user transcript growing in place while listening - mid-turn tool-call progress note ("Searching the web…") - interrupted reply, finalized and tagged "(interrupted)" - cascade failure surfaced as a red ✗ error line The spinner frame is rendered through the real _spinner_text formatter with a fixed elapsed/frame so it doesn't depend on wall-clock timing. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01BCoAfjg7pCUxD3nxAhGcRL

Voice mode is staying, so pin its chrome: in voice mode the prompt bar is swapped for the animated listening bar (with the "(Ctrl-V to type)" hint) and the status line gains the green "● voice on" badge. The mic-capture leg is stubbed (_SnapshotCodeApp._begin_listening no-op) so the listening state renders synchronously without a background thread racing the screenshot. FakeVoice satisfies _VoiceIO and is covered by a small unit test since no render reaches it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01BCoAfjg7pCUxD3nxAhGcRL

aikido-pr-checks · 2026-06-19T03:59:38Z

+# imports `textual` — so a new TUI module is picked up automatically with no list to
+# hand-maintain. Reuses the .coverage data the pytest step just wrote (no re-run), and
+# counts branches because that data was collected with --cov-branch.
+tui_modules="$(git grep -lP '^\s*(from|import) textual' -- 'aai_cli/**/*.py' | paste -sd, -)"


The if [[ -z "$tui_modules" ]] branch is effectively unreachable: with set -euo pipefail, git grep ... | paste ... exits early on no matches before that check runs.

Suggested change

tui_modules="$(git grep -lP '^\s*(from|import) textual' -- 'aai_cli/**/*.py' | paste -sd, -)"

tui_modules="$(git grep -lP '^\s*(from|import) textual' -- 'aai_cli/**/*.py' | paste -sd, - || true)"

Details

✨ AI Reasoning
1) The new block tries to derive module paths and then handle the "none found" case explicitly.
2) The pipeline used for derivation returns failure when no files match.
3) In the current shell mode, that failure aborts execution before the explicit fallback branch is evaluated.
4) This creates contradictory control flow: a branch intended to handle an empty result is effectively unreachable in the exact scenario it describes.
5) This is a definite logic issue in the changed control flow, not a style concern.

_{Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.}
_{Reply @AikidoSec ignore: [REASON] to ignore this issue.}
_{More info}

claude added 5 commits June 19, 2026 03:18

alexkroman enabled auto-merge June 19, 2026 03:59

aikido-pr-checks Bot reviewed Jun 19, 2026

View reviewed changes

alexkroman added this pull request to the merge queue Jun 19, 2026

Merged via the queue into main with commit 7c8af2c Jun 19, 2026
20 checks passed

alexkroman deleted the claude/textual-testing-strategy-9jt37q branch June 19, 2026 04:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add visual-regression snapshot testing for the code/live TUIs#253

Add visual-regression snapshot testing for the code/live TUIs#253
alexkroman merged 5 commits into
mainfrom
claude/textual-testing-strategy-9jt37q

alexkroman commented Jun 19, 2026

Uh oh!

aikido-pr-checks Bot Jun 19, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	tui_modules="$(git grep -lP '^\s(from\|import) textual' -- 'aai_cli//.py' \| paste -sd, -)"
	tui_modules="$(git grep -lP '^\s(from\|import) textual' -- 'aai_cli//.py' \| paste -sd, - \|\| true)"

Conversation

alexkroman commented Jun 19, 2026

Uh oh!

aikido-pr-checks Bot Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aikido-pr-checks Bot Jun 19, 2026 •

edited

Loading