Skip to content

Add --save-dir finalization: auto-name, note, and sidecar#200

Merged
alexkroman merged 3 commits into
mainfrom
claude/serene-heisenberg-7onpfp
Jun 17, 2026
Merged

Add --save-dir finalization: auto-name, note, and sidecar#200
alexkroman merged 3 commits into
mainfrom
claude/serene-heisenberg-7onpfp

Conversation

@alexkroman

Copy link
Copy Markdown
Collaborator

Implements the post-streaming finalization for assembly stream --save-dir: auto-naming recordings from transcript content, writing LLM-generated notes, and creating metadata sidecars.

Summary

This PR completes the --save-dir feature by adding three finalization steps that run after streaming ends:

  1. Auto-naming (--auto-name): Derives a short title from the transcript via the LLM and renames the provisional timestamp-only files to include that slug (e.g., 2026-06-16-143005-quarterly-review.txt).
  2. Note writing (--llm + --save-dir): Writes the final LLM answer as a .md file alongside the transcript.
  3. Sidecar metadata (always): Creates a .aai.json file with title, date, duration, speaker list, turn count, and file references — enabling rich list/browse UIs without parsing transcripts.

Key Changes

  • New module aai_cli/streaming/savedir.py: Core finalization logic

    • SaveDirPlan: Immutable dataclass capturing the resolved --save-dir intent
    • derive_title(): Calls the LLM to generate a short headline from the transcript
    • write_outputs(): Orchestrates the rename, note write, and sidecar creation
    • Error handling wraps OSError as clean CLIError with save_dir_path type
  • New module aai_cli/streaming/batch.py: Extracted batch streaming logic

    • stream_batch_sources(): Drives sequential streaming of stdin sources
    • Moved from session.py to keep session focused on single-run state
    • Handles per-source failures and Ctrl-C/pipe cleanup
  • Updated aai_cli/streaming/session.py:

    • Added save_plan, _meta_lines, _meta_speakers, _capture_start, _last_answer fields to track metadata for finalization
    • _note_meta(): Records finalized turn text and speaker labels for the sidecar
    • _finalize_save_dir(): Calls derive_title() (when --auto-name and transcript non-empty) and write_outputs() with collected metadata
    • Graceful error handling: failed title derivation warns but still saves the recording
  • Updated aai_cli/streaming/naming.py:

    • SavePaths refactored from two fields (transcript, audio) to computed properties (transcript, audio, note, sidecar) derived from directory and stem
    • Added SIDECAR_SUFFIX constant (.aai.json)
  • Updated aai_cli/commands/stream/_exec.py:

    • Added auto_name and no_save_audio flags to StreamOptions
    • _resolve_save_targets() now returns a SaveDirPlan as the third element
    • Validation: --auto-name and --no-save-audio require --save-dir; --auto-name and --name are mutually exclusive
    • Batch import moved to aai_cli/streaming/batch.py
  • Test infrastructure:

    • New tests/_stream_helpers.py: Shared fakes (FakeMic, RecordingMic, FakeTurn, emit_turns, FixedDatetime, DEFAULTS) used by both test_stream_exec.py and new test files
    • New tests/test_streaming_savedir.py: 218 lines of unit tests for write_outputs, derive_title, and error handling (pure file I/O, LLM mocked)
    • New tests/test_stream_save_dir.py: End-to-end tests of --save-dir through run_stream (real session + savedir, LLM mocked)
    • Updated tests/test_stream_exec.py:

https://claude.ai/code/session_01KNx966tACLPYX4B5jkcfqp

…uto-name

Fold the post-capture index step into `assembly stream --save-dir` so a wrapper
script no longer needs an index loop:

- `--llm "…"` alongside `--save-dir` writes the final prompt-chain answer as a
  `.md` note next to the auto-named transcript (summarize-on-capture).
- a `.aai.json` sidecar (title, date, duration, speakers, turns, file names)
  lands beside every recording so a list/browse UI needs no transcript parsing,
  and `--no-save-audio` keeps the transcript without the WAV.
- `--auto-name` derives the filename slug from the transcript via the LLM and
  renames the files once the stream ends (mutually exclusive with `--name`).

The --save-dir lifecycle lives in the new streaming/savedir.py (pure file I/O,
unit-tested without a gateway); the batch driver moves to streaming/batch.py to
keep session.py under the line limit. Docs updated in REFERENCE.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01KNx966tACLPYX4B5jkcfqp
@alexkroman alexkroman enabled auto-merge June 16, 2026 22:58
def _write(path: Path, text: str) -> None:
"""Write ``text`` to ``path`` (the note or sidecar), as a clean CLIError on failure."""
try:
path.write_text(text, encoding="utf-8")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential file inclusion attack via reading file - medium severity
If an attacker can control the input leading into the open function, they might be able to read sensitive files and launch further attacks with that information.

Show fix

Remediation: Ignore this issue only after you've verified or sanitized the input going into this function.

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AikidoSec ignore: False positive for this local CLI. savedir.py only ever writes (write_text/rename) to paths it assembles itself — never reads a user-supplied path. The destination is built from the user's own --save-dir/--name arguments plus, under --auto-name, an LLM-derived title that is run through naming.slugify (lowercased, every non-[a-z0-9] run collapsed to -, length-capped), so /, . and .. cannot survive into the filename — no path traversal or sensitive-file read is reachable. This matches the existing # nosemgrep precedent for the same class in aai_cli/app/transcribe/batch.py.


Generated by Claude Code

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Based on your feedback, we ignored this issue because of the following reason:

False positive for this local CLI. savedir.py only ever writes (write_text/rename) to paths it assembles itself — never reads a user-supplied path. The destination is built from the user's own --save-dir/--name arguments plus, under --auto-name, an LLM-derived title that is run through naming.slugify (lowercased, every non-[a-z0-9] run collapsed to -, length-capped), so /, . and .. cannot survive into the filename — no path traversal or sensitive-file read is reachable. This matches the existing # nosemgrep precedent for the same class in aai_cli/app/transcribe/batch.py.


Generated by Claude Code

Comment thread aai_cli/streaming/batch.py Outdated
except NotAuthenticated:
raise
except CLIError as exc:
output.emit_warning(f"{source}: {exc.message}", json_mode=json_mode)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Emits a warning containing the raw 'source' and exception message (f"{source}: {exc.message}") — avoid including unsanitized user input in logs/UI.

Details

✨ AI Reasoning
​A new warning emission includes the raw 'source' (user path/URL) and exception message in a formatted string passed to output.emit_warning. Both values are user-controllable or derived from user-controlled input and are emitted intact to logging/UI, which risks leaking sensitive data or enabling log-injection payloads. This was introduced in the batch streaming helper.

🔧 How do I fix it?
Keep sensitive data such as emails, passwords, and tokens out of logs. When logging values tied to a user, prefer a safe identifier like a user ID over the raw input, and strip line breaks from any user-provided text you do log.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

…erg-7onpfp

# Conflicts:
#	aai_cli/commands/stream/_exec.py
#	aai_cli/streaming/session.py
#	tests/test_stream_exec.py
def _write(path: Path, text: str) -> None:
"""Write ``text`` to ``path`` (the note or sidecar), as a clean CLIError on failure."""
try:
path.write_text(text, encoding="utf-8")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential file inclusion attack via reading file - medium severity
If an attacker can control the input leading into the open function, they might be able to read sensitive files and launch further attacks with that information.

Show fix

Remediation: Ignore this issue only after you've verified or sanitized the input going into this function.

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AikidoSec ignore: False positive (re-flagged on a new commit). savedir.py only writes (write_text/rename) to paths it assembles itself — it never opens a user-supplied path for reading. The destination comes from the user's own --save-dir/--name plus, under --auto-name, an LLM-derived title passed through naming.slugify (lowercased, every non-[a-z0-9] run collapsed to -, length-capped), so /, ., and .. cannot reach the filename — no path traversal or sensitive-file read is possible. Same class already suppressed via # nosemgrep in aai_cli/app/transcribe/batch.py.


Generated by Claude Code

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Based on your feedback, we ignored this issue because of the following reason:

False positive (re-flagged on a new commit). savedir.py only writes (write_text/rename) to paths it assembles itself — it never opens a user-supplied path for reading. The destination comes from the user's own --save-dir/--name plus, under --auto-name, an LLM-derived title passed through naming.slugify (lowercased, every non-[a-z0-9] run collapsed to -, length-capped), so /, ., and .. cannot reach the filename — no path traversal or sensitive-file read is possible. Same class already suppressed via # nosemgrep in aai_cli/app/transcribe/batch.py.


Generated by Claude Code

…erg-7onpfp

# Conflicts:
#	aai_cli/streaming/session.py
@alexkroman alexkroman added this pull request to the merge queue Jun 17, 2026
Merged via the queue into main with commit 2bc6983 Jun 17, 2026
18 of 19 checks passed
@alexkroman alexkroman deleted the claude/serene-heisenberg-7onpfp branch June 17, 2026 00:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants