Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions aai_cli/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@ between layers is enforced — higher may import lower, never the reverse:
`config_builder`, `keyring_store`, `environments`, `env`, `errors`, `llm`,
`telemetry`, `debuglog`, `remotefs`, `sync_stt`, `hotkey`, `ws`, `youtube`,
`wer`, `argscan`, `jsonshape`, `timeparse`, `microphone`, `procs`, `stdio`,
`choices`. Contract 4 also forbids `rich` here, so "no Rich below the UI
layer" is structural.
`choices`, `locking`, `config_lock`. Contract 4 also forbids `rich` here, so
"no Rich below the UI layer" is structural.

Three things sit *beside* the stack, intentionally unlisted in the layers
contract:
Expand Down Expand Up @@ -139,7 +139,7 @@ heavily-reworked commands with long bodies; small commands keep the inline
### Cross-cutting state (resolution order matters)

- **`app/context.py`** — `AppState` (profile, env) is attached to the Typer context in the root `@app.callback()`. `run_command` is the standard command wrapper.
- **`core/config.py`** — profiles persisted in `config.toml` (via `platformdirs`); the **API key lives only in the OS keyring**, never in a dotfile. The keyring access itself is factored into **`core/keyring_store.py`** (the single importer of `keyring`, holding `KEYRING_SERVICE = "assemblyai-cli"` + `set_secret`/`get_secret`/`restore_secret`/`delete_secret`/`usable`), so the "secrets never touch the dotfile" split is structural; `config` reads/writes secrets through it and only `config.keyring_usable` re-surfaces the probe on the auth facade. Key resolution order: `--api-key` flag (validation paths only) → `ASSEMBLYAI_API_KEY` env → keyring. **Run commands deliberately expose no `--api-key` flag** so keys can't leak into `ps`/shell history.
- **`core/config.py`** — profiles persisted in `config.toml` (via `platformdirs`); the **API key lives only in the OS keyring**, never in a dotfile. The keyring access itself is factored into **`core/keyring_store.py`** (the single importer of `keyring`, holding `KEYRING_SERVICE = "assemblyai-cli"` + `set_secret`/`get_secret`/`restore_secret`/`delete_secret`/`usable`), so the "secrets never touch the dotfile" split is structural; `config` reads/writes secrets through it and only `config.keyring_usable` re-surfaces the probe on the auth facade. Key resolution order: `--api-key` flag (validation paths only) → `ASSEMBLYAI_API_KEY` env → keyring. **Run commands deliberately expose no `--api-key` flag** so keys can't leak into `ps`/shell history. Every `config.toml` write is a read-modify-write (`_load` → mutate → `_dump`): `_dump` is a temp-file + atomic `os.replace` (a reader never sees a torn file), and the whole RMW runs under a cross-process `filelock` (`config_lock.update`/`.locked`, built on `core/locking.py`) so two concurrent `assembly` processes can't lose each other's updates. Readers stay lock-free. The lock helpers live in `config_lock.py` (not `config.py`) only to keep the latter under the file-length gate; reuse one cached `FileLock` per path so nested writers (`persist_login`) stay reentrant.
- **`core/environments.py`** — a frozen `Environment` (api_base, streaming_host, llm_gateway_base, ams_base, stytch_*). `DEFAULT_ENV` is **`production`**; use `--sandbox` (or `--env sandbox000` / `AAI_ENV`) to target the sandbox. The active environment is a process-global set once at startup; precedence: `--env` → `AAI_ENV` → profile's stored env → default. A credential is only valid against the environment that minted it.
- **`core/client.py`** — thin wrappers over the `assemblyai` SDK (`transcribe`, `list_transcripts`, `stream_audio`, etc.). It normalizes SDK exceptions: auth failures become a single clean `auth_failure()` `CLIError`; everything else becomes `APIError`. New SDK calls should follow this try/except shape.
- **`core/errors.py`** — the `CLIError` hierarchy (each with `error_type` + `exit_code`). `ui/output.py` emits errors to **stderr**; stdout stays clean for pipelines. `--json` switches to machine-readable output; it is never auto-enabled — `output.resolve_json()` deliberately keeps human text the default even when piped or agent-run.
Expand Down
129 changes: 64 additions & 65 deletions aai_cli/core/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
import tomli_w
from pydantic import BaseModel, ConfigDict, Field, ValidationError

from aai_cli.core import debuglog, env, keyring_store
from aai_cli.core import config_lock, debuglog, env, keyring_store
from aai_cli.core.errors import CLIError, NotAuthenticated

ENV_API_KEY = "ASSEMBLYAI_API_KEY"
Expand Down Expand Up @@ -187,27 +187,25 @@ def set_active_profile(name: str) -> None:
with no hint why, so the typo is rejected here with the known names listed.
"""
validate_profile(name)
cfg = _load()
if name not in cfg.profiles:
known = ", ".join(sorted(cfg.profiles)) or "none yet"
raise CLIError(
f"No profile named {name!r} (known: {known}).",
error_type="invalid_profile",
exit_code=2,
suggestion=f"Create it first: assembly --profile {name} login",
)
cfg.active_profile = name
_dump(cfg)
with config_lock.update(_load, _dump) as cfg:
if name not in cfg.profiles:
known = ", ".join(sorted(cfg.profiles)) or "none yet"
raise CLIError(
f"No profile named {name!r} (known: {known}).",
error_type="invalid_profile",
exit_code=2,
suggestion=f"Create it first: assembly --profile {name} login",
)
cfg.active_profile = name


def set_api_key(profile: str, api_key: str) -> None:
validate_profile(profile)
keyring_store.set_secret(profile, api_key)
cfg = _load()
cfg.profiles.setdefault(profile, Profile())
if cfg.active_profile is None:
cfg.active_profile = profile
_dump(cfg)
with config_lock.update(_load, _dump) as cfg:
cfg.profiles.setdefault(profile, Profile())
if cfg.active_profile is None:
cfg.active_profile = profile


def get_api_key(profile: str) -> str | None:
Expand All @@ -234,9 +232,8 @@ def get_profile_env(profile: str) -> str | None:
def set_profile_env(profile: str, env: str) -> None:
"""Bind a backend environment to a profile so its key and hosts stay matched."""
validate_profile(profile)
cfg = _load()
cfg.profiles.setdefault(profile, Profile()).env = env
_dump(cfg)
with config_lock.update(_load, _dump) as cfg:
cfg.profiles.setdefault(profile, Profile()).env = env


def get_profile_email(profile: str) -> str | None:
Expand All @@ -248,9 +245,8 @@ def get_profile_email(profile: str) -> str | None:
def set_profile_email(profile: str, email: str) -> None:
"""Persist the login email for a profile (gates internal-environment access)."""
validate_profile(profile)
cfg = _load()
cfg.profiles.setdefault(profile, Profile()).email = email
_dump(cfg)
with config_lock.update(_load, _dump) as cfg:
cfg.profiles.setdefault(profile, Profile()).email = email


def clear_api_key(profile: str) -> None:
Expand All @@ -275,9 +271,8 @@ def set_session(profile: str, *, session_jwt: str, session_token: str, account_i
_session_username(profile),
StoredSession(jwt=session_jwt, token=session_token).model_dump_json(),
)
cfg = _load()
cfg.profiles.setdefault(profile, Profile()).account_id = account_id
_dump(cfg)
with config_lock.update(_load, _dump) as cfg:
cfg.profiles.setdefault(profile, Profile()).account_id = account_id


def get_session(profile: str) -> dict[str, str] | None:
Expand All @@ -300,11 +295,12 @@ def get_account_id(profile: str) -> int | None:

def clear_session(profile: str) -> None:
keyring_store.delete_secret(_session_username(profile))
cfg = _load()
prof = cfg.profiles.get(profile)
if prof and prof.account_id is not None:
prof.account_id = None
_dump(cfg)
with config_lock.locked():
cfg = _load()
prof = cfg.profiles.get(profile)
if prof and prof.account_id is not None:
prof.account_id = None
_dump(cfg)


def persist_login(
Expand All @@ -327,28 +323,32 @@ def persist_login(
restored best-effort.
"""
validate_profile(profile)
prior_api_key = keyring_store.get_secret(profile)
prior_session = keyring_store.get_secret(_session_username(profile))
prior_cfg = _load()
done = False
try:
set_api_key(profile, api_key)
set_profile_env(profile, env)
set_session(
profile,
session_jwt=session_jwt,
session_token=session_token,
account_id=account_id,
)
# Within the same atomic rollback so the sandbox gate can't read stale identity.
if email is not None:
set_profile_email(profile, email)
done = True
finally:
if not done:
keyring_store.restore_secret(profile, prior_api_key)
keyring_store.restore_secret(_session_username(profile), prior_session)
_dump(prior_cfg)
# Hold the write lock across the whole snapshot -> writes -> rollback so a concurrent
# writer can't slip a change between the snapshot and a rollback dump. The set_*
# helpers re-take the same (reentrant) lock, so the nesting is safe.
with config_lock.locked():
prior_api_key = keyring_store.get_secret(profile)
prior_session = keyring_store.get_secret(_session_username(profile))
prior_cfg = _load()
done = False
try:
set_api_key(profile, api_key)
set_profile_env(profile, env)
set_session(
profile,
session_jwt=session_jwt,
session_token=session_token,
account_id=account_id,
)
# Within the same atomic rollback so the sandbox gate can't read stale identity.
if email is not None:
set_profile_email(profile, email)
done = True
finally:
if not done:
keyring_store.restore_secret(profile, prior_api_key)
keyring_store.restore_secret(_session_username(profile), prior_session)
_dump(prior_cfg)


def has_device_id() -> bool:
Expand All @@ -362,11 +362,12 @@ def get_device_id() -> str:
"""A stable anonymous install id for telemetry: a random UUID minted locally on
first use and persisted in config.toml. Carries nothing derivable from the
machine or account."""
cfg = _load()
if cfg.device_id is None:
cfg.device_id = str(uuid.uuid4())
_dump(cfg)
return cfg.device_id
with config_lock.locked():
cfg = _load()
if cfg.device_id is None:
cfg.device_id = str(uuid.uuid4())
_dump(cfg)
return cfg.device_id


def get_telemetry_enabled() -> bool | None:
Expand All @@ -376,9 +377,8 @@ def get_telemetry_enabled() -> bool | None:


def set_telemetry_enabled(*, enabled: bool) -> None:
cfg = _load()
cfg.telemetry_enabled = enabled
_dump(cfg)
with config_lock.update(_load, _dump) as cfg:
cfg.telemetry_enabled = enabled


def get_update_cache() -> tuple[float | None, str | None]:
Expand All @@ -390,10 +390,9 @@ def get_update_cache() -> tuple[float | None, str | None]:
def set_update_cache(*, last_check: float, latest_version: str | None) -> None:
"""Persist the update-notifier cache. ``latest_version`` is None when the last
fetch failed — the timestamp is still recorded so we don't re-spawn every run."""
cfg = _load()
cfg.update_last_check = last_check
cfg.update_latest_version = latest_version
_dump(cfg)
with config_lock.update(_load, _dump) as cfg:
cfg.update_last_check = last_check
cfg.update_latest_version = latest_version


def resolve_api_key(*, profile: str | None = None, api_key_flag: str | None = None) -> str:
Expand Down
51 changes: 51 additions & 0 deletions aai_cli/core/config_lock.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
"""The cross-process write lock for config.toml's read-modify-write.

config.toml's mutating helpers do ``_load -> mutate -> _dump``. The atomic ``os.replace``
in ``_dump`` keeps a *reader* from ever seeing a torn file, but two writers racing would
still lose an update — both read the same config, and the second dump clobbers the first's
change. These helpers serialize the whole read-modify-write across processes via a sibling
lock file; readers stay lock-free (an older-but-valid parse is fine).

Kept out of ``config.py`` only to keep that module under the file-length gate; it reaches
back into ``config`` (its ``_load``/``_dump``/``config_dir``) lazily, at call time.
"""

from __future__ import annotations

import contextlib
from collections.abc import Callable, Generator
from pathlib import Path

import filelock

from aai_cli.core import config, locking


def lock_path() -> Path:
return config.config_dir() / "config.toml.lock"


def write_lock() -> filelock.FileLock:
"""The shared cross-process write lock guarding config.toml."""
return locking.file_lock(lock_path())


@contextlib.contextmanager
def locked() -> Generator[None]:
"""Hold the config write lock for the duration of the block."""
with locking.locked(lock_path()):
yield


@contextlib.contextmanager
def update(
load: Callable[[], config.Config], dump: Callable[[config.Config], None]
) -> Generator[config.Config]:
"""Run a load -> mutate -> dump under the write lock so a concurrent writer can't lose
the update. Yields the loaded config; dumps it on clean exit (an exception in the block
propagates and skips the dump). ``load``/``dump`` are injected by config.py so this
module stays clear of its private helpers."""
with locked():
cfg = load()
yield cfg
dump(cfg)
43 changes: 43 additions & 0 deletions aai_cli/core/locking.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
"""Cross-process advisory file locking, backed by ``filelock``.

Used to serialize a read-modify-write of a shared on-disk file (``config.toml``) across
concurrent ``assembly`` processes, so two of them can't lose each other's updates
(last-writer-wins). The atomic-rename write keeps a *reader* from ever seeing a torn
file; this lock is what keeps two *writers* from clobbering each other.
"""

from __future__ import annotations

import contextlib
from collections.abc import Generator
from pathlib import Path

import filelock

# One cached lock instance per lock-file path. filelock already serializes threads within
# this process (and other processes via the lock file), and reusing a single instance per
# path makes nested acquisitions reentrant — distinct instances on one path deadlock, which
# is what a re-entrant caller (e.g. a snapshot+rollback that calls smaller writers) needs.
_lock_cache: dict[str, filelock.FileLock] = {}


def file_lock(path: Path) -> filelock.FileLock:
"""The cached cross-process lock for ``path`` (created on first use)."""
key = str(path)
lock = _lock_cache.get(key)
if lock is None:
lock = filelock.FileLock(path)
_lock_cache[key] = lock
return lock


@contextlib.contextmanager
def locked(path: Path) -> Generator[None]:
"""Hold the cross-process lock at ``path`` for the duration of the block.

Creates the lock file's parent directory first — filelock won't, and the very first
write on a fresh machine targets a dir that may not exist yet.
"""
path.parent.mkdir(parents=True, exist_ok=True)
with file_lock(path):
yield
5 changes: 5 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,11 @@ dependencies = [
# lazily). Strips boilerplate down to the readable body; ships prebuilt wheels
# (lxml included), so it adds no source-compile step to Homebrew bottling.
"trafilatura>=2.1.0",
# Cross-process advisory lock around config.toml's read-modify-write (config.py),
# so two concurrent `assembly` processes can't lose each other's profile/telemetry
# updates (last-writer-wins). Pure-Python, no compiled deps; already arrived
# transitively via the dev toolchain, so the lock pins the same release.
"filelock>=3.16.0",
]

[project.urls]
Expand Down
Loading
Loading