docs(config): config and settings design doc#162
Conversation
88e44c9 to
3a296f2
Compare
a6a2692 to
dc180f1
Compare
- Add 'Current codebase baseline' gap map at top of doc - Bake in single shipped compose + dev override pattern (reject dev/prod parallel files) - Align service names and profile names with deploy/annotation/docker-compose.dev.yml (argilla/worker/postgres/elasticsearch/redis; all-bundled/external-pg/external-es) - Clarify existing setup/teardown (Argilla workspace provisioning) vs proposed up/down (Docker lifecycle) and proposed setup (config wizard); add Q-setup-verb - Flag [eval] and [all] extras as proposed; clarify per-provider extras already exist - Flag PRAGMATA_* env var prefix as proposed (current state: only ARGILLA_API_URL hardcoded, no systematic prefix)
Inline open questions, expand argilla credential delegation rationale, clean up baseline table and extra install comment noise.
7b4c4be to
2a13d0f
Compare
| - define how users install, configure, and first-run `pragmata`. | ||
| - holistically cover install model, bootstrap commands, config system, and error UX - across all three tools (`annotation`, `querygen`, `eval`) | ||
|
|
||
| **Top-level shape** - two buckets w/ orthogonal concerns: |
There was a problem hiding this comment.
The §1/§2 split makes sense, but I would sharpen the boundary: §1 should cover shared settings/config resolution and entrypoint UX only, while operational bootstrap/setup concerns should remain annotation-specific unless a concrete need exists for querygen/eval.
| 1. **Modular install.** Users can install and use any single tool w/o the others. `pip install pragmata[annotation]` must not require querygen or eval deps | ||
| 2. **Install is side-effect free.** No prompts, no Docker, no network calls at `pip install` time -> configuration happens on first explicit invocation (if at all, see principle 4). follows PyPI policy and `gh auth login` / `supabase` convention. | ||
| 3. **Fail clearly -> point to fix** - e.g. if optional extras not installed. Never raw traceback on a first-use error. Always: "X is not Y. Run `pragmata ... ` to fix." Follows `gh`, `supabase`, `vercel`, `dbt`, `fly`, `railway`. | ||
| 4. **Zero config OOTB.** Setup is optional; defaults work out of the box. Each tool must be usable immediately after install via opinionated defaults - no `setup` required for the happy path. `setup` exists for experienced users who need to override defaults (custom Argilla URL, alternate LLM provider, non-default workspace, etc.). First-run behaviour synthesises a working config from defaults + env vars; the wizard is a convenience, not a gate. Follows `ruff` / `black` (zero-config by design) and `supabase start` (sensible local defaults, no wizard required) |
There was a problem hiding this comment.
Zero config OOTB” is a good target, but I would avoid framing normal settings resolution as “first-run config synthesis.” Defaults + env + config + explicit args should be resolved consistently on every run.
| 2. **Install is side-effect free.** No prompts, no Docker, no network calls at `pip install` time -> configuration happens on first explicit invocation (if at all, see principle 4). follows PyPI policy and `gh auth login` / `supabase` convention. | ||
| 3. **Fail clearly -> point to fix** - e.g. if optional extras not installed. Never raw traceback on a first-use error. Always: "X is not Y. Run `pragmata ... ` to fix." Follows `gh`, `supabase`, `vercel`, `dbt`, `fly`, `railway`. | ||
| 4. **Zero config OOTB.** Setup is optional; defaults work out of the box. Each tool must be usable immediately after install via opinionated defaults - no `setup` required for the happy path. `setup` exists for experienced users who need to override defaults (custom Argilla URL, alternate LLM provider, non-default workspace, etc.). First-run behaviour synthesises a working config from defaults + env vars; the wizard is a convenience, not a gate. Follows `ruff` / `black` (zero-config by design) and `supabase start` (sensible local defaults, no wizard required) | ||
| 5. **Flags suppress prompts.** Interactive wizard by default; any required flag present short-circuits the corresponding prompt. Same command, both use cases. Follows [`gh auth login`](https://cli.github.com/manual/gh_auth_login). |
There was a problem hiding this comment.
“Flags suppress prompts” makes sense only within explicitly interactive commands such as annotation setup. I would not make interactivity the default for normal execution commands; run commands should be deterministic and scriptable unless the user explicitly enters a setup/init flow.
Edit: I noticed that setup has a different meaning here. See my comment on lines R95 to R96.
| 3. **Fail clearly -> point to fix** - e.g. if optional extras not installed. Never raw traceback on a first-use error. Always: "X is not Y. Run `pragmata ... ` to fix." Follows `gh`, `supabase`, `vercel`, `dbt`, `fly`, `railway`. | ||
| 4. **Zero config OOTB.** Setup is optional; defaults work out of the box. Each tool must be usable immediately after install via opinionated defaults - no `setup` required for the happy path. `setup` exists for experienced users who need to override defaults (custom Argilla URL, alternate LLM provider, non-default workspace, etc.). First-run behaviour synthesises a working config from defaults + env vars; the wizard is a convenience, not a gate. Follows `ruff` / `black` (zero-config by design) and `supabase start` (sensible local defaults, no wizard required) | ||
| 5. **Flags suppress prompts.** Interactive wizard by default; any required flag present short-circuits the corresponding prompt. Same command, both use cases. Follows [`gh auth login`](https://cli.github.com/manual/gh_auth_login). | ||
| 6. **Tool-scoped complexity.** Per-tool setup commands (`pragmata annotation setup`, `pragmata querygen setup`, `pragmata eval setup`). No global `pragmata setup`. Setup is diff per tool (annotation needs Docker + Argilla creds; querygen needs LLM provider creds; eval tbc). |
There was a problem hiding this comment.
I agree with avoiding a global setup command, but I would not infer that every tool needs its own setup command. annotation setup seems justified because it bootstraps infrastructure; querygen credentials and provider selection should probably stay in args/env/config unless we identify a concrete setup workflow.
Edit: I noticed that setup has a different meaning here. See my comment on lines R95 to R96.
|
|
||
| ### Lazy imports at the package boundary | ||
|
|
||
| - optional-extra packages (`argilla`, `langchain`, etc.) must not be imported at module scope on the root import path. |
There was a problem hiding this comment.
I would not define extras purely by tool ownership. A dependency being used only by querygen (e.g., langchain) does not automatically mean it belongs in [querygen]; extras should mainly capture heavy, optional, provider-specific, or infrastructure/runtime-sensitive dependencies. Lightweight core dependencies can remain regular deps if they support a first-class package capability.
|
|
||
| not comprehensive - several are inline thoughout above | ||
|
|
||
| ### §Q-prod-script: ship a per-tool bootstrap script for `annotation`? |
| ### §Q-prod-script: ship a per-tool bootstrap script for `annotation`? | ||
|
|
||
|
|
||
| ### §Q-wizard-lib: Questionary or minimal prompts? |
There was a problem hiding this comment.
Defer unless we decide to ship an actual interactive flow.
At the moment, I would not add a general config overwrite wizard. Defaults, flags, env vars, config files, and possibly a commented config-template command should be enough for normal configuration. If we later add an interactive annotation-specific flow that validates external services or handles provisioning/auth, then Questionary is a reasonable annotation-extra dependency. It should not be core, and it should not be added for querygen/eval without a concrete wizard workflow.
|
|
||
| Research strongly prefers [Questionary](https://questionary.readthedocs.io/); 2.1 addresses the older prompt_toolkit clash. Alternative: a lighter library for the handful of prompts we actually need. | ||
|
|
||
| ### §Q-auth-split: `pragmata annotation auth` / `login` separate command, or part of `setup`? |
There was a problem hiding this comment.
Neither for now.
I would not add a pragmata-managed auth/login command unless we decide to persist credentials ourselves, which I do not think we should do at this stage.
| Argilla API keys rotate independently of compose config. `gh` and `supabase` split auth from config; `aws` combines. Research leans split; current draft combines. | ||
|
|
||
|
|
||
| ### §Q-wizard-placement: where and how do wizards appear? |
There was a problem hiding this comment.
Nowhere for v0.1.
I would not add a general interactive config overwrite wizard at this stage.
|
|
||
| Still open - noted in guiding principles but not resolved. | ||
|
|
||
| ### §Q-configured-check: include a "configured" pre-flight check? |
There was a problem hiding this comment.
No, not under the zero-config OOTB principle.
Address PR #162 review (SG): zero-config means a single deterministic resolution chain on every run, not a first-run synthesis path. Normal run commands are non-interactive by default; interactivity is reserved for explicit setup flows that touch external state.
Address PR #162 review (SG): extras should capture heavy/optional/ provider-specific deps, not just per-tool ownership. Lazy imports guard narrowly at the actual import site so we don't mask unrelated ImportErrors inside the optional dep itself.
Address PR #162 review (SG): the prior chain placed explicit config_path in the user-config slot, which let auto-discovered project config beat it. "Explicit beats implicit" now applies at every layer - explicit config_path skips both project and user auto-discovery.
Address PR #162 review (SG): workspace dir is already an explicit kwarg/flag with cwd default; a global env override creates hidden state and is structurally inconsistent with the per-tool PRAGMATA_<TOOL>_<KEY> scheme. PRAGMATA_CONFIG_DIR retained as the sole shared escape hatch.
…store Address PR #162 review (SG): a persistent local secret store is a much larger product/security surface than non-secret config discovery and is deferred. LLM providers stay env-only; Argilla delegates to ~/.cache/argilla/credentials after env, avoiding two parallel stores for the same service.
…ovisioning Address PR #162 review (SG): no general config wizard. Existing 'annotation setup' (Argilla provisioning) keeps its current semantics and stays headless/flag-driven. No 'querygen setup' or 'eval setup' - no per-tool setup proliferation. Remove §1.3.1 wizard-trigger logic and TTY branching; commands are non-interactive everywhere. §1.1.6 becomes a deferred 'configure --write-template' note. Drop the 'configured' pre-flight error from §1.4.
Address PR #162 review (SG): user-editable compose creates an ownership/drift problem on day one and makes the happy path Docker-centric. Customisation surface (--external-postgres etc.) is sufficient for v0.1, so keep the compose package-owned and resolve via importlib.resources at runtime. Drop drift-detection machinery from §2.4 (no drift to manage). Document option Z (eject) as a future escape hatch only - do not build it for v0.1.
Address PR #162 review (SG): commit explicitly to no shipped bootstrap script for v0.1 - documented two-line install is sufficient. Close §Q-prod-script, §Q-wizard-lib, §Q-auth-split, §Q-argilla-creds, §Q-setup-verb, §Q-wizard-placement, §Q-configured-check (all answered by the v0.1 framing). One item remains: whether to support both pragmata.yaml and pyproject.toml [tool.pragmata].
…on-bootstrap Address PR #162 review (SG, line 1): the previous doc covered two separable problems - shared API/config UX across all three tools, and annotation-only infra/bootstrap. Splitting per his suggestion. - config-and-settings.md (§1 of the original): install model, lazy imports, config system, precedence chain, secrets, CLI surface, shared first-use error UX. Applies to all three tools. - annotation-bootstrap.md (§2 of the original): stack composition, compose distribution (locked package-data, option Y), first-run / upgrade / uninstall, prod bootstrap, cross-platform, infra error taxonomy. Annotation-only. Cross-references between the two replace the previous internal §1/§2 references. Both indexed in docs/design/README.md. Drops the 'TODO: review change' markers from the prior round - those were for PR #162 diff scanning and have served their purpose.
Clarify that the two project-config formats (pragmata.yaml vs pyproject.toml) are alternative conventions a repo picks between, not parallel sources - the walk stops at the first match so only one file is ever active per run. Also tighten questionary framing from "deferred" to "not planned" to match the decision that there is no general config wizard in scope.
Remove annotation-bootstrap.md and containerisation-and-deployment.md; these move to a separate PR. Update README index accordingly.
Goal
Add
docs/design/config-and-settings.md- the design document for shared config resolution, settings precedence, secrets handling, CLI surface, and first-use error UX across all three tools (annotation,querygen,eval).Annotation-specific stack lifecycle (Docker, compose, bootstrap) is in a companion PR #175.
Scope
pragmata[annotation],[querygen],[eval]), lazy import guards scoped narrowly to the actual import site, clearImportErrormessages naming the missing package and the extra that provides itResolveSettings.resolvechain; adds auto-discovery of project-level config (./pragmata.yamlorpyproject.toml [tool.pragmata], first-match-wins walk from cwd) and user-level config (platformdirs.user_config_dir("pragmata")). One new core dep:platformdirsCLI flag/kwarg > env var > explicit config_path (if passed) > auto-discovered project config > auto-discovered user config > defaults. Explicit always beats implicitresolve_api_key(); Argilla delegates to its own~/.cache/argilla/credentialsafter envUNSETdefaults; resolution always internal (existing pattern fromapi/querygen.py)annotation setup/teardown/import/export/iaa/up/down,querygen gen-queries,eval run; all commands non-interactive by defaultKey decisions
questionary;annotation setupkeeps its existing headless/flag-driven Argilla provisioning semanticspragmata.yamlandpyproject.toml [tool.pragmata]supported as project-config formats (first-match-wins, ruff pattern) - see open question belowPRAGMATA_<TOOL>_<KEY>env prefix for non-secret tool settings;PRAGMATA_CONFIG_DIRas escape hatch for the platformdirs locationOpen question
Two project-config formats? Supporting both
./pragmata.yamlandpyproject.toml [tool.pragmata](first-match-wins, one file ever active per run - no divergence risk) vs. picking one.pragmata.yamlmatches user-level file shape for easy copy-paste;pyproject.tomlkeeps pragmata alongside ruff/mypy/pytest for repos that already centralise config there.Implementation
Docs-only. No code changes.
References