feat(semantic-layer): add reference-data commands (Chart of Accounts in the metastore)#3
Closed
ottomansky wants to merge 21 commits into
Closed
feat(semantic-layer): add reference-data commands (Chart of Accounts in the metastore)#3ottomansky wants to merge 21 commits into
ottomansky wants to merge 21 commits into
Conversation
…orage ergonomics (keboola#348) * fix(sync push): writeback placeholder manifest entries in place + propagate KBC.* metadata on create Fresh-CREATE pre-population (FIIA / scaffold emit pattern): downstream callers seed manifest entries with placeholder ids and (optionally) KBC.configuration.* metadata before the first `sync push`. Pre-fix, every create unconditionally appended a new ManifestConfiguration / ManifestConfigRow to the manifest, so N placeholders -> 2N entries after one push, every placeholder still looked "added" on re-push (spurious duplicates on remote), and KBC.configuration.folderName from local manifest was silently dropped. Changes: - Service: new `_writeback_create_config_in_manifest` finds the placeholder by (component_id, path) and updates id + branch_id + pull_hash / pull_config_hash in place; preserves all non-bookkeeping metadata (KBC.*). Append remains the fallback when no placeholder exists. - Service: matching `_writeback_create_row_in_manifest` for rows under a parent config. - Service: new `_propagate_kbc_metadata` POSTs any KBC.* keys from the manifest entry to `client.set_config_metadata` once, immediately after the create call. Bookkeeping keys (pull_hash, pull_config_hash) stay out of the metadata API. - push_changes(): replaces the inline `manifest.configurations.append(...)` block (config create path) with the helper call + propagation. - _push_create_row(): replaces `parent.rows.append(...)` with the row helper. Idempotency on re-push falls out for free: after the first push, the placeholder entry holds the real ULID, so the diff engine finds it in remote_configs and reports no change. Tests (TestFreshCreateWriteback, 7 cases): - writeback config in place (placeholder + KBC.* metadata preserved) - writeback config falls back to append when no placeholder - propagate_kbc_metadata filters bookkeeping keys, calls set_config_metadata - propagate_kbc_metadata no-op when there are no KBC.* keys - writeback row in place (no manifest growth) - writeback row falls back to append for untracked rows - end-to-end push: placeholder + folderName -> create + set_config_metadata, manifest length unchanged, re-push is a no-op (status=no_changes) Full sync test suite (77 cases in test_sync_service.py) green; full repo suite (3576 passed, 110 skipped) green; `ty check` clean. * feat(semantic-layer): add search-context + get-context for project-wide reads Two project-wide read subcommands that mirror the upstream `keboola-mcp-server` semantic-context tools (`search_semantic_context`, `get_semantic_context`). Lets downstream callers (FIIA, scheduled agents) drop their MCP dependency for the common "is the model populated?" and "what's at this id?" lookups. CLI: - `kbagent semantic-layer search-context --project P [--pattern G ...] [--type model|dataset|metric|relationship|constraint|glossary|all] [--limit N]` — project-wide glob search over entity names; default searches every child type (not the model itself); `*` matches all. Patterns are repeatable, taking the union. Case-sensitive fnmatch. `--limit` short-circuits both inner and outer loops. - `kbagent semantic-layer get-context --project P --context-id ID` — single fetch by id; probes semantic-model + every CHILD_TYPES entry until it hits, raises NOT_FOUND if no type matches. Non-404 errors (500, etc.) propagate immediately rather than being swallowed. Service (`SemanticLayerService.search_context` / `SemanticLayerService.get_context`): - Validation at the service boundary so CLI, REST router, and `--hint service` callers all share the same error shape. - `_strip_semantic_prefix` normalises the response type field from `"semantic-dataset"` to `"dataset"` for the CLI surface. - Lookup order in get_context is model-first so a model hit short- circuits the 6-type probe to a single call. - try/finally guarantees the metastore client is closed on success and on every error path. Sync surfaces touched: - `commands/semantic_layer.py` — two new Typer commands with `should_hint`/`emit_hint` short-circuits per the hint convention. - `services/semantic_layer_service.py` — new methods + new ClassVar `_ALL_TYPES_FOR_LOOKUP` tuple. - `server/routers/semantic_layer.py` — `GET /search-context` and `GET /get-context` (1:1 CLI->HTTP per CONTRIBUTING.md plugin-sync map); `Query` added to fastapi import. - `hints/definitions/semantic_layer.py` — two new `CommandHint` entries with `ClientCall` + `ServiceCall` for `--hint client` / `--hint service` code generation. - `permissions.py` — both registered as `read` operations. Tests: - `tests/test_semantic_layer_service.py::TestSearchContext` (12 cases) — default pattern, glob narrowing, case-sensitivity, multi-pattern union, type filter (singular + `all` + `model`), `--limit` short-circuit, invalid type / empty pattern / zero limit validation, client cleanup on API error. - `tests/test_semantic_layer_service.py::TestGetContext` (6 cases) — finds dataset by id, finds model (short-circuit on first probe), NOT_FOUND after exhausting all 6 types, 500 propagates without swallowing, empty-id validation, client cleanup on error. - `tests/test_semantic_layer_cli.py::TestSearchContext` (4 cases) and `::TestGetContext` (3 cases) — JSON envelope, kwarg propagation, human-mode table rendering, NOT_FOUND non-zero exit code. Live validation against project 1143 (99_Playground_Max): - `search-context --pattern "*"` returns 8 contexts spanning 4 types. - `search-context --pattern "rev_*" --type metric` narrows to 1 hit. - `get-context` with a UUID returned by the search resolves to its full attribute dict. - `get-context` with `00000000-0000-0000-0000-000000000000` returns NOT_FOUND envelope (exit 1) after probing all 6 types. Full test suite (3601 passed, 110 skipped) green; `ty check` clean. * feat(sync,storage): add --branch override, --if-not-exists, --no-name-drift-warnings Three ergonomic improvements that close downstream-tooling pain points encountered during the FIIA -> kbagent migration. `kbagent sync push --branch <id>` (also sync pull / sync diff): - Per-invocation dev-branch targeting. Beats manifest.branches[0], active_branch_id, and branch-mapping.json (priority 0 in the resolver). - Lets a downstream caller (or operator) target a freshly-created dev branch without first running `branch use` or `sync branch-link`. - Validated mutually exclusive with --all-projects at the CLI layer. - Symmetric on pull / diff for predictable UX. - Threaded through `SyncService._resolve_branch_id(..., branch_override=)`. `kbagent storage create-table --if-not-exists`: - Opt-in flag (defaults False so existing callers are unaffected). - When set, catches the specific `STORAGE_JOB_FAILED` + "already has the same display name" error, probes `get_table_detail(target_id)` to confirm the table truly exists at the expected id, and returns `{action: "skipped", skip_reason: "table already exists"}` instead of raising. A different table with the same display name still surfaces the original error (a real conflict to resolve). - Solves the FIIA `scaffold_storage.py` 8-worker spurious-error symptom documented in the original proposal. `kbagent sync push --no-name-drift-warnings`: - Opt-out flag to suppress the cosmetic `name_drift_warnings` array in the result envelope. The underlying detection still runs (so a future reviewer can re-enable it); only the report is dropped. Sync surfaces touched: - `services/sync_service.py` -- `_resolve_branch_id` gains a `branch_override` parameter (priority 0); `push` / `pull` / `diff` thread it through. `push` adds `no_name_drift_warnings` flag with a single-line suppression at the result-envelope step. - `services/storage_service.py` -- `create_table` gains `if_not_exists=False` kwarg; the IF-NOT-EXISTS branch wraps the existing client.create_table call with a targeted try/except that uses `ErrorCode.STORAGE_JOB_FAILED` (no raw string literal). Response envelope now carries `action: "created" | "skipped"` so programmatic callers can branch on outcome. - `commands/sync.py` -- adds `--branch` to push / pull / diff; adds `--no-name-drift-warnings` to push; validates `--branch` is incompatible with `--all-projects`. - `commands/storage.py` -- adds `--if-not-exists` to create-table. - `server/routers/storage.py` -- `CreateTable` request model gains `if_not_exists: bool`; the router forwards it. Sync routes intentionally absent (sync is filesystem-local; documented exemption per CONTRIBUTING.md plugin-sync map). Tests: - `tests/test_storage_write.py::TestCreateTableIfNotExists` (5 cases): skip on existing when flag set, reraise when unset, reraise when target table missing despite flag, reraise on non-duplicate errors even with flag, success path unchanged with flag. - `tests/test_sync_service.py::TestBranchOverrideAndNameDriftFlag` (4 cases): resolver priority (override wins), push branch_override reaches client, diff branch_override reaches client, no_name_drift_warnings suppresses the field from the envelope (with a control-arm check that proves the warning surfaces by default). - One existing test in `test_storage_write.py` updated to include the new `if_not_exists=False` kwarg in its `assert_called_once_with`. Live validation against project 1143 (99_Playground_Max), branch 388072: - `sync diff --branch 388072` reaches the dev branch and reports `remote_only: 31`; without `--branch`, same call reports no remote diff. - `storage create-table --if-not-exists` end-to-end: first call returns `action: "created"`; second call (same name) returns `action: "skipped", skip_reason: "table already exists"`; third call WITHOUT the flag returns the original `STORAGE_JOB_FAILED` error envelope. Full test suite (3610 passed, 110 skipped) green; `ty check` clean; `ruff check` + `ruff format --check` clean. * release: 0.47.0 — fresh-CREATE writeback, semantic-layer reads, sync/storage ergonomics Bumps version to 0.47.0 and walks the full silent-drift sync map mandated by CONTRIBUTING.md §17 + §322-425 for the three feature commits already on this branch: aaf83bc fix(sync push): writeback placeholder manifest entries in place + propagate KBC.* metadata on create c6ce7ad feat(semantic-layer): add search-context + get-context for project-wide reads 40df5fa feat(sync,storage): add --branch override, --if-not-exists, --no-name-drift-warnings Version + auto-regenerated artefacts (CI-checked): - pyproject.toml: 0.46.1 -> 0.47.0 - .claude-plugin/marketplace.json + plugins/kbagent/.claude-plugin/plugin.json re-synced via `make version-sync` - plugins/kbagent/skills/kbagent/SKILL.md decision table regenerated via `make skill-gen` (now lists search-context and get-context) - src/keboola_agent_cli/changelog.py: new 0.47.0 entry covering all three fixes + the no-sync-router exemption note - uv.lock: keboola-agent-cli pin advanced to 0.47.0 Hand-maintained surfaces (silent-drift risks; not CI-checked): - src/keboola_agent_cli/commands/context.py AGENT_CONTEXT -- sync push/pull/diff signatures updated for --branch / --no-name- drift-warnings; storage create-table gains --if-not-exists; semantic-layer search-context / get-context added. - CLAUDE.md `## All CLI Commands` -- storage create-table gains --if-not-exists; semantic-layer search-context + get-context added. (Sync commands are not currently listed in this file -- pre-existing gap, out of scope for this PR.) - plugins/kbagent/agents/keboola-expert.md Tool Selection Matrix -- the existing semantic-layer "list models / entities" row points at search-context / get-context for project-wide glob/id lookup. The 60 KB budget for the keboola-expert prompt is tight (closing at 59944 bytes); the addition was kept terse rather than expanding the matrix with a fresh row. - plugins/kbagent/skills/kbagent/references/commands-reference.md -- storage create-table gains --if-not-exists note; sync push/pull/diff gain --branch and --no-name-drift-warnings notes; new bullets for semantic-layer search-context and get-context. - plugins/kbagent/skills/kbagent/references/gotchas.md -- four new `(since v0.47.0)` sections: fresh-CREATE writeback contract change, --branch override semantics, storage --if-not-exists envelope, --no-name-drift-warnings opt-out, and the search-context / get- context MCP-parity note. - plugins/kbagent/skills/kbagent/references/sync-workflow.md -- new "Per-invocation dev-branch override" and "Fresh-CREATE writeback" sections with worked examples. `make check` passes clean (lint + format + skill + version + changelog + error-codes + 3610 tests). Pre-existing PR-body-relevant exemptions documented in changelog: - `kbagent sync push/pull/diff` (and their new --branch flag) remain filesystem-local and intentionally have no REST router in src/keboola_agent_cli/server/routers/. Permitted by the CONTRIBUTING Plugin Synchronization map ("terminal-only / filesystem-bound commands"). All other new surfaces (storage create-table, semantic- layer search-context / get-context) are exposed 1:1 over HTTP. * review: iteration-2 fixes (multi-branch writeback safety, metadata error accumulation, skip render, E2E coverage) Independent reviewer findings from iteration 2 (no BLOCKING; 4 NON-BLOCKING + 3 NIT in code; 2 NIT in security). All material items addressed: 1. `_writeback_create_config_in_manifest` now matches placeholders on `(branch_id, component_id, path)` instead of `(component_id, path)` alone. Without this, a multi-branch manifest with the same logical config path under two branches could update the wrong branch's entry. The placeholder branch_id is also no longer overwritten by the helper -- the match already proves it's correct. New regression test: `test_writeback_config_does_not_match_across_branches`. 2. `_propagate_kbc_metadata` now returns the API error message on a non-fatal write failure (the config IS already created and the manifest writeback is complete; aborting the rest of the push mid-loop was the worse failure mode). The push loop accumulates the message into the existing `errors[]` list under a new `change_type: "metadata_propagation"` entry so callers can see what went wrong without losing the rest of the push. Added a docstring "not a secret store" note about KBC.* keys. New unit test: `test_propagate_kbc_metadata_returns_error_message_on_api_failure`. 3. `kbagent storage create-table --if-not-exists` human-mode renderer now prints "Skipped (already exists): <table_id>" + the reason when `result["action"] == "skipped"`, instead of the misleading "Created table: ..." line. JSON envelope unchanged. New CLI test: `test_human_renders_skip_when_action_is_skipped`. 4. E2E coverage in `tests/test_e2e.py::TestE2E_0_47_0_NewSurfaces`: - `storage create-table --if-not-exists` round-trip: created -> skipped -> raises without flag (binds the action envelope shape and the STORAGE_JOB_FAILED reraise path). - `semantic-layer search-context` envelope shape + type filter narrowing; `get-context` NOT_FOUND on all-zero UUID; roundtrip search -> get when the project has at least one searchable entity. - `sync diff --branch <id>`: creates a throwaway dev branch on the fly, asserts diff reaches that branch (status=ok, changes != None, remote_only >= 0), cleans up the dev branch in teardown. Items kept but not changed: - Pyright lambda-parameter noise in tests (`url`/`token` "not accessed"): pre-existing across the test suite, idiomatic for the client_factory signature; ty is clean. - FastAPI router `type` parameter name in `semantic_layer.search-context`: cosmetic shadow of the builtin; ignoring. `make check` clean: 3613 passed, 7 skipped, 106 deselected. `make test-e2e-local CONFIG_DIR=/tmp/kbc-config-e2e ALIAS=e2e-1143 PYTEST_ARGS=...TestE2E_0_47_0_NewSurfaces` -- all 3 new e2e tests pass against project 1143 (529s; 67 passed, 6 skipped, 0 failed total when running the broader e2e suite). * review: iteration-3 convergence cleanup (changelog accuracy, e2e docstring off-by-one) Iteration 3 (independent convergence reviewer) returned "CONVERGED -- zero material findings." Two documentation NITs noted and fixed here: - changelog.py:12 -- the 0.47.0 entry's prose still said `_writeback_create_config_in_manifest` matches placeholders by `(component_id, path)`. Iteration 2 narrowed the key to `(branch_id, component_id, path)` for multi-branch safety; the changelog now reflects the final key and notes the why. - tests/test_e2e.py docstring on TestE2E_0_47_0_NewSurfaces -- said "All four touch a real Keboola project" but the class has three test methods. Fixed to "All three". Iteration 3 also flagged one pre-existing scope item that iteration 2 did NOT introduce: - `storage create-table --if-not-exists` skipped-envelope reports the USER-REQUESTED `primary_key` / `columns` (from the create call's args), not the EXISTING table's actual schema. A caller that relies on the envelope to discover the real schema would get the wrong shape. Out of scope for this PR; will be filed as a separate follow-up issue against keboola/cli before merge per the deferred- scope-orphan rule. `make check` clean: 3613 passed, 7 skipped, 106 deselected. Branch is convergence-clean. Next: pause for user authorization before opening the PR. * test(e2e): add Area B fresh-CREATE writeback + KBC.* propagation against real API Closes the most important e2e coverage gap that the user flagged: the Area B headline fix (writeback in place + KBC.configuration.* metadata propagation on CREATE) was only live-validated manually in the earlier session; nothing in the test suite would catch a regression. New test `TestE2E_0_47_0_NewSurfaces::test_sync_push_fresh_create_writeback_and_kbc_metadata`: - Creates a throwaway dev branch on the configured project. - `sync init` then `sync pull --branch <dev>` so the dev branch lands in the manifest. - Hand-authors a placeholder ManifestConfiguration with `KBC.configuration.folderName` declared (FIIA / scaffold pattern), writes a matching `_config.yml` locally. - `sync push --branch <dev>` -- asserts `created=1, errors=[]`. - Manifest invariants: length unchanged (writeback in place, NO duplicate), placeholder id replaced with the API-assigned ULID, `KBC.configuration.folderName` preserved on the entry under the right `branch_id`. - `config metadata-list` against the new id verifies the folderName landed on the remote via the metadata API. - `sync push` second invocation -- asserts `created=0` (idempotent). - Teardown deletes the dev branch (and with it every config created inside it) so re-runs don't accumulate residue. Also: `yaml` import added at the top of `tests/test_e2e.py` -- the file uses `yaml.dump` in the placeholder fixture builder. Live validated: - New test passes against project 1143 (e2e-1143 / 99_Playground_Max) in 9.44s (direct pytest invocation). - Full e2e suite still 67 passed, 6 skipped; the one failing test (`TestFullE2E::test_full_cli_e2e::_test_file_operations`) is an unrelated pre-existing flake against the Storage Files index lag and is not introduced by this change. * review: /kbagent:review iteration-4 fixes (VERSION GATE drift, follow-up issue, gotchas caveat, fnmatch import) The /kbagent:review subagent caught one BLOCKING and three lower- severity findings that the iteration-2 + iteration-3 independent reviewers had missed. All four addressed here: [B-1] keboola-expert.md §1 Rule 6 VERSION GATE not updated for 0.47.0. An agent on 0.46.x would attempt `semantic-layer search-context` / `get-context` / `storage create-table --if-not-exists` / `sync push|pull|diff --branch` and get "No such command", silently losing the stated MCP-parity benefit. Added a single 0.47.0+ row covering all four new surfaces + the fresh-CREATE writeback + KBC.* propagation behavior change. Stayed under the 60000-byte prompt budget by also tightening the verbose 0.41.0 `semantic-layer` build-heuristic note from a five-line wall to a one-line inline. [NB-1] PR body had `TBD` for the deferred-scope follow-up issue. Filed keboola#349 with a complete repro + suggested fix shape and updated the PR body to link it. Tracking the design-surface scope outside this PR per the deferred-scope-orphan-prevention rule. [NB-2] gotchas.md `--if-not-exists` entry documented the happy path but did not warn that the skipped envelope's `columns` / `primary_key` mirror the user's REQUEST, not the EXISTING table's actual schema. Added an explicit caveat referencing keboola#349 and pointing callers at `storage table-detail` if they need the real shape. [NIT] `import fnmatch` inline inside `_matches_any_pattern` static method body. Hoisted to the top-level imports in `semantic_layer_service.py` for consistency with `permissions.py` and the rest of the file. `make check` clean (3613 passed, 7 skipped). `tests/test_agent_prompt.py` budget check green (60000-byte ceiling respected). `ty check` clean. * review: /kbagent:review iteration-5 fixes (file-size ceiling, hint registry) The second /kbagent:review pass returned APPROVE with two new findings the prior reviewers hadn't surfaced. Both addressed here: [NB-1] services/semantic_layer_service.py crossed the CONTRIBUTING.md hard ceiling (1500 LOC for services/*.py): 1480 -> 1640 LOC during this PR. Extracted search_context + get_context into a new sibling helper `services/_semantic_layer_lookup.py` following the existing `_semantic_layer_crud.py` / `_semantic_layer_internals.py` / etc. pattern. The helpers now own the metastore client lifecycle (open + finally close) via an `open_client: Callable[[], MetastoreClient]` factory the service injects with a 1-line lambda; the service methods are pure 1-line delegators. semantic_layer_service.py: 1640 -> 1496 LOC (under the hard ceiling). _semantic_layer_lookup.py: new file, 187 LOC. Two minor banner-comment trims (`# Helpers (used by every subcommand)`, `# Phase 3 — Read commands`) collapsed to single inline comments to bring the count just under the 1500 ceiling without changing any behavior or structure. [NIT-1] hints/definitions/storage.py `create-table` `ServiceCall` args was missing `if_not_exists`. An AI agent following `--hint service` would generate non-idempotent code even when the caller wanted the IF-NOT-EXISTS path. Added the arg + a `notes[]` line documenting the 0.47.0+ flag. Verification: - `make check` clean: 3613 passed, 7 skipped, 107 deselected - `ty check` clean - The two existing e2e tests touched by this change (test_semantic_layer_search_and_get_context, test_sync_push_fresh_create_writeback_and_kbc_metadata) re-run against project 1143 and still pass in 13.67s - The Pyright "Import could not be resolved" diagnostics on the new `_semantic_layer_lookup` import are stale-cache artifacts; ty is the project's authoritative typechecker and it is green. Helper-design choice: the extraction inverts the client-lifecycle ownership (helpers open + close vs. service open + pass-in close). This matters because the service methods become genuinely 1-line and the orchestrator class stays well under budget for future growth. The cost is one extra import (`Callable` via TYPE_CHECKING) in the helper; gain is ~50 LOC saved in the orchestrator on top of the ~140 LOC moved out. * refactor(semantic-layer): switch try/finally + close() to `with open_client() as client:` Addresses the lone NIT from the /kbagent:review iteration-5 pass. CONTRIBUTING.md prefers the `with` form for resources that implement `__enter__`/`__exit__`; `MetastoreClient` has had both since v0.41.0 but every method in `services/semantic_layer_service.py` was using the older `try/finally + client.close()` idiom. The new `_semantic_layer_lookup.py` (added in iteration 5) inherited that pattern. The reviewer flagged the new helper but the right fix is to sweep the whole service for consistency, not patch just the new file. 20 single-client sites converted via a small one-shot Python rewrite (`/tmp/refactor_with.py`) that matched the canonical shape: client = self._new_metastore_client(project) try: ... finally: client.close() --> with self._new_metastore_client(project) as client: ... Body indentation is unchanged (the try-block and the with-block use the same +4 indent). 1 cross-project (promote) site converted by hand to the new parenthesized multi-context-manager form: with ( self._new_metastore_client(projects[from_project]) as src_client, self._new_metastore_client(projects[to_project]) as tgt_client, ): ... `_semantic_layer_lookup.py::run_search_context` and `run_get_context` also switched to `with open_client() as client:`. The factory pattern the service injects still works: `lambda: self._new_metastore_client( self._resolve_one_project(alias))` returns a MetastoreClient configured per the resolved project. Test fixture updates: `_make_service` in `test_semantic_layer_service.py` now sets `mock.__enter__ = MagicMock(return_value=mock)` + `mock.__exit__ = MagicMock(return_value=False)` on the injected MagicMock so a `with` block over the mock yields the same body the test configures side-effects on. The `TestPromoteModel` cross-project fixture got the same treatment for both source and target mocks. `mock.close.assert_called_once()` -> `mock.__exit__.assert_called_once()` across 5 sites (the cleanup is now invoked via __exit__, not close). Verification: - `make check` clean: 3613 passed, 7 skipped, 107 deselected. - All 4 e2e tests against project 1143 still pass in 32.91s: storage create-table --if-not-exists round-trip, semantic-layer search-context + get-context, sync diff --branch, fresh-CREATE writeback + KBC.* propagation. - `ty check` clean. Net diff: -55 LOC in semantic_layer_service.py (now well under the 1500-LOC ceiling at ~1441) thanks to losing the explicit finally/close at every site. * fix(ci): add @skip_without_credentials to TestE2E_0_47_0_NewSurfaces GitHub CI's `make check` runs `pytest -m "not e2e"` and the `TestE2E_0_47_0_NewSurfaces` class was correctly tagged with `@pytest.mark.e2e`, but I missed the second decorator the other E2E classes in this file use: `@skip_without_credentials`. Without it, when CI somehow does collect the class (e.g. via `pytest tests/` without the `-m "not e2e"` filter, or via another wrapper), the fixture tries to read `os.environ[ENV_TOKEN]` and raises `KeyError` during setup rather than skipping cleanly. Reproducer: unset E2E_API_TOKEN; uv run pytest tests/test_e2e.py::TestE2E_0_47_0_NewSurfaces -> 4 ERROR ... KeyError: 'E2E_API_TOKEN' After the fix: unset E2E_API_TOKEN; uv run pytest tests/test_e2e.py::TestE2E_0_47_0_NewSurfaces -> 4 SKIPPED in 0.20s This matches the pattern every other E2E class in the file uses (see TestFullE2E, TestE2EErrorHandling, TestE2EJsonConsistency, TestE2ESyncWorkflow -- all stack `@skip_without_credentials` ABOVE `@pytest.mark.e2e`). Verified GitHub Actions log for run 26441694904 showed exactly this failure mode: "ERROR ... KeyError: 'E2E_API_TOKEN'" on all four new tests, with 3610 non-e2e tests passing alongside.
…ema on skip (keboola#349) (keboola#350) The action:"skipped" envelope now returns the EXISTING table's columns / primary_key / name (sourced from the get_table_detail probe that confirms the table exists) instead of re-echoing the caller's request. The requested values are preserved under requested_columns / requested_primary_key, and a new schema_drift flag marks when the existing table diverges from the request. Human-mode output shows the actual schema on a skip and warns on drift. Bumps 0.47.0 -> 0.47.1.
* fix workspace login type for snowflake * format workspace login type tests * fix snowflake workspace keypair creation * Address Snowflake workspace review feedback
…r-group (keboola#355) * docs(contributing): deprecate --hint requirement, make tool-matrix per-group Two policy fixes surfaced by the dev-portal review (PR keboola#354), where two of the repo's own rules collided with each other: - --hint code generation is already deprecated in favour of the `kbagent serve` REST API (CLAUDE.md, gotchas.md), but the per-command checklist and the kbagent-pr-reviewer prompt still demanded a hints/definitions entry per command and flagged its absence BLOCKING. Drop the requirement; existing hint definitions stay for back-compat but are no longer extended, and reviewers must not flag a missing one. - keboola-expert.md §2 Tool Selection Matrix is a static subagent system prompt loaded eagerly into every run, with a hard 60 KB budget. The rule demanding one matrix row PER COMMAND fought that budget directly. Make it one row per command GROUP; exhaustive per-command detail lives in AGENT_CONTEXT (loaded dynamically). A missing matrix row is no longer BLOCKING. Trim stale content rather than raising the cap. * docs(contributing): clarify matrix is author-expected but NON-BLOCKING in review
…E; --branch promotes default tree (keboola#360) Fresh-CREATE variable binding (KFR-03/04/05): a transformation scaffolded with its keboola.variables config + values row is runnable after one push. --branch promotes the default tree when no per-branch subtree exists (KFR-07). Bumps 0.47.2 with full plugin/doc-sync.
… flags (0.48.0) (keboola#356) * feat(feature): add `kbagent feature` command group for Manage API feature flags (0.48.0) Adds a new `feature` command group for listing and managing Keboola feature flags via the Manage API, following the same 3-layer design and super-admin-token policy as `org` / `member`. Commands (7): feature list --project ALIAS # stack catalogue (GET /manage/features) feature project-show --project ALIAS # features assigned to a project feature project-add --project ALIAS --feature NAME [--dry-run] [--yes] feature project-remove --project ALIAS --feature NAME [--dry-run] [--yes] feature user-show --project ALIAS --email EMAIL feature user-add --project ALIAS --email EMAIL --feature NAME [--dry-run] [--yes] feature user-remove --project ALIAS --email EMAIL --feature NAME [--dry-run] [--yes] Token security: the super-admin Manage API token is resolved via the existing default-deny `resolve_manage_token()` (interactive hidden prompt; never persisted; never a CLI argument; `--allow-env-manage-token` opt-in for CI). `--project ALIAS` resolves the stack URL (and, for project ops, the numeric project_id) from config. Layers: - manage_client.py: list_features / add|remove_project_feature / get_user / add|remove_user_feature (email + feature URL-encoded in path; POST body {"feature": NAME} sent as application/json) - models.py: Feature model (only `name` stable; extras pass through; bare-string features normalised to {"name": ...}) - services/feature_service.py: FeatureService (alias resolve, dry-run, feature normalisation) - commands/feature.py: thin Typer layer, dual output, dry-run/confirm - permissions.py: list/*-show=read, *-add=admin, *-remove=destructive Notes: - There is no dedicated per-project feature-list endpoint; project/user features are read from the project/user object's `features` array. - Bumped PROMPT_BYTE_BUDGET 60k -> 62k to fit the keboola-expert matrix row (split into specialists if it keeps growing). Tests: test_feature_service.py (19), test_feature_cli.py (21), test_manage_client.py + test_models.py extensions, read-only E2E test_feature_flags_read_e2e (opt-in `make test-e2e-feature`). Docs/sync surfaces: CLAUDE.md, AGENT_CONTEXT, keboola-expert.md, SKILL.md (+ regenerated table), commands-reference.md, gotchas.md, changelog.py; version 0.47.1 -> 0.48.0 (version-sync). * fix(feature): adaptive Rich table -- omit empty Title/Type/Description columns Project/user feature arrays come back from the Manage API as bare strings (name-only), so the optional columns were rendered uniformly empty. The table now shows a column only when at least one feature populates it: the stack catalogue (GET /manage/features) keeps Title/Type/Description, while project-show / user-show collapse to just Name. JSON output is unchanged. * fix(feature): address PR review -- serve REST parity, dataclass, arg order, 204 guard Resolves the kbagent-pr-reviewer findings on keboola#356: B-1 (blocking): add `server/routers/feature.py` -- a 1:1 `kbagent serve` REST router for all 7 feature commands, wired into app.py (import, OpenAPI tag, include_router) and ServiceRegistry.feature. Every endpoint requires the X-Manage-Token header, mirroring the `members` / `org` routers. Closes the serve parity gap that would have 404'd all /feature/* paths. NB-1: replace the bare `tuple[str, int]` returned by `FeatureService._resolve_alias` with a frozen `_ResolvedAlias` dataclass (stack_url, project_id), per CONTRIBUTING Code Quality Patterns. NIT-1: flip `formatter.error(...)` to error_code-first argument order in commands/feature.py. Open question: the POST add paths (`add_project_feature` / `add_user_feature`) now tolerate a 204 No Content body (`response.json() if response.content else {}`) instead of assuming a JSON body on every stack. Tests: 7 new server router tests in test_server_router_calls.py (token pass-through for all endpoints + 401 on missing X-Manage-Token). Full suite 3690 passed; changelog 0.48.0 entry updated. * fix(feature): close re-review nits -- serve auth prefix + APP_DESCRIPTION Follow-up to the re-review on keboola#356 (verdict APPROVE): - Add "/feature" to `_allow_static_through_auth`'s `api_prefixes` in app.py so feature GET endpoints are auth-gated in --ui mode like every other API group (the manage-token check already fired second, so no data leaked, but the prefix omission was a logic gap). - Mention "feature flags" in the APP_DESCRIPTION Project Management layout bullet for completeness (feature already has its own OpenAPI tag).
…te safety (keboola#354) * docs: brainstorm spec for kbagent dev-portal command group Adds the design document produced during /superpowers:brainstorming for wrapping the Keboola Developer Portal API (apps-api.keboola.com) in kbagent. Spec covers data model (multi-identity, mirrors KB project storage), client/service/command 3-layer split, the random-code TTY confirm safety bar with no env-var bypass, v1 op scope (list/get/create/patch/upload-icon/publish/deprecate plus peers lookup), permission-registry integration, testing layout, and the rule keboola#17 documentation-sync checklist. No implementation yet. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(dev-portal): drop peers from spec + add implementation plan Removes the `dev-portal peers` helper from v1 scope -- the agent can compose peer-config research from `list` + `get` directly. Adds the 16-task implementation plan produced by /superpowers:writing-plans. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(dev-portal): add ErrorCode entries for Developer Portal * refactor(safety): extract require_random_code_confirmation() to _helpers Move the load-bearing safety primitive from commands/permissions.py into commands/_helpers.py so upcoming Developer Portal write commands can reuse it without duplicating the guard logic. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(dev-portal): add DeveloperPortalIdentity model + AppConfig fields * feat(dev-portal): ConfigStore methods for identity CRUD * feat(dev-portal): extend CLAUDE_CONFIG_WARNING to mention DP credentials * feat(dev-portal): client skeleton + token-path login Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(dev-portal): MFA login path via /dev/tty Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(dev-portal): client reads + create/patch/publish/deprecate * feat(dev-portal): icon upload (two-hop, presigned S3 PUT) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(dev-portal): service with identity CRUD + verify-on-add DeveloperPortalService skeleton: add/list/remove/edit/rename/use/verify identity methods. add_identity runs the login probe BEFORE persisting so bad credentials never land in config.json. Tests cover happy path, verify-failure-no-persist, use_identity default, and remove. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(dev-portal): service reads + prepare/apply + diff + publish pre-flight * feat(dev-portal): permission registry entries + identity resolver Add 13 dev-portal.* entries to OPERATION_REGISTRY, resolve_identity_alias() helper, and get_dev_portal_service() factory to _helpers.py; cover with TestDevPortalPermissions test class. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(dev-portal): identity subcommands + list/get Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(dev-portal): reconcile permission registry keys + wire callbacks Task 12 added entries with hyphenated identity names (dev-portal.identity-add) that are unreachable -- check_cli_permission builds keys from the Typer tree as `{group}.{subcommand}`, so identity-sub-app leaves are `dev-portal.identity.add` (dotted). Task 13 added the correct dotted form and the original hyphenated entries were dead. This: - Drops the 6 unreachable hyphenated identity entries (Task 12 leftovers). - Adds `dev-portal.identity: read` so the parent-callback descent is allowed. - Realigns categories to data-app.secrets-* precedent: credential add/edit are `write` (admin is reserved for org-level ops). - Wires callbacks on dev_portal_app and identity_app so the engine actually fires on these commands. - Updates TestDevPortalPermissions to assert on the actual runtime keys. * feat(dev-portal): write commands gated by random-code confirm Add create / patch / upload-icon / publish / deprecate commands under `kbagent dev-portal`. Each write command calls `_assert_tty()` as its very first action (before any file I/O or API calls), refusing with exit 6 on non-TTY shells. `--dry-run` bypasses both the TTY check and the random-code prompt, prints a preview, and exits 0. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(dev-portal): version 0.48.0, E2E test, AGENT_CONTEXT, plugin docs Bump version to 0.48.0 for the `kbagent dev-portal` command group and update all silent-drift surfaces per CONTRIBUTING.md rule keboola#17. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(dev-portal): trim keboola-expert.md back to 60K prompt budget Task 15's additions to keboola-expert.md (version-gate, tool-matrix row, inline gotcha) pushed it 1,652 bytes over the CI-enforced 60,000-byte budget. Drops all three additions; dev-portal coverage in the agent surface lives in AGENT_CONTEXT, SKILL.md decision table, commands-reference.md, dev-portal-workflow.md, and gotchas.md (which have no equivalent size cap). File now back to 59,992 bytes. Matches the pattern of fe246e5 (which previously reverted a similar edit for the same reason). * docs(dev-portal): add command help text + regenerate SKILL.md The auto-generated decision table in SKILL.md is sourced from each Typer command's help text via scripts/generate_skill.py. Task 13/14 didn't add help= to the @dev_portal_app.command decorators, so the table generator skipped every dev-portal command and the prior manual entries got stripped on make skill-check. Adds concise help text to every dev-portal command (identity sub-app + list/get + create/patch/upload-icon/publish/deprecate), then regenerates SKILL.md. The auto-generated rows now correctly enumerate the full dev-portal surface. * fix(dev-portal): authenticate once across prepare/apply (no double MFA) patch and publish opened a fresh portal client in prepare_* (to read current state) and again in apply() (to write), each triggering its own /auth/login. On a personal MFA account that meant TWO MFA prompts for a single write; on a service account it was a redundant second login. The service now caches the bearer per alias (in-memory, rebuilt per CLI invocation) and seeds it into the apply() client via a new client.seed_bearer()/bearer API, so the human authenticates at most once per command. create/deprecate/upload-icon were already single-login (their prepare_* defers auth) and are unaffected. Also type BaseHttpClient.__enter__ as Self so `with <subclass>() as c` keeps the concrete type — this clears the pre-existing ty errors where DeveloperPortalClient methods were unresolved on BaseHttpClient inside `with` blocks (dev_portal_service + test_dev_portal_client). * feat(dev-portal): expose read endpoints via kbagent serve Add server/routers/dev_portal.py with GET /dev-portal/apps (list by vendor) and GET /dev-portal/apps/{app} (get one), wired into the FastAPI app + ServiceRegistry. Mirrors `kbagent dev-portal list|get` so external consumers (Web UI, scheduled agents) can do peer-config research over the REST surface. Writes (create/patch/upload-icon/publish/deprecate) and identity management stay CLI-only by design: writes require a human to type a random code on a TTY (meaningless over HTTP), and identity commands handle login credentials that must not travel over this API. The skip is documented in the router module docstring and the OpenAPI tag. * docs(dev-portal): add Tool Selection Matrix row for keboola-expert Give the dev-portal group one per-group row in §2 (reads agent-safe; writes TTY-confirmed, never raw apps-api). Stays under the 60 KB prompt budget by trimming now-deprecated `--hint client` fallbacks from five existing matrix rows (per the policy merged in keboola#355: --hint is superseded by `kbagent serve`). * fix(dev-portal): require bearer auth on GET /dev-portal/* in serve --ui `_is_ui_public` treats any GET not matching `api_prefixes` as an SPA route and serves it without bearer validation. The new `/dev-portal` router was not added to that allow-list, so `GET /dev-portal/apps` was reachable unauthenticated in `kbagent serve --ui` mode (script/curl callers; browser users with the session cookie were unaffected). Add `/dev-portal` to `api_prefixes` and cover it with a 401-without-auth test. * fix(dev-portal): evict stale cached bearer on auth error The per-alias bearer cache is harmless in the CLI (service rebuilt per invocation) but the `kbagent serve` ServiceRegistry is a long-lived singleton, so a cached bearer outlives its portal-side TTL. Once expired, every `_authed_client` call re-seeded the dead token and 401'd forever -- permanent lockout until restart. `_authed_client` now drops `_bearers[alias]` on INVALID_TOKEN / DP_LOGIN_FAILED so the next call re-authenticates. Regression test seeds a stale bearer, asserts the 401 propagates AND the entry is evicted, then that a follow-up call logs in fresh and succeeds. * docs(dev-portal): document dry-run portal GET, type previews, version gate Review polish (non-blocking): - patch/publish `--dry-run` help + dev-portal-workflow.md now state that the preview still logs in and GETs the app (needs connectivity; a personal/MFA identity prompts for MFA) -- use a service account for a fully non-interactive preview. - `_render_pending` / `_pending_as_json` drop their `# type: ignore` workarounds and annotate with `OutputFormatter` / `PendingWrite` (TYPE_CHECKING import). - keboola-expert.md §1 VERSION GATE lists `dev-portal = 0.48.0+`; offset by trimming two now-deprecated `--hint client` prose mentions (stays under the 60 KB prompt budget). * chore(dev-portal): bump to 0.49.0 (0.48.0 taken by keboola#356 feature flags) keboola#356 (`kbagent feature` command group) merged to main as 0.48.0 and that tag is cut, so dev-portal moves to its own 0.49.0 instead of colliding. - pyproject / plugin.json / marketplace.json / uv.lock -> 0.49.0 - changelog: dev-portal entries live under a new 0.49.0 key, above main's 0.48.0 (feature flags) block (done during the rebase) - flip dev-portal "since v0.48.0" doc tags -> v0.49.0: AGENT_CONTEXT, commands-reference, gotchas, keboola-expert version-gate + matrix row --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Petr <petr@keboola.com>
) (keboola#363) * feat(headless): token-only invocation via __env__ project (keboola#359) Let a daemon / container / CI run kbagent with only a token in the environment -- no `kbagent project add`, no config.json on disk. Setting KBAGENT_PROJECT_FROM_ENV=1 together with KBC_TOKEN + KBC_STORAGE_API_URL makes ConfigStore synthesize an in-memory project under the reserved alias `__env__`. Because both the CLI and `kbagent serve` resolve projects through the same ConfigStore.load() chokepoint, a single env-injection covers both consumption styles: kbagent --json storage file-upload --project __env__ --file X kbagent serve # POST endpoints take project=__env__ Security: - The `__env__` project is marked `ephemeral` and stripped by ConfigStore.save(), so the env token is never persisted, even when a write op triggers a config.json write. - Opt-in is the explicit flag, not the mere presence of KBC_TOKEN, to avoid a phantom project on a dev machine that exported KBC_TOKEN only for `project add`. - Flag set but credentials missing -> fail fast (exit 5), not a silent skip. Tests: 7 unit (test_config_store.py) + 3 E2E (test_e2e.py). Docs: changelog, keboola-expert.md, gotchas.md, commands-reference.md, context.py AGENT_CONTEXT, CLAUDE.md. Version 0.49.0 -> 0.50.0. * feat(project): normalize stack URLs (bare host, deep-link, trailing slash) UX follow-up on the headless mode. `KBC_STORAGE_API_URL` (and `project add --url` / `project edit --url`) previously rejected anything that was not already a clean `https://<host>` base -- a bare host like `connection.keboola.com` raised a pydantic ValidationError traceback. Add `normalize_stack_url()` as the single source of truth, used by the ProjectConfig field validator (safety net + clean stored value) and by ProjectService.add_project / edit_project (so token verification hits the right host). It accepts: - bare host connection.keboola.com - trailing slash https://connection.keboola.com/ - surrounding whitespace (paste artifact) - full project deep-link https://connection.keboola.com/admin/projects/10105/dashboard and reduces every form to https://<host>. Explicit non-https schemes (http://, file://, ftp://) are still rejected (SSRF / protocol-abuse guard). An unusable URL in the headless `__env__` injection now raises a clean ConfigError (exit 5) instead of a raw ValidationError traceback. Tests: 6 new model tests + 2 new env-injection tests; updated the old "reject no scheme" test to assert normalization. Full non-e2e suite: 3771 passed. * fix(headless): recover __env__ project_id from token, drop fake name `project list` showed `project_name="env (headless)"` and a null Project ID for the env-injected project -- the fake name was misleading and the ID was simply missing. ConfigStore.load() must stay offline (it runs many times per command and per serve request), so it cannot call verify_token to fetch the real project name. But Keboola Storage tokens are `{projectId}-{tokenId}- {secret}`, so the project_id is recovered offline from the token prefix. The project_name is left blank (honest) instead of a fake placeholder; `project status` / `project info` verify against the API and show the real name when a command actually needs it. Tests: assert project_id is parsed (901-...) and name is blank; a non-numeric token prefix leaves project_id unset without crashing. * review(keboola#363): version gate, guard __env__ mutations, service URL test Address the kbagent-pr-reviewer findings on keboola#363: - NB-1: add the 0.50.0 headless / URL-normalization entry to the Rule 6 VERSION GATE in keboola-expert.md (highest silent-drift surface). - NB-2: reject remove/edit/rename/set-branch on the env-synthesized __env__ project with a clear ConfigError instead of reporting a success that silently vanishes on the next load(). A real persisted project under the same alias (ephemeral=False) stays mutable. - NIT-1: add a service-layer test asserting add_project() normalizes a bare-host / deep-link URL through normalize_stack_url() before the verification client and before persisting. Tests: +5 (guard x2, service normalization x1, project_id parse x2 from earlier). Full non-e2e suite: 3775 passed. * fix(headless): skip org-info backfill for ephemeral __env__ project Devin review flagged that `project status` in headless mode could write a config.json to disk via `_backfill_org_info`: the __env__ project always has empty org_id/org_name, so the backfill kept trying to persist it. `save()` strips the ephemeral entry (so no token leaked), but the file was still created -- breaking the "no config.json on disk" promise -- and the futile backfill re-ran on every `project status`. Skip ephemeral projects when building the backfill update set. When __env__ is the only candidate, the update set stays empty and no file is written at all. Test: get_status() under env-injection leaves the config dir file-free. Full non-e2e suite: 3776 passed.
…(0.50.0) (keboola#364) * feat(stream): `kbagent stream` command group for Data Streams (OTLP) (0.50.0) Add a `kbagent stream` command group so OpenTelemetry / OTLP Data Streams sources can be provisioned and introspected from the CLI instead of copy-pasting endpoints out of the Keboola UI (closes keboola#357). Commands: `stream list`, `stream create-source`, `stream detail`, `stream delete`. Architecture: - Stream control plane lives on a separate host derived from the project's Storage URL (connection.<region> -> stream.<region>, same scheme as ai./queue.) and authenticates with the per-project Storage token (X-StorageApi-Token) -- no manage token. - The OTLP ingestion endpoint (stream-in.<region>/otlp/.../<secret>) is returned by the API in source.otlp.url (never derived) with the secret in the URL path -- masked by default in every surface, --reveal to print it. - create-source --type otlp auto-provisions the logs/metrics/traces sinks (bucket in.c-otlp-<source>) so data actually lands in Storage, matching the UI; provisioning is idempotent and --no-sinks opts out. - create/delete/sink-create are async Tasks polled to completion. New layers: stream_client.py (StreamClient + create_sink + task polling), services/stream_service.py (alias resolution, secret masking, detail assembly, sink provisioning), commands/stream.py, server/routers/stream.py (1:1 serve REST). Wired into cli.py, permissions.py (read/write/destructive), constants.py, server dependencies/app. Tests: test_stream_client.py (14), test_stream_service.py (16), test_stream_cli.py (11); E2E test_stream_otlp_e2e (make test-e2e-stream). Live-validated against a real project: create source -> 3 sinks -> POST OTLP/HTTP logs -> 3 rows landed in in.c-otlp-<name>.logs -> read back via workspace query. Docs synced (CLAUDE.md, context.py AGENT_CONTEXT, keboola-expert.md, SKILL.md, commands-reference.md, gotchas.md, new stream-workflow.md); version 0.50.0 + version-sync. * docs(stream): align CLAUDE.md delete flags notation with context.py (--yes|--force) * docs(stream): trim keboola-expert OTLP matrix row to fit agent prompt budget after 0.50.0 merge
…eboola#365) * fix(serve): document `stream` in OpenAPI + add Data Streams web UI The `stream` router was wired via include_router and fully callable, but its tag was missing from OPENAPI_TAGS in server/app.py -- so /docs#/stream rendered as a bare, description-less section outside its logical group. Add the tag entry under the Data group (next to storage), mirroring the `kbagent stream *` CLI. Add a Data Streams page to the web UI (NERD UI) with parity to the CLI and backend: list sources, create an OTLP/HTTP source (sink auto-provisioning, if-not-exists), a detail drawer with a secret reveal toggle for the OTLP endpoint, and a destructive delete. Wire it into routing (App.tsx), PageId state, and the Browse sidebar section. Add a regression test asserting every operation tag has an OPENAPI_TAGS description block, and add the stream path to the router smoke check, so a newly added router can't ship invisible in /docs again. * fix(ui): stringify stream branch ref to match the str-typed API contract The stream control-plane API types its branch as a string, while the UI's global branchId is a numeric Storage branch ID. Numeric IDs are valid refs (test_branch_override in test_stream_service.py drives branch_id="1234"), so this is not a value bug -- it makes the contract explicit instead of relying on JSON/query coercion. Introduce a branchRef() helper used by all four stream calls (list, detail, create, delete). * chore(release): 0.51.0 -- Data Streams web UI + stream OpenAPI tag Minor bump: the release adds a new user-visible surface (the Data Streams web UI page) on top of the `stream` OpenAPI documentation fix. Add the 0.51.0 changelog entry and propagate the version to plugin.json and marketplace.json via scripts/sync_version.py. * fix(ui): address review -- surface delete errors, use buckets fallback Follow-up on the kbagent-pr-reviewer pass over the Data Streams page: - show deleteMu errors via ErrorBox -- delete failures were silent to the user, unlike the create mutation in the same file (in-file inconsistency) - render destination.buckets as a fallback when a single bucket isn't set, so the multi-bucket field returned by the backend is actually used - note that the sinks/source raw fields surface only via the Raw JSON tab
…--password-stdin (keboola#366) * feat(dev-portal): admin-role PATCH routing + MFA fixes + interactive --password-stdin Three independent fixes against the dev-portal surface that landed in keboola#354, discovered while integrating ABRA Flexi (a real component registration on production apps-api): 1. **Admin-role PATCH routing**. `complexity`, `categories`, `forwardToken`, `forwardTokenDetails`, `injectEnvironment`, `processTimeout`, `requiredMemory`, `features`, and `category` are `.forbidden()` on the apps-api vendor schema (`PATCH /vendors/{vendor}/apps/{app}`) -- but the server's error message is misleading: it says "must be one of: easy, medium, hard" because the enum-validation `.error()` annotation lives on the shared admin schema before `clientAppSchema()` overrides with `.forbidden()`. Source of truth: keboola/developer-portal:src/lib/ validation.js -> clientAppSchema(). Fix: - `DeveloperPortalIdentity.role_hint` becomes a real validator: only `vendor` (default) or `admin` accepted; case-folded; typos raise. The field is now load-bearing, not a free-text label. - `DeveloperPortalClient.patch_app` reads `self._identity.role_hint` and routes admin identities to `PATCH /admin/apps/{app}` (permissive adminAppSchema); vendor identities stay on the vendor endpoint. - `DeveloperPortalService.prepare_patch` preflights: vendor role + admin-only field => fail-fast `VALIDATION_ERROR` with a message that (a) names every offending field, (b) explains why the 422 is misleading, (c) tells the user the exact command to switch identity (`dev-portal identity add --role-hint admin ...`). Admin role bypasses the preflight entirely. - Reads, create, upload-icon, deprecate keep vendor-endpoint behaviour -- only PATCH has a meaningful admin variant on the server. Admin tokens still work on the vendor path for those (superset perms). 2. **MFA login: explicit `challenge` field + actual error surfaced**. User report from a Keboola-org TOTP account: MFA code: 521278 Error: Developer Portal MFA login failed (HTTP 404) Root cause: the apiary spec calls `challenge` optional with default `SOFTWARE_TOKEN_MFA`, but in practice the server 404s when it's omitted. Sending it explicitly fixes it. Single attempt only: an earlier experiment retried with `SMS_MFA` on the same session, but `/auth/login` consumes the session, so the retry always 404'd with "Invalid code or auth state for the user", masking the real first failure (most often a stale 30-second TOTP code from waiting too long to enter it). The error now includes the server response body (truncated to 500 chars) and a hint about TOTP code freshness, so users can tell whether the code was wrong, the session expired, or something else. 3. **`--password-stdin` no longer hangs interactively**. `sys.stdin.read()` waits for EOF, not Enter -- users who pasted a password and pressed Enter sat there until they Ctrl-C'd out. New `_read_password_stdin()` helper branches on `sys.stdin.isatty()`: TTY uses `getpass.getpass()` (hidden, line-based, Enter to confirm); pipe still does `read() -> strip()`. Both `identity add --password-stdin` and `identity edit --password-stdin` route through it. Help text updated to spell out the dual-mode behaviour. Tests (10 new): - TestReadPasswordStdin: TTY -> getpass, pipe -> read. - TestLoginMfaPath::test_mfa_prompt_completes_login: now matches body including `challenge: SOFTWARE_TOKEN_MFA`. - TestLoginMfaPath::test_mfa_failure_surfaces_server_body: real body bubbles up plus stale-TOTP hint. - TestPortalWrites::test_patch_app_vendor_role_hits_vendor_endpoint + test_patch_app_admin_role_hits_admin_endpoint: confirm dispatch. - TestDeveloperPortalIdentity::test_role_hint_accepts_admin + test_role_hint_normalises_case + test_role_hint_rejects_typo. - TestReadsAndPrepareApply::test_prepare_patch_vendor_role_rejects_admin_only_fields + test_prepare_patch_admin_role_allows_admin_only_fields. All 95 dev-portal tests pass; `make check` green (3827 / 8 skipped). * test(dev-portal): use DP_MFA_CHALLENGE_TYPE constant in client tests Replace the two hardcoded "SOFTWARE_TOKEN_MFA" match_json literals with the DP_MFA_CHALLENGE_TYPE constant from constants.py, following through on the NIT-1 constant extraction so the tests can't silently diverge from the client. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…oola#367) Bumps [vitest](https://github.com/vitest-dev/vitest/tree/HEAD/packages/vitest) from 2.1.9 to 4.1.0. - [Release notes](https://github.com/vitest-dev/vitest/releases) - [Changelog](https://github.com/vitest-dev/vitest/blob/main/docs/releases.md) - [Commits](https://github.com/vitest-dev/vitest/commits/v4.1.0/packages/vitest) --- updated-dependencies: - dependency-name: vitest dependency-version: 4.1.0 dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Transitive dependency bump (uv.lock only); kbagent has no direct urllib3 import. Pulled in dev-only via pip-audit -> requests. Addresses CVE-2026-44431 (header leak on cross-origin redirect) and CVE-2026-44432 (decompression DoS).
Transitive dependency bump (uv.lock only); kbagent has no direct idna import. In the runtime httpx/anyio path. No new CVEs in 3.11->3.15; routine maintenance + Unicode data update. Verified: full pytest suite green with urllib3+idna bumps combined.
…cs (0.52.0) (keboola#368) * feat(storage): `clone-table` -- pull a prod table into a dev branch (0.52.0) Adds `kbagent storage clone-table --project P --table-id ID --branch ID [--dry-run]`, wrapping the Storage API `POST /v2/storage/branch/{branch}/tables/{id}/pull` endpoint (operationName `devBranchTablePull`). Why: on `storage-branches` projects a dev branch reads production tables transparently (copy-on-write) until the first write, so an in-branch schema mutation -- `swap-tables`, dropping a column -- fails with a misleading "bucket not found" until the table is materialized branch-local. `clone-table` performs that materialization. It is the blocking prerequisite for the typify-via-branch workflow on storage-branches projects. Implementation mirrors `swap-tables` across all layers: - KeboolaClient.pull_table (async storage job, polled to completion) - StorageService.clone_table (branch mandatory; exit 5 / ConfigError before any HTTP when no branch is set) - commands/storage.py clone-table (permission class `write`; --dry-run; no --yes since it never deletes) - permissions, hint, serve REST route, AGENT_CONTEXT Live-validated against project 10539 (storage-branches ON): clone a prod table into a dev branch -> table materialized -> in-branch swap-tables then succeeds (it previously failed with "bucket not found") -> the production table is left untouched. Tests: tests/test_storage_clone.py (13: client/service/CLI) + tests/test_e2e.py::TestE2EStorageCloneTable (3). Docs synced per convention keboola#17 (auto-generated SKILL.md, commands-reference, gotchas, context, CLAUDE.md, storage-types + typify workflow). Deliberately not added to keboola-expert.md (already at its hard token budget); covered in the other surfaces. Addresses the clone-prod-table-into-branch request in keboola#362. * docs(typify): rewrite for dev-branch-rehearsal + prod-swap; fix false "rejects on production" (keboola#362) Dev-branch merge propagates only configurations, NOT storage table schema (confirmed by the storage-branches design + Keboola public docs, and reproduced live). Two things were documented wrong: 1. typify-table-workflow.md claimed merge promotes the swapped/typed schema to production. It does not. Reworked into a two-stage model: rehearse in a dev branch (profile, build, swap, validate downstream), then repeat the real build + swap in the production (default) branch. Removed the bogus Phase 8 "merge promotes to prod"; added the prod execution with its inconsistency-window + rollback cautions. 2. swap-tables docstrings / command help / hint / context / gotchas / storage-types-workflow all claimed "the Storage API rejects this on production". It does not -- a default-branch swap is verified to work (project 10539) and is the supported way to retype a prod table. Corrected the wording across all surfaces. No code-behavior change: branch_id is still mandatory and the swap is still branch-scoped -- only the documentation/docstrings were wrong. Added a 0.52.0 changelog entry for the correction (the historical 0.28.0 entry is left as-is). Completes the A+B half of keboola#362. * docs+test: address PR keboola#368 review (NB-1 swap semantics, NB-2 test name) NB-1: two keboola-expert.md matrix rows still described `swap-tables` as dev-branch-only -- corrected to "any branch (incl. prod)", consistent with the A+B semantics fix elsewhere in this PR. Net +4 bytes; the prompt stays under its 62000-byte budget (the clone-table version-gate line is still omitted for budget, as noted in the review). NB-2: renamed test_url_encoding_for_special_characters -> test_dotted_table_id_passed_verbatim_in_path (clone + swap dvojče). Dots and dashes are RFC 3986 unreserved, so quote(..., safe="") does not percent-encode them; the test verifies verbatim path pass-through, not encoding. Docstring corrected to say so. * fix: address Devin review on PR keboola#368 (remove deprecated --hint, add VERSION GATE) 1. clone-table wrongly added --hint support. CONTRIBUTING.md (since v0.45.0) forbids new hints/definitions entries and should_hint(ctx) short-circuits for new commands -- swap-tables (0.28.0, pre-deprecation) was mirrored too literally. Removed the should_hint/emit_hint block from storage_clone_table and the HintRegistry.register entry for storage.clone-table, matching stream (0.50.0) / feature (0.48.0) which carry no hint support. 2. Added the VERSION GATE entry `storage clone-table = 0.52.0+` to keboola-expert.md (CONTRIBUTING.md mandates it for new min-version commands). Freed budget by tightening the keboola#245 line so the prompt stays under its 62000-byte cap. * docs: add (since v0.52.0) tag to new gotchas merge section (PR keboola#368 review B-1) The "Dev-branch merge carries only configurations" gotcha used a bare (verified 2026-06-01) stamp. CONTRIBUTING.md (convention keboola#17) requires the (since vX.Y.Z) tag on every gotcha so AI agents don't recommend behavior documentation that predates the install. Now reads "(since v0.52.0, verified 2026-06-01)". * docs(expert): add clone-table gotcha + trim stale content (PR keboola#368 NB-1/NB-2/NIT-1) Delta-review follow-up on keboola-expert.md (deferred from this PR for token budget): - NB-1: added a §3 inline gotcha for the storage-branches copy-on-write trap -- an in-branch swap-tables / column-drop fails "bucket not found" until `clone-table` materializes the prod table branch-local. - NIT-1: removed the dangling (§14.3) cross-reference (no such section) and the deprecated --hint alternative in the Retype matrix row. - NB-2: trimmed the verbose semantic-layer "short form" (full prose already lives in gotchas.md) and tightened the auto-materialize entry. Headroom against the 62000-byte budget went from 7 to 609 bytes.
Dev-tooling dependency via pip-audit chain; no runtime impact on the CLI.
Runtime dep in the server extra. 0.0.27 adds multipart header limits (DoS hardening) and raises the floor to >=0.0.27.
…boola#373) * fix(storage): correct stale "dev branch only" swap-tables wording Follow-up to keboola#368. The swap-tables semantics correction (0.52.0) left four co-located surfaces still claiming the swap is dev-branch-only / rejected on production -- now false after that fix: - services/storage_service.py: the ConfigError raised on a missing --branch still said "The Storage API rejects this on production" (user-facing at exit code 5, directly contradicting the corrected docstring) - references/commands-reference.md, commands/storage.py docstring (-> SKILL.md via make skill-gen), commands/context.py (AGENT_CONTEXT): "in a dev branch" / "in a development branch" A swap on the default/production branch is supported -- it is how a typed rebuild is applied to prod. branch_id stays mandatory; this is wording only, no behavior change. The dependent test match ("dev branch" -> "requires a branch") and a misleading test docstring were updated accordingly. Found by the kbagent-pr-reviewer third pass on keboola#368 (NB-1/2/3 + NIT-1). * fix(storage): address keboola#373 review -- remaining stale "dev branch" swap text Two surfaces the first pass of this PR missed, flagged by kbagent-pr-reviewer: - B-1 (blocking): CLI test test_swap_missing_branch_fails_clearly mocked the OLD "swap-tables requires a dev branch" string as its side-effect and asserted on it. The mock short-circuits the real service, so the test was validating phantom text that no longer matches service output. Updated the mock + assertion to "requires a branch". - NB-1: the swap_tables Args docstring still read "branch_id: Dev branch ID"; corrected to "any branch accepted, including the default/production branch". clone-table wording left intact (clone legitimately targets a dev branch; its service message and tests correctly keep "requires a dev branch").
) Patch release over 0.52.0. Completes the swap-tables wording correction (keboola#373) and bundles the pip 26.1 / python-multipart 0.0.27 dependency bumps (keboola#371/keboola#372). No behaviour change.
…rruption (0.53.0) (keboola#376) * fix(sync): conflict-aware `pull --force`, no more silent baseline corruption (0.53.0) `sync pull --force` could silently strand un-pushed local edits. When a config had local edits and its remote was unchanged, `--force` bypassed the locally-modified guard, hit the `remote_unchanged` short-circuit, and re-stamped the manifest `pull_hash` from the *edited* on-disk file. Afterwards `sync diff` and `sync push` reported "in sync" and shipped nothing while the remote still held the old config -- data loss with no signal. The silent part was compounded by the diff `local_override_hashes` optimization, which then never even read the edited file's content. Fix splits `--force` by 3-way diff state (config and row granularity): - local edited, remote UNCHANGED -> preserve file + 3-way base (pending delta stays visible to `sync push`; no discard, no silent re-stamp) - local edited, remote ALSO changed -> abort before writing anything (SyncConflictError, exit 1, code SYNC_CONFLICT) so the user resolves it - local untouched, remote changed -> take remote (unchanged behavior) Consequence: `--force` no longer discards non-conflicting local edits. To drop local edits on purpose, delete the file/dir and pull. New: errors.SyncConflictError + ErrorCode.SYNC_CONFLICT; read-only conflict pre-pass SyncService._detect_force_pull_conflicts / _is_conflict (runs before any write); commands/sync.py prints a red per-config conflict block (human) or a SYNC_CONFLICT envelope with details.conflicts (--json). --all-projects surfaces a per-project conflict as that project's error without aborting the batch. Tests: tests/test_sync_force_pull_baseline.py (config + row; preserve case b, abort case a, remote-only-changed takes remote) and tests/test_sync_cli.py (exit 1 + human/JSON envelope). Live-verified read-only against project 1183: force-pull preserves a locally-edited transformation ("Skipped (1) locally modified") and `sync diff` still reports modified:1. Docs: changelog 0.53.0, version bump + make version-sync, CLAUDE.md sync group, context.py AGENT_CONTEXT, sync-workflow.md, gotchas.md (since v0.53.0), commands-reference.md, keboola-expert.md. * fix(sync): structured conflict in `pull --all-projects` + E2E force-pull coverage Addresses kbagent-pr-reviewer findings on keboola#376. NB-1: `pull_all()` flattened `SyncConflictError` to `str(exc)`, dropping the `SYNC_CONFLICT` code and the conflicts list, so a `--all-projects --json` consumer (AI agent / script) could not tell a merge conflict from any other error. `pull_all._worker` now catches `SyncConflictError` and stores `{error, error_code, conflicts}`, mirroring the single-project envelope. Unit test in `tests/test_sync_force_pull_baseline.py`. NB-2: adds `tests/test_e2e.py::TestE2ESyncWorkflow::test_sync_force_pull_conflict_aware` -- end-to-end against real Storage: create a config, edit it locally, force-pull with the remote unchanged (assert the edit is preserved and `sync diff` still reports it modified), then mutate the remote and force-pull again (assert exit 1 and `SYNC_CONFLICT`, with the conflict listed). Cleans up the config. * docs(changelog): note --all-projects structured conflict + E2E coverage (0.53.0) * refactor(sync): tighten conflict types + guard _format_conflict_list (NIT-1/NIT-2)
…s, etc.)
Teach kbagent the new `semantic-reference-data` metastore type — a
per-dimension member store (one record per dimension, members in a
`members[]` array). The driving use case: hold a Chart of Accounts (the
account list + all attributes) in the semantic layer instead of a hardcoded
Storage table.
New self-contained sub-app `kbagent semantic-layer reference-data`:
- `list` — dimension summaries (id, dimension, member_count) [read]
- `get` — one record + all members, by --id or --model+--dimension [read]
- `set` — create-or-replace by (model, dimension) from a JSON members
file; idempotent (existing record -> PUT/revision++, else POST) [write]
- `delete` — remove a record by UUID [destructive]
Implementation:
- metastore_client: register `semantic-reference-data` in SemanticType /
SEMANTIC_TYPES; add a `put_item` verb so `set` uses the metastore's real
revisioned PUT (preserves history) instead of DELETE+POST.
- SemanticLayerService: list/get/set/delete via the generic client verbs.
- permissions + hints registered for all four leaves; SKILL.md regenerated.
Deliberately self-contained: reference-data is NOT AI-generated, so it is
kept out of `build` / `export` / `diff` / cascade / PUSH_ORDER — zero blast
radius on the existing model flows.
Tests: service (CRUD, create-vs-replace, validation, NOT_FOUND) + CLI
(list/get/set/delete, bad-JSON, --yes gate) + permission-registry asserts.
Deferred (flagged for follow-up): REST `serve` router parity, the
hand-written plugin-doc surfaces (commands-reference / gotchas /
keboola-expert / context.py / CLAUDE.md), and an E2E hop.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Owner
Author
|
Re-targeting to the upstream repo keboola/cli (this fork base was wrong). Superseded by the keboola/cli PR. |
494cc07 to
644a8cc
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Teaches kbagent the new
semantic-reference-datametastore type — a per-dimension member store (one record per dimension; members in amembers[]array). Driving use case: hold a Chart of Accounts (the account list + all attributes) in the semantic layer instead of a hardcoded Storage table.New self-contained sub-app —
kbagent semantic-layer reference-data(aliaskbagent sl reference-data):listget--idor--model+--dimensionset(model, dimension)from a JSON members file (--members-file,-= stdin)delete--yesgated)Why it's small / low-risk
The 6 existing types are hardcoded in ~8 places, but reference-data is not a model child that
buildgenerates — its members come fromDIM_COA, not the AI. So it is deliberately kept out ofPUSH_ORDER/build/export/diff/ cascade. The new sub-app composes the generic metastore verbs (list_items/get_item/post_item/put_item/delete_item) — zero blast radius on existing model flows.Implementation
metastore_client.py— registersemantic-reference-datainSemanticType/SEMANTIC_TYPES; add aput_itemverb sosetuses the metastore's real revisionedPUT(preserves history) rather than the DELETE+POST theeditops use.SemanticLayerService—list/get/set/delete_reference_datavia the generic verbs.setis idempotent on(modelUUID, dimensionName): existing →PUT(revision++), elsePOST.permissions.py— registry entries for all four leaves (list/get read, set write, delete destructive).--hint client/--hint servicedefinitions for all four leaves.SKILL.md— regenerated (make skill-gen).Tests
NOT_FOUND, id-vs-dimension resolution, permission-registry asserts.--yesgate.3585 passed);ruff check/ruff format/tyclean; SKILL.md freshness + plugin.json version-sync pass.Deferred (intentionally — flagged for a follow-up)
Per agreed scope this is core + mandatory companions only. Not in this PR:
serverouter parity (server/routers/semantic_layer.py1:1 CLI→HTTP).commands-reference.md,gotchas.md,keboola-expert.md,context.pyAGENT_CONTEXT,CLAUDE.mdcommand list.tests/test_e2e.py.Opened as draft pending those + review.
🤖 Generated with Claude Code