Skip to content

feat(storage): clone-table + correct typify/swap dev-branch semantics (0.52.0)#368

Merged
padak merged 6 commits into
mainfrom
claude/zealous-torvalds-cc06ab
Jun 1, 2026
Merged

feat(storage): clone-table + correct typify/swap dev-branch semantics (0.52.0)#368
padak merged 6 commits into
mainfrom
claude/zealous-torvalds-cc06ab

Conversation

@padak
Copy link
Copy Markdown
Member

@padak padak commented Jun 1, 2026

Resolves both halves of #362.

1. clone-table -- pull a prod table into a dev branch (the feature ask, "bod C")

Adds kbagent storage clone-table --project P --table-id ID --branch ID [--dry-run], wrapping POST /v2/storage/branch/{branch}/tables/{id}/pull (operationName devBranchTablePull).

Why: on storage-branches projects a dev branch reads production tables transparently (copy-on-write) until the first write, so an in-branch schema mutation (swap-tables, dropping a column) fails with a misleading "bucket not found" until the table is materialized branch-local. clone-table does that. It is the blocking prerequisite for the typify-via-branch rehearsal.

Mirrors swap-tables across all layers: client.pull_table (async job, polled), StorageService.clone_table (branch mandatory; exit 5 before any HTTP when unset), storage clone-table command (write; --dry-run; no --yes), plus permissions, hint, serve REST route (POST /storage/tables/{project}/{table_id}/pull), and AGENT_CONTEXT.

2. Correct dev-branch merge / swap semantics (the conceptual question, "bod A+B")

The investigation behind #362 established -- confirmed by the storage-branches designer, Keboola public docs, and live testing on project 10539 -- that dev-branch merge propagates only configurations, NOT storage table schema. Two things were documented wrong:

  • typify-table-workflow.md claimed a dev-branch merge promotes the swapped/typed schema to production. It does not. Reworked into a two-stage model: rehearse in a dev branch (profile, build the typed sibling, swap, validate downstream), then discard the branch and repeat the real build + swap in the production (default) branch. The bogus "merge promotes to prod" Phase 8 is replaced with prod execution + its inconsistency-window and rollback cautions.
  • swap-tables docstrings (client/service), command help, hint, context, gotchas.md, and storage-types-workflow.md all claimed "the Storage API rejects this on production". It does not -- a default-branch swap is verified to work and is the supported way to retype a prod table. Corrected everywhere. No code-behavior change: branch_id is still mandatory and the swap is still branch-scoped; only the docs/docstrings were wrong.

Testing

  • Unit/CLI: tests/test_storage_clone.py (13). E2E: tests/test_e2e.py::TestE2EStorageCloneTable (3).
  • Live-validated against project 10539 (storage-branches ON): clone a prod table into a dev branch -> materialized -> in-branch swap-tables then succeeds (previously failed "bucket not found") -> production untouched. Also verified a swap directly on the default branch succeeds (the basis for fix Phase 2: Project management (client, config store, service, CLI) #2).
  • make check green (3831 passed).

Docs (convention #17)

changelog + version bump to 0.52.0 (one entry for clone-table, one for the typify/swap correction); auto-generated SKILL.md; commands-reference.md; gotchas.md (new clone-table + branch-merge-semantics sections, swap section corrected); context.py; CLAUDE.md; storage-types-workflow.md + typify-table-workflow.md.

Note: clone-table was deliberately not added to keboola-expert.md -- that agent prompt is already at its hard 62000-byte token budget (test_agent_prompt). It is covered in the other seven surfaces the agent loads; happy to trim the prompt and add it in a follow-up if preferred.

Closes #362.

Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

🐛 1 issue in files not directly in the diff

🐛 Missing VERSION GATE entry in keboola-expert.md for clone-table = 0.52.0+ (plugins/kbagent/agents/keboola-expert.md:117)

CONTRIBUTING.md mandates updating plugins/kbagent/agents/keboola-expert.md §1 Rule 6 VERSION GATE "when adding a command that introduces or relaxes a minimum-version requirement." The new storage clone-table command requires kbagent 0.52.0+, but no entry was added to the VERSION GATE list. The list currently ends at 0.50.0+ (plugins/kbagent/agents/keboola-expert.md:116-117). Without this entry, the keboola-expert subagent may recommend clone-table to users on older kbagent versions where the command doesn't exist. This file is described in CONTRIBUTING.md as the "highest silent-drift risk in the repo."

View 5 additional findings in Devin Review.

Open in Devin Review

Comment thread src/keboola_agent_cli/commands/storage.py Outdated
@padak padak changed the title feat(storage): clone-table -- pull a prod table into a dev branch (0.52.0) feat(storage): clone-table + correct typify/swap dev-branch semantics (0.52.0) Jun 1, 2026
padak added 2 commits June 1, 2026 21:44
…0.52.0)

Adds `kbagent storage clone-table --project P --table-id ID --branch ID
[--dry-run]`, wrapping the Storage API
`POST /v2/storage/branch/{branch}/tables/{id}/pull` endpoint
(operationName `devBranchTablePull`).

Why: on `storage-branches` projects a dev branch reads production tables
transparently (copy-on-write) until the first write, so an in-branch
schema mutation -- `swap-tables`, dropping a column -- fails with a
misleading "bucket not found" until the table is materialized
branch-local. `clone-table` performs that materialization. It is the
blocking prerequisite for the typify-via-branch workflow on
storage-branches projects.

Implementation mirrors `swap-tables` across all layers:
- KeboolaClient.pull_table (async storage job, polled to completion)
- StorageService.clone_table (branch mandatory; exit 5 / ConfigError
  before any HTTP when no branch is set)
- commands/storage.py clone-table (permission class `write`; --dry-run;
  no --yes since it never deletes)
- permissions, hint, serve REST route, AGENT_CONTEXT

Live-validated against project 10539 (storage-branches ON): clone a prod
table into a dev branch -> table materialized -> in-branch swap-tables
then succeeds (it previously failed with "bucket not found") -> the
production table is left untouched.

Tests: tests/test_storage_clone.py (13: client/service/CLI) +
tests/test_e2e.py::TestE2EStorageCloneTable (3). Docs synced per
convention #17 (auto-generated SKILL.md, commands-reference, gotchas,
context, CLAUDE.md, storage-types + typify workflow). Deliberately not
added to keboola-expert.md (already at its hard token budget); covered
in the other surfaces.

Addresses the clone-prod-table-into-branch request in #362.
… "rejects on production" (#362)

Dev-branch merge propagates only configurations, NOT storage table
schema (confirmed by the storage-branches design + Keboola public docs,
and reproduced live). Two things were documented wrong:

1. typify-table-workflow.md claimed merge promotes the swapped/typed
   schema to production. It does not. Reworked into a two-stage model:
   rehearse in a dev branch (profile, build, swap, validate downstream),
   then repeat the real build + swap in the production (default) branch.
   Removed the bogus Phase 8 "merge promotes to prod"; added the prod
   execution with its inconsistency-window + rollback cautions.

2. swap-tables docstrings / command help / hint / context / gotchas /
   storage-types-workflow all claimed "the Storage API rejects this on
   production". It does not -- a default-branch swap is verified to work
   (project 10539) and is the supported way to retype a prod table.
   Corrected the wording across all surfaces.

No code-behavior change: branch_id is still mandatory and the swap is
still branch-scoped -- only the documentation/docstrings were wrong.
Added a 0.52.0 changelog entry for the correction (the historical 0.28.0
entry is left as-is). Completes the A+B half of #362.
@padak padak force-pushed the claude/zealous-torvalds-cc06ab branch from c48dc3f to 94c01b5 Compare June 1, 2026 19:46
Copy link
Copy Markdown
Member Author

@padak padak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of #368 — feat(storage): clone-table + correct typify/swap dev-branch semantics (0.52.0)

Generated by kbagent-pr-reviewer subagent. Verdict and findings below
are advisory; the human author retains every veto. CI-coverable issues
(lint, format, tests) are confirmed via make check, not duplicated here.

Summary

This PR ships two cohesive changes: (1) a new kbagent storage clone-table command that materializes a production table into a dev branch via POST /v2/storage/branch/{branch}/tables/{id}/pull, and (2) a documentation correctness fix across six files that corrects two long-standing wrong claims -- that swap-tables is dev-branch only, and that a dev-branch merge carries storage schema to production. Both changes are live-verified against project 10539. The implementation is thorough: all three layers are touched, permissions are registered, serve router exposes the endpoint, hint definition is included, and seven documentation surfaces are updated. make check passes with 3844 tests. The code is clean with no layer violations, no magic numbers, and no security issues.

One NON-BLOCKING finding: keboola-expert.md §1 Rule 6 VERSION GATE and the "Retype table columns" tool-selection-matrix row both describe the pre-PR (now-wrong) semantics and omit the clone-table version gate. The PR description acknowledges this and explains the token-budget constraint (17 bytes remaining at 61983 / 62000-byte budget). The finding is flagged so the follow-up trimming is not forgotten.

Verdict: APPROVE

Verdict

  • Verdict: APPROVE
  • Blocking findings: 0
  • Non-blocking findings: 2
  • Nits: 1

Blocking findings

(none)

Non-blocking findings

[NB-1] plugins/kbagent/agents/keboola-expert.md:118,150,152 — VERSION GATE and tool-selection-matrix rows describe pre-PR (now-wrong) swap semantics; clone-table version gate missing

The PR description explicitly acknowledges this: the file is at 61983 / 62000 bytes (17 bytes headroom), so adding storage clone-table = 0.52.0+, (30 bytes) and correcting the matrix rows would exceed the hard prompt budget enforced by tests/test_agent_prompt.py::test_agent_prompt_under_token_budget.

Three specific staleness issues survive after merge:

  1. §1 Rule 6 VERSION GATE (line 118): the list ends at stream = 0.50.0+ without a storage clone-table = 0.52.0+ entry. An AI agent on 0.51.x that tries the command will get "No such command" and the version gate won't catch it.
  2. §2 "Retype table columns" row (line 150): still says "flip the typed copy into the original name in a dev branch". This is now wrong -- the production retype is the default-branch swap documented in the PR. An AI agent following this row will rehearse-in-dev and then wonder why the typed schema never reaches production.
  3. §2 "Promote typed rebuild" row (line 152): says "Service refuses without a branch" without clarifying that ANY branch works, including default/production.

Fix: trim ~50 bytes of stale content from the prompt (e.g. the §14.3 cross-reference to a section that no longer exists in the document) to make room, then add the version gate entry and correct the two matrix rows. This is the right follow-up issue rather than a merge blocker -- gotchas.md, context.py, commands-reference.md, CLAUDE.md, and the two workflow files are all correctly updated in this PR.

[NB-2] tests/test_storage_clone.py:1166-1180test_url_encoding_for_special_characters tests the wrong property

The test doc-string says "Table IDs with dots/dashes are URL-encoded in the path." However, urllib.parse.quote(safe='') does NOT percent-encode RFC 3986 unreserved characters (letters, digits, -, ., _, ~), so in.c-bucket-with-dashes.tbl is transmitted verbatim and the mock URL expectation passes trivially. The test name implies it verifies encoding but it verifies the absence of encoding for this character class -- which is correct behavior, but the test assertion says nothing about a table ID that WOULD be encoded (e.g. one containing a space or slash). The comment "URL-encoded in the path" misleads future readers.

Fix: either rename to test_table_id_with_dashes_and_dots_passes_through_verbatim and update the docstring, or add a second case with "in.c-foo.table WITH spaces" asserting the URL contains %20. Low-risk -- the underlying pull_table implementation using quote(table_id, safe="") is correct.

Nits

  • [NIT-1] plugins/kbagent/agents/keboola-expert.md:150 -- The (§14.3) cross-reference in the "Retype table columns" matrix row (if the future storage retype composite (§14.3) is not yet present) points to a section that does not exist in the current document. The file has sections ## 1 through ## 10 only; there is no §14. The reference is a holdover from an older draft. Removing it would also free 9 bytes toward the budget fix in NB-1.

Verification log

  • gh pr view 368 --json title,body,files,additions,deletions,baseRefName,headRefName,labels,state → state=OPEN, 20 files, +1046/-99, conventional feat(storage): prefix ✓
  • git rev-parse --abbrev-ref HEAD in worktree → claude/zealous-torvalds-cc06ab matches PR branch ✓
  • CONTRIBUTING.md Plugin synchronization map walked: AGENT_CONTEXT ✓, CLAUDE.md ✓, permissions.py ✓, commands-reference.md ✓, gotchas.md ✓, serve router ✓, hint definition ✓; keboola-expert.md deliberately skipped by author (token budget, documented in PR description) → NB-1
  • Layer-violation grep (typer in services, httpx in commands, formatter in clients) → empty ✓
  • Convention grep (magic numbers, bare except, print() in src, raw error_code in production code) → clean ✓; raw error_code="NOT_FOUND" at tests/test_storage_clone.py:222 is a test fixture, not production code; make check-error-codes scans src/ only and passes ✓
  • make check3844 passed, 8 skipped
  • make typecheck → 3 diagnostics in tests/test_services.py:70 (pre-existing on main, not introduced by this PR; git diff main...HEAD -- tests/test_services.py shows no changes to that file) ✓
  • tests/test_agent_prompt.py::TestPilotAgentFile::test_agent_prompt_under_token_budget → PASSED; file is 61983 bytes vs 62000-byte budget ✓
  • File-size budgets: storage_service.py 2419 LOC, commands/storage.py 2741 LOC, client.py 3169 LOC -- all exceed hard ceilings per CONTRIBUTING.md table (1500 / 1200 / 2000); these pre-exist on main and are explicitly listed as "require splitting before adding more"; this PR adds 63 / 99 / 30 new lines respectively to already-over files. Not a new regression introduced here, noted for context.
  • URL encoding verification: quote('in.c-foo.data', safe='')in.c-foo.data (dots and dashes are RFC 3986 unreserved, not percent-encoded); pull_table behavior is correct; test name misleading → NB-2
  • keboola-expert.md §14.3 cross-reference: searched for ## 14, §14 -- no such section exists in document (sections run 1-10 only) → NIT-1
  • Behavior claim (live verification): PR description says clone was validated against project 10539 with storage-branches ON; dry-run, no-branch-rejection, and full-pull tests are all covered in tests/test_e2e.py::TestE2EStorageCloneTable; could not independently run E2E (requires E2E_API_TOKEN env var -- not injected in this review run)

Open questions for the author

(none)

padak added 2 commits June 1, 2026 22:06
NB-1: two keboola-expert.md matrix rows still described `swap-tables` as
dev-branch-only -- corrected to "any branch (incl. prod)", consistent
with the A+B semantics fix elsewhere in this PR. Net +4 bytes; the prompt
stays under its 62000-byte budget (the clone-table version-gate line is
still omitted for budget, as noted in the review).

NB-2: renamed test_url_encoding_for_special_characters ->
test_dotted_table_id_passed_verbatim_in_path (clone + swap dvojče). Dots
and dashes are RFC 3986 unreserved, so quote(..., safe="") does not
percent-encode them; the test verifies verbatim path pass-through, not
encoding. Docstring corrected to say so.
…ERSION GATE)

1. clone-table wrongly added --hint support. CONTRIBUTING.md (since
   v0.45.0) forbids new hints/definitions entries and should_hint(ctx)
   short-circuits for new commands -- swap-tables (0.28.0, pre-deprecation)
   was mirrored too literally. Removed the should_hint/emit_hint block from
   storage_clone_table and the HintRegistry.register entry for
   storage.clone-table, matching stream (0.50.0) / feature (0.48.0) which
   carry no hint support.

2. Added the VERSION GATE entry `storage clone-table = 0.52.0+` to
   keboola-expert.md (CONTRIBUTING.md mandates it for new min-version
   commands). Freed budget by tightening the #245 line so the prompt stays
   under its 62000-byte cap.
Copy link
Copy Markdown
Member Author

@padak padak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of #368 — feat(storage): clone-table + correct typify/swap dev-branch semantics (0.52.0)

Generated by kbagent-pr-reviewer subagent (delta re-review). This is the second-pass review; prior findings NB-1 and NB-2 were addressed by the author between passes, and a Devin finding (remove deprecated --hint support for clone-table) was also addressed. The verdict and findings below are advisory; the human author retains every veto.

Summary

This PR ships kbagent storage clone-table (wrapping POST /v2/storage/branch/{branch}/tables/{id}/pull) and corrects the long-wrong claim that swap-tables is dev-branch-only. All three prior-round findings are resolved. The implementation is clean across all three layers, all plugin sync surfaces were updated, make check passes at 3844 (up from the PR description's 3831 due to the rebase onto #366). One BLOCKING finding remains: the new "Dev-branch merge carries only configurations" gotcha entry in gotchas.md uses (verified 2026-06-01) instead of the mandatory (since v0.52.0) tag. One NON-BLOCKING finding: keboola-expert.md §3 has no inline gotcha for the clone-table prerequisite behavior (copy-on-write "bucket not found" trap), and at 61993/62000 bytes the file is one accidental whitespace change away from breaking test_agent_prompt. Three pre-existing file-size ceiling overruns are noted as NITs.

Verdict: REQUEST CHANGES (one BLOCKING finding).

Verdict

  • Verdict: REQUEST CHANGES
  • Blocking findings: 1
  • Non-blocking findings: 2
  • Nits: 3

Blocking findings

[B-1] plugins/kbagent/skills/kbagent/references/gotchas.md:816 — new gotcha section missing (since vX.Y.Z) version tag

The section header ## Dev-branch merge carries only configurations, NOT storage schema (verified 2026-06-01) uses a date-stamp instead of a (since vX.Y.Z) tag. CONTRIBUTING.md §"Documentation changes" states: "The version tag is non-optional; gotchas without versions are how AI agents end up recommending behavior that does not exist on older kbagent installs." Every other section in gotchas.md uses the (since vX.Y.Z) form.

The documented behavior (merge does not carry storage schema) is a platform invariant, not introduced in 0.52.0 -- but the documentation correction ships in 0.52.0, so agents before this release were given the wrong guidance. The correct tag is (since v0.52.0) to indicate when the gotchas entry first became accurate.

Fix: change line 816 to:

## Dev-branch merge carries only configurations, NOT storage schema (since v0.52.0, verified 2026-06-01)

Non-blocking findings

[NB-1] plugins/kbagent/agents/keboola-expert.md — §3 Inline Gotchas missing the copy-on-write "bucket not found" trap for clone-table

The PR description explicitly deferred adding this gotcha to keboola-expert.md §3 because the file is at 61993 bytes against a 62000-byte hard budget (7 bytes of headroom). The trap is: on storage-branches projects, a dev branch reads production tables transparently until the first write, so swap-tables (a write) fails with a misleading "bucket not found" until clone-table is run first. This exact failure pattern is non-obvious and an agent without this context will send users into a confusing diagnostic loop.

The gotchas.md and AGENT_CONTEXT (context.py) do cover it, and CONTRIBUTING.md rates this as NON-BLOCKING ("missing gotcha row degrades subagent ergonomics without making a command undiscoverable"). The suggested path: trim one of the older verbose entries in §3 (e.g. the storage create-table in a dev branch auto-materializes the bucket section, which partially overlaps) to make room, then add a one-bullet entry for the clone-table prerequisite. This can be done in a follow-up PR as noted in the description.

[NB-2] plugins/kbagent/agents/keboola-expert.md — 7-byte budget headroom is fragile

At 61993 bytes, the file is 7 bytes below the test_agent_prompt_under_token_budget hard gate of 62000. Any future edit that adds even a short phrase or a line-ending difference will break the CI test. The PR description correctly notes that clone-table could not be added here without trimming; the NB is that no trimming was done to create room for organic future growth. Recommend targeting ~60000 bytes as a working ceiling (trim ~2000 bytes of stale content) before the next feature lands.

Nits

  • [NIT-1] src/keboola_agent_cli/client.py:3169 LOC (hard ceiling: 2000) — pre-existing violation, +32 LOC added. CONTRIBUTING.md says the next PR touching this file should split first. Worth a follow-up issue so the debt is tracked rather than silently compounding.
  • [NIT-2] src/keboola_agent_cli/commands/storage.py:2731 LOC (hard ceiling: 1200) — pre-existing violation, +90 LOC added. Same note as NIT-1.
  • [NIT-3] src/keboola_agent_cli/services/storage_service.py:2419 LOC (hard ceiling: 1500) — pre-existing violation, +68 LOC added. Same note as NIT-1.

Verification log

  • gh pr view 368 --json state,headRefName,additions,deletions,files → state=OPEN, branch=claude/zealous-torvalds-cc06ab, +1015/-106, 22 files ✓
  • git rev-parse --abbrev-ref HEADclaude/zealous-torvalds-cc06ab (working tree matches PR branch) ✓
  • CONTRIBUTING.md Plugin synchronization map read ✓
  • CLAUDE.md §17 silent-drift surfaces read ✓
  • plugins/kbagent/agents/keboola-expert.md §1, §3 read ✓
  • 3-layer compliance grep (typer in services, httpx in commands, formatter in clients) → empty ✓
  • grep -E '^\+.*(time\.sleep|retries|timeout)\s*=\s*[0-9]+' → empty ✓
  • grep -E '^\+\s*except\s*:' → empty ✓
  • grep -E '^\+\s*print\(' ... src/keboola_agent_cli/' → empty ✓
  • make check3844 passed, 8 skipped ✓ (exit 0)
  • permissions.py OPERATION_REGISTRY"storage.clone-table": "write" present (line 189) ✓
  • hints/definitions/storage.py → no new CommandHint for clone-table (expected: CONTRIBUTING.md §"Code changes" explicitly says --hint support is DEPRECATED, do not add new definitions — reviewers must NOT flag a missing hint definition) ✓
  • CLAUDE.md ## All CLI Commandsclone-table line present ✓
  • src/keboola_agent_cli/commands/context.py AGENT_CONTEXTclone-table entry present ✓
  • plugins/kbagent/skills/kbagent/references/commands-reference.mdclone-table bullet present ✓
  • plugins/kbagent/skills/kbagent/references/gotchas.mdstorage clone-table section at line 792 with (since v0.52.0) ✓; swap-tables section corrected ✓; new Dev-branch merge section at line 816 uses (verified 2026-06-01) WITHOUT (since vX.Y.Z)BLOCKING
  • plugins/kbagent/agents/keboola-expert.md §1 Rule 6 VERSION GATE → storage clone-table = 0.52.0+ present at line 73 ✓; swap-tables matrix row updated to "any branch incl. prod" ✓
  • plugins/kbagent/skills/kbagent/SKILL.mdclone-table row added ✓
  • plugins/kbagent/skills/kbagent/references/storage-types-workflow.md → workflow corrected for two-stage model ✓
  • plugins/kbagent/skills/kbagent/references/typify-table-workflow.md → Phase 8 reworked (no merge, prod swap instead) ✓
  • server/routers/storage.pyPOST /storage/tables/{project}/{table_id:path}/pull endpoint added ✓
  • tests/test_storage_clone.py → 13 tests (client/service/CLI layers) ✓
  • tests/test_e2e.py::TestE2EStorageCloneTable → 3 E2E tests added ✓
  • tests/test_storage_swap.pytest_dotted_table_id_passed_verbatim_in_path rename with corrected docstring ✓ (prior NB-2 resolved)
  • Prior NB-1 (keboola-expert.md matrix claiming swap-tables is dev-branch-only) → fixed: "any branch incl. prod" ✓
  • keboola-expert.md byte size → 61993 bytes vs 62000 budget (7 bytes headroom) — fragile but test passes ✓
  • Behavior reproduction: cannot run E2E without E2E_API_TOKEN/E2E_URL credentials in this reviewer context; author reports live-validated against project 10539 (storage-branches ON); unit + CLI tests exercise all three layers. Deferring E2E verification to author's stated live test results.

Open questions for the author

(none)

padak added 2 commits June 1, 2026 22:32
…eview B-1)

The "Dev-branch merge carries only configurations" gotcha used a bare
(verified 2026-06-01) stamp. CONTRIBUTING.md (convention #17) requires the
(since vX.Y.Z) tag on every gotcha so AI agents don't recommend behavior
documentation that predates the install. Now reads
"(since v0.52.0, verified 2026-06-01)".
…-1/NB-2/NIT-1)

Delta-review follow-up on keboola-expert.md (deferred from this PR for
token budget):
- NB-1: added a §3 inline gotcha for the storage-branches copy-on-write
  trap -- an in-branch swap-tables / column-drop fails "bucket not found"
  until `clone-table` materializes the prod table branch-local.
- NIT-1: removed the dangling (§14.3) cross-reference (no such section)
  and the deprecated --hint alternative in the Retype matrix row.
- NB-2: trimmed the verbose semantic-layer "short form" (full prose
  already lives in gotchas.md) and tightened the auto-materialize entry.
  Headroom against the 62000-byte budget went from 7 to 609 bytes.
@padak padak merged commit 75b672b into main Jun 1, 2026
2 checks passed
@padak padak deleted the claude/zealous-torvalds-cc06ab branch June 1, 2026 20:44
Copy link
Copy Markdown
Member Author

@padak padak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of #368 — feat(storage): clone-table + correct typify/swap dev-branch semantics (0.52.0)

Generated by kbagent-pr-reviewer subagent (third pass). Verdict and findings below
are advisory; the human author retains every veto. CI-coverable issues
(lint, format, tests) are confirmed via make check, not duplicated here.

Summary

This PR ships two independently useful changes: the clone-table command (a new POST /v2/storage/branch/{branch}/tables/{id}/pull wrapper that materializes a production table branch-locally) and a documentation/docstring correction sweep that removes the false "Storage API rejects this on production" claim from swap-tables. The first half is well-structured across all layers (client, service, command, permissions, serve router, AGENT_CONTEXT, gotchas, workflow docs) with solid unit and E2E coverage. The second half — the correctness fix — is thorough in most places but missed three co-located surfaces that still carry the stale "dev branch only" language. Prior round findings NB-1 (swap semantics) and NB-2 (test name) are confirmed resolved. Verdict: COMMENT — no blocking issues, but three non-blocking documentation inconsistencies remain from the semantics-correction half.

Verdict

  • Verdict: COMMENT
  • Blocking findings: 0
  • Non-blocking findings: 3
  • Nits: 1

Blocking findings

(none)

Non-blocking findings

[NB-1] src/keboola_agent_cli/services/storage_service.py:1377-1381ConfigError message for swap-tables still says "The Storage API rejects this on production"

The PR's stated goal is to correct everywhere that claims "Storage API rejects this on production" (PR description: "Corrected the swap docstrings (client/service), command help, hint, context, gotchas.md, and storage-types-workflow.md"). The service method docstring was updated (lines 1344-1419 diff hunk), but the ConfigError text raised to the user when branch_id is None was not in any diff hunk and was left unchanged: "...The Storage API rejects this on production." This is now a false statement — a default-branch swap IS supported — and it surfaces in the error message users see at exit code 5. A user following the Phase 8 workflow who accidentally omits --branch will see an error that contradicts the documented behavior. Fix: change the tail of the ConfigError message to something like "The branch is mandatory but any branch works, including the default/production branch.".

[NB-2] plugins/kbagent/skills/kbagent/references/commands-reference.md:104 — swap-tables entry still says "swap two storage tables in a dev branch"

The PR corrected the same wording in gotchas.md, keboola-expert.md, storage-types-workflow.md, typify-table-workflow.md, and context.py, but the commands-reference.md entry (the AI agent's command catalogue) was not updated. It still reads "swap two storage tables in a dev branch (POST /tables/{id}/swap)". After this PR the correct statement is "in any branch (including production)". Since commands-reference.md is hand-maintained with no CI freshness check, this will silently mislead any agent that reads it. The fix is a one-word change: replace "in a dev branch" with "in any branch (including the default/production branch)".

[NB-3] src/keboola_agent_cli/commands/storage.py:1329 — swap-tables command docstring first line still says "in a development branch"; SKILL.md inherits the stale description

The Typer command function for swap-tables starts with """Swap two storage tables in a development branch.""". This first line is what make skill-gen uses for the SKILL.md auto-generated decision table; the table was regenerated in this PR (the clone-table row was added), so the stale swap-tables row was re-emitted. Both the --help output and SKILL.md therefore still say "in a development branch" even though the body of the same docstring was correctly updated to say "Any branch works, INCLUDING the default/production branch". Fix: update the first line of the docstring to e.g. """Swap two storage tables (any branch, including production).""" and re-run make skill-gen to propagate.

Nits

  • [NIT-1] tests/test_storage_swap.py:221-222test_no_branch_raises_config_error has docstring "Mandatory branch enforcement: production swap is rejected before any HTTP." This is the opposite of what the test actually validates (it tests that omitting --branch is rejected, not that a production swap is rejected). After the semantic fix, this docstring reads as contradictory. Suggested replacement: "Mandatory branch enforcement: swap-tables without --branch or active branch raises ConfigError before any HTTP." The test logic itself is correct; only the one-line description is misleading.

Prior-round findings status

  • NB-1 (prior round) — swap semantics correction: RESOLVED. gotchas.md, keboola-expert.md, context.py, storage-types-workflow.md, typify-table-workflow.md, client/service docstrings, command help text, hint notes — all updated. The only residual is the three surfaces identified above (NB-1/2/3 in this round).
  • NB-2 (prior round) — test name test_url_encoding_for_special_characters: RESOLVED. Renamed to test_dotted_table_id_passed_verbatim_in_path in tests/test_storage_swap.py:95 with an accurate docstring explaining RFC 3986 unreserved character semantics.

Verification log

  • gh pr view 368 --json title,body,files,additions,deletions → 22 files, +1015/-106, feat(storage): prefix, state OPEN, head claude/zealous-torvalds-cc06ab
  • git rev-parse --abbrev-ref HEADmain; PR branch claude/zealous-torvalds-cc06ab exists as a local ref; source files read via git show claude/zealous-torvalds-cc06ab:<path> throughout ✓
  • make check → 3831 passed, 8 skipped, 14 warnings, exit 0 ✓
  • Layer violations (typer in services, httpx in commands, formatter in clients) → none ✓
  • Convention compliance (magic numbers, bare except:, print() in production code, raw error_code= strings) → no new violations; raw error_code="NOT_FOUND" in test is an established codebase pattern confirmed in 7 existing test files ✓
  • OPERATION_REGISTRY (permissions.py) → "storage.clone-table": "write" present ✓
  • kbagent serve router (server/routers/storage.py) → POST /tables/{project}/{table_id:path}/pull with CloneTable(branch_id: int) body added ✓
  • AGENT_CONTEXT (commands/context.py) → clone-table block added ✓
  • CLAUDE.md ## All CLI Commandskbagent storage clone-table signature present ✓
  • gotchas.md version tags → (since v0.52.0) on both new entries ✓
  • commands-reference.mdclone-table bullet added with (since v0.52.0)
  • keboola-expert.md Rule 6 VERSION GATE → storage clone-table = 0.52.0+ added ✓
  • keboola-expert.md §3 Inline Gotchas → clone-table storage-branches materialization trap added (commit 168feac) ✓
  • hints/definitions/storage.pyswap-tables hint note updated to remove "rejects on production" ✓
  • NB-1 prior round → gotchas.md heading changed from "dev-branch only" to "branch-scoped" ✓
  • NB-2 prior round → test_storage_swap.py test renamed ✓
  • Stale ConfigError text in storage_service.py:1377-1381 → NOT updated (finding NB-1 this round) ✓ (verified via git show claude/zealous-torvalds-cc06ab:src/keboola_agent_cli/services/storage_service.py | sed -n '1376,1386p')
  • Stale commands-reference.md:104 → NOT updated (finding NB-2 this round) ✓ (confirmed as context line in diff)
  • Stale commands/storage.py:1329 docstring first line → NOT updated (finding NB-3 this round) ✓ (confirmed "Swap two storage tables in a development branch." still present)
  • Token discipline (mask_token, no raw tokens in new log/error output) → no issues ✓
  • Backward compatibility: clone-table is entirely new (no old payload consumers); swap-tables JSON shape unchanged ✓
  • E2E coverage: TestE2EStorageCloneTable (3 tests: live pull, dry-run, no-branch rejection) added ✓

Open questions for the author

(none)

padak added a commit that referenced this pull request Jun 1, 2026
* fix(storage): correct stale "dev branch only" swap-tables wording

Follow-up to #368. The swap-tables semantics correction (0.52.0) left four
co-located surfaces still claiming the swap is dev-branch-only / rejected on
production -- now false after that fix:

- services/storage_service.py: the ConfigError raised on a missing --branch
  still said "The Storage API rejects this on production" (user-facing at
  exit code 5, directly contradicting the corrected docstring)
- references/commands-reference.md, commands/storage.py docstring (-> SKILL.md
  via make skill-gen), commands/context.py (AGENT_CONTEXT): "in a dev branch"
  / "in a development branch"

A swap on the default/production branch is supported -- it is how a typed
rebuild is applied to prod. branch_id stays mandatory; this is wording only,
no behavior change. The dependent test match ("dev branch" -> "requires a
branch") and a misleading test docstring were updated accordingly.

Found by the kbagent-pr-reviewer third pass on #368 (NB-1/2/3 + NIT-1).

* fix(storage): address #373 review -- remaining stale "dev branch" swap text

Two surfaces the first pass of this PR missed, flagged by kbagent-pr-reviewer:

- B-1 (blocking): CLI test test_swap_missing_branch_fails_clearly mocked the
  OLD "swap-tables requires a dev branch" string as its side-effect and
  asserted on it. The mock short-circuits the real service, so the test was
  validating phantom text that no longer matches service output. Updated the
  mock + assertion to "requires a branch".
- NB-1: the swap_tables Args docstring still read "branch_id: Dev branch ID";
  corrected to "any branch accepted, including the default/production branch".

clone-table wording left intact (clone legitimately targets a dev branch; its
service message and tests correctly keep "requires a dev branch").
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Clarify dev-branch merge semantics for swap-tables (storage-branches) + feature: clone prod table into branch

1 participant