Skip to content

fix(sync): conflict-aware pull --force — no more silent baseline corruption (0.53.0)#376

Merged
padak merged 4 commits into
mainfrom
fix/sync-force-pull-baseline-corruption
Jun 2, 2026
Merged

fix(sync): conflict-aware pull --force — no more silent baseline corruption (0.53.0)#376
padak merged 4 commits into
mainfrom
fix/sync-force-pull-baseline-corruption

Conversation

@padak
Copy link
Copy Markdown
Member

@padak padak commented Jun 2, 2026

Problem

A user reported (against v0.51.1, project 5785) that sync pull --force silently dropped un-pushed local config edits. Repro:

  1. sync pull a config, edit its _config.ymlsync diff correctly shows 1 to update.
  2. Run sync pull --force (typically to resolve an unrelated config's conflict), with the edited config's remote unchanged.
  3. sync diff and sync push now report "in sync" and ship nothing — yet the local file still holds the edits and the live remote still holds the old config. The edits are stranded, with no signal.

Root cause (confirmed in source + reproduced)

--force skipped the "locally modified" guard in SyncService.pull(). For a config whose remote was unchanged, the remote_unchanged short-circuit then re-stamped the manifest pull_hash from the edited on-disk file — adopting the un-pushed edits as the new synced baseline. The silent part is compounded by the diff local_override_hashes optimization, which skips re-reading a file whose hash matches pull_hash, so the edited content is never even compared.

A failing test (tests/test_sync_force_pull_baseline.py) reproduced the silent modified: 0 deterministically; it is now the regression guard.

Fix

pull --force becomes conflict-aware, branching on 3-way diff state (config and row granularity), per the maintainer decision:

local remote vs base --force behavior
edited unchanged preserve file + 3-way base (pending delta stays pushable; no discard, no silent re-stamp)
edited also changed abort before any write — SyncConflictError, exit 1, code SYNC_CONFLICT, lists each conflict
untouched changed take remote (unchanged)

A read-only pre-pass (_detect_force_pull_conflicts / _is_conflict) detects conflicts before anything is written, so the abort is atomic.

Behavior change for reviewers

--force no longer discards non-conflicting local edits — that was the dangerous behavior. To drop local edits on purpose, delete the file (or config dir) and pull. --force's other powers (orphan cleanup, full refresh, taking remote when local is untouched) are unchanged. --all-projects surfaces a per-project conflict as that project's error without aborting the batch.

What's included

  • errors.SyncConflictError + ErrorCode.SYNC_CONFLICT
  • SyncService._detect_force_pull_conflicts / _is_conflict + the config/row preserve fix in pull()
  • commands/sync.py: red per-config conflict block (human) / SYNC_CONFLICT envelope with details.conflicts (--json); corrected --force help
  • Version bump to 0.53.0 (new ErrorCode + behavior change) + changelog + plugin sync
  • Doc surfaces: CLAUDE.md (sync group was entirely missing), context.py, sync-workflow.md, gotchas.md (since v0.53.0), commands-reference.md, keboola-expert.md

Test plan

  • tests/test_sync_force_pull_baseline.py — config + row: preserve (case b), abort (case a), remote-only-changed takes remote
  • tests/test_sync_cli.py — exit 1 + human/JSON conflict envelope
  • Full suite: 3851 passed, lint/format/ty clean
  • Live-verified read-only against the maintainer's sandbox project padak (id 10539): pulled 38 configs, added parameters._kbagent_live_marker to a config's _config.yml, then sync pull --force reported Files written: 0 + Skipped (1) -- locally modified, and sync diff afterward still reported modified: 1 (parameters._kbagent_live_marker added). Pre-fix this would have been modified: 0 (the silent loss). sync pull is download-only, so the remote project was never mutated.

Open in Devin Review

…ruption (0.53.0)

`sync pull --force` could silently strand un-pushed local edits. When a config
had local edits and its remote was unchanged, `--force` bypassed the
locally-modified guard, hit the `remote_unchanged` short-circuit, and re-stamped
the manifest `pull_hash` from the *edited* on-disk file. Afterwards `sync diff`
and `sync push` reported "in sync" and shipped nothing while the remote still
held the old config -- data loss with no signal. The silent part was compounded
by the diff `local_override_hashes` optimization, which then never even read the
edited file's content.

Fix splits `--force` by 3-way diff state (config and row granularity):
- local edited, remote UNCHANGED   -> preserve file + 3-way base (pending delta
  stays visible to `sync push`; no discard, no silent re-stamp)
- local edited, remote ALSO changed -> abort before writing anything
  (SyncConflictError, exit 1, code SYNC_CONFLICT) so the user resolves it
- local untouched, remote changed   -> take remote (unchanged behavior)

Consequence: `--force` no longer discards non-conflicting local edits. To drop
local edits on purpose, delete the file/dir and pull.

New: errors.SyncConflictError + ErrorCode.SYNC_CONFLICT; read-only conflict
pre-pass SyncService._detect_force_pull_conflicts / _is_conflict (runs before
any write); commands/sync.py prints a red per-config conflict block (human) or a
SYNC_CONFLICT envelope with details.conflicts (--json). --all-projects surfaces
a per-project conflict as that project's error without aborting the batch.

Tests: tests/test_sync_force_pull_baseline.py (config + row; preserve case b,
abort case a, remote-only-changed takes remote) and tests/test_sync_cli.py
(exit 1 + human/JSON envelope). Live-verified read-only against project 1183:
force-pull preserves a locally-edited transformation ("Skipped (1) locally
modified") and `sync diff` still reports modified:1.

Docs: changelog 0.53.0, version bump + make version-sync, CLAUDE.md sync group,
context.py AGENT_CONTEXT, sync-workflow.md, gotchas.md (since v0.53.0),
commands-reference.md, keboola-expert.md.
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 3 additional findings in Devin Review.

Open in Devin Review

Comment thread src/keboola_agent_cli/services/sync_service.py
Copy link
Copy Markdown
Member Author

@padak padak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of #376 — fix(sync): conflict-aware pull --force — no more silent baseline corruption (0.53.0)

Generated by kbagent-pr-reviewer subagent. Verdict and findings below
are advisory; the human author retains every veto. CI-coverable issues
(lint, format, tests) are confirmed via make check, not duplicated here.

Summary

This PR fixes a confirmed data-loss bug in sync pull --force: when a user had un-pushed local edits to a config whose remote was unchanged, a force-pull silently re-stamped the manifest pull_hash from the edited file, stranding the edits while sync diff and sync push reported "in sync". The fix introduces a read-only pre-pass (_detect_force_pull_conflicts / _is_conflict) that branches on the 3-way diff state before writing anything: locally-modified + remote-unchanged configs are preserved, and true merge conflicts (both sides changed) abort the pull with the new SyncConflictError / ErrorCode.SYNC_CONFLICT. Plugin sync, changelog, tests, and documentation surfaces have all been updated. The implementation is sound and the regression tests cover both config and row granularity. Verdict: COMMENT — one non-blocking gap in the --all-projects conflict serialization path and a missing E2E test (per-policy NON-BLOCKING), plus two nits.

Verdict

  • Verdict: COMMENT
  • Blocking findings: 0
  • Non-blocking findings: 2
  • Nits: 2

Blocking findings

(none)

Non-blocking findings

[NB-1] src/keboola_agent_cli/services/sync_service.py:2483--all-projects conflict surfaces as a plain string, not a structured SYNC_CONFLICT envelope

When sync pull --all-projects --force is used, each per-project pull() call runs in a ThreadPoolExecutor worker. The bare except Exception as exc handler at line 2483 catches SyncConflictError and serialises it as results[alias] = {"error": str(exc)} — a human-readable string — rather than the structured {error: {code: "SYNC_CONFLICT", details: {conflicts: [...]}}} envelope that the single-project path produces. The PR description claims "--all-projects surfaces a per-project conflict as that project's error without aborting the batch", which is functionally true, but AI agents and programmatic consumers parsing --json output have no way to distinguish a SYNC_CONFLICT from a generic error in this path; they cannot enumerate the conflicting configs, cannot determine that a retry-after-resolve strategy applies, and cannot correlate with the details.conflicts schema documented in gotchas.md. A targeted fix would be to detect SyncConflictError in the worker and emit a structured dict (e.g. {"error": str(exc), "error_code": "SYNC_CONFLICT", "conflicts": exc.conflicts}) or, alternatively, add a sentence to gotchas.md and the --all-projects help noting that the --json response for this path does not carry details.conflicts.

[NB-2] tests/test_e2e.py — no E2E test for the sync pull --force conflict-preserve behavior

Per CONTRIBUTING.md "Every new CLI command MUST have a corresponding E2E test in tests/test_e2e.py", and by extension material behavior changes to existing commands warrant E2E coverage. The regression test in tests/test_sync_force_pull_baseline.py is service-layer (no real API call), and the CLI test in tests/test_sync_cli.py uses a mock. A live E2E test that: (1) pulls a project, (2) edits a _config.yml, (3) runs sync pull --force while the remote is unchanged, (4) asserts modified: 1 survives, would guard against regressions in the hash-comparison path that are invisible in unit tests (e.g. a future manifest schema change that stops populating pull_config_hash). Environmental constraints (requires E2E_API_TOKEN + E2E_URL) make this non-blocking for the current PR cycle, consistent with the NON-BLOCKING convention in CONTRIBUTING.md. Recommend deferring one cycle with a tracking note.

Nits

  • [NIT-1] src/keboola_agent_cli/errors.py:290SyncConflictError.__init__ accepts list[dict] but the dict value type is unspecified (dict[str, Any] or a typed TypedDict would be self-documenting). Current callers always pass dict[str, str] (no row_id case) or dict[str, str | None] (row case has no optional fields in practice). Minor type-annotation improvement.

  • [NIT-2] src/keboola_agent_cli/commands/sync.py:252_format_conflict_list emits the conflict count as part of the Rich markup string but uses a raw n = len(conflicts) local with no early-return guard. If conflicts is empty (theoretically unreachable since SyncConflictError requires at least one conflict, but the function is called independently), it prints "0 config(s) changed..." which would be confusing. An if not conflicts: return guard at the top would make the function safer as a standalone helper.

Verification log

  • gh pr view 376 --json title,body,files,additions,deletions,baseRefName,headRefName,labels,state → 16 files, +611/-15, state OPEN, base main, head fix/sync-force-pull-baseline-corruption
  • git rev-parse --abbrev-ref HEADfix/sync-force-pull-baseline-corruption (worktree on correct branch) ✓
  • gh pr diff 376 → 868 lines, fetched to /tmp/kbagent-pr-376.diff
  • Layer compliance (typer/click in services, httpx in commands, formatter in clients) → all empty ✓ (no layer violations)
  • Convention checks (magic numbers, raw error_code="..." strings, bare except:, print() in src, token exposure) → all empty ✓
  • grep -n 'sync\.' src/.../permissions.pysync.init/pull/status/diff/push/branch-link/branch-unlink/branch-status all registered ✓ (no new sync commands added; existing registrations cover the fix)
  • Server router check: ls src/.../server/routers/ → no sync.py router. Confirmed intentional: CLAUDE.md L450 explicitly documents sync init/pull/push/diff are filesystem-local (no serve REST surface)
  • keboola-expert.md §3 gotcha for sync pull --force (0.53.0+) → present at L307-314 ✓
  • keboola-expert.md §1 Rule 6 VERSION GATE → no new 0.53.0 entry; the fix is a behavior correction, not a new command, so no version gate entry is strictly required (existing sync push|pull|diff --branch = 0.47.0+ remains accurate). The §3 gotcha carries the semantic-change marker. ✓
  • plugins/.../gotchas.md → new entry tagged (since v0.53.0) at L2430, prose covers preserve/abort/untouched cases and the details.conflicts JSON schema ✓
  • plugins/.../commands-reference.mdsync pull line updated with --force semantics (since 0.53.0) ✓
  • plugins/.../sync-workflow.md → "Key behaviors" section updated, new ## sync pull --force is conflict-aware (since 0.53.0) subsection added ✓
  • CLAUDE.md ## All CLI Commands → sync group added (was entirely missing before this PR) ✓
  • src/.../commands/context.py AGENT_CONTEXTsync pull --force semantics description added ✓
  • pyproject.toml → bumped to 0.53.0; plugin.json, marketplace.json, uv.lock synced ✓
  • src/.../changelog.py0.53.0 entry present (detailed root-cause + fix summary) ✓
  • make check3851 passed, 8 skipped, 117 deselected, 14 warnings in 80s (exit 0) ✓
  • make skill-check → SKILL.md is up-to-date ✓
  • Service-layer regression tests: tests/test_sync_force_pull_baseline.py added (243 lines), covering config-level preserve (case b), abort (case a), remote-only-changed, and row-level preserve/abort ✓
  • CLI-layer tests: tests/test_sync_cli.py extended with test_sync_pull_force_conflict_human (exit 1 + Rich conflict block) and test_sync_pull_force_conflict_json (SYNC_CONFLICT envelope + details.conflicts) ✓
  • E2E: no test for the new conflict-preserve behavior in tests/test_e2e.py → NON-BLOCKING [NB-2]
  • Behavior verification: PR description includes a live sandbox repro against project padak (id 10539) — force-pulled 38 configs, edited a _config.yml, force-pulled again, confirmed Files written: 0 + Skipped (1) -- locally modified, and sync diff still reported modified: 1. Cannot independently reproduce (no E2E credentials in this environment); unverified live behavior is treated as author-confirmed and consistent with service-layer tests ✓ (with that caveat noted)
  • --all-projects conflict serialization: pull_all worker catches SyncConflictError via bare except Exception as exc at L2483 → results[alias] = {"error": str(exc)} — structured error_code/conflicts fields not propagated → [NB-1]
  • Security: no realistic-looking token literals; SAMPLE_VERIFY_TOKEN in new test uses token_id="tok-001" (fake); _ERROR_CODE_TO_TYPE mapping for SYNC_CONFLICT: "conflict" present ✓

Open questions for the author

(none)

padak added 3 commits June 2, 2026 16:46
…ull coverage

Addresses kbagent-pr-reviewer findings on #376.

NB-1: `pull_all()` flattened `SyncConflictError` to `str(exc)`, dropping the
`SYNC_CONFLICT` code and the conflicts list, so a `--all-projects --json`
consumer (AI agent / script) could not tell a merge conflict from any other
error. `pull_all._worker` now catches `SyncConflictError` and stores
`{error, error_code, conflicts}`, mirroring the single-project envelope.
Unit test in `tests/test_sync_force_pull_baseline.py`.

NB-2: adds `tests/test_e2e.py::TestE2ESyncWorkflow::test_sync_force_pull_conflict_aware`
-- end-to-end against real Storage: create a config, edit it locally, force-pull
with the remote unchanged (assert the edit is preserved and `sync diff` still
reports it modified), then mutate the remote and force-pull again (assert exit 1
and `SYNC_CONFLICT`, with the conflict listed). Cleans up the config.
@padak
Copy link
Copy Markdown
Member Author

padak commented Jun 2, 2026

Thanks for the review — all four findings addressed:

  • NB-1 (--all-projects conflict was a flat string) — 464693f: pull_all._worker now catches SyncConflictError and stores {error, error_code: "SYNC_CONFLICT", conflicts}, matching the single-project envelope so a --json consumer can enumerate conflicts and apply a resolve-then-retry strategy. Unit test test_force_pull_all_projects_emits_structured_conflict.
  • NB-2 (no E2E) — 464693f: added tests/test_e2e.py::TestE2ESyncWorkflow::test_sync_force_pull_conflict_aware — creates a config, edits it locally, force-pulls with the remote unchanged (asserts the edit is preserved and sync diff still reports it modified), then mutates the remote and force-pulls again (asserts exit 1 + SYNC_CONFLICT). Cleans up the config.
  • NIT-1 (SyncConflictError untyped list[dict]) — fa8a962: tightened to list[dict[str, str]], matching the producer _detect_force_pull_conflicts.
  • NIT-2 (_format_conflict_list no empty guard) — fa8a962: added if not conflicts: return.

Changelog updated to note the structured --all-projects conflict + E2E coverage (e7dc119). Full suite green, lint/format/ty clean.

@padak padak merged commit 3c1aee1 into main Jun 2, 2026
2 checks passed
@padak padak deleted the fix/sync-force-pull-baseline-corruption branch June 2, 2026 15:14
@padak
Copy link
Copy Markdown
Member Author

padak commented Jun 2, 2026

Proof the fix works

Two independent demonstrations — a deterministic red→green regression test (anyone can reproduce it), and a live transcript on the shipped v0.53.0 binary.

1. Red → green regression test

tests/test_sync_force_pull_baseline.py::test_force_pull_preserves_unpushed_local_edits encodes the exact bug: pull a config, edit it locally, sync pull --force while its remote is unchanged, then assert the edit is still detected.

Revert just the one-line guard (if not is_new:if not is_new and not force:, i.e. pre-0.53.0 behavior) and the test fails — the silent data loss:

>       assert diff_after["summary"]["modified"] == 1, (
            "force-pull (remote unchanged) silently dropped the un-pushed local edit"
        )
E       assert 0 == 1
FAILED tests/test_sync_force_pull_baseline.py::test_force_pull_preserves_unpushed_local_edits

With the fix restored:

1 passed in 0.07s

assert 0 == 1 is the bug in one line: after the force-pull, sync diff reported 0 modified configs even though the edited file was untouched on disk — exactly the "reports in sync, ships nothing, edit stranded" symptom from the report.

2. Live transcript on shipped v0.53.0

Against a live project (read-only — sync pull is download-only, the remote is never mutated):

$ kbagent version
kbagent v0.53.0    up to date

# pull 38 configs, then add an un-pushed local edit to one config's parameters

$ kbagent sync diff            # baseline healthy
   sync diff -> modified: 1 | conflict: 0 | unchanged: 37

$ kbagent sync pull --force    # remote UNCHANGED -- the exact bug scenario
  Files written: 0
  Skipped (1) -- locally modified:
    ! <config>

$ kbagent sync diff            # AFTER force-pull
   sync diff -> modified: 1 | conflict: 0 | unchanged: 37   # edit survives

# the on-disk file still holds the edit:
  _proof_marker: unpushed-edit

Pre-0.53.0, step 4 would report modified: 0 (the baseline silently re-stamped from the edited file) and sync push would ship nothing. On v0.53.0 the force-pull preserves the edit (Skipped (1) -- locally modified) and the pending delta stays visible to sync push.

Released as v0.53.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant