Strip squash-merged ancestor commits during rebase --onto#193
Strip squash-merged ancestor commits during rebase --onto#193subsetpark merged 12 commits intomainfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughCompute transitive ancestor patch IDs from the graph and propagate them into worktree rebase logic. Worktree base selection now filters git-log candidates by commit subjects that match ancestor patch IDs; main orchestration computes Changes
Sequence DiagramsequenceDiagram
participant Orchestrator as Orchestrator/Main
participant Graph as Graph
participant Worktree as Worktree
participant Git as Git
Orchestrator->>Graph: transitive_ancestors(patch_id)
Graph-->>Orchestrator: ancestor_ids
Orchestrator->>Worktree: execute_worktree_plan(..., ancestor_ids)
Worktree->>Git: git log --cherry-pick --right-only --format=%H %s
Git-->>Worktree: (sha, subject) list
Worktree->>Worktree: filter out subjects matching ancestor_ids
Worktree->>Git: resolve chosen_sha^ (parent)
Git-->>Worktree: parent SHA
Worktree->>Git: git rebase --onto parent target
Git-->>Worktree: rebase result
Worktree-->>Orchestrator: result / fail label
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Review SummaryThe core logic — using
4 comments posted · Model: |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/graph.ml`:
- Around line 41-46: transitive_ancestors currently seeds seen as empty so
cycles can re-add the root patch_id; fix by seeding seen with the root to
prevent it being returned: in transitive_ancestors change the initial visit call
to start with Set.singleton (module Patch_id) patch_id (or equivalently Set.add
(Set.empty (module Patch_id)) patch_id) so visit will never include the original
patch_id in its result; keep deps and visit logic unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a0695352-09fe-4769-81f1-abc4385f0354
📒 Files selected for processing (7)
bin/main.mllib/graph.mllib/graph.mlilib/worktree.mllib/worktree.mlitest/test_properties.mltest/test_rebase_onto.ml
Review SummaryThe PR correctly implements squash-merge ancestor stripping. The core logic in Findings:
3 comments posted · Model: |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/worktree.ml`:
- Around line 386-390: The parsing in oldest_non_ancestor_commit loses a final
space because String.strip removes the trailing separator from a "<sha> " line,
breaking String.lsplit2; fix by preserving the final "%H %s" separator instead
of using String.strip on log_output — e.g., replace String.strip with only
trimming newlines (or rstrip just '\n'/'\r') or simply ensure a single trailing
space is present (append " " if missing) before splitting; update the code paths
around oldest_non_ancestor_commit, log_output, String.strip and the subsequent
String.lsplit2 usage so empty subjects don't drop the separator.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3a77eaeb-cd92-49bb-bfb7-5ac74ea3b61c
📒 Files selected for processing (3)
lib/worktree.mllib/worktree.mlitest/test_rebase_onto.ml
🚧 Files skipped from review as they are similar to previous changes (1)
- lib/worktree.mli
Review SummaryThe diff correctly implements the squash-merge ancestor stripping for Two new issues found:
2 comments posted · Model: |
Review SummaryThe PR correctly fixes the squash-merge drift bug. Previously-outstanding issues from prior turns are fully addressed: Three remaining observations:
3 comments posted · Model: |
Review SummaryThe core logic is correct and well-tested. The squash-merge ancestor-subject filter, Two actionable issues remain:
One note on non-atomic double
3 comments posted · Model: |
There was a problem hiding this comment.
1 issue found across 2 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="lib/worktree.ml">
<violation number="1" location="lib/worktree.ml:436">
P1: `git log` is called with an invalid flag (`--no-signature`), which can make rebase base-detection fail at runtime.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.
Review SummaryThe PR correctly fixes the squash-merge ancestor-replay bug and all previously outstanding review concerns (CRLF stripping, Four new findings:
4 comments posted · Model: |
Review SummaryThis PR correctly addresses the squash-merge trunk rebase drift bug. The core implementation is sound:
One
4 comments posted · Model: |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/worktree.ml`:
- Around line 411-413: The code currently treats "no kept commits" as an error
by returning Result.Error "no unique commits found" from the block that matches
List.last kept; change that to return a distinct success value signalling
"nothing left to transplant" (for example: change the function's Ok payload from
a raw sha string to an option string and return Result.Ok None when kept is
empty, or introduce a dedicated variant like `Nothing_to_transplant`), and
update callers (notably rebase_onto) to handle that new success case by moving
the branch to target without replaying commits instead of falling back to plain
git rebase; locate the code by the match List.last kept expression and the
rebase_onto caller to implement the changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f09e5e47-3bb0-470b-a351-1d565323050a
📒 Files selected for processing (3)
lib/worktree.mllib/worktree.mlitest/test_properties.ml
✅ Files skipped from review due to trivial changes (1)
- lib/worktree.mli
🚧 Files skipped from review as they are similar to previous changes (1)
- test/test_properties.ml
Review SummaryThis diff correctly addresses all must-fix and should-fix items from prior review turns. The new Three carry-over
3 comments posted · Model: |
find_old_base now switches to `git log --format='%H %s'` so it can see each candidate commit's subject, and filters out commits whose subject matches `[<project>] Patch N:` for any transitive ancestor N. The patch-id fallback from `git rev-list --cherry-pick` alone isn't enough when upstream squash-merges: the squash carries a fresh patch-id that cherry-pick can't equate with the feature branch's original commit, so the ancestor's commit survives the rebase as an empty/no-op or conflicting replay. Graph gains a `transitive_ancestors` helper; both rebase call sites in bin/main.ml pass the computed ancestor set through execute_worktree_plan to the new `Worktree.rebase_onto ~project_name ~ancestor_ids` signature. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Delete oldest_unique_commit (.ml + .mli); migrate its tests to oldest_non_ancestor_commit ~ancestor_ids:[] which subsumes the old behaviour for real git log --format=%H %s output. - In oldest_non_ancestor_commit, treat a line without a space as malformed and skip it rather than treating the whole line as a SHA. Real git output always has "<sha> <subject>" with a mandatory space; the previous fallback would have fed garbage into rev-parse. - Clarify the is_ancestor_patch_subject doc comment to match the actual predicate (any non-[alnum-_] character terminates the id). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Graph.transitive_ancestors now seeds the [seen] set with the root patch_id and removes it from the result, so a dependency cycle (p -> q -> p) can no longer leak p back into its own ancestors. - is_ancestor_patch_subject: short-circuit when project_name or ancestor_ids is empty, so "[] Patch N:" can never accidentally match. Widen the id terminator set to stop only on whitespace or ':' — any Patch_id.to_string that doesn't itself contain those now round-trips, so the function is robust to non-numeric ids (slugs, UUIDs, etc). - oldest_non_ancestor_commit: drop the [String.strip] on the whole log output. An oldest commit with an empty subject emits "<sha> " via [%H %s]; stripping would erase the separator and silently drop the SHA, picking the wrong old_base. Blank/ whitespace lines are filtered explicitly. - Add regressions for the empty-subject and empty-project cases. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- oldest_non_ancestor_commit now rstrips only trailing [\r] on each
line before splitting, so CRLF line endings don't leak the [\r]
into the sha. Explicitly not a full rstrip: the mandatory trailing
space on empty-subject lines ("<sha> ") must survive to keep
[lsplit2] parsing it as a separator.
- prop_transitive_ancestors_transitive: a dep graph with a back-edge
to the root (B depends on root A) would have grand = A, which the
fixed [transitive_ancestors] correctly excludes from ancestors.
Allow grand = p.id as an exception to the closure check.
- Add a CRLF regression test for oldest_non_ancestor_commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- find_old_base now passes --no-signature to git log so a log.showSignature=true in the user's global gitconfig can't inject "gpg: Signature made ..." lines into %H %s output. Without this, oldest_non_ancestor_commit would treat the signature line as a sha and propagate a confusing rev-parse failure into the rebase fallback. - Sharpen is_ancestor_patch_subject's doc comment: the id terminator is any Char.is_whitespace or ':' (not just space and colon as the old wording implied). - Hand-crafted cycle test for Graph.transitive_ancestors: gen_patch_dag only produces DAGs, so the Set.mem termination guard isn't exercised by the property tests. Build A↔B directly, verify the walk terminates and excludes the root. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- worktree.mli: rebase_onto's doc comment now says "git log --cherry-pick" to match the actual implementation. Was still referencing the old "git rev-list --cherry-pick" invocation that this PR replaced. - test_properties.ml: sort both sides of the cycle-test equality so the assertion doesn't depend on Set.to_list traversal order. With only one element per set it's moot today, but the pattern would break silently if future cycles had multiple ancestors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
git log's option is --show-signature / --no-show-signature; the bare --no-signature form is not recognized and fails with exit 128, which would have made find_old_base fall back to plain rebase on every call once this landed. Verified locally: $ git log --no-signature -n 1 fatal: unrecognized argument: --no-signature Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- oldest_non_ancestor_commit: skip lines where lsplit2 yields an
empty SHA (leading whitespace). Real git log --format=%H %s can't
produce these, but the guard keeps "" from ever reaching
git rev-parse and producing a confusing fallback.
- test 10: the original trailing-newline drift didn't actually
change patch-id (git normalizes trailing whitespace), so
cherry-pick alone already stripped Patch 1 and the subject-filter
path wasn't exercised. Switch to content-level drift
("dep-v1-drift") and add a sanity assertion that confirms
cherry-pick keeps the drifted commit — so the end-state assertion
is genuinely proving the subject-filter code path.
- graph.mli: document that transitive_ancestors returns a list
sorted by Patch_id.compare and is cycle-safe.
- worktree.mli: spell out the full git log invocation
(--no-merges, --no-show-signature) that oldest_non_ancestor_commit
expects as input.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- worktree.ml: expand the comment above the per-line stripping in oldest_non_ancestor_commit to explain why both rstrip (\r only) and the strip-based blank-line gate are needed — one handles CRLF, the other handles space-only separator lines, and "<sha> " must survive both. - test_rebase_onto.ml: add prop_single_line_empty_subject to cover the narrow "<sha> \n" case directly (existing prop_empty_subject_preserved covers the multi-line variant). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
78f6834 to
215ba85
Compare
- worktree.mli: document that project_name="" or ancestor_ids=[] opts out of the subject filter (cherry-pick dedup still applies). - test_properties.ml: extend the hand-crafted cycle block with a three-node A→B→C→A ring, so the [seen]-set termination guard is exercised on a non-adjacent back-edge (C→A is reached only after A→B→C, not directly from A). - test_rebase_onto.ml: rename prop_trailing_whitespace to clarify that the invariant under test is "subject-side trailing spaces don't corrupt the SHA" (the SHA side is always safe because lsplit2 splits at the first space). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Review SummaryThis PR correctly implements the ancestor-subject filter for squash-merged trunks and is well-tested. All previously raised must-fix and should-fix issues from prior turns have been addressed. The remaining open items are all note-level carry-overs:
4 comments posted · Model: |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
lib/worktree.mli (1)
54-57: Document the actual patch-id terminator rule in the interface.The implementation intentionally matches subjects like
[proj] Patch 6 (HEAD -> main)as well as[proj] Patch 6: ..., but this comment currently reads as though:is required. The.mlishould expose the same delimiter contract as the implementation/tests.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/worktree.mli` around lines 54 - 57, Update the comment for the commit-subject matcher in lib/worktree.mli to document the exact patch-id terminator rule: state that the pattern is "[<project_name>] Patch <N>" where the "<N>" may be followed by either a colon (:) or whitespace and then additional tokens (e.g. git ref/parentheses like "(HEAD -> main)") or end-of-line; ensure the comment mentions this behavior alongside the existing reference to ancestor_ids so it reflects the implementation/tests that accept both ":" and space/parenthesis-terminated forms.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@lib/worktree.mli`:
- Around line 54-57: Update the comment for the commit-subject matcher in
lib/worktree.mli to document the exact patch-id terminator rule: state that the
pattern is "[<project_name>] Patch <N>" where the "<N>" may be followed by
either a colon (:) or whitespace and then additional tokens (e.g. git
ref/parentheses like "(HEAD -> main)") or end-of-line; ensure the comment
mentions this behavior alongside the existing reference to ancestor_ids so it
reflects the implementation/tests that accept both ":" and
space/parenthesis-terminated forms.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 13d23ce5-e873-400a-ba39-7f12adf28fdc
📒 Files selected for processing (7)
bin/main.mllib/graph.mllib/graph.mlilib/worktree.mllib/worktree.mlitest/test_properties.mltest/test_rebase_onto.ml
🚧 Files skipped from review as they are similar to previous changes (4)
- lib/graph.ml
- test/test_properties.ml
- bin/main.ml
- test/test_rebase_onto.ml
Review SummaryOverall the PR is in good shape. All prior One new One carry-over
2 comments posted · Model: |
- test_rebase_onto.ml test 10: assert the cherry-pick log contains both commits (Patch 1 + Patch 7) before the SHA comparison, so a future git version that patch-id-equates the drift with the squash fails as an explicit line-count mismatch rather than a confusing SHA mismatch. - graph.mli: spell out that transitive_ancestors returns the list in "ascending order" by Patch_id.compare, removing the (minor) direction ambiguity. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Review SummaryThe core implementation —
3 comments posted · Model: |
- test_rebase_onto.ml test 10: add a comment at the SHA assertion explaining why oldest_non_ancestor_commit returns HEAD~1 (oldest = Patch 1, last line of newest-first git log output) and noting that the [git] helper already strips its subprocess output. - worktree.mli: elaborate the oldest_non_ancestor_commit advisory to explain why both --no-merges and --no-show-signature are load-bearing for callers who build their own git log invocation (merge commits give wrong ~1 parent; GPG lines get mistaken for SHAs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
You're iterating quickly on this pull request. To help protect your rate limits, cubic has paused automatic reviews on new pushes for now—when you're ready for another review, comment |
| with | ||
| | Result.Ok sha -> | ||
| (* With [~ancestor_ids:[]] the subject filter is inactive, so both | ||
| log lines survive; [oldest_non_ancestor_commit] returns the last |
There was a problem hiding this comment.
[SHOULD FIX] test/test_rebase_onto.ml:823 — Carry-over from Turn 12: The sanity-check now has the line-count assertion (good), but the SHA assertion immediately after lacks a comment explaining why oldest_non_ancestor_commit must return HEAD~1 rather than HEAD. oldest_non_ancestor_commit returns List.last kept — the last (oldest) line of newest-first git log output. With two commits in the log (Patch 7 = HEAD = first line, Patch 1 = HEAD~1 = last line), List.last returns Patch 1's SHA. Without a comment a reader may assume the function returns the newest SHA and wonder why we assert against HEAD~1.
| log lines survive; [oldest_non_ancestor_commit] returns the last | |
| ```ocaml | |
| (* oldest_non_ancestor_commit returns the *last* (oldest) line of the | |
| newest-first log, so with two commits the result is Patch 1 = HEAD~1, | |
| not Patch 7 = HEAD. *) | |
| let patch1_sha = git ~process_mgr ~dir [ "rev-parse"; "HEAD~1" ] in | |
| assert_eq "test10: cherry-pick alone keeps drifted Patch 1" patch1_sha sha |
> **Scope:** Two lines in `test/test_rebase_onto.ml` around line 823; no other files need changing.
| excluding [patch_id] itself. Missing IDs in [deps_map] act as leaves. | ||
| Dependency cycles that point back to [patch_id] do not re-add it. *) | ||
| let transitive_ancestors t patch_id = | ||
| let rec visit seen pid = |
There was a problem hiding this comment.
[NOTE] lib/graph.ml:43 — transitive_ancestors returns a Patch_id.t list and is_ancestor_patch_subject in worktree.ml uses List.mem (O(n) per commit line scanned). For current gameplan sizes (tens of patches) this is harmless. If ancestor sets grow large, passing a Set.M(Patch_id).t directly and replacing List.mem with Set.mem (O(log n)) would improve performance.
| let rec visit seen pid = | |
| No change required at current scale. If ancestor counts grow, change the return type to `Set.M(Patch_id).t` and update `is_ancestor_patch_subject` to use `Set.mem`. |
Scope:
lib/graph.ml,lib/graph.mli,lib/worktree.ml, and bothbin/main.mlcall sites.
| patch_id) | ||
| in | ||
| let ancestor_ids = | ||
| Runtime.read runtime (fun snap -> |
There was a problem hiding this comment.
[NOTE] bin/main.ml:2571 — agent and ancestor_ids are fetched in two separate Runtime.read calls (two mutex acquisitions). The graph is immutable after startup so there is no real race today. The same pattern appears at the merge-conflict rebase path (~line 2906). If the graph ever becomes mutable, combining into a single read would be the safe fix.
| Runtime.read runtime (fun snap -> | |
| No change required while the graph is immutable. If it becomes mutable, combine into a single read: | |
| ```ocaml | |
| let agent, ancestor_ids = | |
| Runtime.read runtime (fun snap -> | |
| let agent = Orchestrator.agent snap.Runtime.orchestrator patch_id in | |
| let ancestor_ids = | |
| Graph.transitive_ancestors | |
| (Orchestrator.graph snap.Runtime.orchestrator) patch_id | |
| in | |
| (agent, ancestor_ids)) | |
| in |
> **Scope:** Both rebase call sites in `bin/main.ml`.
Review SummaryThis PR is in good shape. All must-fix and should-fix issues from prior review turns have been resolved in this diff:
One remaining should-fix: the SHA-comparison block in test 10 would benefit from a one-line comment explaining that Two carry-over notes (O(n)
3 comments posted · Model: |
Summary
Worktree.find_old_basenow reads each candidate commit's subject alongside its SHA (viagit log --format='%H %s') and filters out commits matching[<project>] Patch N:for any transitive ancestorN, in addition to the existinggit rev-list --cherry-pickpatch-id filter.trigger-only-execution/patch-7— the auto--onto mainreplayed merged Patch 1 and Patch 2 back onto main because their feature-branch patch-ids had drifted from the squashes.Graph.transitive_ancestorsexposes the ancestor set; both rebase call sites inbin/main.ml(theRebaseop and merge-conflict rebase) thread it throughexecute_worktree_plan→Worktree.rebase_onto ~project_name ~ancestor_ids.Test plan
dune build(fatal-warnings clean)dune runtest --force— full suite passes, including:is_ancestor_patch_subjectandoldest_non_ancestor_commit(covers squash(#42)suffix, wrong-project tag, ad-hoc agent subjects, missing-colon form, empty-ancestor passthrough)Graph.transitive_ancestors(self-exclusion, direct-deps coverage, deps-closure)test_rebase_onto.ml: reproduces the patch-7 drift — an ancestor commit with the[proj] Patch 1:subject whose content was touched (so its patch-id no longer matches main's squash) is correctly stripped whenancestor_ids=[1]; only the current patch's commit sits on top of the squashrebase_ontointegration tests keep passing by threading~project_name:""and~ancestor_ids:[](no-op filter preserves pre-change behavior)🤖 Generated with Claude Code
Summary by cubic
Stops rebase --onto from replaying squash-merged ancestor commits by reading
git logsubjects and skipping[<project>] Patch N:for transitive ancestors. Picks the correct old base on squash-merging trunks and hardens log parsing.Bug Fixes
Worktree.find_old_basenow usesgit log --cherry-pick --format='%H %s' --no-show-signature --no-merges, returns the oldest SHA after dropping lines whose subjects match[<project>] Patch N:for anyancestor_ids; strips trailing\r, skips blank/malformed/empty-SHA lines, and supports opt-out viaproject_name=""orancestor_ids=[]. Docs clarify why--no-mergesand--no-show-signatureare required.Graph.transitive_ancestors(cycle-safe, ascending order) and threaded its result intoWorktree.rebase_onto ~project_name ~ancestor_idsat both rebase call sites; tests include a line-count sanity check to prove the subject-filter path.Refactors
oldest_unique_commitwitholdest_non_ancestor_commit, which subsumes the old behavior.Written for commit 32b2a31. Summary will update on new commits.
Summary by CodeRabbit
New Features
Bug Fixes
Tests