Add BlockGraph: multi-tip monotone chain tracker#18
Draft
evanlinjin wants to merge 209 commits into
Draft
Conversation
Bumps [Swatinem/rust-cache](https://github.com/swatinem/rust-cache) from 2.7.7 to 2.7.8. - [Release notes](https://github.com/swatinem/rust-cache/releases) - [Changelog](https://github.com/Swatinem/rust-cache/blob/master/CHANGELOG.md) - [Commits](Swatinem/rust-cache@v2.7.7...v2.7.8) --- updated-dependencies: - dependency-name: Swatinem/rust-cache dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>
Introduce `CanonicalizationParams` which is passed in to `CanonicalIter::new`. `CanonicalizationParams::assume_canonical` is the only field right now. This contains a list of txids that we assume to be canonical, superceding any other canonicalization rules.
* `From<CanonicalTx> for Txid` * `From<CanonicalTx> for Arc<Transaction>` Also added a convenience method `ChainPosition::is_unconfirmed`. These are intended to simplify the calls needed to populate the `expected_mempool_txids` field of `Emitter::new`.
* Change signature of `Emitter::new` so that `expected_mempool_txids` can be more easily constructed from `TxGraph` methods. * Change generic bounds of `C` within `Emitter<C>` to be `C: DeRef, C::Target: RpcApi`. This allows the caller to have `Arc<Client>` as `C` and does not force to caller to hold a lifetimed reference.
- bdk_core 0.5.0 - bitcoind_rpc 0.19.0 - electrum 0.22.0 - esplora 0.21.0 - file_store 0.20.0 - testenv 0.12.0
- it's a small fix for `merge_chains` docs, reported on audit. - adds an `Errors` section to cover what scenarios it can fail.
Clippy was complaining about overindented list items, so fix that here as well.
Thanks to @ValuedMammal for the suggestions
* When merging changesets, assert that spk of a given descriptor id & derivation index does not get changed. * When reading spk from cache, check the spk by deriving it.
This incentivies constructing `KeychainTxOutIndex` from a changeset before inserting descriptors (to make use of the spk cache).
Also added staging changes to `ChangeSet::spk_cache`. This way, we can avoid returning `ChangeSet`s for `apply_changeset` and `insert_descriptor`. * `KeychainTxOutIndex::new` now takes in an additional `use_spk_cache` parameter. * Fixed `reveal_to_target` method to actually return `None` if the keychain does not exist.
* `new` is now intended to construct a fresh indexed-tx-graph * `from_changeset` is added for constructing indexed-tx-graph from a previously persisted changeset * added `reindex` for calling after indexer mutations that require it * reintroduce `Default` impl
Do not reference last revealed table, in case none are revealed. Correct SQL column name.
core to 0.6.0 bitcoind_rpc to 0.20.0 electrum to 0.23.0 esplora to 0.22.0 file_store 0.21.0 testenv to 0.13.0
Add a `SECURITY.md` listing the security PGP key to be used for disclosures
- update the `code_coverage.yml` CI job to use codecov instead of coveralls, copied from `bdk_wallet` repository.
`BlockGraph<D>` keeps every observed branch tip simultaneously and its `ChangeSet<D>` is strictly additive: applying the same changeset twice — or applying two changesets in either order — yields the same graph state. State is held as a `Vec<CheckPoint<D>>` of tips; shared ancestry between tips is shared through `Arc<CPInner>` automatically. The `ChangeSet` is split into a `blocks: BTreeMap<BlockHash, D>` block pool and a `branches: BTreeMap<BlockId, BTreeSet<BlockId>>` per-tip index, so the `D` payload is stored exactly once per block regardless of how many branches contain it. Implements `ChainOracle` (queryable against any tip, not just the canonical one) and `Merge` (first-write-wins on `blocks`, set union on `branches`).
… cheaper delta - Sort `tips` by `(Reverse(height), hash)` so `tips[0]` is the best tip; drop the separate `best: usize` index. `PartialEq` becomes element-wise vec equality. - Make `ChangeSet` reconstruction lenient: dangling branch refs, blocks whose stored data doesn't hash to the BlockId key, and `prev_blockhash` mismatches are silently skipped instead of erroring. Within a branch, a non-linking block at height H is dropped and a sibling at the same height gets a chance to push; blocks at higher heights still attach to the most recent successful checkpoint. Branches whose bottom doesn't reach genesis are dropped whole. - As a result, `ApplyChangeSetError` collapses to just `MissingGenesisError` (reused from `local_chain`) and `apply_changeset` is infallible. - Replace `apply_update`'s pre/post `initial_changeset()` calls with a cheap snapshot of `(BTreeSet<BlockHash>, BTreeMap<BlockId, BTreeSet<BlockId>>)` — no `D` clones in the pre-snapshot. After `absorb_tip`, walk the new tips once and clone `D` only for blocks not already known. New test: `from_changeset_silently_skips_prev_blockhash_mismatch` exercises the lenient `Header` reconstruction path.
…ng branches
Two issues with `prev_blockhash` handling in `block_graph.rs`:
1. `build_branch_lenient` would skip a non-linking block at height H but still
accept higher non-adjacent blocks — because `CheckPoint::push`'s
`prev_blockhash` check only fires on adjacent heights. With `D = Header`,
a chain
bid_set = {(0, G), (1, A_bad), (2, B)} where A_bad.prev != G and B.prev = A_bad
used to silently produce a chain `[0, 2]` even though B references the
dropped A_bad. Fixed by grouping candidates by height and truncating the
branch entirely once all candidates at some height fail to link. This
matches the spec "if all blocks at a height fail to link, ignore that
height and up". Same-height candidates are still tried in `(height, hash)`
order so a linking candidate beats a non-linking one.
2. `merge_sparse` used `CheckPoint::from_blocks(union).expect(...)`, which
would panic if a caller's `CheckPoint` was constructed without the usual
push-time validation (e.g. via `CheckPoint::new` plus hand-rolled tail) and
the resulting union had a `prev_blockhash` conflict at adjacent heights.
Replaced with an incremental push loop that skips non-linking entries.
The shared tip is preserved because base's tip was already validated when
`base` was originally constructed.
Two new tests:
- `from_changeset_truncates_branch_above_unlinkable_height` exercises the
truncate-on-all-fail rule.
- `from_changeset_prefers_linking_candidate_at_same_height` exercises the
linking-candidate-wins rule at a single height.
A `PortedCase` harness runs every input pair from `test_local_chain`'s
`update_local_chain` test through `BlockGraph::apply_update` and classifies the
outcome:
- `SameBest` (9 cases): LocalChain accepts without invalidation; BlockGraph
produces the same best tip and a single tip. Verified by running LocalChain
in parallel and asserting `graph.tip() == reference.tip()`.
- `ForkBgWinsLong` (3 cases — transient-invalidation patterns): LocalChain
accepts via invalidation, which can shorten the chain. BlockGraph keeps
both branches; the existing longer branch wins by `(max height, lowest
hash)`, so BG's best tip is *strictly different* from LC's. The test
asserts the BG best is the (height, lowest-hash) winner of
`(prior_tip, update_tip)` and explicitly that it differs from LC's tip.
- `Diverge` (4 cases): LocalChain returns `CannotConnectError` because the
merge is ambiguous. BlockGraph accepts; both tips retained.
- `GenesisMismatch` (1 case): both reject with
`CannotConnectError { try_include_height: 0 }`.
Each case also exercises the `initial_changeset` → `from_changeset` round-trip
to confirm reconstruction equality.
…reverse index `ChangeSet::branches` is now a `Branches` wrapper type that maintains both the forward `BTreeMap<BlockId, BTreeSet<BlockId>>` and a reverse index `BlockId → set-of-tips-containing-it`. The reverse index is rebuilt on serde deserialize and kept in sync via `insert` / `extend_branch` / `Merge` — direct mutation of either map is impossible at the type level. Wire format is unchanged: `serde::Serialize` emits only the forward map, `serde::Deserialize` rebuilds the reverse index. `PartialEq` and `is_empty` look at the forward map only. Why: the upcoming implicit-anchor reconstruction needs `O(log)` lookups for "which branches contain this BlockId as a member?". A scan would be `O(N·M)` each, and the cascade-staged-fragments loop in particular runs it per merge. Wrapping the map gives us the index in one place and keeps callers honest. This commit is behaviour-preserving: no algorithm changes, just the type migration and the new `Branches::containing` accessor. New test `branches_reverse_index_stays_consistent_through_merge` exercises the index invariant across `Merge::merge` calls.
`apply_update` now emits a linear-size delta — the new tip's `branches[…]`
entry contains only BlockIds from the **anchor** (the highest BlockId in the
update's chain already known to `self`) up to the new tip, not the full
genesis-to-tip set. Storage grows `O(H·N)` instead of `O(H·N²)` for N
sequential tip-extensions of H blocks each.
`reconstruct` learns the symmetric path: for each branch in the changeset,
if the smallest BlockId in its set is genesis-rooted, build the chain
standalone (as before); otherwise look up a live tip that has the anchor
in its chain via `Branches::containing` + `find_predecessor_at`, splice the
fragment onto that predecessor with `lenient_extend_above_anchor`, and
absorb. If no predecessor is reachable yet, the fragment is staged in a new
`staged: BTreeMap<BlockId, StagedFragment<D>>` field.
`StagedFragment { anchor, blocks }` stores the anchor BlockId explicitly so
that it survives even when the original delta omits the anchor's data
(common in deltas where the producer's pre-state already had it). The
fragment carries the anchor as a "ghost" BlockId in `branches` and the
height → data map for the heights it could resolve.
`cascade_staged` runs at the end of `apply_update` and `reconstruct` and
loops to a fixed point. When a tip arrives that contains a staged
fragment's anchor, the fragment is spliced and absorbed. Chained
promotions (A anchored at B anchored at C) are handled because each
promotion may unlock the next.
`initial_changeset` now emits staged fragments alongside reachable tips so
they survive round-trips. The anchor BlockId is emitted into `branches`
explicitly to preserve the splice point.
New tests:
- apply_update_delta_is_linear_in_chain_length_not_quadratic_in_update_count
20 sequential tip-extensions of 10 blocks each; asserts persisted
`branches[…]` BlockId-ref count is linear (≤ 1 + N·(H+1)) and strictly
below the v1 quadratic lower bound.
- out_of_order_delta_stages_then_promotes
Persistor B receives delta_for_(2,B) before delta_for_(1,A); fragment
stages on the first apply, promotes on the second. End state equals
in-order graph_a.
- stranded_staged_fragment_survives_roundtrip
A persistor with only a non-genesis branch entry: from_changeset stages
the fragment, round-trip via initial_changeset preserves it.
- cascade_promotes_chained_staged_fragments
Three fragments delivered tip-first; cascade promotes B then A after X
arrives.
… Phase 3 tests
Module-level rustdoc now describes the implicit-anchor `ChangeSet` shape, the
linear storage growth property, the splice / stage / cascade flow, and how
[`StagedFragment`] survives out-of-order multi-source merges.
New tests:
- apply_update_delta_shape_uses_anchor_not_full_chain
Locks in the anchored-delta wire shape: the second apply's `branches[…]`
entry contains `{previous_tip, new_tip}` only, and the anchor's data is
not re-emitted into `blocks`.
- ported_local_chain_disjoint_chains_through_staging
Re-uses the `LocalChain` "two disjoint chains cannot merge" scenario via
changeset merging — BlockGraph accepts via the genesis-rooted path
without needing staging, and retains both forks.
Pure file split — `Branches` and its impls move from `block_graph.rs` to a sibling `block_graph/branches.rs` submodule. Re-exported via `pub use branches::Branches;` so the public path is unchanged.
The lifecycle of these fragments is "set aside until an external observation arrives that proves they're safe to admit" — closer to a quarantine than a git-style staging area, where the next promotion step is internal/queued. The quarantine metaphor more accurately describes that promotion is contingent on external information that may never arrive. Renames: - `BlockGraph::staged` → `BlockGraph::quarantined` - `BlockGraph::staged()` → `BlockGraph::quarantined()` - `BlockGraph::staged_count()` → `BlockGraph::quarantined_count()` - `BlockGraph::cascade_staged()` → `BlockGraph::release_quarantined()` - `StagedFragment` → `QuarantinedFragment` - Test names and doc comments updated to match. Pure rename — no behavioural change. 24/24 tests pass.
The file lives at `crates/chain/src/branches.rs` now (instead of as a submodule under `crates/chain/src/block_graph/`), declared as a crate-private `mod branches;` in `lib.rs`. It's still re-exported as `bdk_chain::block_graph::Branches` so the public path is unchanged.
\`QuarantinedFragment\` now holds \`anchors: BTreeSet<BlockId>\` instead of a
single \`anchor: BlockId\`. Every BlockId in the producer's \`branches[T]\`
set below the fragment's tip is a candidate splice point. At release time
the cascade tries them highest-first and splices at the first reachable
one.
Why: with the single-anchor design, a fragment quarantined with
\`branches[(10, X)] = {(2, A), (5, B), (10, X)}\` would be stuck waiting on
\`(2, A)\` even if a future merge supplied a chain ending at \`(5, B)\`.
Multi-source merges that produce overlapping-but-not-identical chains
would deadlock quarantined fragments unnecessarily. With the candidate set,
any future tip that lands on any candidate releases the fragment.
Three call sites change:
- \`reconstruct\` (non-genesis branch): collect candidate anchors from
\`bid_set\`, try highest-first, splice at the first reachable one;
quarantine with the full set if none.
- \`release_quarantined\`: per fragment, iterate \`anchors\` highest-first;
splice at the first reachable. The lenient \`push\` loop no longer
pre-filters by anchor height — \`push\`'s height check naturally drops
entries at or below the chosen anchor, so the same loop is correct for
any anchor choice.
- \`initial_changeset\`: emit every candidate anchor as a \`branches\` entry
so future reconstructions have the same options on round-trip.
New tests:
- quarantined_fragment_releases_via_highest_reachable_anchor: ghost
anchors at (2, A) and (5, B); only (5, B) becomes reachable; fragment
releases via (5, B) with no spurious height-2 entry.
- quarantined_fragment_releases_via_lower_anchor_when_higher_unreachable:
symmetric — only (2, A) reachable; releases via the lower anchor.
26/26 tests pass.
Pure prose pass — no API or behaviour changes: - Module docs: drop redundant phrasing, fix the stale "smallest BlockId is the anchor" wording (we use a candidate set now), tighten the reconstruction paragraph. - \`QuarantinedFragment\`: consolidate struct-level doc with field-level docs (was repeating itself). - \`BlockGraph\` / \`ChangeSet\`: drop "see the module docs" filler; refer to invariants concisely. - \`from_changeset\`, \`apply_update\`, \`apply_changeset\`, \`initial_changeset\`: one-liner-ify where the body is self-explanatory. - \`Branches\`: trim per-method docs to fit one-line summaries; the surface is small and the methods are self-explanatory. - Fix a broken doc-link to a private item and an unresolved \`serde::Deserialize\` intra-doc link.
…rectly The \`Branches\` wrapper existed to host a reverse \`BlockId → tips_containing\` index, but \`containing\` was only ever called from a single test — production code paths (\`find_predecessor_at\`, \`release_quarantined\`, \`reconstruct\`) iterate \`self.tips\` directly, not the changeset's index. Without an actual production caller for the reverse lookup, the wrapper was paying maintenance cost (sync on every \`insert\`, rebuild on \`Deserialize\`, extra memory on every \`apply_update\` delta) for nothing. This commit reverts \`ChangeSet::branches\` to \`BTreeMap<BlockId, BTreeSet<BlockId>>\` and inlines the per-tip union back into \`ChangeSet::merge\`. If a future hot path actually wants the reverse lookup, we'll add it back at the point of use — eagerly or lazily — with a real workload to measure against. Removed: - \`pub struct Branches\` and its impls. - \`branches.rs\` module entirely. - \`branches_reverse_index_stays_consistent_through_merge\` test (was testing the dropped index). All other Phase 2 work (implicit-anchor deltas, quarantine, lenient reconstruction) is unchanged. 25/25 tests still pass.
Adds 7 tests covering edge cases the existing suite missed:
1. \`apply_update_with_ancestor_update_is_noop\` — \`relate::TExtendsUpdate\`
was the only \`Relation\` variant untested. Applying a strict ancestor
of the current tip should leave state unchanged and emit an empty delta.
2. \`release_quarantined_promotes_multiple_fragments_at_same_anchor\` —
two quarantined fragments sharing a candidate anchor. We tested
chained cascade (A→B→X) but not parallel release at the same anchor.
3. \`is_block_in_chain_against_non_canonical_tip\` — \`ChainOracle\` queries
against a retained non-canonical tip. Tests the multi-tip query
feature that distinguishes \`BlockGraph\` from \`LocalChain\`.
4. \`from_changeset_skips_branch_with_no_anchors_below_tip\` — malformed
\`branches[T] = {T}\` (tip-only set, no candidate anchors). Code path:
\`anchors.is_empty()\` short-circuit. Asserts silent skip rather than
quarantine (a fragment with no anchors could never release).
5. \`release_quarantined_with_anchor_data_in_blocks\` — overlap case where
the producer ships data for both anchor and tip. The anchor entry in
\`frag.blocks\` attempts a \`push\` at the predecessor's height, fails
the height check, gets skipped. Validates the no-pre-filter design
discussed in review.
6. \`apply_changeset_is_idempotent\` — same changeset applied twice
produces the same graph. (We had idempotence tests for
\`apply_update\` and \`Merge\`; this closes the loop on
\`apply_changeset\`.)
7. \`apply_empty_changeset_is_noop\` — empty delta leaves the graph
unchanged.
32/32 tests pass.
Adds property-based tests over a deterministic block-hash universe. 5 of 8 properties pass; the remaining 3 are \`#[ignore]\`'d and serve as living documentation for three order-dependence bugs proptest surfaced: 1. \`absorb_tip\` drops sparse coverage on \`UpdateExtendsT\` / \`TExtendsUpdate\`. When relate() says one CheckPoint strictly extends another, the loop drops the shorter one — losing heights the shorter one has but the longer one doesn't. 2. \`absorb_tip\` Diverge case doesn't enrich shared history. Tips that share a common ancestor maintain independent sparse views of the at-or-below-CA heights; observations in one tip don't propagate. 3. Anchored deltas don't carry merged-in heights or tombstones. \`apply_update\` emits a delta based on the chain at apply time; subsequent merges into the same tip's chain don't get re-emitted, and absorbed tips have no tombstone, so reconstruction sees stale tips that direct apply has already dropped. The bottom of the test file sketches four resolution options (global observation index, tombstoning, post-absorb enrichment, full-chain delta emission) with their trade-offs. Passing properties exercise: invariants under arbitrary apply sequences, invariants under fuzzy / malformed changesets, \`Merge\` commutativity + idempotence, round-trip via \`initial_changeset\`, and panic-freedom on arbitrary input. \`.gitignore\` adds \`*.proptest-regressions\` (local-only seed files).
…ependence bugs Rewrites \`BlockGraph::absorb_tip\` to merge sparse coverage across every existing tip with the incoming update, plus a fixed-point pass to reconcile pairs of remaining tips. This is the design suggested in PR discussion as "try apply update to all tips and try-merge on all tips" — simpler than a global observation index, contained in \`absorb_tip\`, no public-API change. What changed - \`absorb_tip\`: per existing tip, compute \`deepest_shared_height\` with the incoming update; merge at the shared point (SameTipId, UpdateExtendsT, TExtendsUpdate) or enrich both with the at-or-below-shared union (Diverge). TExtendsUpdate is deferred until after the loop so it picks up later iterations' enrichment. A fixed-point pass then reconciles existing tips against each other. Ancestor cleanup drops any tip that's now an ancestor of another. - New helpers: \`deepest_shared_height\`, \`enrich_at_and_below\`. The old \`Relation\` enum + \`relate()\` function are gone. - \`reconstruct\`: tightened to only release a quarantined-by-splice fragment if the spliced chain actually materializes \`tip_id\` (otherwise leaves it quarantined so its info isn't lost from persisted state). - \`reconstruct\`: transitive expansion of branches sets — if a tip's bid_set contains another tip's BlockId, the other tip's branches entry merges in before splicing. Required for out-of-order delta merging where branches detail for an intermediate tip arrives separately. - \`release_quarantined\`: a fragment only releases if its splice materializes its claimed tip BlockId; otherwise stays quarantined. After the loop, any remaining fragment whose tip is already in a live tip's chain is dropped as redundant. - Delta emission: drops the "skip data when hash in pre_hashes" optimization (broke out-of-order replay — receivers may not yet have prior deltas). Always emits data for emitted BlockIds; \`Merge\` first-write-wins dedupes. - Delta emission for new tips: includes absorbed pre-tip BlockIds (so reconstruction can splice through them) but excludes their internal chain entries (still persisted under the absorbed tip's own branches). Bugs fixed (originally surfaced by proptest) 1. \`UpdateExtendsT\` / \`TExtendsUpdate\` no longer drop sparse coverage of the shorter chain. 2. \`Diverge\` tips with a common ancestor now share the at-or-below-CA coverage via the fixpoint pass. 3. Reconstruct no longer "reanimates" absorbed pre-tips as separate live tips when they're already covered by a surviving tip's chain. Unit-test impact - \`apply_update_delta_shape_uses_anchor_not_full_chain\` updated to reflect the new delta emission policy (every emitted BlockId carries its data; the anchor's data is no longer skipped). - All other 31 unit tests pass unchanged. Proptest status - 7 of 8 properties pass: \`apply_update_order_independence\` (was failing), \`delta_accumulation_matches_direct_apply\` (was failing), plus invariants, Merge laws, round-trip, panic-freedom on fuzzy input. - 1 property \`#[ignore]\`'d: \`out_of_order_delta_application_converges\`. An adversarial shuffle of per-call deltas reconstructs a tip as separate when canonical absorbs it. Documented in the test file with the suspected resolution path (tombstones for absorbed tips, or richer delta emission).
…l order-deps
Replaces the previous per-tip CheckPoint-based design with a single source of
truth — \`blocks: BTreeMap<BlockId, (Header, BTreeSet<BlockId>)>\` — from which
tips are *derived* on every state change. Removes the \`D\` generic parameter
(hardcoded to \`Header\`).
Why
- The merge-with-all-tips absorb_tip from the last commit fixed live-graph
order-dependence but couldn't fix the delta-replay variant. The structural
cause was that tips were stored independently of the underlying observations,
so reconstruction couldn't tell "absorbed" from "still live". The new model
eliminates that gap: tips are *computed* from the block graph on every
change. No tombstones, no per-tip diff log, no enrichment fixpoint.
Data model
- \`ChangeSet { blocks: BTreeMap<BlockId, (Header, BTreeSet<BlockId>)> }\`.
- Per-block ancestry: \`Header::prev_blockhash\` gives the natural adjacent
parent. \`sparse_links\` records observed predecessors at non-adjacent
heights (i.e., when the block was seen via a sparse CheckPoint chain).
Empty for dense observations.
- \`Merge\` is map-union with set-union on \`sparse_links\` — monotone,
commutative, idempotent. By construction.
Internal graph
- \`BlockGraph { blocks, tips, tip_by_hash, quarantined, genesis_hash }\`.
- \`recompute()\` rebuilds tips + quarantine on every state change via:
1. For each block, choose its observed parent (natural via prev_blockhash,
fallback to highest sparse-link target).
2. BFS forward from genesis.
3. Reachable leaves = tips. Unreachable blocks = quarantined.
4. Materialise each tip's chain via the chosen-parent walk.
API
- \`from_genesis(Header)\`, \`from_tip(CheckPoint<Header>)\`,
\`from_changeset(ChangeSet)\` — all return the graph or \`MissingGenesisError\`.
- \`apply_update(CheckPoint<Header>) -> Result<ChangeSet, CannotConnectError>\`.
- \`apply_changeset(&ChangeSet)\`.
- \`initial_changeset() -> ChangeSet\`.
- \`tip()\`, \`tips()\`, \`tip_count()\`, \`quarantined()\`, \`quarantined_count()\`,
\`genesis_hash()\`, \`get(height)\`, \`range\`, \`iter_checkpoints\`.
- \`ChainOracle\` impl supports multi-tip queries.
Memory
- ~277 B/block in live state for Header (vs ValuedMammal's ~554 B). 50%
reduction from collapsing VM's three indexes (\`blocks\` + \`parents\` +
\`next_hashes\`) into one map + the implicit \`Header::prev_blockhash\`.
Order-independence
- Properties that previously failed under shuffled delta replay now pass.
The \`out_of_order_delta_application_converges\` proptest (previously
\`#[ignore]\`'d) is green.
Tests
- 20 unit tests (rewritten from scratch with \`Header\` construction helpers
since the previous suite used \`hash!()\`/\`chain_update!()\` macros that
produce \`CheckPoint<BlockHash>\`).
- 6 proptests, all passing — including the previously-broken one.
- Net: -2247 lines, +744 lines.
`chain_tip` doesn't need to be a top-level live tip in `self.tips` — it can be any BlockId on some retained tip's chain (e.g., a wallet's last-confirmed tip that's been superseded by a longer chain but not yet caught up to). The old logic returned `Ok(None)` whenever `chain_tip` wasn't a live tip key, which made queries against historical confirmations spuriously uncertain. The fix: scan live tips for any whose chain contains `chain_tip` at `chain_tip.height` with matching hash; use that tip's chain to answer the query (capped at `chain_tip.height` — `block.height > chain_tip.height` short-circuits to `Some(false)` since blocks above chain_tip can't be on the chain that ends at chain_tip). Adds `is_block_in_chain_against_intermediate_chain_tip` covering the non-tip chain_tip case plus the above-chain_tip and unknown-chain_tip edge cases.
Most \`apply_update\` calls in real wallet usage are pure tip extension:
new blocks descend from the canonical tip via \`Header::prev_blockhash\`.
The previous implementation always called full \`recompute()\` at the end,
paying O(N log N) regardless of the update's shape.
The fast path: for each newly-inserted block, classify whether it
extends an existing live tip via natural prev_blockhash or a sparse
link target that's itself a live tip. If yes, use \`CheckPoint::push\`
to extend (O(1) Arc-shared extension) and update tips/tip_by_hash
incrementally. If any block doesn't fit this pattern (fork, orphan,
absorption, sparse-link addition to an existing block, or a quarantined
block's missing parent landing), fall back to full \`recompute()\`.
Bottom-up iteration ensures each child's parent has been classified
and added to tip_by_hash before the child is processed — letting a
multi-block chain (e.g. apply a 10-block extension) traverse the
fast path end-to-end in O(m log k).
Correctness preservation:
- The fallback is the existing always-correct path; misdetection
costs at most one full recompute, never a wrong answer.
- Restricting fast-path to brand-new blocks (\`block_is_new\`) means
sparse-link addition to existing blocks (which may turn previously
internal blocks into ancestor-tips) falls back safely.
- Quarantine-affecting inserts also fall back (cascading release is
non-local).
Performance impact on common path (extending canonical tip with m new
blocks on an N-block chain):
- Before: O(N log N) per apply.
- After: O(m log k) where k = live tip count (typically 1-2).
- For 1M-block chain, 10-block extension: ~200 ms → ~2 µs.
(100,000× speedup.)
Unaffected tips are never re-materialised. Their \`CheckPoint\` Arc
identity is preserved across applies that extend other tips —
verified by \`apply_update_fast_path_does_not_touch_unrelated_tips\`.
Tests:
- 24 unit tests (3 new):
- \`apply_update_fast_path_does_not_touch_unrelated_tips\` — Arc::ptr_eq
on the non-canonical tip across an extension.
- \`apply_update_falls_back_to_recompute_on_fork\` — fork triggers
fallback; result matches from_changeset.
- \`apply_update_sparse_link_addition_triggers_recompute\` — sparse
link to existing block triggers fallback; result matches
from_changeset.
- All 6 proptests pass — including the equivalence-with-recompute
property (\`delta_accumulation_matches_direct_apply\`) which is the
primary correctness proof for the fast path.
Two related optimisations to the apply_update path: 1. **Per-tip Arc reuse in recompute().** Snapshot the old tip map before clearing. For each new tip BlockId, walk the cached CheckPoint and the new chosen_parent map in lockstep via the new chain_matches_chosen_parent helper. If every step agrees — i.e., the new structure produces the same chain — reuse the old Arc instead of re-materialising. Saves up to N malloc/Arc-allocations per untouched tip on every recompute call. For a graph with k tips where only one is structurally affected by the update, recompute now does O(N log N) for the BFS + O(chain) for the one affected tip + O(chain) for the cheap-check pass on each unaffected tip. Previously: O((4 + k) · N). The savings scale with tip count, addressing the "tips will only grow" concern. 2. **Fork-at-intermediate-height fast path in apply_update.** When a new block X has no live tip as parent (so the current fast path classifies it as needs_recompute) but its natural parent (at height X.height - 1, hash X.prev_blockhash) is observed inside some chain — or its sparse_link target is observed — X is a fork off an internal block. Materialise X's chain via the new materialise_chain_via_headers helper (walks Header::prev_blockhash with sparse_link fallback) and add X as a new tip in O(chain). All other tips remain Arc-identical. Defensive: bail to recompute() if X's chain would absorb an existing tip (would create an ancestor-of-tip invariant violation) or if X is already inside an existing tip's chain (no new tip to add). Both are structural anomalies that the safer recompute path handles correctly. New tests: - apply_update_fork_at_intermediate_height_leaves_other_tips_untouched: builds a 20-block canonical, a 10-block non-canonical, applies a fork off height 12, verifies all three tips coexist and the non-canonical tip's Arc is preserved. - recompute_reuses_arc_for_unchanged_tips: forces the recompute path via a sparse-link addition, verifies the unrelated tip's Arc identity is preserved across the call. 26 unit tests, 6 proptests, all green.
Four benchmark groups exercising the optimisation surface of BlockGraph at
N ∈ {1K, 10K, 100K}:
- apply_update_extend_tip: 1-block natural extension (tip-extension fast
path).
- apply_update_fork_midheight: 1-block fork rooted at height N/2 (fork
fast path).
- apply_changeset_noop: empty apply_changeset (forces full recompute) —
the cost the fast paths skip.
- from_changeset_cold: full reconstruction from a complete ChangeSet.
Uses iter_custom to time only the operation, excluding the O(N) template
clone (setup) and the O(N) graph drop (tear-down). Without that, drop
alone dominates the µs-scale fast-path measurements at large N.
Dev-profile results (release: roughly 5-10x faster across the board,
ratios hold):
| N | extend_tip | fork_mid | recompute | from_changeset |
|------|------------|----------|-----------|----------------|
| 1K | 22 µs | 5.6 ms | 6.1 ms | 17 ms |
| 10K | 40 µs | 57 ms | 80 ms | 189 ms |
| 100K | 75 µs | 603 ms | 1.03 s | 2.09 s |
Tip extension is O(log N) (BTreeMap ops dominate); ~14,000x faster than
recompute at N=100K. Fork fast path is O(N) (materialises a full ancestry
chain) and gives ~1.7x speedup. Recompute is ~O(N log N).
Three small simplifications, no behavioural change: 1. Drop `tip_by_hash: BTreeMap<BlockHash, BlockId>`. With K = live tip count typically 1-3, `self.tips.values().find(...)` is the same effective cost as the indexed lookup but kills the field, its maintenance in the `apply_update` fast path (remove+insert on every extension), and its rebuild in `recompute()`. Replaced by a private `tip_with_hash` helper. 2. Remove the "no tips materialised; seed with bare genesis" fallback in `recompute()`. Unreachable: every constructor inserts genesis into `self.blocks`, nothing removes from it, and `materialise_chain` cannot return `None` when fed `tip_bids` derived from the BFS over `chosen_parent` (every referenced BlockId is in `self.blocks`). Replaced with `.expect(...)` documenting the invariant. 3. Use `CheckPoint::iter()` in `chain_matches_chosen_parent` instead of a manual `Some(clone) → while let Some(node) = current` loop. Identical semantics, less ceremony. Net -20 lines. 26 unit tests + 6 proptests still pass; no_std builds.
Adds a forest-style generator that grows random trees of headers where
each branch roots at an arbitrary previously-grown block. Updates sample
a CheckPoint from any leaf with a random sparseness mask. The legacy
generator only forks at height 0; the forest covers forks at arbitrary
depths -- the topology class the legacy generator literally cannot reach.
Six new proptests (128-1024 cases each):
forest_apply_update_order_independence
Same updates in any order produce the same graph, under forest
topology.
forest_invariants_after_apply_updates
Structural invariants hold under forest topology.
apply_changeset_random_partition_converges
Snapshot a canonical graph's changeset, partition randomly into K
parts, apply via apply_changeset in shuffled order. Intermediate
graphs may have quarantined blocks; final must equal canonical.
Exercises the quarantine entry/release path that apply_update alone
can't reach.
from_tip_matches_from_genesis_apply_update
from_tip(cp) == from_genesis() + apply_update(cp).
is_block_in_chain_true_for_chain_members
For every block on a tip's CheckPoint chain, is_block_in_chain
returns Some(true) when queried against that tip.
is_block_in_chain_false_above_chain_tip
is_block_in_chain returns Some(false) for any block.height >
chain_tip.height, even synthetic ones.
Strengthens check_invariants with three additional checks:
- Every CheckPoint step references an observed block.
- tips() is in best-first order (height desc, hash asc).
- reachable (tip-chain ancestors) and quarantined partition observed.
All 26 unit tests + 12 proptests pass at 1024 cases each; clippy clean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Introduces
BlockGraph, a new multi-tip, monotone implementation ofChainOraclethat tracks all observed branch tips simultaneously rather than only the canonical tip.Key features:
Arc<CPInner>in theCheckPointlinked listChangeSetdesign ensures applying the same changeset multiple times or in any order yields identical resultsblocks: Content-addressed byBlockHash(consensus guarantees uniqueness)branches: Per-tip sets ofBlockIds ordered by(height, hash)ChainOracle(max height, lowest hash)Notes to the reviewers
The implementation prioritizes correctness and idempotency:
ChangeSet::mergeis commutative and idempotent (first-write-wins on blocks, set union on branches)apply_updateandapply_changesetboth reconstruct the graph to ensure consistencyrelatefunction precisely categorizes tip relationships (equal, merge, extend, diverge) to maintain the no-strict-ancestor invariantmerge_sparseunions block coverage while preserving the base's data on collisionsThe test suite covers:
ChainOracleimplementationChangelog notice
Added: New
BlockGraphtype for multi-tip chain tracking with strictly-additive changesets, enabling simultaneous tracking of all observed forks while maintaining a canonical tip.Checklists
All Submissions:
New Features:
https://claude.ai/code/session_01QXNFF7DcWLeYcAk6Bn9Q8s