Skip to content

ERA → static cold backend: end-to-end integration (42× faster than KV)#76

Open
dapplion wants to merge 47 commits into
unstablefrom
experiment-era-static-cold-load
Open

ERA → static cold backend: end-to-end integration (42× faster than KV)#76
dapplion wants to merge 47 commits into
unstablefrom
experiment-era-static-cold-load

Conversation

@dapplion
Copy link
Copy Markdown
Owner

End-to-end integration of the static cold backend (#75) + the ERA file consumer/producer in lcli (#69) plus a few hot-path optimizations on top. The branch is what produced the 42× speedup measurement reported in #75's comment thread.

What's in here

Three building blocks composed:

  1. Static cold backend — the code from Generalize cold DB: ColdStore trait + slot-keyed static archive #75 (static-files-generalization-spec), branched at the put_batch-fix point.
  2. ERA file importer in lcli — the code from ERA file consumer and producer via LCLI #69 / sigp ERA file consumer and producer via LCLI sigp/lighthouse#9273 (era-lcli-upstream).
  3. Optimizations layered on the union:
    • era::custom_blinder — direct-byte SSZ blinder for Capella + Deneb. Walks BeaconBlockBody SSZ container offsets without typed parsing; only typed-decodes Transactions+Withdrawals slices for tree-hash. Verified byte-identical against clone_as_blinded().as_ssz_bytes().
    • Custom Transactions tree-hasher (also on its own clean branch transactions-tree-hash-from-ssz-bytes). Skips the per-transaction `Vec` allocation by walking the SSZ List offset table directly.
    • ERA consumer's per-block path collapsed into a two-pass parallel pipeline (decompress → blind) with fork-dispatch: pre-Bellatrix passthrough (no payload, FullPayload SSZ ≡ BlindedPayload SSZ), Capella+Deneb custom blinder, Bellatrix+Electra+ typed fallback.
    • Reconstruction loop sequentialised. The static archive enforces strictly-ascending slot writes per column, so the parallel reconstruction would commit batches out of order.

End-to-end result

Mainnet 1260 era files (/mnt/ssd/era-mainnet-nimbus/, eras 0..1260, ~10.3M slots), same hardware as the KV reference run (era-import-timing.csv):

backend total mean s/era
KV (tuned) 51.27 h 146.5
Static (this branch, phase 1) 1.22 h 3.49

42.0× faster on phase 1 (per-era ERA import). Per-phase tracing + per-era CSV in the same shape as the KV reference are captured.

Per-block microbench numbers driving the speedup (beacon_node/beacon_chain/examples/blinder_bench.rs, examples/hash_bench.rs):

block typed parse + clone_as_blinded + as_ssz_bytes custom blinder + custom tx hasher speedup
Capella (128 KB) 4.83 ms 1.58 ms 3.05×
Deneb (86 KB) 4.45 ms 0.93 ms 4.78×

Known limitation

Phase 2 reconstruction does not run end-to-end on this branch. Phase 1 writes era-boundary states to the StateSnapshot/StateDiff columns at slots 8192, 16384, …, 1260·8192. Phase 2 then tries to backfill intermediate states beneath those slots, which the static archive's per-column monotonic-forward invariant rejects. Sequentialising the reconstruction loop (this branch does that) doesn't help — the conflict is between phase 1's boundary writes and phase 2's intermediate writes within the same column, not between parallel workers. Real fix is architectural — either (a) allow random-slot writes within an existing file_id by tracking per-slot `is this offset populated?` rather than just `highest_written_slot`, or (b) interleave reconstruction with phase 1 so all writes are slot-ascending. Discussed in the PR #75 comment thread.

What this is for

Integration testbed that demonstrates #75 + #69 work together end-to-end, with the perf numbers backed by code in one diff. NOT meant to land as-is into upstream — the optimizations should land in the appropriate PRs:

dapplion added 30 commits May 8, 2026 19:36
Add a slot-keyed durable archive (`StaticBlockStore`) for finalized blinded
blocks, integrated into `migrate_database` as a second pass that runs
alongside the existing cold-state migration. File format and manifest
persistence remain `todo!()` — this is the wiring scaffold.

- New `DBColumn::BeaconBlockSlot` reverse index (root → slot).
- `HotColdDB::get_block_with` and `block_exists` fall through to the
  archive after a hot-KV miss.
- Archival driven inside `migrate_database`: cold ops (BeaconBlockRoots +
  BeaconBlockSlot) commit atomically, hot deletes after split commit.
- Skip-slot dedup seeded from `BeaconBlockRoots[current_split.slot - 1]`,
  with `Hash256::ZERO` for the genesis case.
- Spec at `specs/static-blocks.md`.
Companion document describing the static-file backend for `BlobSidecar`
archival via `.erb` files. Initialization via genesis sync or imported
era files; checkpoint sync and P2P blob backfill rejected at startup.
Replaces the byte-keyed Cold: ItemStore<E> bound on HotColdDB with a slot-typed
ColdStore<E> trait: get/put_batch/exists/iter_from for slot-keyed columns plus
get_index/put_index_batch over a tight DBColumnColdIndex enum (BlockSlot,
ColdStateSummary). KV backends (BeaconNodeBackend, MemoryStore) implement it
by translating slot/root keys into the existing KeyValueStore byte API.

StaticBlockStore generalised to StaticColdStore: one type, columns dispatched
on each call. Per-column subdirectory; per-column settings (record_type,
compression, max_decompressed) come from a build-time column_config table on
first creation and are persisted in each column's conf so future builds with
different defaults stay compatible. Conf magic bumped to LHSTBLK2.

Removes prune_historic_states + the lighthouse db prune-states CLI: the mode
they produce ("cold blocks present, cold states absent") isn't in the
startup-path table in specs/static-cold-backend.md and the spec doesn't
support runtime mode transitions. full_state_pruning_enabled goes with it.

Other: store_cold_state* helpers take separate slot-keyed and root-index
buffers; migration writes slot-keyed cold data first, root indices after, so
a crash leaves no dangling indices.
- Move beacon_node/store/src/static_blocks.rs to static_cold.rs (the type
  is no longer block-specific).
- Add DBColumnCold (slot-keyed cold columns) alongside DBColumnColdIndex.
  StaticColdStore is keyed by DBColumnCold all the way through; no DBColumn
  conversion happens inside static_cold.rs. column_config returns a plain
  ColumnConfig (was Option) and UnsupportedColumn errors go away — the
  tighter enum makes them unrepresentable.
- Eager-open every cold column at boot, freeze the columns map. No outer
  Mutex/RwLock; the per-column writer state mutex is the only sync point.
- Rename ColumnConfig::max_decompressed -> max_value_bytes (it bounds the
  raw payload size on uncompressed reads too, defending against corrupt
  headers).
- BeaconStateDiff: compression: false. HDiff is already compressed
  internally (zstd'd validator/balance chunks) so snappy on top is wasteful.
The slot-keyed methods on ColdStore (get/put_batch/contains/iter_from) now
take the tight DBColumnCold enum instead of DBColumn, mirroring the existing
DBColumnColdIndex shape on the index methods. This drops DBColumn from
static_cold.rs entirely.

KV backend impls (BeaconNodeBackend, MemoryStore) translate via
column.db_column(). FrozenForwardsIterator::new still accepts DBColumn at
the public boundary and converts at the call to cold_db.iter_from.

Also: delete static_blobs.rs (was a stub returning Unsupported on every
call, with no callers). Revert noise renames (io_batch, cold_db_block_ops,
cold_db_state_ops, ops, .map_err(|e| e.into())) to keep the diff against
unstable focused on real semantic changes.
`BeaconBlockSlot` (and the `DBColumnColdIndex::BlockSlot` variant that wrapped
it) was added for a static-archive read-fallback path that was removed earlier
in this branch. Nothing writes or reads it now, so drop the variant from the
DBColumn enum, the matching DBColumnColdIndex variant, the
`MissingFrozenBlockSlot` error, and the corresponding key_size match arm.

Rewrite TODO-static-block-storage.md to reflect the current branch state:
the static-cold generalization is in, the prune-states removal is in, and the
remaining work is cold-backend selection (flag), review of block read/write
paths now that BeaconBlockSlot is gone, an invariants review, and tests.
The two explicit impls (BeaconNodeBackend, MemoryStore) were identical
boilerplate translating slot/root keys into the underlying byte-keyed
KeyValueStore. Replace with a single blanket impl in lib.rs.

Forecloses a future ColdStore impl that isn't a KeyValueStore (e.g. wiring
StaticColdStore directly as the Cold parameter); reversible if/when that
becomes wanted.
The blanket `ColdStore` impl writes `slot.as_ssz_bytes()` for
`BeaconColdStateSummary`, where older releases wrote SSZ-encoded
`ColdStateSummary { slot }`. The two encodings are byte-identical (an SSZ
container of one fixed-size field equals the field), but the equality is
load-bearing for read compatibility with existing databases. Add a
regression test that pins it.
The slot-walk rewrite of `check_cold_state_diff_consistency` was forced by
not having an index iterator on the trait. Add `iter_index(col)` (yields
`(Hash256, Slot)`) and restore the invariant to iterating
`BeaconColdStateSummary` directly, matching unstable's structure modulo
the slot-typed API.
Replace the two-buffer (slot-keyed data + state-root index) helper signatures
with a single `&mut ColdBatch` and add `commit_cold_batch` that flushes data,
syncs, then commits the index — encoding the data-before-index ordering at
the API.

`put_state` and `reconstruct.rs` collapse to "build batch, commit batch."
The migration loop keeps a top-level summary index that accumulates across
states and is flushed at end-of-migration; per-iteration data still goes
through `commit_cold_data` (renamed from `commit_cold_items`).
Drops the `KeyValueStore -> ColdStore` blanket and replaces it with an
explicit per-backend impl. `BeaconNodeBackend` no longer impls `ColdStore`
directly — its byte-translation is inlined inside the `ColdBackend::Kv` arm
where it's actually used. `MemoryStore` keeps an explicit impl (still used
as the Cold parameter in tests via `EphemeralHarnessType`).

`ColdBackend<E>` is a new enum with `Kv(BeaconNodeBackend)` /
`Static(StaticColdStore)` variants, picked at startup from
`StoreConfig::cold_backend` (default `Kv`). Production type signatures swap
the second `BeaconNodeBackend<E>` slot to `ColdBackend<E>` (3 production
sites, 6 test sites, 3 database_manager sites).

`StaticColdBackend<E>` wrapper from the previous commit collapsed into a
direct `impl<E> ColdStore<E> for StaticColdStore`. Index methods stub
`Unsupported` for now — wiring the embedded KV is the next piece.
Genesis sync against the static cold backend was failing for two reasons:

1. `BeaconColdStateSummary` and friends are root-keyed indices; the static
   files are slot-keyed. The previous `Unsupported` stubs blocked the very
   first migration. Embed a `BeaconNodeBackend<E>` at `<root>/index/` and
   serve `get_index` / `put_index_batch` / `iter_index` from it. Forwards
   iteration over slot-keyed columns (`iter_from`) is now also implemented
   by walking the column's `.off` sidecar.

2. `BeaconChainBuilder::genesis` pre-writes the genesis block_root to cold
   `BlockRoots` at slot 0, then the first migration writes the same
   (slot, root) again. KV cold accepts the overwrite; the static backend's
   strict-ascending check rejected it. `Column::put` now treats a re-put of
   an identical value at the current highest slot as a no-op, and errors
   only on a value mismatch (a real bug).

Threads `StoreConfig` into `StaticColdStore::open` so the embedded KV picks
up the same backend (`leveldb` / `redb`) and tuning as the hot/blobs DBs.

Adds `genesis_sync_static_cold` covering ~1000 finalized blocks with the
static backend and a load of every cold state through the new index.
Drops the bespoke 1000-block static-cold test and instead has get_store
read the cold backend from COLD_BACKEND=static|kv. CI / local can now run
the existing store_tests suite against either backend without duplicating
test bodies.

Also trims ColdBackendKind to the derives actually exercised today.
Display, EnumString, VariantNames, Copy were forward-looking for the
not-yet-wired --cold-backend CLI flag - re-add when that lands.
The static cold backend is append-only in ascending slot order, so
checkpoint/weak-subjectivity sync (which backfills slots below the anchor)
is fundamentally incompatible. Refuse the combination explicitly in
BeaconChainBuilder::weak_subjectivity_state instead of failing later
with an opaque 'static cold put out of order' error.

The 6 weak_subjectivity_sync_* tests early-return under
COLD_BACKEND=static so the test suite passes against either backend.

Adds the --cold-backend CLI flag (kv|static, default kv) so operators
can opt into the static backend at startup. Re-adds EnumString and
VariantNames on ColdBackendKind for clap parsing.
Test vectors are now hosted at dapplion/era-test-vectors and downloaded
via Makefile (same pattern as slashing_protection interchange tests).
- Add docs to EraFileDir, import_all, and module-level usage example
- Rename let _span to let _ for debug spans
- Remove unused _start_slot variable
- Extract parse_era_filename with unit tests
- Add rejects_wrong_trusted_slot test
EraFileDir::new now takes genesis_validators_root and EraImportTrust:
- TrustedStateRoot(era_number, root): uses that ERA as reference,
  verifies its state root, imports only ERAs 0..=era_number
- Untrusted: uses highest ERA in directory as reference

Trust checks (genesis_validators_root, state root) moved from
import_all/import_era_file into EraFileDir::new. Removed all
expects/unwraps from production code.
Add `init_genesis_store` + `advance_store_to_era` so that after ERA import
the store metadata (split, anchor, fork choice) is fully set up for the
regular `resume_from_db` → `build()` startup path.

Key changes in consumer.rs:
- `init_genesis_store`: standalone genesis init (block, state, anchor, fork choice)
- `advance_store_to_era`: advances split/anchor/fork choice to ERA boundary
- `write_state_root_index_for_era`: writes both BeaconStateRoots (slot→root)
  and BeaconColdStateSummary (root→slot) for every slot
- Uses `from_persisted` instead of `get_forkchoice_store` to avoid deriving
  a wrong anchor block root from the ERA boundary state's latest_block_header

Test `chain_boots_from_imported_db` verifies:
- canonical_head matches expected head root
- Every slot's state is accessible via state_root_at_slot → get_state
- Every slot's block is accessible via block_root_at_slot → get_blinded_block
- Blocks form a valid parent chain (parent_root linkage)

Also fixes producer to use get_blinded_block + make_full_block for cold blocks
where get_full_block fails when prune_payloads is enabled.
Separate store initialization from the ERA consumer since it's not part
of the production beacon node path.
dapplion added 13 commits May 8, 2026 23:27
`hot_storage_strategy` reads `hot_hdiff_start_slot()` which returns
`anchor_slot`. With the anchor still at slot 0 from `init_genesis_store`,
storing the head state at the ERA boundary produces `DiffFrom(intermediate)`
and fails because the hot DB has no preceding diffs (we just imported into
the cold DB).

Update the anchor in memory first so the strategy sees `anchor_slot ==
head_slot` and stores the head as a `Snapshot`. The kv-op is still added to
the same atomic batch as `set_split` and the persisted fork choice for
crash-consistent persistence.
Idempotent put at any committed slot makes `migrate_database` retries
safe after a mid-loop crash. The previous put accepted re-puts only at
exactly `highest_written_slot`; on retry, slot 0 < highest fired
out-of-order. Now any committed slot accepts an identical-value re-put;
mismatched values and skipped-slot fills still error.

New `COLD_BACKEND_KEY` in `BeaconMeta` pins the backend kind on first
open and refuses mismatched re-opens (Static and Kv on-disk layouts are
incompatible). `reconstruct_historic_states` refuses to run under
static cold — the slots it would write are below every column's
high-water mark.

`max_value_bytes` ratchets upward on open if the build default exceeds
disk, so a newer build can write larger records than an older one
persisted, and re-persists immediately for stable re-opens.

Per-column files renamed `static_blocks_*` -> `data_*`,
`static_blocks.conf` -> `column.conf` — the literal prefix was
misleading after the per-column generalisation.

`kv_cold_store` helper module dropped; `MemoryStore`'s `ColdStore` impl
inlined to match `ColdBackend::Kv`. Two impls, no shared helper.
`decompress_record` returns `Result<Vec<u8>>` (was `Result<Option<Vec<u8>>>`
with `Some` on every success path).

`TODO(static)` markers added for `iter_from` perf, the migrate-vs-index
transient invariant 11 window, invariants 10/11/12 re-review under
static cold, and the missing test set.

Spec cleanup: delete `specs/static-blocks.md` (stale, ~60% contradicted
the code) and `TODO-static-block-storage.md`. Rewrite the
`static_cold.rs` module header as the canonical byte-level format
reference (layout, data file, `column.conf`, put contract, recovery).
Combines sigp#9273 (ERA consumer/producer via lcli) on top of #75 (ColdStore
trait + StaticColdStore), and rewires the importer so finalized blinded
SignedBeaconBlocks land in DBColumnCold::Block of the static archive
instead of the hot DB. Hardcodes ColdBackendKind::Static in the lcli
runners. PLAN.md captures the experiment goal: benchmark ERA load speed
and write amplification of the static backend.

Block writes use clone_as_blinded() + SSZ + put_batch(Block, slot-keyed
ascending). Genesis init drops the legacy hash-keyed put_block() (and
the Hash256::zero alias) for the same slot-keyed cold write.
Adds a sibling job to `beacon-chain-tests` that runs
`beacon_chain::store_tests::*` with `COLD_BACKEND=static` (and `FORK_NAME=fulu`)
to exercise the static slot-keyed cold-DB backend on every CI run. Mirrors the
existing job's runner, toolchain, cache, and feature flags
(`fork_from_env,slasher/lmdb,portable`). Added to `test-suite-success` so the
merge queue blocks on it.
Adds the missing pieces so the static cold archive can serve block-by-root
reads without keeping a duplicate in hot indefinitely.

Schema (re-adds what f671da1 dropped):
- `DBColumn::BeaconBlockSlot` (tag `bbs`, 32-byte key, 8-byte SSZ Slot)
- `DBColumnColdIndex::BlockSlot` variant

Migrate (`migrate_database`):
- alongside the existing block-bulk push to `cold.Block`, push the matching
  `(block_root, slot)` to `cold_block_slot_index` and the `block_root` to
  `hot_block_delete_roots`
- end-of-loop: `put_index_batch(BlockSlot, ...)` after `ColdStateSummary`,
  before split commit
- post split commit: `hot_db.do_atomically(deletes)` reclaims hot space for
  the just-migrated blocks. Hot delete only runs after cold bytes + cold
  index are durable, so a crash here leaves cold canonical and reads fall
  through. KV mode keeps `move_blocks_to_static_cold` false → all the new
  buffers stay empty → status quo.

Read fallback (`get_block_with`, `block_exists`):
- hot first; on miss, `cold.get_index(BlockSlot, root)` then
  `cold.get(Block, slot)`. Missing bulk for an indexed slot raises
  `MissingFrozenBlock` (corruption). KV mode's empty BlockSlot index makes
  the fallback always return None on hot miss — identical to before.

Invariant 10 (`check_cold_block_root_indices`):
- now uses `self.block_exists(&block_root)` (the public read with cold
  fallback) instead of the bare `hot_db.key_exists(...)`. Required because
  hot-delete makes the bare hot check fire spuriously for every migrated
  slot under Static cold.

Init-path coverage:
- Genesis + KV: cold writes gated off, BlockSlot empty, fallback always
  None on hot miss. Status quo.
- Genesis + Static: migrate writes block + index to cold, deletes from
  hot. Reads ≥ split.slot hit hot; < split.slot hit cold via fallback.
- Era + Static: hot has only post-anchor blocks. cold has 0..S from era
  (future era-import path) + post-S from migrate. Fallback is the read
  path for slot < S.
- Ckpt + KV: BlockSlot empty as in Genesis + KV. Backfill fills hot.
- Ckpt + Static (no era): rejected by the existing WSS guard.
`make cli-local` after `e259a5157b` introduced `--cold-backend` without
touching `book/src/help_bn.md`, so `cli-check` failed on every push.
Re-added in `bbc3badfd2` (`BeaconBlockSlot`); the hardcoded snapshot in
`check_db_columns` wasn't updated, so the test asserted on a stale list.
Layered on top of the put_batch fix (which is on its own branch
static-cold-batched-fsync). Three changes:

1. era::custom_blinder: a direct-byte SSZ blinder that walks
   SignedBeaconBlock container offsets and only typed-decodes the
   transactions/withdrawals slices for tree-hash. Avoids the typed
   BeaconBlock parse (which dominates per-block CPU because of
   attestation/sync-aggregate/etc. allocations) and the subsequent
   clone_as_blinded + as_ssz_bytes round-trip. Implements Capella and
   Deneb. Verified byte-identical against
   clone_as_blinded().as_ssz_bytes() in
   beacon_node/beacon_chain/examples/blinder_bench.rs:
       Capella: 8.4 ms -> 2.1 ms per block (4.03x)
       Deneb:   7.7 ms -> 1.15 ms per block (6.70x)

2. consumer.rs: dispatch by fork.
       Phase 0/Altair: byte passthrough (FullPayload == BlindedPayload SSZ)
       Bellatrix:      typed-parse fallback (not implemented in custom)
       Capella/Deneb:  custom_blinder
       Electra+:       typed-parse fallback
   Three previous parallel passes (decompress / ssz_parse / blind+encode)
   collapse into one parallel pipeline.

3. consumer.rs: reconstruction loop sequentialised. The static cold
   archive enforces strictly-ascending slot writes per column.

End-to-end measurement on mainnet 1260 eras:
       KV (era-import-timing.csv from tuned KV run):  51.27 h
       Static (this branch, phase 1 only):              1.22 h  (42x)

Phase 2 reconstruction still fails because the static archive's
monotonic-forward constraint conflicts with backfilling intermediate
states into columns where phase 1 already wrote era-boundary records at
much higher slots. Out of scope for this experiment.
Restores the two-pass shape: `era_import_decompress_blocks` (snappy
decompress, parallel) -> Vec<Vec<u8>> -> `era_import_blind_blocks`
(fork-dispatched blinder, parallel). Intermediate Vec costs ~8k
mallocs/era which is sub-millisecond at typical allocator latency
(<0.1% of per-era cost), but the per-phase visibility is worth more
than that — we can now see snappy vs blinder regressions independently.
…plit_outer

- consumer.rs: drop unused decode_block helper.
- custom_blinder.rs: drop unused ssz::Encode; replace 6-tuple OuterFields
  return with a named struct (clippy::type_complexity).
- consumer.rs: keep ssz::Encode for the Bellatrix/Electra+ fallback's
  clone_as_blinded().as_ssz_bytes() calls.
…han typed)

Transactions::from_ssz_bytes allocates one Vec<u8> per transaction
(hundreds per block) just to throw it away after tree_hash_root.
Replaced with a direct-byte hasher that walks the SSZ
List<Transaction, MAX_TX> offset table, hashes each Transaction's
bytes in place via tree_hash::merkle_root, and list-merkleizes the
per-tx roots. Byte-identical to the typed path (verified in
beacon_chain/examples/hash_bench.rs).

Microbench on real mainnet blocks:
                          typed      custom    speedup
  Capella tx hash         1498 us     607 us    2.47x
  Deneb tx hash            991 us     519 us    1.91x
  Capella withdrawals       8.2 us     7.9 us    1.04x   <- not worth it
  Deneb withdrawals         9.4 us     8.9 us    1.06x   <- not worth it

Withdrawals stays on the typed path: <=16 fixed-size 44-byte records,
the typed allocation is already in the noise. Maintenance cost of a
hand-rolled hasher isn't justified for ~5 ns of saving.

End-to-end blinder bench (full block in -> blinded SSZ out):
                          typed (A3)   custom blinder    speedup
  Capella block (128 KB)   4828 us       1581 us         3.05x
  Deneb block (86 KB)      4445 us        930 us         4.78x
Replace the per-slot fsync loop in `put_batch` with one fsync per file:
items are grouped by file_id, all records appended through a BufWriter,
then a single sync_all for the data file, all offsets written, single
sync_all for the offset file, and a single atomic config commit per
batch.

Same caller-visible "batch durable on return" contract. For an
8192-item batch (one ERA's worth of slot-keyed writes) this drops
fsync count from ~32k (4 per slot) to ~3, with measured speedups
between 155x and 775x per column on /mnt/ssd NVMe.

Spec updated to reflect the batched semantics.
@dapplion dapplion requested a review from michaelsproul as a code owner May 11, 2026 01:28
dapplion added 4 commits May 11, 2026 03:32
…n typed)

When the caller has the SSZ-encoded bytes of a Transactions list and
only needs its tree_hash_root, going through

    Transactions::from_ssz_bytes(bytes)?.tree_hash_root()

allocates one Vec<u8> per transaction (hundreds per mainnet block) just
to throw it away after hashing. The new helper walks the SSZ List
container's offset table directly, hashes each Transaction's bytes in
place via tree_hash::merkle_root, and list-merkleizes the per-tx roots,
with no intermediate typed allocation.

Output is byte-identical to the typed path (six tests covering empty,
single-tx, many-tx, mixed-size, single-large, and chunk-boundary
edges).

Microbench on real mainnet blocks (beacon_chain/examples/hash_bench.rs
in a separate experiment branch):

                          typed       custom    speedup
  Capella (92.5 KB txs)   1498 us     607 us    2.47x
  Deneb (46.4 KB txs)      991 us     519 us    1.91x

Motivating use case: ERA-file importer producing blinded
SignedBeaconBlock SSZ for the static cold archive without going through
clone_as_blinded() + as_ssz_bytes(). General-purpose helper, no caller
in this PR.
…m_ssz_bytes

Drops the private duplicate now that the helper is a public types-crate
API (added in the cherry-picked f82265e). Same byte-identical output.
Three new criterion benches replacing the ad-hoc Instant-loop examples:
- store/benches/static_cold.rs           (put_batch per ERA, 3 columns)
- beacon_chain/benches/blinder.rs        (typed vs custom blinder, Capella + Deneb)
- beacon_chain/benches/hash.rs           (typed vs custom Transactions/Withdrawals tree-hash)

extract_block stays as an example (one-shot fixture extractor, not a bench).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant