eth/protocols/wit, consensus/bor: WIT2 — BP-signed witness announcements with transitive relay and pre-import serving#2208
eth/protocols/wit, consensus/bor: WIT2 — BP-signed witness announcements with transitive relay and pre-import serving#2208
Conversation
…ncements with transitive relay and pre-import serving Adds WIT2 (protocol version 3): block producers sign a chunked-parallel commitment over each witness, peers verify the signature and relay the announcement at network-RTT speed without execution, and any peer holding the body can serve it pre-import from an in-memory cache. Byte-correctness is verified by requesters against the BP-signed WitnessHash, attaching tampering blame to the server; content-correctness (state-root) failures attach to the BP. Removes the per-hop ~500 ms execution gate that today serialises witness propagation through stateless validators. Witness commitment uses 1 MiB chunked-parallel keccak (keccak256 of the concatenation of per-chunk hashes), measured at ~13.5 ms wall-clock for 50 MiB witnesses on 8 cores vs ~88 ms single-shot. Wire format and signature shape are unchanged from a single-keccak commitment; only the function mapping bytes to the 32-byte commitment changes. Producer-side signing reuses the engine SignerFn via consensus/bor.SignBytes with a dedicated mimetype (application/x-bor-wit2-announce) and a domain-separated digest tag, replay-resistant at both the digest and signer-call levels. Receivers verify ecrecover against the scheduled producer for the announced block; announces for blocks whose header is not yet locally available are deferred (no strike) so the block-cosend race does not punish honest relayers. Pre-import serving cache (capacity 10) is fed from the paged-fetch path the moment byte-correctness check passes, before chain write. Cache entries are gated on a BP-signed WitnessHash being on file — relayers never cache unverified bytes, and WIT1 fallback paths skip the cache entirely. handleGetWitness consults the cache before chain storage. Wire: new protocol version WIT2 = 3, new message SignedNewWitnessHashesMsg = 0x06 with up to 64 announcements per packet. WitnessMetadataResponse extended with WitnessHash. WIT1 peers continue using NewWitnessHashes; mixed mesh tolerated. Rate-limits: 200 ms per-(blockHash, peer) relay rate-limit, 30 s announce TTL, per-peer token bucket (burst 256, refill 64/s), strike disconnect at 5 invalid signed announces per minute. Conflicting WitnessHash for the same BlockHash is rejected via signedWitnessCache.putIfNewer. Operator note: validators running Clef as their signer must whitelist the mimetype application/x-bor-wit2-announce; without it the producer falls back to unsigned WIT1 announces.
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.
Tip: disable this comment in your organization's Code Review settings.
|
test |
Code Review3 issues found. Checked for bugs and CLAUDE.md compliance. 1. Performance: redundant witness encodingFile:
On a 50 MiB witness this adds ~100–300 ms of redundant CPU work per verified fetch — meaningful given WIT2's goal of eliminating per-hop latency. Suggested fix: Have 2. Performance: unconditional encode+hash before signed-announcement checkFile:
Every witness broadcast — including from WIT1 peers — pays the full encode+hash cost (~150–450 ms on 50 MiB witnesses) even when the result is never used. Suggested fix: Check 3. Bug: peer dropped on local EncodeRLP failureFile: When This is inconsistent with the pattern in Suggested fix: Change m.handleWitnessFetchFailureExt(hash, "", fmt.Errorf("witness encode failed: %w", err), false) |
Codecov Report❌ Patch coverage is ❌ Your patch check has failed because the patch coverage (52.37%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## develop #2208 +/- ##
===========================================
- Coverage 52.29% 52.22% -0.07%
===========================================
Files 884 886 +2
Lines 155571 156147 +576
===========================================
+ Hits 81355 81548 +193
- Misses 68989 69356 +367
- Partials 5227 5243 +16
... and 21 files with indirect coverage changes
🚀 New features to boost your workflow:
|
- eth/handler_wit2.go: remove unused errInvalidSigner, contextBackground, wit2SpanLookupMissMeter, and now-unused context import - core/stateless/witness_commit_bench_test.go: drop redundant c := c loop-var copies (Go 1.22+ copyloopvar) - goimports formatting on accounts/accounts.go, witness_commit_bench_test.go, witness_commit_helpers_test.go, eth/fetcher/witness_manager.go, eth/fetcher/witness_manager_wit2_test.go, eth/handler_wit2.go, eth/protocols/wit/protocol.go
… drop - eth/fetcher/witness_manager.go: verifyAgainstSignedHash now returns the canonically-encoded body and signed hash on success, so the pre-import serving cache no longer re-encodes the same witness (~14 ms saved per verified fetch on 50 MiB witnesses). cacheVerifiedWitnessForServing takes the precomputed body directly. - eth/fetcher/witness_manager.go: local EncodeRLP failure inside verifyAgainstSignedHash no longer drops the peer — re-encoding bytes the peer already delivered as valid RLP is a local invariant violation, not peer misbehavior. Mirrors the pattern already used by the cache path. - eth/handler_wit.go: hoist signedWitnesses.get(hash) above the EncodeRLP + WitnessCommitHash work in handleBroadcastWitness. WIT1 broadcasts (no signed announcement on file) used to pay the full encode+hash cost only to discard the result; now they short-circuit. - eth/fetcher/witness_manager_wit2_test.go: rename + retarget the no-signed-hash regression test onto verifyAgainstSignedHash, where the invariant now lives.
|


Summary
Adds WIT2 (witness protocol version 3): block producers sign a commitment over each witness, peers verify the signature and relay the announce at network-RTT speed without executing the block, and any peer that has fetched the body can serve it pre-import from an in-memory cache. The slow part of witness propagation — re-execution before relay — is removed from the critical path. Mixed mesh with WIT1 nodes is tolerated; no flag-day rollout required.
Devnet result (4 scenarios, post-fork-only window, hop-chain topology with +300 ms per-hop import knob):
What we're solving
Today on Polygon mainnet, witness propagation through a stateless validator that is multiple hops away from a block producer accumulates a per-hop ~500 ms execution gate: each intermediate node must finish executing the block before it will relay the witness downstream. This serialises along the path and shows up at the receiver as milestone-voting latency — slow milestone votes on a fraction of blocks at multi-hop stateless validators. Adding more peers does not help; the chain of dependencies is fan-in × execution time.
The deliverable is to detach announce from execute so witness availability propagates at gossip speed, while keeping the same byte-correctness guarantee (hash check at the requester, with on-chain blame) and the same content-correctness guarantee (state-root, with BP blame).
How the code achieves it
1. BP-signed witness commitment
The producer needs to commit to which witness bytes are correct without paying ~88 ms of single-thread keccak on the announce path (otherwise we re-introduce the same gate we're trying to remove, just on a different node). See Signing-scheme evaluation below — short version: chunked-parallel keccak at 1 MiB chunks beats the next-best viable candidate by a clear margin and keeps the WIT1 wire format intact.
core/stateless/witness_commit.go::WitnessCommitHash(bytes)=keccak256(concat(per-1MiB-chunk-keccak)). Each 1 MiB chunk is hashed in parallel; final aggregate is one extra keccak over <1 KiB of chunk hashes. ~13.5 ms wall-clock for 50 MiB witnesses on 8 cores vs ~88 ms single-shot keccak — 6.5× speedup, no wire-format change. Producer and verifier agree on the chunk size as a protocol constant.consensus/bor.SignBytesreusing the engine'sSignerFn, with a dedicated mimetypeapplication/x-bor-wit2-announceand a domain-separated digest tag — replay-resistant at both the digest and signer-call levels.2. Verify-and-relay without execution
WIT2 = 3(eth/protocols/wit/protocol.go), new messageSignedNewWitnessHashesMsg = 0x06carrying up to 64 announcements per packet.eth/handler_wit2.go::handleSignedWitnessAnnouncementsdoes ecrecover against the scheduled producer for the announced block; on success the announce is cached and immediately relayed to peers that have not seen this hash. No state execution is touched.3. Pre-import serving cache
pendingWitnessBodies(capacity 10) in the WIT2 handler is fed from the paged-fetch path the moment byte-correctness verification against the BP-signedWitnessHashpasses — i.e. before chain write.handleGetWitnessconsults this cache before chain storage, so a peer that just received the body can serve it to a downstream stateless node before it has finished executing.4. Blame model preserved
WitnessHash; failure attaches to the server that returned the bytes.WitnessHashfor the sameBlockHashis rejected viasignedWitnessCache.putIfNewer, so a peer cannot equivocate witnesses across announcements.5. Rate-limits & DoS shape
6. Compatibility
NewWitnessHashes. Mixed WIT1/WIT2 mesh is tolerated: WIT2 nodes downgrade to WIT1 wire when peering with WIT1 peers (relay handler skips peers withVersion() < wit.WIT2).WitnessHashfield onWitnessMetadataResponseis set by WIT2 servers and ignored by WIT1 readers — wire forward-compatible.Signing-scheme evaluation
Picking the right commitment function for the announce signature is load-bearing for the whole PR: too slow on the producer and we just move the per-hop gate from "execute the block" to "hash the witness"; too weak and we lose the byte-blame property that lets a downstream node disconnect a peer that returned tampered bytes. Four candidates were evaluated end-to-end on synthetic 1–50 MiB witnesses (Apple M4 Pro, Go 1.26.2,
go test -benchtime=3s -count=3, median of three).Candidates
keccak256(canonical_RLP(witness))single-threadkeccak(chunk0_hash ‖ … ‖ chunkN_hash), chunks hashed concurrentlyheader.StateRootto detect bad bytesResult at 50 MiB — verifier wall-clock (best parallel config)
D(intrinsic, 4 cores)44 ms2.0×0 msWhy D was rejected post-bench
D had the most attractive numbers (zero producer cost, 2× verifier speedup, no signature on the announce path) — but a peer can serve a truncated witness whose included nodes all hash consistently up to the BP-signed
header.StateRoot. Branch nodes embed child references as 32-byte hashes inside their own bytes, so dropping a subtree leaves the parent branch nodes' hashes unchanged. The intrinsic walker has no way to distinguish "this hash-reference belongs to a path that was never touched and is intentionally absent" from "this hash-reference belongs to a path that was touched and was adversarially omitted" — only attempting execution would. That destroys pre-execute byte-blame, which is the whole reason WIT2 introduced a content commitment in the first place. A/B/C all preserve byte-blame because they sign over content; truncation changes the commitment, signature mismatch, peer dropped pre-execute.Why B at 1 MiB chunks won
A chunk-size sweep at 50 MiB / 8 cores:
512 KiB shaves a tenth of a ms over 1 MiB at the cost of doubling the chunk count and the per-chunk overhead — 1 MiB is the knee of the curve. Below 512 KiB, per-chunk setup starts dominating. The 4 GB/s ceiling is the M4 Pro's aggregate keccak throughput across 8 P-cores; further parallelism doesn't help with the current keccak primitive.
Verifier-side scaling — B beats A non-trivially only ≥ 30 MiB
For the small witnesses Polygon emits today (typically 1–10 MiB) B is comparable to A; for the large witnesses we already see at the upper tail (30–50 MiB) B is the difference between the producer/verifier paying a ~90 ms gate vs ~14 ms. The fix is most impactful exactly where the problem is worst.
Why not C
C is dominated by every other viable candidate on these numbers: slower verifier than A (122 ms vs 88 ms), 91 MiB / 614 k allocations per verify at 50 MiB, no wire saving. C only becomes interesting if a future design needs sub-witness proofs (proving a specific node belongs to the committed set without sending the full body) — that's not on the roadmap, so C is a no-vote here.
Sensitivity caveats
Full bench artifact (raw numbers, reproduction commands, allocation breakdown):
agent-zero/investigations/witness-propagation/witness-commit-bench.md.Local devnet validation
A 9-node hop-chain devnet on
kurtosis-pos: 4 BPs full-mesh, two relay full-nodes (F1/F2) carrying a +300 ms per-hop import-delay knob to amplify the gate without heavy tx loads, and three stateless validators at hop distances 1 / 2 / 3 from the closest BP (S1 ↔ BP1, S2 ↔ F1, S3 ↔ F2). Topology was enforced post-launch viaadmin_removePeerafter every node imported past Giugliano (block 128 + 72-block settle), so the measurement window is post-fork and post-prune only — pre-fork blocks (different code path) are excluded.Four scenarios, ~30 measured blocks each:
bor:develop(control)bor:wit2bor:wit2, rest =bor:developbor:wit2, rest =bor:developF2import-lag (the relay just before S3) shows the mechanism: median drops 805 → 305 ms in scenario 2 — one full per-hop inject overlapped with WIT2 announcement-driven pre-fetch, exactly what the design predicts.S3's residual p95 of 260 ms in scenario 2 is the single +300 ms inject on F2 still in the critical path: WIT2 lets the F1 hop overlap, but F2 still has to receive and execute the block before serving S3. Without the artificial knob (i.e., on mainnet), the natural per-hop gate is ~50–100 ms and this residual shrinks proportionally.
Full report (per-scenario logs, lag tables, errors/warnings, peer-count snapshots, prune timestamps, image map):
agent-zero/investigations/witness-propagation/devnet-validation-2026-04-30b.md.Backward compatibility — explicit checks
pendingWitnessBodiesskipped when no signed WitnessHash on fileeth/handler_wit2.go::resolveWitnessBytesWitnessHashfield onWitnessMetadataResponseignored by WIT1 readersconsensus/bor.SignBytesconsensus/bor/signbytes_test.goTest plan
core/stateless/witness_commit_test.go,witness_commit_bench_test.go,consensus/bor/signbytes_test.go,eth/handler_wit2_test.go,eth/handler_wit_test.go,eth/peerset_test.go,eth/protocols/wit/protocol_wit2_test.go,eth/fetcher/witness_manager_wit2_test.go.pendingWitnessBodies. We don't expect more than a few in-flight unique witnesses at a time, but worth a second opinion under burst conditions.application/x-bor-wit2-announce.Diffguard / quality-gate notes
eth/handler_wit2.go(errInvalidSigner,contextBackground,wit2SpanLookupMissMeter) — left over from earlier iterations; worth removing before merge.eth/handler_wit2.gois 504 lines (4 over the 500 threshold). Up to reviewers whether to split.WitnessCommitHashcognitive complexity is 18 vs the 10 threshold — driven by the parallel-keccak fan-out with bounded goroutines; not naturally simplifiable below ~12 without losing the parallelism. Open to suggestions.