Skip to content

Use block headers in cache handleHead, full blocks are best-effort now#397

Merged
AntiD2ta merged 10 commits intomasterfrom
fix/cache-block-header
Apr 23, 2026
Merged

Use block headers in cache handleHead, full blocks are best-effort now#397
AntiD2ta merged 10 commits intomasterfrom
fix/cache-block-header

Conversation

@AntiD2ta
Copy link
Copy Markdown
Contributor

@AntiD2ta AntiD2ta commented Apr 9, 2026

Problem

Vouch's cache handleHead fetches full signed beacon blocks via /eth/v2/beacon/blocks/<root> on every head event. Lighthouse prunes block bodies by default, returning 404 for this endpoint. The signedbeaconblock/first strategy silently ignores 404s but never sends a response, causing a timeout and persistent error-level logs on every slot. The execution data extracted from the full block (gas limit, chain head) is only used for builder bid verification and is not critical for core validator duties.

Solution

Use data.Slot from the HeadEvent directly for the root-to-slot cache (consistent with handleBlock), avoiding an HTTP round-trip entirely. Attempt the full block fetch as best-effort for execution payload data, logging failures at debug level instead of error. The startup path in service.go:New uses the BeaconBlockHeader endpoint since no event data is available at that point.

Resolves #393.

Dependends on #394 which depends on on #396 for passing the Trivy scan.

@AntiD2ta AntiD2ta changed the title fix: use block headers in cache handleHead, full blocks best-effort Use block headers in cache handleHead, full blocks are best-effort now Apr 9, 2026
@AntiD2ta AntiD2ta self-assigned this Apr 9, 2026
@AntiD2ta AntiD2ta changed the base branch from fix/sync-committee-duplicates to master April 9, 2026 15:24
@AntiD2ta AntiD2ta force-pushed the fix/cache-block-header branch from c160486 to 228ac27 Compare April 9, 2026 16:46
@AntiD2ta AntiD2ta marked this pull request as ready for review April 10, 2026 08:54
Sync committee validators can occupy multiple positions in the same
subcommittee, producing duplicate (account, root) pairs that Dirk
rejects as a batch. Deduplicate before signing and scatter the
signatures back to all original positions.

Fixes attestantio/attestant#630
When a validator is assigned to multiple sync committee subnets, Vouch
sends different signing roots (one per subnet) for the same account in
a single SignGenericMulti call. Dirk's RunRules rejects any batch
containing the same pubkey more than once, causing all signing to fail.

After the existing (account, root) deduplication, split remaining
entries into N batches where each batch contains at most one entry per
account. The fast path (all accounts unique) is preserved with zero
overhead. Each batch is sent as a separate SignGenericMulti call and
signatures are mapped back to original positions.
Extract multi-signer batch-splitting logic into standalone signMulti
function. Rename uniqueAccounts/uniqueRoots/uniqueData/uniqueSigs to
uniquePairAccounts/uniquePairRoots/uniquePairData/uniquePairSigs to
clarify that uniqueness is at the (account, root) pair level. Wrap
SignGenericMulti errors consistently with the rest of the package.

Add test cases for error propagation when a batch fails mid-signing.
Cover the scenario where all (account, root) pairs are identical,
verifying dedup reduces them to a single unique entry and the
numBatches == 1 fast path in signMulti is exercised.
Beacon nodes like Lighthouse prune block bodies but retain headers.
The cache's handleHead was fetching full blocks via /eth/v2/beacon/blocks/<root>,
which returns 404 for pruned blocks, causing persistent error-level logs
and stale execution chain head data.

Change handleHead and startup to fetch BeaconBlockHeader first (always
available), caching block root to slot mapping. Full block fetch for
execution payload data becomes best-effort — failure is logged at debug
level since execution data is only used for builder bid gas limit
verification and is not critical for core validator duties.

Resolves #393.
Use data.Slot from the HeadEvent for the root-to-slot cache mapping
instead of fetching headers via HTTP. This is consistent with handleBlock
which also uses event data directly, and eliminates a redundant HTTP
request per slot.

Simplify tests to verify slot is always cached from the event regardless
of whether the full block fetch succeeds or fails.
All callers already handle the returned error at their own appropriate
log level. The strategy's internal Warn on timeout is redundant and
produces noise when the strategy is used best-effort (e.g. cache
handleHead on Lighthouse/Erigon where block bodies are pruned).

This aligns with line 81 in the same function where individual provider
failures already log at Debug.
…egies

Move clientMonitor.ClientOperation before error type checking, matching
the pattern in all other first strategies. Previously, 404 and 503
responses bypassed monitoring entirely, making pruned-block beacon nodes
invisible in Prometheus metrics.
@AntiD2ta AntiD2ta force-pushed the fix/cache-block-header branch from 228ac27 to 47f1103 Compare April 13, 2026 10:10
AntiD2ta added a commit that referenced this pull request Apr 17, 2026
Comment thread strategies/signedbeaconblock/first/signedbeaconblock.go
Comment thread services/cache/standard/events.go
@AntiD2ta AntiD2ta merged commit 39ae971 into master Apr 23, 2026
5 checks passed
@AntiD2ta AntiD2ta deleted the fix/cache-block-header branch April 23, 2026 09:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cache service continuously fails to fetch signed beacon blocks by root when block body has been pruned by Lighthouse

2 participants