Skip to content

Move BlockProcessingResult match out of block lookups#9327

Merged
mergify[bot] merged 6 commits into
sigp:unstablefrom
dapplion:gloas-lookup-sync-error-type
Jun 2, 2026
Merged

Move BlockProcessingResult match out of block lookups#9327
mergify[bot] merged 6 commits into
sigp:unstablefrom
dapplion:gloas-lookup-sync-error-type

Conversation

@dapplion
Copy link
Copy Markdown
Collaborator

@dapplion dapplion commented May 21, 2026

Issue Addressed

As a result we would have to duplicate x3 the big match on BlockProcessingResult we currently have in block lookups mod.rs

This PR moves the match of BlockProcessingResult to sync_methods to reduce the diff of #9155. There are some subtle changes that deserve dedicated attention, and may be drowned in the bigger diff of #9155 otherwise:

Unstable This PR / #9115
Some error conditions immediately Drop the lookup (no retries). For example for "internal" errors like the BeaconChainError Retries ALL errors 4 times. I believe assuming some errors are internal is risky as dropping a lookup drops all its children potentially forcing the node to resync a lot of blocks because of an internal timeout

dapplion added 2 commits May 20, 2026 06:08
Reshape BlockProcessingResult from the AC-verdict-passthrough
Ok/Err/Ignored enum to Imported(info) | Error { penalty, reason }. The
producer (network_beacon_processor) translates beacon-chain
Result<AvailabilityProcessingStatus, BlockError> into this shape via a
new classify_processing_result(), so the consumer only has to resolve
the symbolic WhichPeerToPenalize against an in-scope PeerGroup.

- New WhichPeerToPenalize { BlockPeer, CustodyPeerForColumn(u64) } with
  an `apply(action, &peer_group, reason, cx)` helper. Penalty policy
  lives once in classify_processing_result instead of being duplicated
  across consumer arms.
- Producer emits stable identifiers via two const modules
  (processing_result_info, processing_result_reason) so producer and
  consumer never trade ad-hoc string literals.
- Ignored becomes Error { penalty: None, reason: "processor_overloaded" }
  with a producer-side warn!; the consumer drops the lookup as before.
- DuplicateFullyImported and GenesisBlock map to Imported (which makes
  the consumer's "successfully imported" branch fall out naturally and
  removes the per-BlockError policy block — net -88 lines in mod.rs).
- on_processing_result_inner now captures the downloaded block's
  parent_root up-front, before borrowing request_state mutably, so the
  parent_unknown branch keeps working for any R.

Test rig: three tests construct the new variants directly. Extracted
from sigp#9155 (gloas-lookup-sync) onto bare sigp/unstable.
@dapplion dapplion requested a review from jxs as a code owner May 21, 2026 00:37
@dapplion dapplion force-pushed the gloas-lookup-sync-error-type branch from 50076f0 to 76e344b Compare May 21, 2026 04:04
BlockError::BeaconChainError(_) | BlockError::InternalError(_) => None,
BlockError::DuplicateImportStatusUnknown(_) => None,
BlockError::AvailabilityCheck(inner) => match inner {
AvailabilityCheckError::InvalidColumn((Some(idx), _)) => Some((
Copy link
Copy Markdown
Member

@jimmygchen jimmygchen May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note here - if column_index is None, it falls to the Malicious category below and penalize all column peers - on unstable, no peer is penalised:

// If no index supplied this is an un-attributable fault. In practice
// this should never happen.
None => vec![],
}

However, it doesn't look like the None column index case is reachable?

return Err((

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The None column index case reaches block_peer_penalty, which penalizes all column peers it seems, the enum variant and mapping here is slightly confusing - we're using BlockPeer variant to penalise all column peers it seems? I could be wrong

WhichPeerToPenalize::BlockPeer => peer_group.all().copied().collect(),

// per-component failure counter, so `SINGLE_BLOCK_LOOKUP_MAX_ATTEMPTS` bounds the
// retry loop and eventually drops the lookup if the failure persists. Whether the
// peer should be downscored is the producer's call (encoded in `penalty`).
debug!(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was previously error, but given we have now collapsed some of the ignore case as well, so it may not worth error on all failures? do we lose any visibility on errors here?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the apply function call below is supposed to handle those?

Copy link
Copy Markdown
Member

@jimmygchen jimmygchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a pair review with @dapplion and the simplification looks good to me!

The impact of the simplication is described in the PR, to summarise:

  • Lookup now retries instead of dropping immediately on an errors, and collapse the Drop and Retry cases into just Retry
  • Penalty messages have been changed and number of unique messages are bound to the BlockError enum variants
  • The error categorization and penality looks correct to me:

match result {
Ok(AvailabilityProcessingStatus::Imported(_)) => Self::Imported(true, "imported"),
Ok(AvailabilityProcessingStatus::MissingComponents(_, _)) => {
Self::Imported(false, "missing_components")
}
Err(e) => {
let penalty = match &e {
BlockError::DuplicateFullyImported(_) => {
return Self::Imported(true, "duplicate");
}
BlockError::GenesisBlock => return Self::Imported(true, "genesis"),
BlockError::ParentUnknown { parent_root, .. } => {
return Self::ParentUnknown {
parent_root: *parent_root,
};
}
BlockError::BeaconChainError(_) | BlockError::InternalError(_) => None,
BlockError::DuplicateImportStatusUnknown(_) => None,
BlockError::AvailabilityCheck(inner) => match inner {
AvailabilityCheckError::InvalidColumn((Some(idx), _)) => Some((
PeerAction::MidToleranceError,
WhichPeerToPenalize::CustodyPeerForColumn(*idx),
(&e).into(),
)),
inner => match inner.category() {
AvailabilityCheckErrorCategory::Internal => None,
AvailabilityCheckErrorCategory::Malicious => block_peer_penalty(inner),
},
},
BlockError::ExecutionPayloadError(epe) => {
if epe.penalize_peer() {
block_peer_penalty(epe)
} else {
None
}
}
// Remaining invalid blocks: penalize the block peer. Listed explicitly so a
// new `BlockError` variant forces a compile error here.
BlockError::FutureSlot { .. }
| BlockError::StateRootMismatch { .. }
| BlockError::WouldRevertFinalizedSlot { .. }
| BlockError::NotFinalizedDescendant { .. }
| BlockError::BlockSlotLimitReached
| BlockError::IncorrectBlockProposer { .. }
| BlockError::UnknownValidator(_)
| BlockError::InvalidSignature(_)
| BlockError::BlockIsNotLaterThanParent { .. }
| BlockError::NonLinearParentRoots
| BlockError::NonLinearSlots
| BlockError::PerBlockProcessingError(_)
| BlockError::WeakSubjectivityConflict
| BlockError::InconsistentFork(_)
| BlockError::ParentExecutionPayloadInvalid { .. }
| BlockError::KnownInvalidExecutionPayload(_)
| BlockError::Slashable
| BlockError::EnvelopeBlockRootUnknown(_)
| BlockError::OptimisticSyncNotSupported { .. }
| BlockError::BlobNotRequired(_)
| BlockError::InvalidBlobCount { .. }
| BlockError::BidParentRootMismatch { .. } => block_peer_penalty(&e),
};
Self::Error {
penalty,
reason: format!("{e:?}"),
}
}
}

Just a small comment above @dapplion

@jimmygchen jimmygchen added the waiting-on-author The reviewer has suggested changes and awaits thier implementation. label May 21, 2026
Copy link
Copy Markdown
Member

@pawanjay176 pawanjay176 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Mostly nits

self.send_sync_message(SyncMessage::BlockComponentProcessed {
process_type,
result: crate::sync::manager::BlockProcessingResult::Ignored,
result: BlockProcessingResult::Error {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we just retrying when the cpu is overloaded? does it simplify things somehow? seems wasteful to retry something we already have and just failed to process

// per-component failure counter, so `SINGLE_BLOCK_LOOKUP_MAX_ATTEMPTS` bounds the
// retry loop and eventually drops the lookup if the failure persists. Whether the
// peer should be downscored is the producer's call (encoded in `penalty`).
debug!(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the apply function call below is supposed to handle those?

debug!(
?block_root,
component = ?R::response_type(),
"Lookup component processing ignored, cpu might be overloaded"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting that this changes internal availability-processing errors from dropping the whole lookup to retrying.
But given that we are following this up with #9155 , we should be good since every sub state machine there handles its own drop semantics?

Conflict resolutions:
- network_beacon_processor/mod.rs: keep BlockProcessingResult re-export;
  drop unused FixedBlobSidecarList import (removed upstream).
- sync/block_lookups/mod.rs: keep PR's classified BlockProcessingResult
  handling (Imported/ParentUnknown/Error); discard old BlockError match.
- sync/tests/lookups.rs: drop the two new blob crypto-failure tests; blob
  lookup sync was deprecated/removed upstream (sigp#9383), deleting the
  corrupt_last_blob_* helpers they relied on. Column equivalents remain.
- sync_methods.rs: drop BlockError::BlobNotRequired arm (variant removed
  upstream) from the exhaustive penalty match.
@pawanjay176 pawanjay176 added ready-for-merge This PR is ready to merge. and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Jun 2, 2026
@mergify mergify Bot added the queued label Jun 2, 2026
@mergify
Copy link
Copy Markdown

mergify Bot commented Jun 2, 2026

Merge Queue Status

This pull request spent 30 minutes 46 seconds in the queue, including 27 minutes 57 seconds running CI.

Required conditions to merge

mergify Bot added a commit that referenced this pull request Jun 2, 2026
@mergify mergify Bot merged commit bbe7ead into sigp:unstable Jun 2, 2026
38 checks passed
@mergify mergify Bot removed the queued label Jun 2, 2026
mergify Bot pushed a commit that referenced this pull request Jun 3, 2026
… classification

Combine PR #9327's processing-result classification redesign (BlockProcessingResult
becomes Imported/ParentUnknown/Error with producer-side penalty classification in
network_beacon_processor) with PR #9155's RequestState trait removal (concrete
BlockRequest/DataRequest state machine in single_block_lookup.rs, concrete download/
processing handlers in block_lookups/mod.rs). The common.rs RequestState trait is
deleted. ParentUnknown remains reactive (surfaced from the processing result), as in
unstable. All gloas constructs are stripped.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-merge This PR is ready to merge. syncing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants