Gloas range sync#9362
Conversation
|
Tested on a gloas kurtosis network and mainnet and it works fine. Some cases where we went into optimistic mode because the parent envelope was not downloaded by range sync. Can be handled with #9039 |
|
Some required checks have failed. Could you please take a look @pawanjay176? 🙏 |
eserilev
left a comment
There was a problem hiding this comment.
Looks good on the whole. Had some thoughts about a few things, lmk what you think
| } | ||
|
|
||
| /// Constructs a Gloas `RangeSyncBlock` with block and optional envelope. | ||
| pub fn new_gloas( |
There was a problem hiding this comment.
I think we should raise an error here if gloas_enabled is false
4c4f59f to
27de595
Compare
|
For the block_verification tests, I constructed the We should probably have some tests with envelopes, but I think we can do it in a later PR. Made an issue for that #9400 |
| .collect(); | ||
|
|
||
| // Group data columns by block_root | ||
| let mut data_columns_by_block = HashMap::<Hash256, Vec<Arc<DataColumnSidecar<E>>>>::new(); |
There was a problem hiding this comment.
| let mut data_columns_by_block = HashMap::<Hash256, Vec<Arc<DataColumnSidecar<E>>>>::new(); | |
| let mut data_columns_by_block_root = HashMap::<Hash256, Vec<Arc<DataColumnSidecar<E>>>>::new(); |
| .entry(block_root) | ||
| .or_default() | ||
| .push(column); | ||
| } |
There was a problem hiding this comment.
nit: we could consolidate some of this logic as well since i think something similar exists for fulu responses
| let blocks_req_id = blocks_id(components_id()); | ||
| let mut info = | ||
| RangeBlockComponentsRequest::<E>::new(blocks_req_id, None, None, Span::none()); | ||
| RangeBlockComponentsRequest::<E>::new(blocks_req_id, None, None, None, Span::none()); |
There was a problem hiding this comment.
could we add some envelope related tests here?
| std::mem::swap(&mut blocks, &mut filtered_chain_segment); | ||
|
|
||
| // Extract envelopes before passing blocks to signature verification. | ||
| let envelopes: Vec<_> = blocks |
There was a problem hiding this comment.
In a future PR to optimize sync speeds we should:
- Spawn future 1 for signature verification
- Spawn future 2 to send all payloads of this batch to EL
- Spawn future 3 to advance state into this epoch (compute epoch transition)
- Await futures 1,2,3 and then import all blocks and payloads to beacon_chain
There was a problem hiding this comment.
Yeah this is meant to be a v1
| Ok(range_sync_blocks) | ||
| } | ||
|
|
||
| fn responses_gloas( |
There was a problem hiding this comment.
we dont pass the column_to_peer_id mapping so we lose out on the ability to penalize individual peers like we do in responses_with_custody_columns. I think a lot of that functionality can be shared between responses_with_custody_columns and responses_gloas
There was a problem hiding this comment.
we also dont check which columns should be there. and we dont do any retry/exceeded_retries accounting
There was a problem hiding this comment.
good call. i moved the gloas logic also to responses_with_custody_columns
| imported_blocks.push((block_root, block_slot)); | ||
| continue; |
There was a problem hiding this comment.
Just delete the continue to prevent re-implementing this twice
| /// ## Peer scoring | ||
| /// | ||
| /// The envelope is invalid and the peer is faulty. | ||
| EnvelopeError(Box<EnvelopeError>), |
There was a problem hiding this comment.
Now this makes the types circular. EnvelopeError contains BlockError and BlockError contains EnvelopeError
There was a problem hiding this comment.
Sorry, I started fixing this up but it got too messed up. The envelope processing functions returning block errors is so annoying. Gonna try to fix this up in a separate PR
Range sync does not import payload envelopes yet (coming in sigp#9362), so a multi-block segment of consecutive post-Gloas FULL blocks cannot be imported by `process_chain_segment`. Skip the affected `block_verification` tests under `FORK_NAME=gloas` until then, and drop the `import_chain_segment_with_envelopes` placeholder helper and the per-test workarounds it required.
Reduce the weak-subjectivity forward-sync loop to the minimum needed to import post-Gloas blocks: persist the payload envelope and mark it received in fork choice. The data-column reconstruction was unnecessary (the WSS test chain has no blobs). Placeholder until range sync imports payload envelopes (sigp#9362).
The signature was simplified to take no arguments; update the three remaining test call sites that still passed (&da_checker, spec).
Range sync now imports payload envelopes, so the chain_segment and invalid_signature tests no longer need to early-return under Gloas. Drop the now-unused fork_name_from_env import.
| pub fn make_available( | ||
| &self, | ||
| spec: &Arc<ChainSpec>, | ||
| _spec: &Arc<ChainSpec>, |
There was a problem hiding this comment.
I think this is from lions changes. we can just delete this argument
| .data_availability_checker | ||
| .batch_verify_kzg_for_available_blocks(&available_blocks)?; | ||
|
|
||
| // Verify KZG for Gloas envelope columns out-of-band (commitments live in block bid). |
There was a problem hiding this comment.
| // Verify KZG for Gloas envelope columns out-of-band (commitments live in block bid). |
| Ok(Self::Base(available_block)) | ||
| } | ||
|
|
||
| /// Constructs a Gloas `RangeSyncBlock` with block and optional envelope. |
There was a problem hiding this comment.
| /// Constructs a Gloas `RangeSyncBlock` with block and optional envelope. | |
| /// Constructs a Gloas `RangeSyncBlock` with block and optional `AvailableEnvelope` which wraps the payload envelope with its data columns. |
| /// process the block that builds on top of this block. | ||
| pub fn new_gloas( | ||
| block: Arc<SignedBeaconBlock<E>>, | ||
| envelope: Option<Box<AvailableEnvelope<E>>>, |
There was a problem hiding this comment.
I thought about this some more
I think AvailableEnvelope construction should be updated to reflect how we construct an AvailableBlock
AvailableBlock construction checks if data columns are expected e.g. If they are and columns arent provided (via AvailableBlockData) the constructor fails, If kzg commitment on the columns doesn't match the blocks, the constructor fails, etc.
We could have an AvailableEnvelopeData enum for example that matches what AvailableBlockData does.
Could add a TODO if its too much for this PR, but I think we should do it at some point at least
Issue Addressed
N/A
Proposed Changes
Implement range sync in gloas.
Basically requests blocks and payloads post gloas from the same peer, couples them and sends it for processing.
Does not change sync much at all other than adding the machinery for payloads by range requests.
Main changes are:
RangeSyncBlockwhich used to be a struct is an enum to account for the Gloas case. This allows a clear separation between gloas and pre-gloas code.AvailableBlockDatanow has aBlockInEnvelopevariant. This is to clearly indicate the post gloas case. I feel this is simpler to follow compared toNoDatavariant.Tries to extract post gloas logic into its own functions so that there is minimal logic change in mainnet range sync behaviour.
This is meant as a stable base on which we can iterate further to make range sync cleaner and for unblocking range sync support on devnet. Some ideas for later is removing the retry mechanism in favour of delegating column fetching to lookup sync which can be done post #9155 and batch signature verifying envelopes.