Skip to content

dd: proposer pipelining "build-ahead"#99

Open
Maddiaa0 wants to merge 10 commits intomainfrom
md/pipeline
Open

dd: proposer pipelining "build-ahead"#99
Maddiaa0 wants to merge 10 commits intomainfrom
md/pipeline

Conversation

@Maddiaa0
Copy link
Member

No description provided.

@Maddiaa0
Copy link
Member Author

Maddiaa0 commented Feb 21, 2026

TODO:

  • data structure changes
  • double check agent producted diagrams

@Maddiaa0 Maddiaa0 marked this pull request as draft February 21, 2026 12:05
Copy link
Contributor

@iAmMichaelConnor iAmMichaelConnor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great doc!


### Option 2: Speculative — checkpoint_proposal-Gated

The next proposer starts building (locally, **not broadcasting**) as soon as the `checkpoint_proposal` is received. Blocks are only **broadcast after attestations arrive**.
Copy link
Contributor

@iAmMichaelConnor iAmMichaelConnor Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the next proposer validate the previous proposer's checkpoint (verify the proofs, simulate, check for duplicate nullifiers, check fee juice balances)? If so, the time saved by this Option 2 is lessened vs Option 1, since the next proposer will be performing a lot of the time-consuming computation that will happen as part of the attestation process.
At the very least, the next proposer will need to personally simulate all txs of the previous checkpoint (even if not taking time to verify the proofs), so that they can begin simulating tree insertions for their own checkpoint.

Edit:

B builds silently for ~2s before attestations arrive

Oh, maybe you're already assuming this, actually. "2s" suggests your only saving is the time for the committee to send attestations back to the previous proposer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ill combine the two sections together, theyre not different enough to warrant seperate sections

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm making the assumption that the next proposer will be validating blocks as they arrive, they shouldnt need to verify all proofs etc, the ideal situation is that the node is at the tip by the time the checkpoint proposal arrives (or at least just slightly behind).


## L1 Submission Handoff

The predecessor attempts L1 submission during their slot as normal. At the **slot boundary**, the predecessor stops trying. After the slot boundary, anyone can submit the predecessor's checkpoint to L1 — the current `ProposeLib.sol` already allows any address to submit. The incentive for the next proposer to submit is indirect: B needs A's checkpoint on L1 to make B's own blocks valid.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the current ProposeLib.sol already allows any address to submit

The incentive for the next proposer to submit is indirect

It could be that there isn't sufficient "indirect incentive":

If the "predecessor" gets the block reward, how does the submitter (be it the next proposer or someone else) get reimbursed for their L1 ETH submission costs?
If the ETH cost of the next proposer submitting both the previous checkpoint proposal and their own proposal[1] exceeds the block reward, there might not be sufficient incentive.

[1] a prudent proposer would want to submit their own proposal so as to reduce the likelihood of getting slashed, presumably.

Copy link
Member Author

@Maddiaa0 Maddiaa0 Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree here, I've moved this section to optional at the end. In original discussions we were planning to either slash the person who missed checkpoint submission AND OR give most of their reward to the submitter ( if the submission window is open for longer to maintain liveness)


## L1 Submission Handoff

The predecessor attempts L1 submission during their slot as normal. At the **slot boundary**, the predecessor stops trying. After the slot boundary, anyone can submit the predecessor's checkpoint to L1 — the current `ProposeLib.sol` already allows any address to submit. The incentive for the next proposer to submit is indirect: B needs A's checkpoint on L1 to make B's own blocks valid.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably Rollup.sol is already well-designed enough to revert early in the case of a duplicated submission, to save the subsequent submitter(s) unnecessary gas?


The current `ProposeLib.sol` already allows **any address** to submit a checkpoint — there is no proposer validation on the submitter. This design preserves that property.

The only L1 contract change required is **slot validation**: today the contract requires `slot == currentSlot`. For Build Ahead, the contract must also accept `slot == currentSlot - 1` after the slot boundary has elapsed. This allows a checkpoint from slot N to be submitted during slot N+1's window.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to discuss:

It could be that any protocol changes which aim at a post-alpha release should go through the (not yet finalised) AZIP process. An L1 change is such a "protocol change".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh no

The next proposer builds from the **last confirmed state** (the most recent L1-confirmed tip). This is equivalent to the current system's behavior when a proposer is offline, but triggered earlier within the slot rather than waiting for the slot boundary.

The exact timeout value is TBD and will be tuned via testing. It should be long enough to avoid false positives (slow proposers) but short enough to recover meaningful build time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens, slashing-wise, in the following scenario:

The previous proposer proposes a checkpoint.

The next proposer begins building block 1 of their checkpoint. After 6s, they are done building that block. Q1: Do they broadcast this single block to their peers? Or do they wait until the end of their checkpoint to broadcast this block?

"Multiple blocks per checkpoint" doesn't make much sense if the latter, so presume they broadcast this block after 6s.

So the next proposer has now proposed a block before the previous checkpoint has been submitted to L1 (and, under Option 2 above, before the previous block has been attested-to).

Now suppose:

  • With Option 1 above:
    • The prev checkpoint is not submitted to L1, or it doesn't reach L1 in time.
  • With Option 2 above:
    • The prev checkpoint is not attested; or it is not submitted to L1; or it doesn't reach L1 in time.

Q2: Can/should the next proposer be slashed for technically proposing a block that builds on a checkpoint that never got proposed to L1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they should only broadcast whenever theyve received attestations. You should not be slashed for building on a block which has attestations, even if it never makes it to l1 in a checkpoint


3. **Coordinated upgrade required.** The L1 contract changes are a hard fork. All validators must upgrade simultaneously — likely at an epoch boundary.

4. **Remaining dead zone.** Even with Build Ahead, 10-12s of dead zone remains per slot (the checkpoint finalization overhead: re-execution + assembly + P2P round-trip). This is inherent to the attestation-based protocol and cannot be eliminated without changing the trust model (e.g., optimistic attestation).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't the attestors re-execute block 1 of a checkpoint as soon as they see it -- 6s into the checkpoint -- rather than only commencing re-execution of the blocks of the checkpoint after the final block in the checkpoint has been proposed?
Wouldn't this then reduce the "re-exection" component of dead zone to be "re-execution of the final block of the checkpoint", which is 10x faster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, ill delete these lines - theyre not correct


# Open Questions

- **Reward economics:** Rewards are distributed at **proof arrival time**, not at checkpoint submission time. This means the submitter of the L1 checkpoint doesn't automatically get the reward — the proof associates the reward with the original proposer. The incentive for B to submit A's checkpoint is indirect (B needs A's checkpoint on L1 to make B's own blocks valid). The exact reward mechanism and incentive alignment needs formal analysis but is deferred from this design.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. This covers one of my earlier comments. Peraps this analysis shouldn't be deferred, but is important to the feasiblility of this design?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it is, ill reword to say that we must address this before proceeding

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or we can work without l1 changes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For implementation I have opted to run this without L1 changes. These can be part of a sep proposal to improve submission across the board. Nothing should be specific to this implementation

@Maddiaa0 Maddiaa0 marked this pull request as ready for review February 23, 2026 15:50

KPIs: we are trying to reduce **user-perceived latency** (time from TX submission to "proposed chain" visibility) by up to 12s and increase **effective chain throughput** from 8 blocks per slot to 10 blocks per slot (+25% improvement).

The solution: allow the next slot's proposer to begin building blocks as soon as the predecessor's checkpoint data appears on P2P — before it lands on L1. The next proposer builds and broadcasts blocks during what is currently dead time, and anyone can submit the predecessor's checkpoint to L1 (the next proposer is incentivized to do so because their own blocks depend on it). All blocks B builds during the overlap go into B's checkpoint for B's slot.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it not possible to stagger the slots entirely?

So, proposer A builds checkpoint 10 in slot 9 and publishes to L1 in slot 10. Meanwhile, proposer B builds checkpoint 11 in slot 10 (having seen checkpoint 10 on p2p towards the end of slot 9) and publishes in slot 11.

I guess everyone needs to agree to only attest to a checkpoint for slot N broadcasted in slot N - 1. Otherwise the proposer for slot N could just keep building all the way through slot N. But that should be ok.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is, the implementation plan will likely go this route.

As long as the majority of the node software agrees to NOT attest to proposals that appear outside of the pipeline, then it should be safe.

An area of risk is when we pass the agreed "timeout". If a proposer publishes their proposal late, then we need the majority of attestors to ignore it, otherwise they could potentially publish a proposal within the next pipelined slot, causing the new proposer to have to scramble. This case will need to be recoverable within the implementation as there may be an attestation "witholding" attack that could crash the node software if this was no covered.


## L1 Submission Handoff

The predecessor attempts L1 submission during their slot as normal. At the **slot boundary**, the predecessor stops trying. After the slot boundary, anyone can submit the predecessor's checkpoint to L1 — the current `ProposeLib.sol` already allows any address to submit. The incentive for the next proposer to submit is indirect: B needs A's checkpoint on L1 to make B's own blocks valid, but also should yield some of proposer A's block rewards.
Copy link
Contributor

@aminsammara aminsammara Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but also should yield some of proposer A's block rewards.

This is on the roadmap. Data shows undeniably certain publishers are much better at landing checkpoints on l1 faster and cheaper. More longer term, I don't think proposers should submit anything to L1 and that should be handled by specialized publishers.

Without this don't assume anyone is ever gonna submit someone else's block to L1. The current avg cost per block is 77% of the reward (assuming the increase in block rewards).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah im a believer in decoupling these actions somewhat.


If the current proposer is unresponsive — no block proposals or checkpoint data appear on P2P — the network should not wait for the full slot to elapse. A **globally agreed, configurable timeout** fires mid-slot: if no proposals from the current proposer are observed by this deadline, honest nodes start accepting proposals from the next proposer.

The next proposer builds from the **last confirmed state** (the most recent L1-confirmed tip). This is equivalent to the current system's behavior when a proposer is offline, but triggered earlier within the slot rather than waiting for the slot boundary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last confirmed state (the most recent L1-confirmed tip).

I don't think this is worth it. Any m less than the slotDuration leaves the proposer N+1 who builds early after the timeout trigger at risk of wasting work.

Image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for the confusing N+3 stuff but my point is as long as the L1 contract accepts proposals for the duration of the slot, having the next proposer build early could be a grieving vector. It gives us nothing in the happy pay (i.e. next proposer sees block proposals before m), in the unhappy path it still has shortcomings (what i describe above) and is probably a lot of work to implement.


# Rollout Plan

This **requires** a coordinated rollout - creating a second proposer view will require all nodes to run compatible versions such that they are building along the correct tip.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It takes time for nodes to update. A significant % will not update for days and this will split the network's view of what the correct tip is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants