Skip to content

WIP: Enable downloading of transaction sets in parallel with early SCP stages#5209

Draft
bboston7 wants to merge 15 commits intostellar:masterfrom
bboston7:skip-ledgers-p25
Draft

WIP: Enable downloading of transaction sets in parallel with early SCP stages#5209
bboston7 wants to merge 15 commits intostellar:masterfrom
bboston7:skip-ledgers-p25

Conversation

@bboston7
Copy link
Copy Markdown
Contributor

@bboston7 bboston7 commented Apr 8, 2026

This is very much a work-in-progress. This is littered with TODOs that I am actively chipping away at. I've opened this draft PR as an easier way to point people towards the prototype as it evolves into a production-ready feature.

I try to keep this branch in a good enough state such that it:

  • Compiles
  • Can join the network
  • Can run on supercluster

Therefore, I tend to do most work on other branches and merge it in here when ready in larger single commits.

I plan to split this whole thing into multiple smaller PRs when it's time to actually merge it in.

@bboston7
Copy link
Copy Markdown
Contributor Author

I just rebased this on master and force pushed up so we can get a CI run. Ignore the terrible branch name; this is built on top of protocol 26, not 25.

bboston7 added 12 commits April 20, 2026 13:26
Some early notes

Item fetcher note

More notes

Notes

Interface improvements

Add way to get time spent waiting for tx hash

Formatting

Expose download times, and check in ballot protocol

Find actual "step 3" location

Notes

Consider values being downloaded as valid for nomination

Allow partially validated values to step 3

Fix nomination validation check

Allow proceeding without tx set

Logging

More general logging

More targeted log messages

More super detailed logging

More logging

Improve pending envelope flow

Handle combining candidates

Sim network setup

Manually feed SCP messages to node0

Loadgen

Catchup two ledgers worth

Optional checks

fmt

Gate feature on app being synced

Lots of log changes and sanity checks

Fix loadgen issues

Allow processed messages through

Add `feedTxSetFromStatement`

Cleanup

Add ability to feed all tx sets from a given slots

Feed everywhere

Test different injection points

TODO

Check wait time

Revert "Check wait time"

This reverts commit 0f8d1e4.

Fix test for cases where the voted on value changes

Harden test to handle case where first slot experiences issue

Reapply "Check wait time"

This reverts commit 53719f7.

Outline skip ledger test

Add support for message filtering

Reliably force skip ledger condition

Debug timers

Simple vote-to-skip based on static timeout

Checkpoint

Builds

Add `isSkipLedgerValue`

Fix test instances

Downstream skip ledger handling

Take largest existing skip value when available

todo

Downstream skip ledger handling

Replace v_3

v_2

Update some comments

Bump xdr

Handle high level union

TODO

signature checking

Generate new skip values

Don't short circuit on stall

Refactor skip location

Programmatically check that SKIP is externalized

Better skip checking

Metrics

New (nonworking) test

Fix vote reversal test

Configurable timeout

Bump default timeout

Move debug messages to own partition

Add new metric tracking time blocked during balloting

Some testing notes

Add some low level skip tests

Fix errors introduced in rebase

Cleanup + triage TODOs

Fix lifetimes of tx set objects

Formatting

Handles todos 1 and 2

Fix todo 3

TODO 6

TODOs 7, 9, and 10

Add some details around TODOs 11 and 12

Value -> StellarValue error handling

Add new fields to better construct skip value tx sets

Replace `kInvalidValue` with skip in `maybeReplaceValueWithSkip`

Remove todo 16

Clarify envelope parsing order

Remove todo 20

Remove todo 21

Error handling around missing wait times

Clean up TODOs around stall point

Clean up catchup related TODOs

Remove TODO(30)

Remove TODO(31)

Remove todos around min validation level

Disallow skip values in nomination
Fix `getStatementValues` to extract all values, move validtion relax
We expect an extra cache hit now from the additional validation check
before setting `mCommit` in the ballot protocol
return ret;
}

// If no more fully ready envelopes, proceed to processing partially
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be an implementation detail or a naming issue, but doesn't having a single ready envelope for the slot imply all partial envelopes become ready?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it depends on which envelopes are "ready" - if we can't vote any new values, then a ready envelope would imply all pending envelopes are ready.

Copy link
Copy Markdown
Contributor Author

@bboston7 bboston7 Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't having a single ready envelope for the slot imply all partial envelopes become ready

Not necessarily. There could be multiple Values on the network for a given slot, and the local node may have some but not all of them. One way this can happen is if nomination has multiple round leaders (timeouts have occurred) proposing different values. Processing "ready" envelopes first nudges the network towards values that the local node sees as available in the case of multiple values on the network.

You're right though that in the common case (a single round leader), either all envelopes will be "ready" or they'll all be "partially ready".

// accepted so that checkHeardFromQuorum can see a quorum and arm
// the ballot timer. The commit gate in setConfirmPrepared
// independently validates values before voting to commit.
return SCPDriver::kFullyValidatedValue;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, generic validation function returning a blanket "valid" for the whole PREPARE phase seems really sketchy - could you please walk me through what necessitates this change? Is this related to the relaxed validity checks we discussed before?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might mean we need to rewrite the validation flow a bit (for example, I'm not sure why we'd validate values at commit time outside of this function)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly the relaxation we needed on PREPARE statements. The only things we were checking before were p, p', and ballot, but with parallel downloading these values may be legetimately invalid. I'm not really sure what's left to check here.

I'm not sure why we'd validate values at commit time outside of this function

I think I addressed this in the comment, but the added check at commit time isn't related to this. It's mainly to provide a more useful error if someone forgets to update their stellar-core instance.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the issue that I'm seeing is that we're not returning "maybe valid" to indicate relaxed validation, and instead we return "fully valid". It seems error-prone as the caller could misinterpret this result. Is there a reason we don't return "maybe valid" here?

mSlot.getSlotIndex(), mSlot.getSCP().ballotToStr(c),
mSlot.getSCP().ballotToStr(h));

throwIfValueInvalidForCommit(c.value, "setAcceptCommit");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is related to my other comment, but I don't quite follow why an additional validation is needed here given that CONFIRM messages would have gone through validateValues that should have the tx set by then (or is this just a sanity check?)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really just here to offer a better error message when a quorum of nodes pushes through a value that the local node sees as invalid. Without this, we would still crash in that scenario, but we'd crash at the emitCurrentStateStatement call at the end of this function.

I don't think we really care that a malicious quorum can crash the node, but in practice this is most likely to occur when someone forgets to update prior to a protocol boundary. This check lets us tell them why their node crashed.

{
auto validationLevel = mSlot.getSCPDriver().validateValue(
mSlot.getSlotIndex(), value, /*nomination=*/false);
if (validationLevel != SCPDriver::kInvalidValue)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this might be an abstraction/implementation issue, but how come it's safe to have the value in "waiting for download" state at commit time?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is a weird one. There are two ways to get to this function:

  1. Via setConfirmCommit. setConfirmCommit is only called when the hint ballot is a CONFIRM or EXTERNALIZE ballot. PendingEnvelopes holds those back if they aren't fully downloaded, so a kAwaitingDownload value cannot end up here.
  2. Via setAcceptCommit. Unlike setConfirmCommit, you can wind up at setAcceptCommit via a PREPARE hint ballot. In that case however, a quorum has already determined the value to be valid, so it must be valid. It's safe to set mCommit in that instance. It's worth noting that the node will naturally stall after this, as it cannot make further progress into the CONFIRM phase without the transaction set because PendingEnvelopes will hold partially fetched CONFIRM messages back. This is the one weird case where it's OK to have a kAwaitingDownload value in c.

Again, this function is technically unnecessary altogether. It really is just there to emit a better error message in a single edge case. I added it because of some confusion that came up in a model checker run where the model checker ran a scenario with a malicious quorum. I'm curious whether you think its existence makes the code more confusing overall, in which case I can remove it or better document its purpose.

Comment thread src/herder/HerderImpl.cpp
auto maybeQSet = mApp.getHerder().getQSet(qSetHash);
if (maybeQSet)
{
processSCPQueue(true);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this correct? wouldn't it be possible to have other message types in this queue, which would cause them to get processed as well?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fine? Any CONFIRM or EXTERNALIZE messages in the queue would have fully downloaded tx sets and should be safe to process. The "partially ready" queue only contains nomination and PREPARE messages, which should also be safe to process.

I don't see how this is different than the processSCPQueue call in the Herder::ENVELOPE_STATUS_READY case. In either case, all messages in the queue should safe to process, even though you may end up processing messages other than the one you just received.

Do you see a scenario where it would be wrong to process other message types here?

Copy link
Copy Markdown
Contributor

@marta-lokhova marta-lokhova Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My worry is processing messages in unexpected order, such that it trips downstream invariance (e.g. value must be downloaded, but we accidentally fed the message that's not fully ready, crashing the node).

I don't see how this is different than the processSCPQueue call in the Herder::ENVELOPE_STATUS_READY case.

Do we understand why? Messages can be received in any order, so let's say we feed a ballot message with 1 value followed by a nomination message referencing 3 values that aren't all downloaded. I don't quite understand how this wouldn't trip SCP.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, is this due to isNewerStatement checks in SCP itself? If so, that seems quite brittle at the two subsystems have an implicit correctness dependency between them that isn't documented. I know this was here before, but what worries me is that we're building on top of badness, making this even harder to maintain and reason about.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value must be downloaded, but we accidentally fed the message that's not fully ready, crashing the node

If the value must be downloaded (such as for CONFIRM and EXTERNALIZE), then the envelope isn't considered ready or partially ready and isn't in either queue, so processSCPQueue won't pass it on.

Messages can be received in any order, so let's say we feed a ballot message with 1 value followed by a nomination message referencing 3 values that aren't all downloaded. I don't quite understand how this wouldn't trip SCP.

I don't understand why this would be a problem? Nomination can run concurrently with balloting. In this case, we'd treat the undownloaded values as valid in nomination. If balloting times out and hasn't set mValueOverride, it'll pick up the latest composite candidate from nomination. If it has set mValueOverride (and therefore set h), it will stick with the value from balloting. This is no different than the situation without parallel downloading.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following up about this (and the other comments on HerderImpl) to capture some discussion from our meeting.

The plan is to try pushing much of this logic (and related logic in PendingEnvelopes) down to SCP. My first thought is that it would live in Slot. PendingEnvelopes would treat all messages with downloaded qsets as "fully ready" (there would be no "partially ready" queue anymore) and pass them on to Slot. Slot would then interface with item fetcher as needed, and manage which values get passed to nomination/ballot protocol, and which get held back (CONFIRM and EXTERNALIZE messages without downloaded qsets). Slot would also handle delivering / replaying messages as tx sets arrive, potentially via a callback registered with ItemFetcher.

The first step here is to draw out the various data flows between Herder, PendingEnvelopes, ItemFetcher, and SCP. From there we can decide how to change these interfaces.

Comment thread src/herder/HerderImpl.cpp
// a NOMINATE or PREPARE message. Technically both of these criteria
// should be properly handled downstream, but this provides some
// additional assurance.
if (mApp.getState() == Application::State::APP_SYNCED_STATE &&
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, herder having to deal with the guts of SCP and differentiate between nomination, prepare and commit seems like an improper abstraction/separation of responsibilities situation. It also seems quite brittle - what if we change SCP in the future such that some CONFIRM messages would be allowed and some won't? or some PREPARE messages should be stalled?

It looks to me like the main problem is the weird mechanism in herder (which existed prior to your changes), which decides when to feed new messages to SCP. Does it really need to live in herder? Have you considered moving the mechanism inside of SCP, so it can automatically decide when to advance the protocol vs when to stall (then re-feed the messages back to the protocol).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I see what you mean. I'll try moving this into SCP after wrapping up the test harness work.

}
break;
default:
// Value is valid or maybe valid, so we shouldn't replace it with skip
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wanted to clarify: this function should never be called with maybe valid state value right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can end up with kMaybeValidValue here when processing messages for slots beyond LCL+1.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, based on what we discussed last week, I was under impression we weren't going to allow skip values for future slots - so it's unexpected to see maybe swapping values for slots we can't actually validate. Possibly this is related to the existing implementation of SCP and the hoops we have to jump through to get skip value to work. Is this because of how bumpState is implemented for all slots, current and future?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think there might be a bug here. The intent was that this only had an impact on the current slot, but once I added the kInvalidValue arm to the switch statement, it can technically fire for future slots containing invalid values.

The root issue is that the BallotProtocol itself doesn't know whether this is a future slot or not (that's a Herder concern). I think what we might want is to make validateValue communicate a bit more information back. It should probably tell us whether this value belongs to a future slot or not.

Additionally, if we add more details about why a value is invalid (structural vs something related to the tx set), then we can do a bit more validation for PREPARE messages as well. I need to think about that a bit to make sure it wouldn't break anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants