WIP: Enable downloading of transaction sets in parallel with early SCP stages#5209
WIP: Enable downloading of transaction sets in parallel with early SCP stages#5209bboston7 wants to merge 15 commits intostellar:masterfrom
Conversation
285df2f to
1692131
Compare
|
I just rebased this on master and force pushed up so we can get a CI run. Ignore the terrible branch name; this is built on top of protocol 26, not 25. |
Some early notes Item fetcher note More notes Notes Interface improvements Add way to get time spent waiting for tx hash Formatting Expose download times, and check in ballot protocol Find actual "step 3" location Notes Consider values being downloaded as valid for nomination Allow partially validated values to step 3 Fix nomination validation check Allow proceeding without tx set Logging More general logging More targeted log messages More super detailed logging More logging Improve pending envelope flow Handle combining candidates Sim network setup Manually feed SCP messages to node0 Loadgen Catchup two ledgers worth Optional checks fmt Gate feature on app being synced Lots of log changes and sanity checks Fix loadgen issues Allow processed messages through Add `feedTxSetFromStatement` Cleanup Add ability to feed all tx sets from a given slots Feed everywhere Test different injection points TODO Check wait time Revert "Check wait time" This reverts commit 0f8d1e4. Fix test for cases where the voted on value changes Harden test to handle case where first slot experiences issue Reapply "Check wait time" This reverts commit 53719f7. Outline skip ledger test Add support for message filtering Reliably force skip ledger condition Debug timers Simple vote-to-skip based on static timeout Checkpoint Builds Add `isSkipLedgerValue` Fix test instances Downstream skip ledger handling Take largest existing skip value when available todo Downstream skip ledger handling Replace v_3 v_2 Update some comments Bump xdr Handle high level union TODO signature checking Generate new skip values Don't short circuit on stall Refactor skip location Programmatically check that SKIP is externalized Better skip checking Metrics New (nonworking) test Fix vote reversal test Configurable timeout Bump default timeout Move debug messages to own partition Add new metric tracking time blocked during balloting Some testing notes Add some low level skip tests Fix errors introduced in rebase Cleanup + triage TODOs Fix lifetimes of tx set objects Formatting Handles todos 1 and 2 Fix todo 3 TODO 6 TODOs 7, 9, and 10 Add some details around TODOs 11 and 12 Value -> StellarValue error handling Add new fields to better construct skip value tx sets Replace `kInvalidValue` with skip in `maybeReplaceValueWithSkip` Remove todo 16 Clarify envelope parsing order Remove todo 20 Remove todo 21 Error handling around missing wait times Clean up TODOs around stall point Clean up catchup related TODOs Remove TODO(30) Remove TODO(31) Remove todos around min validation level Disallow skip values in nomination
Fix `getStatementValues` to extract all values, move validtion relax
cf854d3 to
0564c22
Compare
We expect an extra cache hit now from the additional validation check before setting `mCommit` in the ballot protocol
| return ret; | ||
| } | ||
|
|
||
| // If no more fully ready envelopes, proceed to processing partially |
There was a problem hiding this comment.
This might be an implementation detail or a naming issue, but doesn't having a single ready envelope for the slot imply all partial envelopes become ready?
There was a problem hiding this comment.
I guess it depends on which envelopes are "ready" - if we can't vote any new values, then a ready envelope would imply all pending envelopes are ready.
There was a problem hiding this comment.
doesn't having a single ready envelope for the slot imply all partial envelopes become ready
Not necessarily. There could be multiple Values on the network for a given slot, and the local node may have some but not all of them. One way this can happen is if nomination has multiple round leaders (timeouts have occurred) proposing different values. Processing "ready" envelopes first nudges the network towards values that the local node sees as available in the case of multiple values on the network.
You're right though that in the common case (a single round leader), either all envelopes will be "ready" or they'll all be "partially ready".
| // accepted so that checkHeardFromQuorum can see a quorum and arm | ||
| // the ballot timer. The commit gate in setConfirmPrepared | ||
| // independently validates values before voting to commit. | ||
| return SCPDriver::kFullyValidatedValue; |
There was a problem hiding this comment.
Hmm, generic validation function returning a blanket "valid" for the whole PREPARE phase seems really sketchy - could you please walk me through what necessitates this change? Is this related to the relaxed validity checks we discussed before?
There was a problem hiding this comment.
This might mean we need to rewrite the validation flow a bit (for example, I'm not sure why we'd validate values at commit time outside of this function)
There was a problem hiding this comment.
This is exactly the relaxation we needed on PREPARE statements. The only things we were checking before were p, p', and ballot, but with parallel downloading these values may be legetimately invalid. I'm not really sure what's left to check here.
I'm not sure why we'd validate values at commit time outside of this function
I think I addressed this in the comment, but the added check at commit time isn't related to this. It's mainly to provide a more useful error if someone forgets to update their stellar-core instance.
There was a problem hiding this comment.
the issue that I'm seeing is that we're not returning "maybe valid" to indicate relaxed validation, and instead we return "fully valid". It seems error-prone as the caller could misinterpret this result. Is there a reason we don't return "maybe valid" here?
| mSlot.getSlotIndex(), mSlot.getSCP().ballotToStr(c), | ||
| mSlot.getSCP().ballotToStr(h)); | ||
|
|
||
| throwIfValueInvalidForCommit(c.value, "setAcceptCommit"); |
There was a problem hiding this comment.
I think this is related to my other comment, but I don't quite follow why an additional validation is needed here given that CONFIRM messages would have gone through validateValues that should have the tx set by then (or is this just a sanity check?)
There was a problem hiding this comment.
This is really just here to offer a better error message when a quorum of nodes pushes through a value that the local node sees as invalid. Without this, we would still crash in that scenario, but we'd crash at the emitCurrentStateStatement call at the end of this function.
I don't think we really care that a malicious quorum can crash the node, but in practice this is most likely to occur when someone forgets to update prior to a protocol boundary. This check lets us tell them why their node crashed.
| { | ||
| auto validationLevel = mSlot.getSCPDriver().validateValue( | ||
| mSlot.getSlotIndex(), value, /*nomination=*/false); | ||
| if (validationLevel != SCPDriver::kInvalidValue) |
There was a problem hiding this comment.
Hmm, this might be an abstraction/implementation issue, but how come it's safe to have the value in "waiting for download" state at commit time?
There was a problem hiding this comment.
Ah, this is a weird one. There are two ways to get to this function:
- Via
setConfirmCommit.setConfirmCommitis only called when the hint ballot is a CONFIRM or EXTERNALIZE ballot. PendingEnvelopes holds those back if they aren't fully downloaded, so akAwaitingDownloadvalue cannot end up here. - Via
setAcceptCommit. UnlikesetConfirmCommit, you can wind up atsetAcceptCommitvia a PREPARE hint ballot. In that case however, a quorum has already determined the value to be valid, so it must be valid. It's safe to setmCommitin that instance. It's worth noting that the node will naturally stall after this, as it cannot make further progress into the CONFIRM phase without the transaction set because PendingEnvelopes will hold partially fetched CONFIRM messages back. This is the one weird case where it's OK to have akAwaitingDownloadvalue inc.
Again, this function is technically unnecessary altogether. It really is just there to emit a better error message in a single edge case. I added it because of some confusion that came up in a model checker run where the model checker ran a scenario with a malicious quorum. I'm curious whether you think its existence makes the code more confusing overall, in which case I can remove it or better document its purpose.
| auto maybeQSet = mApp.getHerder().getQSet(qSetHash); | ||
| if (maybeQSet) | ||
| { | ||
| processSCPQueue(true); |
There was a problem hiding this comment.
is this correct? wouldn't it be possible to have other message types in this queue, which would cause them to get processed as well?
There was a problem hiding this comment.
I think that's fine? Any CONFIRM or EXTERNALIZE messages in the queue would have fully downloaded tx sets and should be safe to process. The "partially ready" queue only contains nomination and PREPARE messages, which should also be safe to process.
I don't see how this is different than the processSCPQueue call in the Herder::ENVELOPE_STATUS_READY case. In either case, all messages in the queue should safe to process, even though you may end up processing messages other than the one you just received.
Do you see a scenario where it would be wrong to process other message types here?
There was a problem hiding this comment.
My worry is processing messages in unexpected order, such that it trips downstream invariance (e.g. value must be downloaded, but we accidentally fed the message that's not fully ready, crashing the node).
I don't see how this is different than the processSCPQueue call in the Herder::ENVELOPE_STATUS_READY case.
Do we understand why? Messages can be received in any order, so let's say we feed a ballot message with 1 value followed by a nomination message referencing 3 values that aren't all downloaded. I don't quite understand how this wouldn't trip SCP.
There was a problem hiding this comment.
Oh, is this due to isNewerStatement checks in SCP itself? If so, that seems quite brittle at the two subsystems have an implicit correctness dependency between them that isn't documented. I know this was here before, but what worries me is that we're building on top of badness, making this even harder to maintain and reason about.
There was a problem hiding this comment.
value must be downloaded, but we accidentally fed the message that's not fully ready, crashing the node
If the value must be downloaded (such as for CONFIRM and EXTERNALIZE), then the envelope isn't considered ready or partially ready and isn't in either queue, so processSCPQueue won't pass it on.
Messages can be received in any order, so let's say we feed a ballot message with 1 value followed by a nomination message referencing 3 values that aren't all downloaded. I don't quite understand how this wouldn't trip SCP.
I don't understand why this would be a problem? Nomination can run concurrently with balloting. In this case, we'd treat the undownloaded values as valid in nomination. If balloting times out and hasn't set mValueOverride, it'll pick up the latest composite candidate from nomination. If it has set mValueOverride (and therefore set h), it will stick with the value from balloting. This is no different than the situation without parallel downloading.
There was a problem hiding this comment.
Following up about this (and the other comments on HerderImpl) to capture some discussion from our meeting.
The plan is to try pushing much of this logic (and related logic in PendingEnvelopes) down to SCP. My first thought is that it would live in Slot. PendingEnvelopes would treat all messages with downloaded qsets as "fully ready" (there would be no "partially ready" queue anymore) and pass them on to Slot. Slot would then interface with item fetcher as needed, and manage which values get passed to nomination/ballot protocol, and which get held back (CONFIRM and EXTERNALIZE messages without downloaded qsets). Slot would also handle delivering / replaying messages as tx sets arrive, potentially via a callback registered with ItemFetcher.
The first step here is to draw out the various data flows between Herder, PendingEnvelopes, ItemFetcher, and SCP. From there we can decide how to change these interfaces.
| // a NOMINATE or PREPARE message. Technically both of these criteria | ||
| // should be properly handled downstream, but this provides some | ||
| // additional assurance. | ||
| if (mApp.getState() == Application::State::APP_SYNCED_STATE && |
There was a problem hiding this comment.
Hmm, herder having to deal with the guts of SCP and differentiate between nomination, prepare and commit seems like an improper abstraction/separation of responsibilities situation. It also seems quite brittle - what if we change SCP in the future such that some CONFIRM messages would be allowed and some won't? or some PREPARE messages should be stalled?
It looks to me like the main problem is the weird mechanism in herder (which existed prior to your changes), which decides when to feed new messages to SCP. Does it really need to live in herder? Have you considered moving the mechanism inside of SCP, so it can automatically decide when to advance the protocol vs when to stall (then re-feed the messages back to the protocol).
There was a problem hiding this comment.
Yeah, I see what you mean. I'll try moving this into SCP after wrapping up the test harness work.
| } | ||
| break; | ||
| default: | ||
| // Value is valid or maybe valid, so we shouldn't replace it with skip |
There was a problem hiding this comment.
just wanted to clarify: this function should never be called with maybe valid state value right?
There was a problem hiding this comment.
You can end up with kMaybeValidValue here when processing messages for slots beyond LCL+1.
There was a problem hiding this comment.
Hm, based on what we discussed last week, I was under impression we weren't going to allow skip values for future slots - so it's unexpected to see maybe swapping values for slots we can't actually validate. Possibly this is related to the existing implementation of SCP and the hoops we have to jump through to get skip value to work. Is this because of how bumpState is implemented for all slots, current and future?
There was a problem hiding this comment.
Ah, I think there might be a bug here. The intent was that this only had an impact on the current slot, but once I added the kInvalidValue arm to the switch statement, it can technically fire for future slots containing invalid values.
The root issue is that the BallotProtocol itself doesn't know whether this is a future slot or not (that's a Herder concern). I think what we might want is to make validateValue communicate a bit more information back. It should probably tell us whether this value belongs to a future slot or not.
Additionally, if we add more details about why a value is invalid (structural vs something related to the tx set), then we can do a bit more validation for PREPARE messages as well. I need to think about that a bit to make sure it wouldn't break anything.
This is very much a work-in-progress. This is littered with TODOs that I am actively chipping away at. I've opened this draft PR as an easier way to point people towards the prototype as it evolves into a production-ready feature.
I try to keep this branch in a good enough state such that it:
Therefore, I tend to do most work on other branches and merge it in here when ready in larger single commits.
I plan to split this whole thing into multiple smaller PRs when it's time to actually merge it in.