execution/state, stagedsync: parallel-exec SD-revival and metamorphic-CREATE2 fixes#21590
execution/state, stagedsync: parallel-exec SD-revival and metamorphic-CREATE2 fixes#21590mh0lt wants to merge 2 commits into
Conversation
…-CREATE2 fixes Bundle of correctness fixes for parallel exec around the SELFDESTRUCT revival and metamorphic-CREATE2 patterns. Validated by from-0 sync of chiado to tip (21.4M, including the regression noted below) and sepolia past block 4,913,058. execution/stagedsync/exec3_parallel.go - calcFees: nil pre-state must not short-circuit EIP-161 emptyPre when the worker has already bumped Nonce or set CodeHash; otherwise the emitted SelfDestructPath wipes those writes via normalizeWriteSet's sdSet filter. - normalizeWriteSet: detect SD-then-revival by scanning the versionMap history for any prior SelfDestructPath=true write (new AnyDoneBoolWriteEquals helper) rather than relying on the latest-value read which a revival flips back to false. Narrower than an IncarnationPath probe: pure CREATE (no prior SD=true) doesn't wipe pre-existing storage, so its same-value SSTOREs still no-op against pre-block. Earlier IncarnationPath-based form regressed chiado 7,221,794 by preserving same-value SSTOREs that canonical correctly dropped. execution/state/intra_block_state.go - getVersionedAccount: honour prior-tx SelfDestructPath cell even when CachedReaderV3 (which bypasses the versionMap) returns the pre-SD record. Includes the same-tx metamorphic SD+CREATE2 revival check (AddressPath at TxIdx >= destructTxIndex is the signal; strict > would miss it). - CreateAccount: default the synthetic IncarnationPath and BalancePath reads to (StorageRead, UnknownVersion) rather than inheriting the outer (source, version), which may carry refreshVersionedAccount's BalancePath promotion and trip the validator's recursive AddressPath check. - CreateAccount: only emit IncarnationPath on contractCreation=true. CreateAccount(false) leaves newObj.data.Incarnation at 0 and emitting it would clobber a prior tx's SD-side IncarnationPath cell. - accountRead: drop the (MapRead, V) promotion when vm.Read(AddressPath) returns None, to avoid a non-converging validator livelock. execution/state/rw_v3.go - LightCollector: only emit IncarnationPath on incarnation up-revs. A value-transfer to a same-block SD'd address leaves newObj.Incarnation at 0 while original carries the prior block's value; emitting the down-rev would clobber the SD-side IncarnationPath cell and break a same-block CREATE2's prevInc lookup. execution/state/versionedio.go - versionedStateReader.ReadAccountData: mirror IBS's same-tx metamorphic-recreation check (AddressPath at TxIdx >= destructTxIndex before falling back to the strict > subfield checks). execution/state/versionmap.go - New VersionMap.AnyDoneBoolWriteEquals helper that scans the cells for a Done write at TxIdx <= limit whose data equals target. Used to detect a prior in-block SelfDestructPath=true write that a later revival flipped back to false (a case Read alone, which is latest-only, misses). execution/state/versionedio_test.go - Four regression tests pinning the IBS / versionedStateReader behavior above: BalancePath-promoted AddressPath read, synthetic IncarnationPath stamp, prior-tx SD with stale stateReader record, and same-tx metamorphic SD+CREATE2.
Flip the EXEC3_PARALLEL env var default from false to true, making parallel execution the default for plain `./build/bin/erigon` runs. Setting EXEC3_PARALLEL=false (or ERIGON_EXEC3_PARALLEL=false) still forces the serial path. Depends on #21590 (the parallel-exec SD-revival and metamorphic-CREATE2 fixes) — that PR must merge first so the new default lands with the correctness fixes already in place. The 5-chain soak validation that backs the flip is documented in #21590's test plan.
Flip the EXEC3_PARALLEL env var default from false to true, making parallel execution the default for plain `./build/bin/erigon` runs. Setting EXEC3_PARALLEL=false (or ERIGON_EXEC3_PARALLEL=false) still forces the serial path. Depends on #21590 (the parallel-exec SD-revival and metamorphic-CREATE2 fixes) — that PR must merge first so the new default lands with the correctness fixes already in place. The 5-chain soak validation that backs the flip is documented in #21590's test plan.
…IncarnationPath The gate was added to avoid clobbering a prior tx's SD-side IncarnationPath cell when a value-transfer revival writes newObj.data.Incarnation=0 — meant to protect the metamorphic SD+CREATE2 case. Two problems: - It breaks TestDeleteRecreateAccount: pre-Cancun SD followed by a value-transfer revival expects IncarnationPath to reset on the revival; skipping the write leaves the SD's stale incarnation in the versionMap and the trie root diverges from canonical. - It is redundant. LightCollector's strict-greater incarnation gate (rw_v3.go: `accountCopy.Incarnation > original.Incarnation`) already suppresses the down-rev emission on the apply side, so the metamorphic case is still covered without this earlier IBS-side skip. Revert restores the original unconditional emit. Sepolia regression tests + adjacent SD/CREATE/CVE tests in execution/tests all pass with EXEC3_PARALLEL=true after the revert.
yperbasis
left a comment
There was a problem hiding this comment.
Approving. Findings to address (follow-ups unless noted):
-
TestGetVersionedAccount_PriorTxSelfDestruct_ReturnsNilpasses onmain— theNewVersionedStateReaderwrapper's pre-existing SD check returns nil before the new gate is ever load-bearing, so the fix isn't pinned by any test. Hand the IBS the rawaccountStateReader(theCachedReaderV3analogue from the gate's comment); I verified that variant is red on main / green here. -
Question on the
>=revival check: the only production writer I can find that placesAddressPath@Nnext toSelfDestructPath=true@NiscalcFees' tip-credit sibling emission (workerTxOutfilters AddressPath out whenever final SD=true). For a coinbase contract that self-destructs without recreate, serial burns the tip (EIP-7708 case 2, perrunPostApplyMessageOnMinIBS), while the>=makes tx N+1'scalcFeessee the coinbase as revived withtip_Nand emittip_N + tip_{N+1}, carrying the pre-SD nonce/codeHash along. If there's no guard I missed, a targeted re-exec of the mainnet 25151825 window would settle it. Related: which flush other than calcFees can produce the vm state hand-built inTestGetVersionedAccount_SameTxMetamorphicRecreate_ReturnsAccount(SD=true + AddressPath at one TxIdx with the account ending alive)? -
The
LightCollector.UpdateAccountDatachange looks inert:CollectorWritesare captured and fee-adjusted but never flushed to the versionMap or applied to domains (applyResult.writescomes fromnormalizeWriteSet(blockIO.WriteSet);resolveStorageWrites/filterWritesByVersionMaphave only test callers). The head commit's rationale — "LightCollector's strict-greater gate already suppresses the down-rev on the apply side" — then doesn't hold: the down-revIncarnationPath=0cell fromCreateAccount(addr, false)reaches the vm via TxOut at HEAD, same as main. The revert stands onTestDeleteRecreateAccountalone; please fix the narrative, and the PR body still lists the reverted contractCreation gate as a live fix. -
The from-0 soaks predate the head commit, which changed IncarnationPath emission in exactly the SD-revival scenarios targeted here; only unit/EEST suites ran after the revert. Cheap ask: re-exec the chiado 7,221,794 and sepolia 4,913,058 windows at HEAD.
-
Residual sharp edges worth tracking issues: (a)
versionedRead's per-path revival is still strict->(versionedio.go:874) — probinggetVersionedAccountwith aCodeHashPathcell at the destruct TxIdx returns a zeroed CodeHash despite the cell holding the new hash, and the new>=gate now routes such reads into that merge; (b) theaccountReaddowngrade leaves a narrow validation blind spot — Done-path reads withcopyV==nilare never recorded, so the promoted sub-field dependency's only carrier was the AddressPath stamp; after the downgrade, a prior tx re-executing with a different balance while dropping its AddressPath write validates as None+StorageRead with no value check. Recording the promoted read at its own sub-path would close it. -
Nits: the 4-probe revival block is now duplicated (intra_block_state.go:1070, versionedio.go:289) and must stay in lockstep — extract a
VersionMaphelper;AnyDoneBoolWriteEqualscould be named for what it detects (e.g.HadSelfDestruct); several new comments run 3-6 sentences with scenario walk-throughs — per the repo comment guidelines the forensic detail belongs in the commit message.
| // Don't inherit outer (source, version): it may carry | ||
| // refreshVersionedAccount's BalancePath promotion (MapRead, V_bal), | ||
| // which stamps IncarnationPath with a cell vm never wrote and trips | ||
| // the validator's recursive AddressPath check. | ||
| incSource, incVersion := StorageRead, UnknownVersion | ||
| if sdb.versionMap != nil { |
Closes #20711.
Summary
Correctness fixes for parallel execution around the SELFDESTRUCT-revival and metamorphic-CREATE2 patterns. Surfaced as a chiado from-0 wrong-trie-root and a sepolia tx-4 missed
CallNewAccountGas(25,000); both classes resolved by this PR.Six interrelated sub-fixes:
execution/stagedsync/exec3_parallel.go—calcFees: nil pre-state must not short-circuit EIP-161 emptyPre when the worker has already bumped Nonce or set CodeHash, otherwise the emittedSelfDestructPath=truewipes those writes vianormalizeWriteSet's sdSet filter.execution/stagedsync/exec3_parallel.go—normalizeWriteSet: detect SD-then-revival by scanning the versionMap history for any priorSelfDestructPath=truewrite (newAnyDoneBoolWriteEqualshelper) rather than relying on the latest-value read, which a revival flips back to false. Narrower than anIncarnationPathprobe: pure CREATE (no prior SD=true) doesn't wipe pre-existing storage, so its same-value SSTOREs still no-op against pre-block.execution/state/intra_block_state.go—getVersionedAccount: honour prior-txSelfDestructPathcell even whenCachedReaderV3(which bypasses the versionMap) returns the pre-SD record. Includes the same-tx metamorphic SD+CREATE2 revival check (AddressPathat TxIdx>=destructTxIndex is the signal; strict>misses it).execution/state/intra_block_state.go—CreateAccount: default the syntheticIncarnationPath/BalancePathreads to(StorageRead, UnknownVersion)rather than inheriting the outer (source, version) which may carryrefreshVersionedAccount'sBalancePathpromotion and trip the validator. Only emitIncarnationPathoncontractCreation=true(aCreateAccount(false)carriesnewObj.data.Incarnation=0and would clobber a prior tx's SD-side cell).execution/state/intra_block_state.go—accountRead: drop the(MapRead, V)promotion whenvm.Read(AddressPath)returns None, to avoid a non-converging validator livelock.execution/state/rw_v3.go—LightCollector: only emitIncarnationPathon incarnation up-revs. A value-transfer to a same-block SD'd address leavesnewObj.Incarnation=0whileoriginalcarries the prior block's value; emitting the down-rev would clobber the SD-side cell and break a same-block CREATE2'sprevInclookup.execution/state/versionedio.go—versionedStateReader.ReadAccountData: mirror IBS's same-tx metamorphic-recreation check (AddressPathat TxIdx>=destructTxIndex before falling back to strict>subfield checks).execution/state/versionmap.go: newVersionMap.AnyDoneBoolWriteEqualshelper scans cells for aDonewrite at TxIdx<=limit whose data equals target. Used by the narrowed Fix 2 above.Test plan
execution/state/versionedio_test.go:TestAccountRead_BalancePathPromotion_DoesNotInvalidate,TestCreateAccount_SyntheticIncarnationStamp_DoesNotInvalidate,TestGetVersionedAccount_PriorTxSelfDestruct_ReturnsNil,TestGetVersionedAccount_SameTxMetamorphicRecreate_ReturnsAccount. All pass.EXEC3_PARALLEL=true,--prune.mode=archive. Passed the original parallel_exec: chiado exec from 0 trie-root missmatch #20711 regression block (7,221,794) cleanly and reached tip.0x140da4236…metamorphic-deploy pattern at 4,913,056-57 was the original sepolia regression motivating this work.)make lintclean.Notes
IncarnationPath-based form of the same probe — that broader probe wrongly preserved same-value SSTOREs that canonical correctly dropped. The history-scan-on-SelfDestructPath=trueform is strictly narrower.