feat(test-fill): align `fill-stateful` with `gas-benchmarks` implementation by LouisTsai-Csie · Pull Request #2923 · ethereum/execution-specs

LouisTsai-Csie · 2026-05-27T08:43:26Z

🗒️ Description

Add a few features & refactors to the stateful filling implementation.

Commit d198221: Replace `debug_setHead` with FCU approach.

debug_setHead only supports in Geth, Besu and Erigon. Nethermind implements debug_resetHead that takes block hash as input. Reth has the method registered but not implemented (Link), Nimbus comments out the method (Link), no implementation either.

Consider some features (e.g., built-in opcode tracing) is now only supported in Nethermind master branch, this PR changes the debug_setHead to FCU version, in order to set the canonical chain.

Commit 0252bec: Add a new CLI flag `--gas-bump-blocks`.

For perf-devnet-3 snapshot, we need to bump the block gas limit to ~1T block gas limit. This is done by sending empty blocks in gas-benchmarks. User could use --gas-bump-blocks to assign number of empty block to increase the block gas limit.

In gas-benchmarks, we send 5000 empty blocks to bump the block gas limit for perf-devnet-3. For Jochmnet, we do not need to bump the gas limit, as the block gas limit is already high enough (~1 gigagas).

Commit 05f3c16: Add deterministic sender pooling.

For test_ether_transfers_onchain_receivers benchmark, to prevent the sender account being cached, we pre-fund the account via CL withdrawal. We inject 15K withdrawal into funding phase, starting with this pk range:

SENDER_BASE_KEY = int.from_bytes(
    keccak256(b"gas-repricings-private-key"), "big"
)

Commit 54f43ad: Batch tx receipt request

Batch tx receipt RPC request, to avoid large amount of RPC request at the same time. This was a bottleneck in the past when using gas-benchmarks.

Commit 8477bfe: refactor `fill_stateful.py`

Split the hive setup version from the original fill_stateful.py file, to keep the file small and focus on the filler feature.

These senders are then consumed by test_ether_transfers_onchain_receivers, this aligns changes in gas-benchmarks PR NethermindEth/gas-benchmarks#146

Verification

Verify the test via Hive mode, works successfully.

🔗 Related Issues or PRs

Issue #2910

✅ Checklist

All: Ran fast static checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
```
just static
```
All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
All: Considered updating the online docs in the ./docs/ directory.
All: Set appropriate labels for the changes (only maintainers can apply labels).
Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.
Tests: For PRs implementing a missed test case, update the post-mortem document to add an entry the list.
Ported Tests: All converted JSON/YML tests from ethereum/tests or tests/static have been assigned @ported_from marker.

Cute Animal Picture

codecov · 2026-05-27T09:01:09Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.43%. Comparing base (1596cf6) to head (8477bfe).
⚠️ Report is 4 commits behind head on forks/amsterdam.

Additional details and impacted files

@@                 Coverage Diff                 @@
##           forks/amsterdam    #2923      +/-   ##
===================================================
- Coverage            90.44%   90.43%   -0.01%     
===================================================
  Files                  535      535              
  Lines                32439    32430       -9     
  Branches              3012     3012              
===================================================
- Hits                 29338    29329       -9     
  Misses                2573     2573              
  Partials               528      528

Flag	Coverage Δ
unittests	`90.43% <ø> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

marioevz

Thanks for doing this, I added a couple of comments which I feel need to be resolved before merging.

marioevz · 2026-06-03T17:21:20Z

+        head_block = self.get_block_by_hash(head_block_hash)
+        assert head_block is not None, (
+            f"cannot reset head to unknown block {head_block_hash}"
+        )


Maybe this is unnecessary, since the client itself will make the same complaint if it cannot find the block.
It's just one less RPC call to make IMO.

marioevz · 2026-06-03T17:22:54Z

+        head_fork = self.fork.fork_at(
+            block_number=HexNumber(head_block["number"]),
+            timestamp=HexNumber(head_block["timestamp"]),
+        )


I see now the reason behind the previous call, but perhaps the solution is that we should never expect TransitionFork here and assume that ChainBuilderEthRPC is always going to be initialized with Fork, otherwise error out.

marioevz · 2026-06-03T17:25:20Z

+        fcu_version = head_fork.engine_forkchoice_updated_version()
+        assert fcu_version is not None, (
+            "Fork does not support engine forkchoice_updated"
+        )


We could do this once during initialization of ChainBuilderEthRPC, save fcu_version as a field of the chain builder instance, and then just use it as self.fcu_version.
This is also under the assumption that fork is always Fork and never TransitionFork.

marioevz · 2026-06-03T17:31:17Z

 @pytest.fixture(autouse=True, scope="function")
 def _reset_chain_between_tests(


Maybe out of scope for this PR, but the way this fixture is auto-used feels off.
I think a better structure would be to have a session_client_backend fixture that is session-scoped, and a client_backend fixture that is function-scoped.

The session_client_backend fixture initializes the object for the entire session, most of what _session_pre_run and client_backend fixtures do now.

The new client_backend fixture prepares and cleans up the ClientBackend object before and after every test, so what _reset_chain_between_tests currently does.

See example of session_t8n and t8n session and function scoped fixtures that use the same object internally:

execution-specs/packages/testing/src/execution_testing/cli/pytest_commands/plugins/filler/filler.py

Line 1093 in 7771d27

def session_t8n(

marioevz · 2026-06-03T17:34:08Z

        return
    try:
-        debug_rpc.set_head(start_hex)
+        eth_rpc.set_canonical_head(Hash(expected_hash))


This call feels like we are conflating two different objects to achieve a purpose.

We could have client_backend.reset_to_start_block or similar because client_backend already knows start_block so it can do everything by itself.

This also makes the code more maintainable because if we ever modify the client_backend structure, you only have to modify methods inside of the self-contained ClientBackend class.

marioevz · 2026-06-03T17:45:40Z

+    group.addoption(
+        "--gas-bump-blocks",
+        action="store",
+        dest="gas_bump_blocks",
+        default=5000,
+        type=int,
+        help=(
+            "Empty blocks at start to increase block gas limit."
+            "Each block increase ≤ parent_gas_limit/1024."
+            "Default: 5000 blocks."
+        ),
+    )


This is the right idea but the wrong abstraction.

A user normally only knows the gas limit their test needs, not how many empty blocks produce it. With this parameter they still have to (a) find the starting block gas limit of the snapshot, and (b) compute the number of blocks needed to ramp up to their target. That's our math to do, not theirs.

Two suggestions:

For the MVP, make the parameter --block-gas-limit (the target), and internally derive how many empty blocks are needed to reach it from the start block's gas limit. The ≤ parent_gas_limit/1024 per-block constraint is exactly the kind of detail we should be hiding behind this.

Longer term, and probably out of scope here: even that parameter may be unnecessary. We already have a default Environment with a default gas_limit, so we could just target that. A test that deviates inherently needs a different setup, and therefore a different number of gas-bump-blocks anyway.

marioevz · 2026-06-03T17:57:46Z

 # Placed outside pre-allocation to ensure accounts remain uncached.
-SENDER_BASE_KEY = (
-    0x1111111111111111111111111111111111111111111111111111111111111111
+SENDER_BASE_KEY = int.from_bytes(


My main concern here is that this sidesteps the standard pre.fund_eoa path. That abstraction exists so we can change how funding works under the hood without touching tests — and this is the one test that opts out, which means every framework change now has to account for this special case separately. Case in point: This PR adds a whole new parameter just to deal with this (--sender-pool-size).

It also makes the test order-dependent: yield_distinct_sender() hands out senders based on execution order, so the accounts a given test gets aren't stable.

I'd rather we solve the underlying thing generally. If the goal is funding via withdrawals to keep accounts uncached, that's something pre.fund_eoa could do for all accounts — at which point this becomes automatic and the special case (and the new parameter) goes away.

LouisTsai-Csie added 3 commits May 27, 2026 07:21

feat: repleace debug setHead by FCU update

d198221

feat: increase gas limit via empty blocks

0252bec

feat: deterministic sender pool allocation

05f3c16

LouisTsai-Csie self-assigned this May 27, 2026

LouisTsai-Csie added A-test-fill Area: execution_testing.cli.pytest_commands.plugins.filler C-feat Category: an improvement or new feature labels May 27, 2026

LouisTsai-Csie mentioned this pull request May 27, 2026

Stateful filling refactors & missing feature tracker #2910

Open

9 tasks

LouisTsai-Csie requested a review from marioevz May 27, 2026 13:30

LouisTsai-Csie added 2 commits May 27, 2026 14:04

refactor: batch receipt request

54f43ad

refactor: split large fill-stateful file

8477bfe

LouisTsai-Csie mentioned this pull request Jun 3, 2026

STEEL Weekly Meeting, Wed 3rd June, 2026 ethsteel/pm#70

Closed

marioevz requested changes Jun 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(test-fill): align `fill-stateful` with `gas-benchmarks` implementation#2923

feat(test-fill): align `fill-stateful` with `gas-benchmarks` implementation#2923
LouisTsai-Csie wants to merge 5 commits into
ethereum:forks/amsterdamfrom
LouisTsai-Csie:refactor-stateful-filling

LouisTsai-Csie commented May 27, 2026 •

edited

Loading

Uh oh!

codecov Bot commented May 27, 2026 •

edited

Loading

Uh oh!

marioevz left a comment

Uh oh!

marioevz Jun 3, 2026

Uh oh!

marioevz Jun 3, 2026

Uh oh!

marioevz Jun 3, 2026

Uh oh!

marioevz Jun 3, 2026

Uh oh!

marioevz Jun 3, 2026

Uh oh!

marioevz Jun 3, 2026

Uh oh!

marioevz Jun 3, 2026

Uh oh!

marioevz Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@pytest.fixture(autouse=True, scope="function")
		def _reset_chain_between_tests(

Conversation

LouisTsai-Csie commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🗒️ Description

Commit d198221: Replace debug_setHead with FCU approach.

Commit 0252bec: Add a new CLI flag --gas-bump-blocks.

Commit 05f3c16: Add deterministic sender pooling.

Commit 54f43ad: Batch tx receipt request

Commit 8477bfe: refactor fill_stateful.py

Verification

🔗 Related Issues or PRs

✅ Checklist

Cute Animal Picture

Uh oh!

codecov Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

marioevz left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

LouisTsai-Csie commented May 27, 2026 •

edited

Loading

Commit d198221: Replace `debug_setHead` with FCU approach.

Commit 0252bec: Add a new CLI flag `--gas-bump-blocks`.

Commit 8477bfe: refactor `fill_stateful.py`

codecov Bot commented May 27, 2026 •

edited

Loading