Skip to content

perf(evm): depth-indexed InterpreterExecContext pool for nested calls#482

Merged
zoowii merged 1 commit into
DTVMStack:mainfrom
ys8888john:feat/perf-fold-stack-check-into-chunk
Apr 29, 2026
Merged

perf(evm): depth-indexed InterpreterExecContext pool for nested calls#482
zoowii merged 1 commit into
DTVMStack:mainfrom
ys8888john:feat/perf-fold-stack-check-into-chunk

Conversation

@ys8888john
Copy link
Copy Markdown
Contributor

1. Does this PR affect any open issues?(Y/N) and add issue references (e.g. "fix #123", "re #123".):

  • N
  • Y

2. What is the scope of this PR (e.g. component or file name):

3. Provide a description of the PR(e.g. more details, effects, motivations or doc link):

  • Affects user behaviors
  • Contains CI/CD configuration changes
  • Contains documentation changes
  • Contains experimental features
  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Other

4. Are there any breaking changes?(Y/N) and describe the breaking changes(e.g. more details, motivations or doc link):

  • N
  • Y

5. Are there test cases for these changes?(Y/N) select and add more details, references or doc links:

  • Unit test
  • Integration test
  • Benchmark (add benchmark stats below)
  • Manual test (add detailed scripts or steps below)
  • Other

6. Release note

None

Comment thread src/evm/interpreter.cpp
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR significantly changes the EVM execution hot path to reduce interpreter overhead and introduce a profile-guided, background JIT compilation flow, while also adding a depth-indexed InterpreterExecContext pool to avoid per-nested-call allocations.

Changes:

  • Reuse InterpreterExecContext per call depth to eliminate repeated large frame-stack allocations in deeply nested calls.
  • Add profile-guided JIT triggering with a sliding-window profiler and a background compilation thread pool, plus new config/CLI options.
  • Inline several “pure”/hot EVM opcodes in the interpreter dispatch loop to reduce handler overhead.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/vm/dt_evmc_vm.cpp Adds depth-indexed exec-context reuse and implements profile-guided JIT + background compilation plumbing in execute()
src/tests/spec_unit_tests.cpp Adds --enable-profile-guided-jit CLI flag for the spec test runner
src/tests/solidity_contract_tests.cpp Adds --enable-profile-guided-jit flag but removes --enable-multipass-lazy
src/runtime/runtime.cpp Switches between interpreter/JIT based on whether JIT code is actually available
src/runtime/evm_module.h Makes JIT code pointer atomic and adds a std::future for background compilation
src/runtime/evm_module.cpp Waits for any in-flight background JIT compilation in EVMModule destructor; skips eager compile when PGJ is enabled
src/runtime/config.h Adds EnableProfileGuidedJIT and NumJITCompileThreads runtime config fields
src/evm/interpreter.cpp Inlines several opcode implementations inside the dispatch loop (calldata/txcontext/memory/misc/transient storage)
src/compiler/evm_frontend/evm_mir_compiler.cpp Minor comment tweak in MIR builder init
src/compiler/evm_compiler.cpp Wraps eager EVM JIT compile in try/catch and changes how JIT code pointer is published
src/cli/dtvm.cpp Adds --enable-profile-guided-jit to the CLI
src/action/compiler.cpp Removes the lazy-compilation warning branch and always runs eager EVM JIT compile in multipass mode

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/vm/dt_evmc_vm.cpp Outdated
Comment thread src/vm/dt_evmc_vm.cpp Outdated
Comment thread src/vm/dt_evmc_vm.cpp Outdated
Comment thread src/runtime/config.h Outdated
Comment thread src/compiler/evm_compiler.cpp Outdated
Comment thread src/tests/solidity_contract_tests.cpp
Comment thread src/vm/dt_evmc_vm.cpp Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 28, 2026

⚡ Performance Regression Check Results

✅ Performance Check Passed (interpreter)

Performance Benchmark Results (threshold: 25%)

Benchmark Baseline (us) Current (us) Change Status
total/main/blake2b_huff/8415nulls 1.56 1.52 -2.4% PASS
total/main/blake2b_huff/empty 0.02 0.02 -2.4% PASS
total/main/blake2b_shifts/8415nulls 11.67 11.90 +2.0% PASS
total/main/sha1_divs/5311 5.16 5.12 -0.9% PASS
total/main/sha1_divs/empty 0.06 0.06 -0.0% PASS
total/main/sha1_shifts/5311 2.86 2.85 -0.4% PASS
total/main/sha1_shifts/empty 0.04 0.04 +1.2% PASS
total/main/snailtracer/benchmark 53.03 52.98 -0.1% PASS
total/main/structarray_alloc/nfts_rank 1.01 0.98 -2.4% PASS
total/main/swap_math/insufficient_liquidity 0.00 0.00 +1.8% PASS
total/main/swap_math/received 0.00 0.00 +1.6% PASS
total/main/swap_math/spent 0.00 0.00 +2.1% PASS
total/main/weierstrudel/1 0.29 0.28 -1.7% PASS
total/main/weierstrudel/15 3.16 3.15 -0.5% PASS
total/micro/JUMPDEST_n0/empty 1.47 1.64 +11.5% PASS
total/micro/jump_around/empty 0.09 0.09 -1.8% PASS
total/micro/loop_with_many_jumpdests/empty 22.55 21.71 -3.7% PASS
total/micro/memory_grow_mload/by1 0.09 0.10 +9.0% PASS
total/micro/memory_grow_mload/by16 0.10 0.10 +6.3% PASS
total/micro/memory_grow_mload/by32 0.11 0.11 -5.1% PASS
total/micro/memory_grow_mload/nogrow 0.08 0.09 +10.2% PASS
total/micro/memory_grow_mstore/by1 0.10 0.10 +3.0% PASS
total/micro/memory_grow_mstore/by16 0.11 0.11 +0.8% PASS
total/micro/memory_grow_mstore/by32 0.12 0.12 -3.0% PASS
total/micro/memory_grow_mstore/nogrow 0.10 0.10 +0.4% PASS
total/micro/signextend/one 0.23 0.23 -1.4% PASS
total/micro/signextend/zero 0.23 0.23 -0.4% PASS
total/synth/ADD/b0 3.13 2.00 -36.3% PASS
total/synth/ADD/b1 1.97 2.00 +1.4% PASS
total/synth/ADDRESS/a0 4.98 4.83 -3.0% PASS
total/synth/ADDRESS/a1 5.36 5.43 +1.3% PASS
total/synth/AND/b0 1.71 1.71 +0.0% PASS
total/synth/AND/b1 1.71 1.72 +0.3% PASS
total/synth/BYTE/b0 6.08 6.12 +0.7% PASS
total/synth/BYTE/b1 4.74 4.76 +0.3% PASS
total/synth/CALLDATASIZE/a0 3.10 3.58 +15.7% PASS
total/synth/CALLDATASIZE/a1 4.10 3.60 -12.2% PASS
total/synth/CALLER/a0 4.96 4.82 -2.8% PASS
total/synth/CALLER/a1 5.37 5.41 +0.8% PASS
total/synth/CALLVALUE/a0 3.26 3.76 +15.3% PASS
total/synth/CALLVALUE/a1 3.52 3.76 +6.8% PASS
total/synth/CODESIZE/a0 3.67 3.98 +8.5% PASS
total/synth/CODESIZE/a1 3.93 4.00 +2.0% PASS
total/synth/DUP1/d0 1.06 1.39 +30.4% PASS
total/synth/DUP1/d1 1.31 1.40 +6.7% PASS
total/synth/DUP10/d0 1.15 1.31 +13.7% PASS
total/synth/DUP10/d1 1.10 1.40 +26.7% PASS
total/synth/DUP11/d0 1.22 1.31 +7.0% PASS
total/synth/DUP11/d1 1.00 1.40 +39.3% PASS
total/synth/DUP12/d0 1.15 1.39 +21.3% PASS
total/synth/DUP12/d1 1.23 1.40 +13.7% PASS
total/synth/DUP13/d0 1.15 1.31 +13.8% PASS
total/synth/DUP13/d1 1.14 1.16 +1.8% PASS
total/synth/DUP14/d0 1.22 1.31 +7.4% PASS
total/synth/DUP14/d1 1.23 1.40 +13.2% PASS
total/synth/DUP15/d0 1.22 1.27 +4.0% PASS
total/synth/DUP15/d1 1.00 1.40 +39.4% PASS
total/synth/DUP16/d0 1.15 1.07 -6.9% PASS
total/synth/DUP16/d1 1.23 1.40 +13.2% PASS
total/synth/DUP2/d0 1.14 1.30 +14.2% PASS
total/synth/DUP2/d1 1.16 1.40 +20.5% PASS
total/synth/DUP3/d0 1.14 1.31 +14.2% PASS
total/synth/DUP3/d1 1.23 1.40 +13.3% PASS
total/synth/DUP4/d0 1.14 1.20 +5.4% PASS
total/synth/DUP4/d1 1.23 1.40 +13.3% PASS
total/synth/DUP5/d0 1.22 1.30 +6.5% PASS
total/synth/DUP5/d1 1.00 1.39 +39.5% PASS
total/synth/DUP6/d0 1.14 1.31 +14.4% PASS
total/synth/DUP6/d1 1.00 1.16 +15.8% PASS
total/synth/DUP7/d0 0.96 1.19 +24.0% PASS
total/synth/DUP7/d1 1.00 1.40 +39.7% PASS
total/synth/DUP8/d0 1.06 1.15 +8.2% PASS
total/synth/DUP8/d1 1.01 1.39 +38.6% PASS
total/synth/DUP9/d0 1.15 1.07 -7.0% PASS
total/synth/DUP9/d1 1.23 1.39 +12.7% PASS
total/synth/EQ/b0 2.76 2.76 -0.1% PASS
total/synth/EQ/b1 1.40 1.34 -4.0% PASS
total/synth/GAS/a0 3.91 3.83 -2.1% PASS
total/synth/GAS/a1 3.92 3.84 -2.1% PASS
total/synth/GT/b0 2.62 2.60 -0.5% PASS
total/synth/GT/b1 1.48 1.56 +5.8% PASS
total/synth/ISZERO/u0 1.15 1.15 -0.0% PASS
total/synth/JUMPDEST/n0 1.47 1.72 +17.0% PASS
total/synth/LT/b0 2.62 2.60 -0.6% PASS
total/synth/LT/b1 1.48 1.56 +5.7% PASS
total/synth/MSIZE/a0 4.25 4.26 +0.3% PASS
total/synth/MSIZE/a1 4.75 4.82 +1.4% PASS
total/synth/MUL/b0 5.41 5.32 -1.6% PASS
total/synth/MUL/b1 5.40 5.39 -0.2% PASS
total/synth/NOT/u0 1.68 1.84 +10.0% PASS
total/synth/OR/b0 1.64 1.64 +0.1% PASS
total/synth/OR/b1 1.71 1.72 +0.1% PASS
total/synth/PC/a0 3.42 3.59 +4.8% PASS
total/synth/PC/a1 3.99 3.59 -9.8% PASS
total/synth/PUSH1/p0 1.14 1.13 -1.4% PASS
total/synth/PUSH1/p1 1.31 1.38 +4.8% PASS
total/synth/PUSH10/p0 1.18 1.15 -2.7% PASS
total/synth/PUSH10/p1 1.33 1.41 +6.1% PASS
total/synth/PUSH11/p0 1.22 1.15 -5.8% PASS
total/synth/PUSH11/p1 1.33 1.42 +6.4% PASS
total/synth/PUSH12/p0 1.15 1.15 +0.1% PASS
total/synth/PUSH12/p1 1.34 1.41 +5.9% PASS
total/synth/PUSH13/p0 1.15 1.15 -0.1% PASS
total/synth/PUSH13/p1 1.33 1.41 +6.3% PASS
total/synth/PUSH14/p0 1.20 1.15 -3.3% PASS
total/synth/PUSH14/p1 1.34 1.42 +5.8% PASS
total/synth/PUSH15/p0 1.15 1.15 +0.0% PASS
total/synth/PUSH15/p1 1.41 1.52 +8.2% PASS
total/synth/PUSH16/p0 1.22 1.15 -5.6% PASS
total/synth/PUSH16/p1 1.36 1.41 +3.6% PASS
total/synth/PUSH17/p0 1.22 1.15 -5.7% PASS
total/synth/PUSH17/p1 1.32 1.41 +7.2% PASS
total/synth/PUSH18/p0 1.15 1.15 -0.2% PASS
total/synth/PUSH18/p1 1.34 1.42 +5.8% PASS
total/synth/PUSH19/p0 1.15 1.21 +5.7% PASS
total/synth/PUSH19/p1 1.34 1.40 +5.1% PASS
total/synth/PUSH2/p0 1.15 1.13 -0.9% PASS
total/synth/PUSH2/p1 1.32 1.38 +4.9% PASS
total/synth/PUSH20/p0 1.21 1.12 -7.5% PASS
total/synth/PUSH20/p1 1.32 1.41 +6.6% PASS
total/synth/PUSH21/p0 1.15 1.15 +0.1% PASS
total/synth/PUSH21/p1 1.33 1.42 +6.3% PASS
total/synth/PUSH22/p0 1.22 1.15 -5.5% PASS
total/synth/PUSH22/p1 1.33 1.40 +5.9% PASS
total/synth/PUSH23/p0 1.15 1.15 +0.1% PASS
total/synth/PUSH23/p1 1.33 1.44 +7.8% PASS
total/synth/PUSH24/p0 1.15 1.15 -0.1% PASS
total/synth/PUSH24/p1 1.34 1.42 +6.1% PASS
total/synth/PUSH25/p0 1.22 1.18 -3.0% PASS
total/synth/PUSH25/p1 1.34 1.42 +6.1% PASS
total/synth/PUSH26/p0 1.15 1.15 -0.1% PASS
total/synth/PUSH26/p1 1.35 1.43 +6.4% PASS
total/synth/PUSH27/p0 1.17 1.15 -1.6% PASS
total/synth/PUSH27/p1 1.34 1.41 +5.2% PASS
total/synth/PUSH28/p0 1.15 1.12 -2.4% PASS
total/synth/PUSH28/p1 1.34 1.42 +5.9% PASS
total/synth/PUSH29/p0 1.11 1.07 -4.0% PASS
total/synth/PUSH29/p1 1.34 1.41 +5.2% PASS
total/synth/PUSH3/p0 1.19 1.15 -3.2% PASS
total/synth/PUSH3/p1 1.32 1.41 +7.0% PASS
total/synth/PUSH30/p0 1.17 1.16 -1.0% PASS
total/synth/PUSH30/p1 1.35 1.43 +6.2% PASS
total/synth/PUSH31/p0 1.21 1.15 -5.0% PASS
total/synth/PUSH31/p1 1.46 1.52 +4.2% PASS
total/synth/PUSH32/p0 1.15 1.15 -0.0% PASS
total/synth/PUSH32/p1 1.36 1.43 +5.0% PASS
total/synth/PUSH4/p0 0.99 1.11 +12.3% PASS
total/synth/PUSH4/p1 1.33 1.39 +5.0% PASS
total/synth/PUSH5/p0 1.19 1.15 -3.4% PASS
total/synth/PUSH5/p1 1.33 1.42 +6.9% PASS
total/synth/PUSH6/p0 1.15 1.15 -0.1% PASS
total/synth/PUSH6/p1 1.30 1.42 +8.7% PASS
total/synth/PUSH7/p0 1.15 1.11 -2.8% PASS
total/synth/PUSH7/p1 1.33 1.42 +7.0% PASS
total/synth/PUSH8/p0 1.07 1.15 +7.3% PASS
total/synth/PUSH8/p1 1.34 1.40 +4.7% PASS
total/synth/PUSH9/p0 1.15 1.15 +0.0% PASS
total/synth/PUSH9/p1 1.29 1.40 +8.5% PASS
total/synth/RETURNDATASIZE/a0 3.51 3.99 +13.8% PASS
total/synth/RETURNDATASIZE/a1 3.76 4.00 +6.3% PASS
total/synth/SAR/b0 3.77 3.79 +0.6% PASS
total/synth/SAR/b1 4.31 4.31 -0.1% PASS
total/synth/SGT/b0 2.60 2.60 -0.0% PASS
total/synth/SGT/b1 1.64 1.56 -4.6% PASS
total/synth/SHL/b0 3.03 3.04 +0.1% PASS
total/synth/SHL/b1 1.60 1.56 -2.3% PASS
total/synth/SHR/b0 2.94 2.93 -0.1% PASS
total/synth/SHR/b1 1.56 1.52 -2.4% PASS
total/synth/SIGNEXTEND/b0 3.12 3.66 +17.4% PASS
total/synth/SIGNEXTEND/b1 3.22 3.72 +15.3% PASS
total/synth/SLT/b0 2.60 2.61 +0.7% PASS
total/synth/SLT/b1 1.50 1.48 -1.5% PASS
total/synth/SUB/b0 3.13 1.98 -36.8% PASS
total/synth/SUB/b1 1.97 1.98 +0.2% PASS
total/synth/SWAP1/s0 1.49 1.49 -0.1% PASS
total/synth/SWAP10/s0 1.50 1.50 -0.1% PASS
total/synth/SWAP11/s0 1.50 1.50 -0.1% PASS
total/synth/SWAP12/s0 1.50 1.50 +0.1% PASS
total/synth/SWAP13/s0 1.51 1.51 -0.0% PASS
total/synth/SWAP14/s0 1.51 1.51 +0.1% PASS
total/synth/SWAP15/s0 1.51 1.51 -0.0% PASS
total/synth/SWAP16/s0 1.51 1.51 +0.1% PASS
total/synth/SWAP2/s0 1.49 1.49 +0.1% PASS
total/synth/SWAP3/s0 1.49 1.49 +0.1% PASS
total/synth/SWAP4/s0 1.49 1.49 +0.2% PASS
total/synth/SWAP5/s0 1.49 1.50 +0.2% PASS
total/synth/SWAP6/s0 1.49 1.50 +0.2% PASS
total/synth/SWAP7/s0 1.50 1.50 +0.0% PASS
total/synth/SWAP8/s0 1.50 1.50 +0.1% PASS
total/synth/SWAP9/s0 1.50 1.50 +0.1% PASS
total/synth/XOR/b0 1.55 1.55 -0.1% PASS
total/synth/XOR/b1 1.55 1.56 +0.3% PASS
total/synth/loop_v1 4.58 4.32 -5.7% PASS
total/synth/loop_v2 4.59 4.30 -6.2% PASS

Summary: 194 benchmarks, 0 regressions


✅ Performance Check Passed (multipass)

Performance Benchmark Results (threshold: 25%)

Benchmark Baseline (us) Current (us) Change Status
total/main/blake2b_huff/8415nulls 0.85 0.86 +0.9% PASS
total/main/blake2b_huff/empty 0.01 0.01 -1.4% PASS
total/main/blake2b_shifts/8415nulls 4.56 4.61 +0.9% PASS
total/main/sha1_divs/5311 0.59 0.59 -0.1% PASS
total/main/sha1_divs/empty 0.01 0.01 +0.2% PASS
total/main/sha1_shifts/5311 0.55 0.55 +0.4% PASS
total/main/sha1_shifts/empty 0.01 0.01 +0.2% PASS
total/main/snailtracer/benchmark 31.34 32.19 +2.7% PASS
total/main/structarray_alloc/nfts_rank 0.31 0.31 -0.3% PASS
total/main/swap_math/insufficient_liquidity 0.00 0.00 -1.4% PASS
total/main/swap_math/received 0.00 0.00 -1.7% PASS
total/main/swap_math/spent 0.00 0.00 -0.5% PASS
total/main/weierstrudel/1 0.24 0.24 -0.9% PASS
total/main/weierstrudel/15 2.60 2.60 -0.0% PASS
total/micro/JUMPDEST_n0/empty 0.00 0.00 +0.9% PASS
total/micro/jump_around/empty 0.05 0.06 +1.9% PASS
total/micro/loop_with_many_jumpdests/empty 0.00 0.00 +0.2% PASS
total/micro/memory_grow_mload/by1 0.01 0.01 -0.2% PASS
total/micro/memory_grow_mload/by16 0.01 0.01 -0.9% PASS
total/micro/memory_grow_mload/by32 0.01 0.01 -0.6% PASS
total/micro/memory_grow_mload/nogrow 0.01 0.01 +0.4% PASS
total/micro/memory_grow_mstore/by1 0.01 0.01 -0.5% PASS
total/micro/memory_grow_mstore/by16 0.02 0.02 -0.6% PASS
total/micro/memory_grow_mstore/by32 0.02 0.02 -0.6% PASS
total/micro/memory_grow_mstore/nogrow 0.01 0.01 +0.1% PASS
total/micro/signextend/one 0.07 0.08 +2.8% PASS
total/micro/signextend/zero 0.07 0.08 +2.6% PASS
total/synth/ADD/b0 0.00 0.00 +1.3% PASS
total/synth/ADD/b1 0.00 0.00 +1.6% PASS
total/synth/ADDRESS/a0 0.15 0.15 +0.0% PASS
total/synth/ADDRESS/a1 0.15 0.15 -0.0% PASS
total/synth/AND/b0 0.00 0.00 +1.1% PASS
total/synth/AND/b1 0.00 0.00 +1.4% PASS
total/synth/BYTE/b0 0.00 0.00 +1.1% PASS
total/synth/BYTE/b1 0.00 0.00 +1.3% PASS
total/synth/CALLDATASIZE/a0 0.07 0.07 -0.2% PASS
total/synth/CALLDATASIZE/a1 0.07 0.07 -0.0% PASS
total/synth/CALLER/a0 0.18 0.18 +0.1% PASS
total/synth/CALLER/a1 0.18 0.18 +0.0% PASS
total/synth/CALLVALUE/a0 0.26 0.26 +0.3% PASS
total/synth/CALLVALUE/a1 0.26 0.26 +0.3% PASS
total/synth/CODESIZE/a0 0.07 0.07 -0.3% PASS
total/synth/CODESIZE/a1 0.07 0.07 -0.2% PASS
total/synth/DUP1/d0 0.00 0.00 +1.2% PASS
total/synth/DUP1/d1 0.00 0.00 +1.6% PASS
total/synth/DUP10/d0 0.00 0.00 +1.6% PASS
total/synth/DUP10/d1 0.00 0.00 +1.3% PASS
total/synth/DUP11/d0 0.00 0.00 +1.3% PASS
total/synth/DUP11/d1 0.00 0.00 +0.8% PASS
total/synth/DUP12/d0 0.00 0.00 +1.3% PASS
total/synth/DUP12/d1 0.00 0.00 +1.3% PASS
total/synth/DUP13/d0 0.00 0.00 +1.6% PASS
total/synth/DUP13/d1 0.00 0.00 +1.3% PASS
total/synth/DUP14/d0 0.00 0.00 +1.2% PASS
total/synth/DUP14/d1 0.00 0.00 +1.4% PASS
total/synth/DUP15/d0 0.00 0.00 +1.2% PASS
total/synth/DUP15/d1 0.00 0.00 +0.8% PASS
total/synth/DUP16/d0 0.00 0.00 +1.3% PASS
total/synth/DUP16/d1 0.00 0.00 +1.4% PASS
total/synth/DUP2/d0 0.00 0.00 +1.1% PASS
total/synth/DUP2/d1 0.00 0.00 +1.4% PASS
total/synth/DUP3/d0 0.00 0.00 +1.2% PASS
total/synth/DUP3/d1 0.00 0.00 +1.3% PASS
total/synth/DUP4/d0 0.00 0.00 +1.3% PASS
total/synth/DUP4/d1 0.00 0.00 +1.1% PASS
total/synth/DUP5/d0 0.00 0.00 +1.3% PASS
total/synth/DUP5/d1 0.00 0.00 +1.2% PASS
total/synth/DUP6/d0 0.00 0.00 +0.0% PASS
total/synth/DUP6/d1 0.00 0.00 +1.1% PASS
total/synth/DUP7/d0 0.00 0.00 +1.2% PASS
total/synth/DUP7/d1 0.00 0.00 +1.1% PASS
total/synth/DUP8/d0 0.00 0.00 +1.0% PASS
total/synth/DUP8/d1 0.00 0.00 +1.7% PASS
total/synth/DUP9/d0 0.00 0.00 +1.4% PASS
total/synth/DUP9/d1 0.00 0.00 +1.3% PASS
total/synth/EQ/b0 0.00 0.00 +1.3% PASS
total/synth/EQ/b1 0.00 0.00 +1.7% PASS
total/synth/GAS/a0 0.76 0.76 +0.0% PASS
total/synth/GAS/a1 0.76 0.76 -0.0% PASS
total/synth/GT/b0 0.00 0.00 +1.3% PASS
total/synth/GT/b1 0.00 0.00 +1.4% PASS
total/synth/ISZERO/u0 0.00 0.00 +1.4% PASS
total/synth/JUMPDEST/n0 0.00 0.00 +1.2% PASS
total/synth/LT/b0 0.00 0.00 +1.5% PASS
total/synth/LT/b1 0.00 0.00 +1.0% PASS
total/synth/MSIZE/a0 0.00 0.00 +1.2% PASS
total/synth/MSIZE/a1 0.00 0.00 +1.7% PASS
total/synth/MUL/b0 0.00 0.00 +1.3% PASS
total/synth/MUL/b1 0.00 0.00 +1.4% PASS
total/synth/NOT/u0 0.00 0.00 +1.3% PASS
total/synth/OR/b0 0.00 0.00 +1.2% PASS
total/synth/OR/b1 0.00 0.00 +1.4% PASS
total/synth/PC/a0 0.00 0.00 +1.7% PASS
total/synth/PC/a1 0.00 0.00 +1.2% PASS
total/synth/PUSH1/p0 0.00 0.00 +1.5% PASS
total/synth/PUSH1/p1 0.00 0.00 +1.3% PASS
total/synth/PUSH10/p0 0.00 0.00 +1.2% PASS
total/synth/PUSH10/p1 0.00 0.00 +1.4% PASS
total/synth/PUSH11/p0 0.00 0.00 +1.3% PASS
total/synth/PUSH11/p1 0.00 0.00 +1.4% PASS
total/synth/PUSH12/p0 0.00 0.00 +1.4% PASS
total/synth/PUSH12/p1 0.00 0.00 +1.3% PASS
total/synth/PUSH13/p0 0.00 0.00 +1.1% PASS
total/synth/PUSH13/p1 0.00 0.00 +1.3% PASS
total/synth/PUSH14/p0 0.00 0.00 +1.3% PASS
total/synth/PUSH14/p1 0.00 0.00 +1.3% PASS
total/synth/PUSH15/p0 0.00 0.00 +1.4% PASS
total/synth/PUSH15/p1 0.00 0.00 +1.7% PASS
total/synth/PUSH16/p0 0.00 0.00 +1.1% PASS
total/synth/PUSH16/p1 0.00 0.00 +1.2% PASS
total/synth/PUSH17/p0 0.00 0.00 +1.4% PASS
total/synth/PUSH17/p1 0.00 0.00 +1.1% PASS
total/synth/PUSH18/p0 0.00 0.00 +1.3% PASS
total/synth/PUSH18/p1 0.00 0.00 +1.3% PASS
total/synth/PUSH19/p0 0.00 0.00 +1.5% PASS
total/synth/PUSH19/p1 0.00 0.00 +1.3% PASS
total/synth/PUSH2/p0 0.00 0.00 +1.3% PASS
total/synth/PUSH2/p1 0.00 0.00 +1.3% PASS
total/synth/PUSH20/p0 0.00 0.00 +1.6% PASS
total/synth/PUSH20/p1 0.00 0.00 +1.3% PASS
total/synth/PUSH21/p0 0.00 0.00 +0.7% PASS
total/synth/PUSH21/p1 0.00 0.00 +0.9% PASS
total/synth/PUSH22/p0 1.16 1.14 -1.4% PASS
total/synth/PUSH22/p1 1.34 1.45 +8.1% PASS
total/synth/PUSH23/p0 1.23 1.15 -6.5% PASS
total/synth/PUSH23/p1 1.42 1.43 +0.7% PASS
total/synth/PUSH24/p0 1.15 1.15 -0.1% PASS
total/synth/PUSH24/p1 1.34 1.43 +6.7% PASS
total/synth/PUSH25/p0 1.15 1.15 +0.1% PASS
total/synth/PUSH25/p1 1.34 1.43 +7.3% PASS
total/synth/PUSH26/p0 0.99 0.83 -15.8% PASS
total/synth/PUSH26/p1 1.35 1.45 +7.5% PASS
total/synth/PUSH27/p0 1.23 1.15 -6.5% PASS
total/synth/PUSH27/p1 1.34 1.43 +6.2% PASS
total/synth/PUSH28/p0 1.17 1.15 -1.8% PASS
total/synth/PUSH28/p1 1.35 1.43 +5.5% PASS
total/synth/PUSH29/p0 1.15 1.15 +0.1% PASS
total/synth/PUSH29/p1 1.34 1.46 +9.0% PASS
total/synth/PUSH3/p0 0.00 0.00 +1.3% PASS
total/synth/PUSH3/p1 0.00 0.00 +1.3% PASS
total/synth/PUSH30/p0 1.18 1.16 -2.3% PASS
total/synth/PUSH30/p1 1.35 1.46 +8.0% PASS
total/synth/PUSH31/p0 1.15 1.15 +0.1% PASS
total/synth/PUSH31/p1 1.48 1.55 +4.4% PASS
total/synth/PUSH32/p0 1.15 1.15 +0.2% PASS
total/synth/PUSH32/p1 1.36 1.45 +7.3% PASS
total/synth/PUSH4/p0 0.00 0.00 +1.2% PASS
total/synth/PUSH4/p1 0.00 0.00 +1.1% PASS
total/synth/PUSH5/p0 0.00 0.00 +1.3% PASS
total/synth/PUSH5/p1 0.00 0.00 +1.2% PASS
total/synth/PUSH6/p0 0.00 0.00 +1.3% PASS
total/synth/PUSH6/p1 0.00 0.00 +1.3% PASS
total/synth/PUSH7/p0 0.00 0.00 +1.1% PASS
total/synth/PUSH7/p1 0.00 0.00 +1.2% PASS
total/synth/PUSH8/p0 0.00 0.00 +1.6% PASS
total/synth/PUSH8/p1 0.00 0.00 +1.5% PASS
total/synth/PUSH9/p0 0.00 0.00 +1.3% PASS
total/synth/PUSH9/p1 0.00 0.00 +1.2% PASS
total/synth/RETURNDATASIZE/a0 0.03 0.03 +0.0% PASS
total/synth/RETURNDATASIZE/a1 0.03 0.03 +0.0% PASS
total/synth/SAR/b0 0.00 0.00 +1.4% PASS
total/synth/SAR/b1 0.00 0.00 +1.4% PASS
total/synth/SGT/b0 0.00 0.00 +1.1% PASS
total/synth/SGT/b1 0.00 0.00 +1.3% PASS
total/synth/SHL/b0 0.00 0.00 +1.4% PASS
total/synth/SHL/b1 0.00 0.00 +1.3% PASS
total/synth/SHR/b0 0.00 0.00 +0.9% PASS
total/synth/SHR/b1 0.00 0.00 +1.3% PASS
total/synth/SIGNEXTEND/b0 0.00 0.00 +1.4% PASS
total/synth/SIGNEXTEND/b1 0.00 0.00 +1.0% PASS
total/synth/SLT/b0 0.00 0.00 +1.3% PASS
total/synth/SLT/b1 0.00 0.00 +1.4% PASS
total/synth/SUB/b0 0.00 0.00 +1.3% PASS
total/synth/SUB/b1 0.00 0.00 +1.4% PASS
total/synth/SWAP1/s0 0.00 0.00 +1.5% PASS
total/synth/SWAP10/s0 0.00 0.00 +1.3% PASS
total/synth/SWAP11/s0 0.00 0.00 +1.3% PASS
total/synth/SWAP12/s0 0.00 0.00 +1.3% PASS
total/synth/SWAP13/s0 0.00 0.00 +1.4% PASS
total/synth/SWAP14/s0 0.00 0.00 +1.3% PASS
total/synth/SWAP15/s0 0.00 0.00 +1.2% PASS
total/synth/SWAP16/s0 0.00 0.00 +1.3% PASS
total/synth/SWAP2/s0 0.00 0.00 +1.1% PASS
total/synth/SWAP3/s0 0.00 0.00 +1.3% PASS
total/synth/SWAP4/s0 0.00 0.00 +1.2% PASS
total/synth/SWAP5/s0 0.00 0.00 +0.7% PASS
total/synth/SWAP6/s0 0.00 0.00 +1.1% PASS
total/synth/SWAP7/s0 0.00 0.00 +1.0% PASS
total/synth/SWAP8/s0 0.00 0.00 +1.1% PASS
total/synth/SWAP9/s0 0.00 0.00 +0.7% PASS
total/synth/XOR/b0 0.00 0.00 +1.0% PASS
total/synth/XOR/b1 0.00 0.00 +1.4% PASS
total/synth/loop_v1 1.19 1.19 +0.7% PASS
total/synth/loop_v2 1.06 1.07 +0.8% PASS

Summary: 194 benchmarks, 0 regressions


@ys8888john ys8888john force-pushed the feat/perf-fold-stack-check-into-chunk branch from 47a7f32 to 99f4061 Compare April 28, 2026 09:47
Copy link
Copy Markdown
Contributor

@starwarfan starwarfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets interpreter-mode EVM performance by reducing per-call allocations for nested calls and inlining several hot opcodes directly into the computed-goto dispatch loop.

Changes:

  • Add a depth-indexed InterpreterExecContext pool to reuse execution contexts across nested EVMC calls.
  • Inline multiple “pure read” opcodes and several memory/misc/transient-storage opcodes in BaseInterpreter::interpret() to avoid handler overhead and reduce hot-path work.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/vm/dt_evmc_vm.cpp Adds a per-depth InterpreterExecContext reuse pool for interpreter fast-path execution.
src/evm/interpreter.cpp Inlines multiple opcode implementations inside the computed-goto interpreter loop for performance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/vm/dt_evmc_vm.cpp
Comment thread src/vm/dt_evmc_vm.cpp
Comment thread src/evm/interpreter.cpp Outdated
Comment on lines +1182 to +1194
// ---- Memory ops (inlined: MLOAD/MSTORE/MSTORE8) ----
// Each opcode performs:
// 1. stack underflow check
// 2. memory expansion + gas charge (mirror of
// checkMemoryExpandAndChargeGas in opcode_handlers.cpp)
// 3. memcpy load/store
// The gas formula MUST match the canonical
// calculateMemoryExpansionCost EXACTLY:
// MemoryCost(W) = (W*W)/512 + 3*W (computed in __int128)
// delta = MemoryCost(NewWords) - MemoryCost(CurrentWords)
// NewWords/CurrentWords = ceil(size/32) where size is byte length.
// Any deviation (e.g. inlining the subtraction before the divide)
// can desynchronize gas accounting and break consensus.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree the duplication is a maintenance concern. I'll extract a static inline helper for the expansion+charge sequence in a follow-up.

@zoowii
Copy link
Copy Markdown
Contributor

zoowii commented Apr 28, 2026

CI failed. @ys8888john

@ys8888john ys8888john force-pushed the feat/perf-fold-stack-check-into-chunk branch 2 times, most recently from 512ee4e to 66d72f1 Compare April 28, 2026 13:54
Comment thread src/evm/interpreter.cpp Outdated
if (OffsetVal > intx::uint256(InputSize)) {
Frame->Stack[sp - 1] = intx::uint256(0);
} else {
// OffsetVal <= InputSize fits safely in size_t.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only inline small functions (keep it easy to read by human).
the code in else branch is too large for inline

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This inline policy is inspired by evmone baseline, but for readability larger ops are still dispatched through *Handler::doExecute(), e.g. RETURN, RETURNDATACOPY, etc.

@ys8888john ys8888john force-pushed the feat/perf-fold-stack-check-into-chunk branch from 66d72f1 to cad7286 Compare April 29, 2026 01:20
@ys8888john ys8888john force-pushed the feat/perf-fold-stack-check-into-chunk branch from cad7286 to 1a62057 Compare April 29, 2026 02:06
@zoowii zoowii merged commit 5e5fddd into DTVMStack:main Apr 29, 2026
16 checks passed
abmcar added a commit to abmcar/DTVM that referenced this pull request May 12, 2026
After rebasing onto current upstream/main (which now includes DTVMStack#458 / DTVMStack#460
/ DTVMStack#482 / DTVMStack#483 perf work) and running a 10-rep evmone-bench on the 27 paper
benches, the cumulative PR delta has collapsed to noise (raw geomean
+1.15%, +0.46% after correcting a single-iteration outlier on
main/blake2b_shifts/8415nulls via a focused 20-rep re-measurement).
0 benches above the +/-25% CI gate.

The A-vs-PR-base -2.73% from this commit's own optimization is unchanged;
the framing shift is that the absolute runtime delta of the whole PR vs
unmodified main has been absorbed by the intervening upstream perf
optimizations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants