test(bench): Phase 2 real-flow CU bench — v1 vs new on Hadrian#171
Open
anil-rome wants to merge 5 commits into
Open
test(bench): Phase 2 real-flow CU bench — v1 vs new on Hadrian#171anil-rome wants to merge 5 commits into
anil-rome wants to merge 5 commits into
Conversation
added 5 commits
May 17, 2026 00:53
Deploys NEW (post-#165/#166/#167/#168/#169) versions of ERC20SPLFactory, SPL_ERC20 wBench wrapper, and RomeBridgeWithdraw side-by-side with v1 (currently deployed) contracts on Hadrian. Measures real production-contract CU on 6 user-facing flows via the same 3-sample real-receipt methodology as the primitive baseline. Methodology: production-contract tx -> rome_solanaTxForEvmTx -> Solana getTransaction -> meta.computeUnitsConsumed. No synthetic probes — these are the real CU users pay. Side-by-side architecture: - NEW ERC20SPLFactory creates own ERC20Users + a wBench mint via the factory's own create_token_mint + init_token_mint path (exercises A4 + A5) - NEW SPL_ERC20 wBench wraps the fresh mint we control; deployer mints supply via HelperProgram.mint_spl - NEW RomeBridgeWithdraw constructor reuses v1's wUSDC + wETH wrappers so both bridges operate on the same wrapper state for fair compare Captured (mean of 3 samples, Hadrian 2026-05-16/17): Op v1 CU v2 CU Save % --- ----- ----- ---- - SPL_ERC20.transfer 261,432 262,724 +1,292 +0.5% SPL_ERC20.approve 414,104 211,997 -202,107 -48.8% SPL_ERC20.transferFrom 258,015 263,313 +5,298 +2.1% SPL_ERC20.bridgeOutToSolana 582,407 351,968 -230,439 -39.6% RomeBridgeWithdraw.burnUSDC 952,642 799,637 -153,005 -16.1% ERC20SPLFactory.create_token_mint 460,082 198,706 -261,376 -56.8% ERC20SPLFactory.init_token_mint 229,548 (errored) n/a n/a Per-flow expectations matched: transfer/transferFrom show ~0 delta as they were already migrated in #143/#163 (no regression confirmed). bridgeOutToSolana shows the A1 win (recipient ATA-create migration). burnUSDC shows the A3 + canonical-ATA win. create_token_mint shows the A4 (System CreateAccount via HelperProgram.create_mint_account) win. Unmeasured this run: - RomeBridgeWithdraw.approveBurnETH + burnETH (deployer has no wETH balance on Hadrian; bridging ETH in from Sepolia takes ~10-15 min, deferred) - init_token_mint partial — v1 captured 1 of 3 samples; v2 errored with code -32000 (likely insufficient PDA lamports after multiple create_token_mint calls drained the reserve; future runs should top up more aggressively) Artifacts: - deployments/hadrian.real-flow-bench.json — v1 + v2 addresses + wBench mint identity - deployments/hadrian.real-flow-bench.results.json — per-sample EVM tx hashes + Solana sigs + CU readings Scripts: - scripts/bench/deploy-real-flow-v2.ts — single-shot deploy harness - scripts/bench/measure-real-flows.ts — 9-op x 2-version x 3-sample bench runner Replay: export HARDHAT_VAR_HADRIAN_PRIVATE_KEY=... npx hardhat run scripts/bench/deploy-real-flow-v2.ts --network hadrian npx hardhat run scripts/bench/measure-real-flows.ts --network hadrian
Adds Romeswap (Uniswap V2 fork) DEX flow benchmark scaffold for Hadrian.
Bench covers 2 pair types:
- wrapped x wrapped: wUSDC (v1) x wBench (v2)
- wrapped x plain ERC20: wUSDC x MOCK (deployed via existing ERC20Factory)
Per-pair ops (Rome-required multi-tx breakdown):
- createPair (1 sample; idempotent)
- addLiquidity (3-tx: tokenA->pair, tokenB->pair, pair.mint)
- swap (2-tx: tokenIn->pair, pair.swap)
- removeLiquidity (2-tx: lp->pair, pair.burn)
CAPTURED THIS RUN:
Pair A (wUSDC x wBench) createPair: 1,085,739 CU
Pair B (wUSDC x MOCK) createPair: 1,088,488 CU
Pair addresses:
A: 0x45350dF36fA7334C2E267598Af8fC136e4982A9E
B: 0x9FB2471A400CA670F5459829b622A2f4d4824642
MOCK token: 0x5cB734B113E31005487D7E4bcA39BCC3e17B8e9A
Both ~1.08M CU per pair — that's the cost of CREATE2-deploying a Romeswap
pair contract on Hadrian today. Slightly below the 1.4M Solana tx ceiling.
No v1-vs-v2 split — UniswapV2Factory's createPair logic is wrapper-agnostic.
UNMEASURED THIS RUN (state-setup blockers; needs follow-up):
- addLiquidity / swap / removeLiquidity for both pair types
Root cause: each pair receives wrapper SPL tokens which auto-creates the
pair's external_auth PDA + per-mint ATAs on Solana, with rent paid from the
CALLER (deployer)'s PDA reserve. After multiple prior bench runs depleted
the reserve and the top-up tx (swap_gas_to_lamports 20M) failed silently
on this run, the wrapper.transfer-to-pair txs reverted with
SimulateTransactionError: mollusk error: Failure(Custom(1)) = InsufficientFunds.
Fix path for next run: top up PDA lamports in smaller chunks (5M x 4)
with explicit balance verification between steps; reduce surface to
ONE pair type per session to avoid cascading state needs.
Also includes measure-bridge-eth-flows.ts (focused script awaiting wETH
bridge-in; not run this session per operator's pivot to DEX).
Replay:
export HARDHAT_VAR_HADRIAN_PRIVATE_KEY=...
npx hardhat run scripts/bench/measure-dex-flows.ts --network hadrian
Hypothesis: the Rome convention of splitting addLiquidity/removeLiquidity/swap
across multiple EVM txs is unnecessary. The router overhead (slippage logic,
optimal-amounts derivation, path math) is what busts the 1.4M CU ceiling,
NOT the underlying pair operations themselves.
Test: deploy a minimal RomeswapDirect.sol that takes pre-computed amounts
and does the token transfers + pair op inline as a single function.
Measure on Hadrian via real on-chain receipts (mean of 3 samples,
post-pair-warm-up).
RESULT — all three flows fit comfortably:
addLiq 1-tx mean 972,935 CU ✓ 31% margin to 1.4M ceiling
swap 1-tx mean 886,119 CU ✓ 37% margin
removeLiq 1-tx mean 1,187,699 CU ✓ 15% margin
Per-sample CU (cold sample 1 includes pair-state initialization):
addLiq: [1,073,102 | 929,628 | 916,075] steady-state ~920K
swap: [ 897,304 | 881,564 | 879,488] steady-state ~880K
removeLiq: [1,200,709 | 1,180,440 | 1,181,948]
Implication: a thin Solidity router (RomeswapDirect-style) collapses the
3-tx addLiquidity / 2-tx swap / 2-tx removeLiquidity flows to ONE EVM tx
each. No precompile needed. The user signs once per op instead of 2-3 times.
The pattern that works:
- Pre-compute all amounts off-chain (rome-ui or any frontend)
- Submit one tx through a minimal router
- User pre-approves the router on the tokens (one-time setup per token)
What the existing UniswapV2Router does that pushes it over the ceiling:
- optimalAmounts() math (~150K CU of EVM ops)
- slippage check (additional storage reads)
- multi-hop path walking
- per-token allowance double-check via transferFrom retries
Trim those → fits. RomeswapMinimalRouter spec recommended as a follow-up.
Note: createPair + addLiq cannot be combined atomically (createPair alone
is ~1.08M; +addLiq ~970K = ~2M > 1.4M). They stay as two separate user txs.
Deployment receipts on Hadrian:
- RomeswapDirect: 0x95E85BEaeF5D3043f415D61216bAECb1b131BE44
- Pair A (wUSDC x wBench): 0x45350dF36fA7334C2E267598Af8fC136e4982A9E
(reserves: 102K / 102K post-bench)
Per-sample EVM tx hashes + Solana sigs in
deployments/hadrian.romeswap-direct.results.json
Replay:
export HARDHAT_VAR_HADRIAN_PRIVATE_KEY=...
npx hardhat run scripts/bench/measure-romeswap-direct.ts --network hadrian
Hypothesis: 2-hop swap (wBench → wUSDC → MOCK via Pair A + Pair B)
fits in 1.4M CU atomically.
Result: HYPOTHESIS FALSIFIED. All 3 samples rejected at rome-sdk
preflight with code -32000 (TooManyComputeUnitsInAtomicTx signature).
Setup verified:
Pair A seeded: 100266 / 99684 (wUSDC / wBench)
Pair B seeded: 100000 / 100000 (MOCK / wUSDC)
Quote: 100 wBench → 100 wUSDC → 99 MOCK
All approvals + lamport top-ups confirmed before bench
Bench: 3 samples, all rejected at preflight (off-chain simulation
exceeds 1.4M; no Solana tx submitted).
Analytical estimate had 2-hop at ~1.34M (4% margin). Real cost lands
just above ceiling — likely due to auto-ATA verification per pair.swap
destination and EVM-side composition overhead.
Implications for target architecture:
- 1-hop swap: 1-tx atomic (886K, comfortable) ✓
- 2+ hop swap: split into N sequential 1-tx swaps by rome-ui
- Each split tx retains slippage + deadline; only inter-hop
atomicity is lost (intermediate token sits in user wallet ~5s
between hops)
Path to 2-hop atomic (if ever needed): strip fee-on-transfer support
from pair.swap (save ~100K per hop → 2-hop ~1.24M, would fit). Trade
locks out Token-2022 fee tokens + MetaHook + tax-token forks. Not
recommended; future composability > one extra atomic hop.
Adds:
- contracts/cpi/test/RomeswapDirect.sol: swap2Hop function
- scripts/bench/measure-2hop-swap.ts: dedicated 2-hop bench
- deployments/hadrian.2hop-bench.results.json: per-sample data
…-#364
Tests whether canonical Uniswap V2 Router operations (unmodified — plain
vanilla Uniswap V2 deployed at 0xB342f70D...) fit Solana's 1.4M CU ceiling
on Hadrian after universal-delegation savings (#364).
RESULT: only single-hop swap fits. All other ops bust 1.4M.
Op CU Verdict
-- -- -------
swapExactTokensForTokens single-hop (wUSDC→wBench) 1,101,044 ✓ fits (21% margin)
swapExactTokensForTokens 2-hop n/a ✗ preflight reject
addLiquidity n/a ✗ preflight reject
removeLiquidity (LP pre-approved) n/a ✗ preflight reject
removeLiquidityWithPermit (valid signature) n/a ✗ preflight reject
Comparison vs lean direct executor (RomeswapDirect from prior bench):
lean canonical overhead
swap single-hop 886K 1,101K +215K (24%)
addLiquidity 973K busts >150K (pushes over)
removeLiquidity 1,187K busts >213K (pushes over)
The canonical router's ~200-300K of safety overhead (safeTransferFrom
wrappers, slippage logic, _addLiquidity optimal-amounts derivation, multi-hop
path-walking) is what separates "fits" from "busts." Post-#364 savings
helped the lean path; not enough to cover the canonical overhead.
removeLiquidityWithPermit ALSO busts — permit eliminates a separate approve
tx but the underlying remove operation itself doesn't fit. So canonical
router's permit family is unusable on Rome.
Implication for unified DEX UX using only canonical infrastructure:
- 1-tx swap (single-hop): YES via canonical, full Uniswap V2 safety features
- addLiquidity: stays 3-tx (Rome direct-pair convention)
- removeLiquidity: stays 2-tx (Rome direct-pair convention)
- 2+ hop swap: stays N-tx (split per hop)
Permit setup verified: EIP-712 domain name is 'Romeswap V2' (not 'Uniswap V2');
the rome-uniswap-v2 fork branded the LP token. With correct domain, permit
signature recovers correctly — busts CU during the underlying operation,
not at signature verification.
Per-sample EVM tx hashes + Solana sigs in
deployments/hadrian.canonical-router.results.json.
Replay:
npx hardhat run scripts/bench/measure-canonical-router.ts --network hadrian
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 2 of the universal-delegation CU benchmark. Where Phase 1 (#170) measured primitive-level CU savings for A1-A6 selectors, this PR measures real production-contract flow-level CU on Hadrian by deploying NEW (post-#165/#166/#167/#168/#169) versions side-by-side with v1 (currently deployed) contracts and benchmarking user-facing ops.
Headline real-flow numbers (Hadrian, mean of 3 samples)
SPL_ERC20.transferSPL_ERC20.approveSPL_ERC20.transferFrom(self-allowance)SPL_ERC20.bridgeOutToSolanaRomeBridgeWithdraw.burnUSDCERC20SPLFactory.create_token_mintExpected-vs-measured alignment:
transfer/transferFrom: migrated in Add transfer_spl(bytes32,bytes32,uint64,bytes32) overload to IHelperProgram #143/feat(erc20spl): direct-precompile rewrite — 5 hot paths via new HelperProgram selectors #163, no further savings expected → confirmed (≤2% noise)approve: v1's deployed bytecode predates feat(erc20spl): direct-precompile rewrite — 5 hot paths via new HelperProgram selectors #163 → still usesSplTokenLib.approve + invoke_signed→ A1-A2-era selector dispatch shaves ~50%bridgeOutToSolana: A1create_ata_for_keycollapses ATA-create leg → big winburnUSDC: A3pda_with_saltcollapses CCTP message-PDA derive → matches Phase 1 primitive saving (~93K) plus other path simplificationscreate_token_mint: A4create_mint_accountcollapsesSystemProgramLib.create_account + invoke_signed→ biggest %-winArchitecture — Option B (side-by-side)
ERC20SPLFactory0x21cc267f…0xfF96ab9E…(deploys own ERC20Users)SPL_ERC20for the bench0x94AC3E5e…(wUSDC)0xD39EC36a…(wBench wraps a fresh mint; deployer minted supply viaHelperProgram.mint_spl)RomeBridgeWithdraw0xf48902e4…0xAa457897…(constructor reuses v1's wUSDC + wETH wrappers so bridges share state)Why NEW.SPL_ERC20 wraps a fresh mint instead of v1's wUSDC: separate wrapper contracts own separate Solana PDAs which own separate ATAs — there's no way to share v1's wUSDC SPL balance with a new wrapper. The wrapper-level operations (transfer/approve/transferFrom/bridgeOutToSolana) have identical code paths regardless of underlying mint, so the CU comparison is fair.
Unmeasured this run
RomeBridgeWithdraw.approveBurnETH/burnETH— deployer has no wETH balance on Hadrian. Bridging in ETH from Sepolia would take ~10-15 min; deferred to a follow-up. Phase 1 primitive measurements (A2 −154K + A3 ×2 = −187K) gives expected −241K CU per Wormhole outbound bridge.ERC20SPLFactory.init_token_mint— partial (1/3 v1 samples captured; v2 errored). Likely deployer PDA lamport reserve exhausted after multiplecreate_token_mintcalls. Future runs should top up more aggressively.Test plan
transfer,transferFrom)burnETH/approveBurnETHafter bridging in wETHinit_token_mintwith larger PDA lamport top-upReplay
export HARDHAT_VAR_HADRIAN_PRIVATE_KEY=... npx hardhat run scripts/bench/deploy-real-flow-v2.ts --network hadrian npx hardhat run scripts/bench/measure-real-flows.ts --network hadrianCross-references