feat(engine): multi-signal candidate gating for detected arb cycles#130
Merged
Conversation
Adds a gating layer between Bellman-Ford cycle detection and EVM simulation. Catches corruption signatures (stale graph edges with zero reserves, repeated identical profit_factor across many sibling cycles, f64-overflow profit values) that the existing detector + simulator silently passed through, polluting the candidate stream and wasting 100-500 ms of revm fork sim per bogus cycle. ## Background A 1h 20m live mainnet capture on PR #118's branch surfaced cycles reporting `expected_net_eth = 21,539,349,978.775558` ETH — 21.5 billion ETH per cycle, the same value repeated across 444 cycles within a single block. Root cause: at least one pool in the registry had stale reserves on its price-graph edge; Bellman-Ford walked that edge through many path permutations and every cycle inherited the same astronomical implied rate. revm sim then ran ~37 k attempts in 80 minutes against these poisoned candidates, each call costing one or more RPC reads to Alchemy and pushing the provider into rate-limit territory. The detector was correct; the graph snapshot was wrong; the existing pipeline had no defense between the two. ## What ships `crates/grpc-server/src/cycle_gating.rs` (new) — five gates, each keyed to a corruption signature observed in real or plausible runs, plus one soft-warn band for operator audit: 1. **TVL** — every edge in the cycle must have `reserve_in >= GatingConfig::min_reserve_f64` (default 1e6 wei). Empty pools cannot produce real arbs regardless of the rate the graph thinks they have. 2. **Multi-cycle fingerprint** — `profit_factor` values quantised to 1e-6 and bucketed. When >= `fingerprint_min_cluster` cycles land in the same bucket (default 5), the cluster is dropped — the signature of a single corrupt edge feeding many paths. 3. **Hard sanity cap** — `profit_factor > 100.0` (10000%) or non-finite (NaN, +/-inf) is dropped immediately. UST's peak depeg was 90%, so this band is unequivocally math broken. 4. **Soft warn** — `profit_factor > 0.5` logs a `warn` with cycle details for operator audit; never drops on its own. Surfaces suspicious but possibly real depeg / launch events. 5. **Post-sim revm cross-check** — after fork simulation, compare `sim_result.profit_wei` against the detector's `expected_net_wei`. If the two differ by more than `revm_profit_mismatch_threshold` (default 0.5 fractional), trust revm and drop the candidate. Catches the residual case where the local graph was stale at detection time but the pre- sim gates could not see it. Each drop bumps `aether_cycle_gate_dropped_total{reason="..."}` with a label drawn from a fixed set of four (`profit_factor_impossible`, `reserves_too_low`, `fingerprint_cluster`, `revm_contradicts`), so dashboards stay enumerable without label churn. ## Engine integration `AetherEngine` cycle loop (`engine.rs` ~line 1325) builds the fingerprint index once per detection pass via `build_fingerprint_index` (O(N) batch) so the per-cycle gate is O(1) — without this the multi-cycle gate would be O(N) per cycle, O(N^2) across the batch, blowing the 3 ms detection budget on dense graphs. Post-sim gate fires immediately after `if !sim_result.success` so it costs at most one HashMap probe + one f64 ratio per simulated cycle. `EngineConfig` gains a `gating: GatingConfig` field with strict production defaults; tests that exercise the detection cycle with synthetic graphs (`add_edge` without seeded reserves) override with `GatingConfig::permissive()` so they assert detection behaviour rather than gating behaviour. Four existing test fixtures updated accordingly; one test `test_engine_custom_config` left untouched since it inspects `min_profit_threshold_wei` only. ## Tests `crates/grpc-server/src/cycle_gating.rs` ships 12 unit tests covering: - `fingerprint_bucket` quantisation correctness, including NaN / +inf / -inf collapsing into a single bucket (so the fingerprint gate also catches the f64-overflow signature even when the hard cap somehow does not). - `build_fingerprint_index` skipping unprofitable cycles. - Pre-sim gate behaviour for impossible profit, NaN profit, low reserves, fingerprint cluster, and normal arbs. - Post-sim gate behaviour for matching profits, zero actual vs nonzero expected, large mismatch (the 21B vs 1e15 case), and the `expected == 0` no-op pass. 488 workspace tests pass. `cargo clippy --workspace -- -D warnings` clean. ## Out of scope - Edge freshness tracking (gate 2 of the original design doc): requires adding `last_update_block: u64` to `PriceEdge` and propagating block numbers through ~30 call sites in `engine.rs` + tests. Filed for a separate PR. - Cross-pool agreement check: needs a new pool-registry iteration helper. Filed for a separate PR. - Configurable thresholds via env / config file: defaults are calibrated for the blue-chip pool registry shipped in `config/pools.toml`. Operators monitoring long-tail or freshly-launched pools may need to relax `min_reserve_f64` and `profit_factor_impossible` once we expose the knobs. ## Verification plan After merge, re-run the 10-min capture script from PR #118's branch (`scripts/mempool_capture.sh`). Expect: - `aether_cycle_gate_dropped_total{reason="reserves_too_low"}` or `{reason="fingerprint_cluster"}` ticking up where the 21B-ETH cycles were previously surviving. - `aether_pending_arb_candidates_total` numbers landing in the realistic 1-200 bps profit band instead of the `gt_200bps" >> 7000` we saw. - `aether_simulations_run_total` rate dropping proportionally, with the saved RPC budget freed for the cycles that survive gating.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
A 1h 20m live mainnet capture on PR #118's branch surfaced cycles reporting
expected_net_eth = 21,539,349,978.775558ETH — 21.5 billion ETH per cycle, the same value repeated across 444 cycles within a single block.Root cause: at least one pool in the registry had stale reserves on its price-graph edge. Bellman-Ford walked that edge through many path permutations and every cycle inherited the same astronomical implied rate. revm sim then ran ~37 k attempts in 80 minutes against these poisoned candidates, each call costing one or more RPC reads to Alchemy and pushing the provider into rate-limit territory.
The detector was correct. The graph snapshot was wrong. The existing pipeline had no defense between the two.
What ships
crates/grpc-server/src/cycle_gating.rs(new, 350 LOC) — five gates plus one soft-warn band, each keyed to a corruption signature observed in real or plausible runs:reserve_in < 1e6 weireserves_too_lowprofit_factor(quantised to 1e-6)fingerprint_clusterprofit_factor > 100.0(10000%) or NaN/±∞profit_factor_impossibleprofit_factor > 0.5warnwith cycle details; never dropsprofit_weidiffers from detector'sexpected_net_weiby >50% fractionalrevm_contradictsEach drop bumps
aether_cycle_gate_dropped_total{reason="..."}with a label drawn from a fixed set of four, so dashboards stay enumerable without label churn.Why each gate exists
Gate 1 (TVL). Empty pools cannot produce real arbs regardless of the rate the graph thinks they have. Corrupt edges most commonly present as
reserve_in = 0.0(placeholder seed never refreshed by chain events).Gate 2 (fingerprint). The 21B-ETH bug observed in production is the canonical example — 444 cycles produced identical profit factors, which is impossible if the underlying graph state varied per cycle. Quantising and bucketing catches the pattern in O(1) per cycle after an O(N) batch index build.
Gate 3 (hard cap). UST's peak depeg was 90% (
profit_factor = 0.9). Anything beyond100.0(10000%) is unequivocally math broken, never a real opportunity even during the most extreme stablecoin event in DeFi history.Soft warn. Real arbs on blue-chip pools live well below 50%. Depegs and launch events live above. Worth flagging for operator audit but not dropping on its own; the operator may legitimately want to chase depeg arbs.
Gate 4 (post-sim). Catches the residual case where revm's fork sim reveals a graph-vs-chain mismatch the pre-sim gates could not have seen. revm uses current chain state via RPC; the detector uses a cached snapshot. When they disagree by more than 50%, trust revm and drop the candidate before it reaches the executor.
Why not just clamp
A single hard threshold ("anything above 50% is fake") would also drop real depeg arbs — UST in 2022, USDC in March 2023 SVB, every wrapped-token de-backing. The five-gate stack catches the 21B bug (which fails gates 1, 2, and 4 simultaneously) without losing legitimate >50% opportunities (which pass gates 1, 2, 4 and only trigger the soft-warn at gate 3).
Engine integration
AetherEnginecycle loop (engine.rs~line 1325) builds the fingerprint index once per detection pass viabuild_fingerprint_index(O(N) batch) so the per-cycle gate is O(1). Without this, the multi-cycle gate would be O(N) per cycle, O(N²) across the batch, blowing the 3 ms detection budget on dense graphs.Post-sim gate fires immediately after
if !sim_result.successso it costs at most one HashMap probe + one f64 ratio per simulated cycle.EngineConfiggains agating: GatingConfigfield with strict production defaults. Tests that exercise the detection cycle with synthetic graphs (add_edgewithout seeded reserves) override withGatingConfig::permissive()so they assert detection behaviour rather than gating behaviour. Four existing test fixtures updated accordingly.Tests
12 new unit tests in
cycle_gating.rscovering:fingerprint_bucketquantisation correctness, including NaN / +∞ / -∞ collapsing into a single bucketbuild_fingerprint_indexskipping unprofitable cyclesexpected == 0no-op pass488 workspace tests pass.
cargo clippy --workspace -- -D warningsclean.Out of scope (filed for separate PRs)
last_update_block: u64toPriceEdgeand propagating block numbers through ~30 call sites. Schema change toaether-statecrate.config/pools.toml. Operators monitoring long-tail or freshly-launched pools may need to relaxmin_reserve_f64andprofit_factor_impossibleonce knobs are exposed.Verification plan
After merge, re-run the 10-min capture script from PR #118's branch (
scripts/mempool_capture.sh). Expect:aether_cycle_gate_dropped_total{reason="reserves_too_low"}or{reason="fingerprint_cluster"}ticking up where the 21B-ETH cycles were previously survivingaether_pending_arb_candidates_totalnumbers landing in the realistic 1-200 bps profit band instead of thegt_200bps >> 7000we sawaether_simulations_run_totalrate dropping proportionally, freeing RPC budget for the cycles that actually survive gatingDiff
crates/grpc-server/src/cycle_gating.rscrates/grpc-server/src/engine.rsgatingfield onEngineConfig, update 4 test fixtures)crates/grpc-server/src/metrics.rsaether_cycle_gate_dropped_total{reason}counter + accessor)crates/grpc-server/src/main.rsmod cycle_gating)Net: +667 lines, 4 files.