Skip to content

feat: mempool shadow-run suite — discovery, WS keepalive, retention, config hardening, dashboard#184

Open
Pablosinyores wants to merge 2 commits into
developfrom
feat/shadow-run-suite
Open

feat: mempool shadow-run suite — discovery, WS keepalive, retention, config hardening, dashboard#184
Pablosinyores wants to merge 2 commits into
developfrom
feat/shadow-run-suite

Conversation

@Pablosinyores

Copy link
Copy Markdown
Owner

Net-new features for the unattended public-mempool shadow run, landed on develop's foundation. No submission path is touched — all of this is observability / discovery / ops hardening.

What's included

  • Hot-token discovery (AETHER_DISCOVERY=1, off by default) — HotTokenTracker records every decoded mempool swap; the discovery loop screens frequently-traded candidates with a fee-on-transfer round-trip (revm, own timeout), qualifies venue liquidity, and appends admitted pools to pools.toml + reloads the registry (capped at AETHER_MAX_ADMITTED_POOLS). Wired additively into SimContext (new field + a record() hook in the decode loop) — develop's backrun design is untouched.
  • WS idle keepalivesubscribe_once gains an idle-watchdog arm (MEMPOOL_WS_IDLE_TIMEOUT_SECS, default 60, 0=off) that drops+reconnects on a silent TCP half-close. Registers MempoolIngestMetrics so idle_reconnect_total + the re-encode counter populate.
  • Mempool retention sweepPruneMempoolLedger + runRetentionLoop in the reconciler (MEMPOOL_RETENTION_HOURS def 48 / _INTERVAL_MINS def 60; children dropped ON DELETE CASCADE).
  • demo.shAETHER_ARM gating (A = analytical-only), profit-scorer under spawn_with_restart with a crash-loop bound, start_log_janitor (copytruncate, independent of respawn), and AETHER_DISCOVERY opt-in carrying the discovery→canary / freeze-pools.toml→measurement guardrail.
  • Alertingaether.shadow.rules group (producer-down, predictions-stalled, retention-errors, reconciler-drops).
  • Postgres tuning — aggressive autovacuum + max_connections=200; x-logging json-file 50MB×5 caps on all compose services.
  • Backrun funnel & diagnostics Grafana dashboard (auto-provisioned).

Foundation-adaptation notes

  • develop's foundation is treated as canonical. The 6 foundation files that diverged (router_decoder, mempool_backrun, fork, mempool_pipeline, metrics, mempool) plus engine.rs/cycle_gating.rs/price_graph.rs/pools.toml use develop's version; only net-new feature deltas are layered on.
  • The dynamic-gas oracle was dropped: develop's backrun sizes with a fixed input_amount_wei and prices gas in the revm sim — there is no static pre-revm coarse gate for it to make dynamic, so the feature is obsolete here.

Validation

cargo clippy --workspace clean · grpc-server (75) / ingestion (82) / simulator (96) / *-bins tests green · go build/vet/test · bash -n demo.sh · docker compose --profile ledger --profile node config · promtool (12 rules).

Land two net-new mempool features on develop's foundation.

Hot-token discovery (AETHER_DISCOVERY=1, off by default): a HotTokenTracker
records every decoded mempool swap; the discovery loop reads frequently-traded
candidates, screens them with a fee-on-transfer round-trip (revm, own timeout),
qualifies venue liquidity, and appends admitted pools to pools.toml + reloads
the registry (capped at AETHER_MAX_ADMITTED_POOLS). The tracker is wired
additively into SimContext (new field + a record() hook in the decode loop);
develop's backrun design is otherwise untouched.

WS idle keepalive: subscribe_once gains an idle-watchdog select arm
(MEMPOOL_WS_IDLE_TIMEOUT_SECS, default 60, 0=off) that drops + reconnects when
no pending tx arrives in the window, closing the silent-half-close stall.
MempoolIngestMetrics is registered via with_metrics so idle_reconnect_total +
the re-encode counter finally populate.

Adds 4 discovery metrics (pools_admitted/rejected, fot_screen, hot_candidates).
cargo clippy --workspace + grpc-server/ingestion/simulator tests green.
…+ funnel dashboard

Operational hardening for the unattended public-mempool shadow run.

- Reconciler retention sweep (PruneMempoolLedger + runRetentionLoop): deletes
  old mempool_predictions (children dropped ON DELETE CASCADE) on
  MEMPOOL_RETENTION_HOURS (def 48, <=0 disables) / _INTERVAL_MINS (def 60);
  metrics aether_mempool_retention_pruned_total / _errors_total.
- demo.sh: AETHER_ARM gating (A = analytical-only, executor + backrun validator
  off), profit-scorer supervised under spawn_with_restart with a crash-loop
  bound, start_log_janitor (copytruncate $LOG_DIR/*.log past DEMO_LOG_CAP_BYTES,
  independent of respawn), and AETHER_DISCOVERY opt-in carrying the
  discovery->canary / freeze-pools.toml->measurement guardrail.
- Prometheus aether.shadow.rules group (producer-down, predictions-stalled,
  retention-errors, reconciler-drops).
- Postgres autovacuum + max_connections tuning; x-logging json-file 50MBx5
  caps on all compose services.
- Backrun funnel & diagnostics Grafana dashboard (auto-provisioned).

go build/vet/test, promtool (12 rules), docker compose config, bash -n green.
@vercel

vercel Bot commented Jun 1, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
aether Ready Ready Preview, Comment Jun 1, 2026 1:57pm
aether-63xv Ready Ready Preview, Comment Jun 1, 2026 1:57pm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant