Data trustworthiness: staleness watchdog, honest coverage, self-verification + health by Co-Messi · Pull Request #1 · Co-Messi/HyperData-Terminal

Co-Messi · 2026-06-08T14:29:47Z

What & why

Makes the terminal's data trustworthy, not just live. The core problem this addresses: the Terminal could silently show frozen data as live, overstated liquidation completeness it didn't have, and had no way to prove its numbers against external sources. (The internal sibling repo's "verified against external sources" reputation rested entirely on a one-shot script the Terminal didn't even ship.)

Changes (R1–R4)

Trust integrity

Staleness watchdog — every WS socket now uses heartbeat=20; the two bare Hyperliquid sockets (orderbook, orderflow) no longer silently freeze. Per-feed last_message_at/is_stale(); orderbook force-reconnects an alive-but-silent socket; the hub flips feed status to stale and kicks the HL trade socket when both venues go quiet.
Honest liquidation coverage — exchange_coverage() tags each venue confirmed/sampled/heuristic; get_stats() returns coverage + confirmed/heuristic counts & volume. README no longer claims "every liquidation" (Binance is throttled at the source, HL is a large-trade heuristic). Bybit widened 15→50 symbols (batched, not truncated).
Crash-durable persistence — time-based commit (COMMIT_INTERVAL_SECONDS=5) + atexit flush bound worst-case loss on kill -9; corruption-recovery path no longer assumes a connection.

Verification + visibility

DataHealthMonitor cross-references hub data vs Binance/Deribit (BTC price <0.5%, L/S, DVOL) + freshness/completeness/consistency; runs every 30s. Surfaced at /v1/health and as a dashboard badge (✓ LIVE / ⚠ STALE / ⚠ DRIFT). src/verify_data.py is a CLI wrapper (verified live: 10/10 pass).

Correctness depth

CVD — per-venue trade dedup (HL tid, Binance aggId) so reconnect replays can't double-count; per-venue cumulative CVD so "BTC CVD" isn't a silent HL+Binance sum.
Exchange event-time for funding rates; OrderFlowEngine.stop() double-await fixed.

Docs/tests — docs/DATA_INTEGRITY.md; new tests/test_data_integrity.py; fixed 2 pre-existing funding tests (they expected an un-normalized rate).

Adversarial review + fixes

Ran an adversarial review (saved under .roast/) and fixed its findings:

H1 API bound 0.0.0.0 with no auth → default to 127.0.0.1, opt into LAN via HYPERDATA_API_HOST=0.0.0.0.
H2 CI never ran pytest → added a Run-tests step.
M1/M6 biased trade sampling (keyed off a shared counter) → dedicated, lock-protected _trade_count.
M2 LLM agent read nonexistent stat keys (always saw "0 liquidations") → fixed.
M3 unbounded DB growth → hourly prune() + WAL checkpoint.
M7 cascade alert/totals blended confirmed + heuristic → confirmed-only volume.
M4 bad query params returned 500 → validated to 400, clamped.
Real latent crash: llm_agent used asyncio without importing it at module scope (any LLM timeout → NameError).
Graceful SIGTERM shutdown in run_api.py; /v1/public/metrics uptime fixed.

Judgment calls (please veto if you disagree)

CI Lint set to continue-on-error: true — the repo has ~97 pre-existing ruff errors (mostly E501/I001/F401) unrelated to this work, so the Lint step was already failing. Tests now gate the build; recommend a separate focused lint-cleanup PR, then drop continue-on-error.
Deferred roast items (all "fix after launch" tier): H3 pin GitHub Actions to SHAs, M5 WS broadcast backpressure, dead copy-trading handler removal, demo-code extraction.

Verification

pytest: 71 passed, 2 skipped. verify_data.py: 10/10 live. /v1/health + SIGTERM tested live.

Not in this PR

The README/social images: there are new asset options in assets/ worth considering (recommendation relayed separately) — intentionally left out per your "don't replace now."

The two primary Hyperliquid sockets (orderbook l2Book, orderflow trades) connected without a heartbeat, so a half-open TCP connection threw no error, never reconnected, and kept serving a frozen orderbook / frozen CVD as if it were live. There was no 'no data in N seconds = stale' guard anywhere. - heartbeat=20 on both bare HL ws_connect calls (matches every other socket) - per-feed last_message_at + data_age()/is_stale() on OrderBookEngine and OrderFlowEngine (HL + Binance tracked separately) - OrderBookSnapshot gains a stale flag, recomputed at read time - orderbook self-heals: its snapshot loop force-closes an alive-but-silent socket past 2x the threshold to trigger the backoff reconnect - hub._update_feed_staleness() (run each 1s status tick, live mode only) flips orderflow_engine/orderbook_feed status to 'stale' and force-reconnects the HL trade socket when both venues go silent - liquidations intentionally not aged out (sporadic by nature)

…tic) The README claimed 'every liquidation event as it happens,' but the feed is not a complete census: Binance's !forceOrder stream is throttled to ~1 liq/symbol/sec at the source, Bybit only subscribes to the top-15 symbols, and Hyperliquid has no liquidation feed at all (events are inferred from large trades, carrying confirmed=False). - exchange_coverage() documents each venue's collection method + caveat - get_stats() now returns coverage, confirmed_count, heuristic_count, and a per-exchange 'method' tag - BYBIT_SYMBOL_LIMIT / HL_LIQUIDATION_MIN_USD constants replace magic numbers; misleading '# Confirmed' source comments corrected - liquidation stream marks Hyperliquid rows as estimated (~ / ?) with a caption, and tags sampled/heuristic venues in the per-exchange totals - README rewritten to state real coverage and link DATA_INTEGRITY.md

Writes were committed only every 50 events with no atexit/signal flush, so a SIGKILL/OOM lost everything since the last commit — and because _event_count is a single shared counter across all tables, a low-frequency table (funding, LSR, options) could sit uncommitted for many minutes behind high-frequency trades. - _maybe_commit() commits at 50 events OR every COMMIT_INTERVAL_SECONDS (5s), bounding worst-case loss on an uncatchable crash to ~5s and flushing slow tables promptly; replaces all seven fixed every-50 commit sites - atexit handler flushes on graceful exit / unhandled exception / Ctrl-C - corruption-recovery path no longer assumes self._conn was assigned (guarded via a local handle) — fixes a latent AttributeError if the first connect fails - kept startup PRAGMA integrity_check (Terminal won't hit the multi-GB scale that made it a problem in the internal repo)

The terminal displayed data but never proved it. Added a DataHealthMonitor that cross-references live hub data against public exchange APIs (Binance spot price <0.5%, long/short ratio, Deribit DVOL) and checks per-feed staleness, freshness, completeness, and consistency. - src/data_layer/health_monitor.py: reusable monitor producing a structured {overall, counts, checks} result; thresholds tuned to what the Terminal actually collects (verified live: 230 assets priced) - hub runs it every 30s (live mode only) and caches the result on self.health - /v1/health now returns data_health + per-feed status; top-level status flips to 'degraded' when stale/drifting (server still up) - combined dashboard's hardcoded green 'LIVE' badge replaced with a dynamic ✓ LIVE / ⚠ STALE / ⚠ DRIFT / ⚠ DEGRADED indicator driven by the monitor - src/verify_data.py: one-shot CLI reusing the same monitor (non-zero exit on any failure) — verified end-to-end live: 10/10 checks pass

The order-flow engine fed Hyperliquid and Binance trades into the same buckets and the same cumulative_cvd, so 'BTC CVD' was a silent HL+Binance sum. With no trade dedup (unlike the liquidation feed), a reconnect/resubscribe replay double-counted into the cumulative figure, which never resets — drift over a long session. - per-venue bounded seen-ID sets (HL tid, Binance aggTrade id) skip replays - cumulative CVD tracked per venue (cumulative_cvd_hl/_binance) plus the combined series; get_cumulative_cvd(symbol) exposes all three - documented the exchange-event-time vs wall-clock skew in _expire_old

Funding snapshots stamped local time.time() instead of the exchange's event time, injecting local-clock + fetch latency into cross-source alignment. - Binance premiumIndex: use the per-item 'time' field (ms), fall back to local - Bybit v5 tickers: use the response envelope 'time' (per-ticker rows have none) - spot_prices: Binance /ticker/price returns no timestamp, so local fetch time is documented as intentional rather than silently misleading

Bybit only subscribed to DEFAULT_SYMBOLS[:15] because Bybit v5 caps args per subscribe request — so liquidations on the other ~35 tracked symbols never arrived. Batch the subscription in chunks of 10 instead of truncating; symbols without a Bybit linear perp just get a harmless per-topic error reply. Coverage note updated to match.

- stop() re-awaited self._task inside the task loop and set it to None on the first iteration, so the second iteration awaited None (TypeError: None is not awaitable) and never cleaned up the Binance task. Cancel each task once, then null both handles; init _binance_task in __init__. - removed unused 'field' import - CLAUDE.md claimed 'no synthetic/demo mode' while hub.py still carries a full demo path; documented demo as a dev/preview-only path (product runs live, both entry points pass demo=False, health monitor disabled under demo)

- tests/test_data_integrity.py: staleness watchdog (incl. combined-age uses freshest venue), orderbook stale flag, CVD dedup + per-venue split, and health monitor verdict classification (stale outranks drift) - fixed the two pre-existing funding-rate tests: they expected the raw 8h rate as 'hourly', but the collector correctly normalizes to hourly (rate/8) and annualizes that — updated expectations to match the correct math - docs/DATA_INTEGRITY.md documents coverage caveats, the staleness watchdog, the verification checks + tolerances, durability, and timestamps - README gains a Self-Verifying Data feature row; CLAUDE.md documents verify_data

Adversarial review (.roast/REPORT-latest.md) findings: - H1: API defaulted to host 0.0.0.0 with no auth + CORS '*' while docs claimed localhost-only. Default to 127.0.0.1; opt into LAN via HYPERDATA_API_HOST. - H2: CI ran ruff + ast.parse + import smoke-tests but never pytest — the whole suite was invisible. Added a 'Run tests' step. - M2: LLM agent read liq_stats['count']/['volume_usd'] (nonexistent keys), so the model always saw '0 liquidations'. Use total_count/total_volume_usd. - L4: /v1/public/metrics uptime was always 0 (self._start_time never set) — use the hub's tracked uptime_seconds.

…tals, LLM crash More adversarial-review findings (.roast/REPORT-latest.md): - M1/M6: trade sampling keyed off the shared _event_count (bumped by 6 event types), so the '1-in-2' sample was biased and get_trade_summary's x2 rescale wrong. Added a dedicated, lock-protected _trade_count; all counter mutation now inside the lock. - M3: no retention/pruning anywhere — DB + WAL grew forever and the periodic COUNT(*) under the lock degraded the hot path. Added DataStore.prune() (delete > RETENTION_DAYS old + wal_checkpoint TRUNCATE), run hourly from the hub. - M7: liquidation totals + the cascade alert blended confirmed and HL-heuristic events. Added confirmed_volume_usd/heuristic_volume_usd; cascade alert now uses confirmed volume only. - F821 (real latent crash): llm_agent used asyncio.TimeoutError but only imported asyncio inside other functions — any LLM timeout raised NameError. Hoisted the import to module scope. - CI: lint step made non-blocking (pre-existing repo-wide ruff debt) so the new pytest gate actually runs; flagged for a focused lint-cleanup follow-up. - tests: unbiased-sampling, prune, and confirmed-vs-heuristic-volume regression tests added.

- SIGTERM handling: run_api.py only caught KeyboardInterrupt, so 'kill <pid>' (what process managers send) skipped hub.stop() and could drop unflushed writes + leave sockets open. Added SIGINT/SIGTERM handlers that trigger a clean shutdown. - M4: numeric query params (limit/minutes/min_size/threshold) were int()/float()'d raw, so '?limit=abc' raised an uncaught 500 without CORS headers. Added _int_param/_float_param helpers that return 400 on bad input and clamp to sane bounds; applied to all 9 sites. - docs: documented the loopback-default API bind, retention/pruning, and SIGTERM in DATA_INTEGRITY.md.

chatgpt-codex-connector · 2026-06-08T14:29:54Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

Co-Messi · 2026-06-08T14:40:08Z

@Verdict

verdict-0528 · 2026-06-08T14:43:27Z

⚠️ Verdict error: TimeoutError: The operation was aborted due to timeout

files=21 diffLen=165717 elapsed=187991ms prFetch=3286ms fileContent=4693ms

Co-Messi · 2026-06-08T14:49:47Z

@Verdict

verdict-0528

What this PR does: Makes the terminal's data trustworthy by adding a staleness watchdog (heartbeat + per-feed is_stale), honest liquidation coverage tags, crash‑durable persistence, a live data‑health monitor (BTC price, L/S, DVOL vs Binance/Deribit) exposed at /v1/health, per-venue CVD with dedup, and loopback‑only API binding. Also fixes biased trade sampling, unbounded DB growth, cascade alerts using confirmed volume, and other adversarial‑review issues.
Risk areas:

Health monitor BTC price cross‑reference uses a tight 0.5% tolerance comparing hub perp price to Binance spot; can falsely trigger DRIFT alarms in volatile markets.
Health monitor’s funding‑vs‑LSR consistency check may compare funding from one venue against Binance LSR, producing misleading warnings.
Staleness watchdog force‑reconnects the orderflow HL socket when data_age() is infinite (e.g., persistent outage), which could cause log/noise storms.
Verdict: ⚠️ Minor concerns

verdict-0528 · 2026-06-08T14:52:54Z

+
+# External reference endpoints (public, no key).
+BINANCE_SPOT_PRICE = "https://api.binance.com/api/v3/ticker/price"
+BINANCE_PREMIUM_INDEX = "https://fapi.binance.com/fapi/v1/premiumIndex"


🟠 P1 (High): The 0.5% tolerance (PRICE_TOLERANCE_PCT) comparing the hub’s likely perp‑market BTC price against Binance spot is extremely tight. During high volatility the basis can easily exceed 0.5%, causing the health check to fail and the dashboard badge to show ⚠ DRIFT even when no actual data fault exists. This undermines the trustworthiness promise. Consider widening the tolerance or computing it dynamically (e.g., based on typical basis spreads).

verdict-0528 · 2026-06-08T14:52:54Z

+        for c in checks:
+            counts[c.status] = counts.get(c.status, 0) + 1
+
+        # Badge semantics: a stale feed is the most urgent signal (frozen data),


🟡 P2 (Medium): The consistency check funding_vs_lsr compares hub.market.assets["BTC"].funding_rate (source unknown, could be a different exchange) against Binance’s globalLongShortAccountRatio. These may reflect different markets and can naturally diverge, leading to noisy WARN results that confuse users about data integrity. Restrict the check to funding rates from the same venue as the LSR, or remove it.

Co-Messi · 2026-06-08T23:31:59Z

@codex

chatgpt-codex-connector · 2026-06-08T23:36:45Z

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…y, reconnect noise Resolves the reviewer's three concerns on the data-health monitor: - P1 (false DRIFT): the BTC price cross-reference compared the hub's PERP price to Binance SPOT at a tight 0.5% tolerance, so normal perp/spot basis tripped DRIFT in volatile markets. Now compares perp-to-perp (Binance premiumIndex markPrice) with a tiered tolerance: <0.5% pass, <2% warn, >=2% fail — only a real feed break flips the badge. - P2 (cross-venue noise): funding_vs_lsr compared HL funding against Binance LSR. Now uses Binance funding vs Binance LSR (same venue, confirmed Binance source) and skips near-zero funding that carries no directional signal. - Reconnect noise storm: the HL order-flow force-reconnect could fire every status tick during a persistent outage (and during initial connect, when data_age is infinite). Guarded on 'had data then stopped' + debounced to once per cooldown. Verified live: 10 checks run, price diff 0.012% perp-to-perp, no false DRIFT.

Co-Messi added 12 commits June 8, 2026 21:28

verdict-0528 Bot reviewed Jun 8, 2026

View reviewed changes

Co-Messi merged commit 68ab89e into main Jun 9, 2026
2 checks passed

Co-Messi deleted the feat-data-trustworthiness branch June 9, 2026 11:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data trustworthiness: staleness watchdog, honest coverage, self-verification + health#1

Data trustworthiness: staleness watchdog, honest coverage, self-verification + health#1
Co-Messi merged 13 commits into
mainfrom
feat-data-trustworthiness

Co-Messi commented Jun 8, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 8, 2026

Uh oh!

Co-Messi commented Jun 8, 2026

Uh oh!

verdict-0528 Bot commented Jun 8, 2026

Uh oh!

Co-Messi commented Jun 8, 2026

Uh oh!

verdict-0528 Bot left a comment

Uh oh!

verdict-0528 Bot Jun 8, 2026

Uh oh!

verdict-0528 Bot Jun 8, 2026

Uh oh!

Co-Messi commented Jun 8, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Co-Messi commented Jun 8, 2026

What & why

Changes (R1–R4)

Adversarial review + fixes

Judgment calls (please veto if you disagree)

Verification

Not in this PR

Uh oh!

chatgpt-codex-connector Bot commented Jun 8, 2026

Uh oh!

Co-Messi commented Jun 8, 2026

Uh oh!

verdict-0528 Bot commented Jun 8, 2026

Uh oh!

Co-Messi commented Jun 8, 2026

Uh oh!

verdict-0528 Bot left a comment

Choose a reason for hiding this comment

Uh oh!

verdict-0528 Bot Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

verdict-0528 Bot Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

Co-Messi commented Jun 8, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant