You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The screener is the surface, but the thing behind it is a snapshot pipeline — 10.8M+ rows across 13,963 markets, refreshed daily. I want to ask people who actually backtest prediction-market strategies what fields they keep wishing were there.
What the snapshots currently capture (per market per refresh):
price (yes-side mid)
volume cumulative + volume_24h
liquidity (Gamma's number, not orderbook-derived)
one_day_change, one_hour_change, one_week_change
closed, archived, active flags
end_date, category, tags
outcome_prices (full multi-outcome where applicable)
What I know is missing and have not (yet) added:
Order-book depth at multiple levels. Currently zero. Would need to hit CLOB per market, expensive at 13k markets but maybe doable for a curated top-N by liquidity.
Trade-by-trade tape. Not snapshotted — only aggregated volumes. Without the tape you can't reconstruct VWAP or detect single-fill spikes.
Resolution outcome + timestamp for closed markets. We have closed=true but not always the resolved outcome cleanly joined back to the historical snapshots, so survival-bias-aware backtests are awkward.
News/event tagging. Markets that moved 20% in an hour — was there a tweet, a court ruling, an earnings print? Currently zero linkage.
Funding-rate / borrow analogues. Polymarket doesn't have these, but the cost-of-carry equivalent (capital lockup until resolution) is computable from end_date + price and we don't expose it as a field.
Question for anyone running models on prediction-market data:
Which of (1)–(5) would change what you can backtest vs just being nice-to-have?
Is there a 6th thing I'm not listing that you've had to scrape yourself?
If you could only add one field per snapshot row, what would it be?
The full historical pull (SQLite + CSV) is on Gumroad — $9, freely redistributable for research. The screener stays free. Answers here genuinely shape what the next refresh adds, so be specific.
Methodology background on the existing crash-signal column lives in Discussion #2 if useful context.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
The screener is the surface, but the thing behind it is a snapshot pipeline — 10.8M+ rows across 13,963 markets, refreshed daily. I want to ask people who actually backtest prediction-market strategies what fields they keep wishing were there.
What the snapshots currently capture (per market per refresh):
price(yes-side mid)volumecumulative +volume_24hliquidity(Gamma's number, not orderbook-derived)one_day_change,one_hour_change,one_week_changeclosed,archived,activeflagsend_date,category,tagsoutcome_prices(full multi-outcome where applicable)What I know is missing and have not (yet) added:
closed=truebut not always the resolved outcome cleanly joined back to the historical snapshots, so survival-bias-aware backtests are awkward.end_date+ price and we don't expose it as a field.Question for anyone running models on prediction-market data:
The full historical pull (SQLite + CSV) is on Gumroad — $9, freely redistributable for research. The screener stays free. Answers here genuinely shape what the next refresh adds, so be specific.
Methodology background on the existing crash-signal column lives in Discussion #2 if useful context.
Beta Was this translation helpful? Give feedback.
All reactions