Skip to content

BTC event trading pipeline upadte#80

Open
EvW1329 wants to merge 1 commit into
mainfrom
wyf/polymarket_btc_range
Open

BTC event trading pipeline upadte#80
EvW1329 wants to merge 1 commit into
mainfrom
wyf/polymarket_btc_range

Conversation

@EvW1329
Copy link
Copy Markdown
Collaborator

@EvW1329 EvW1329 commented Apr 16, 2026

Summary

This PR adds a complete end-to-end pipeline for trading Polymarket BTC binary
contracts using a direct event-probability model. It includes data ingestion
(both synthetic and real), model training, per-contract predictions, a
portfolio strategy, unit tests, and documentation.

What's included

Core pipeline (polymarket/)

Module Description
contracts.py Parse Polymarket JSONL snapshots into typed ContractMetadata objects; extract strike/direction
from question text (supports above, below, reach/hit/exceed/surpass, K-suffix like $88K)
event_features.py 15 causal features in 3 groups: contract-level (log-moneyness, sigma_h, Black-Scholes prob),
BTC price (returns, realised vol, RSI, momentum), on-chain lagged 24h (MVRV, hash-rate, difficulty)
event_dataset.py Synthetic training data: weekly BTC settlement times (2018–2025), 7 relative strikes, 168
hourly observations per event (~363K rows total)
event_model.py Calibrated logistic regression with temporal train/val split; isotonic calibration on held-out
fold; AUC 0.968 on held-out data
data_builder.py Build synthetic OHLCV feather files per contract; generate per-contract event_probs.csv for
the backtester
real_data_builder.py Build feather files from a real Polymarket trade-history parquet; parses BTC markets,
checks 60% hourly coverage, forward-fills gaps ≤ 6h, writes real_contracts.jsonl
synthetic_prices.py Log-normal price simulation for synthetic contract OHLCV
settlement.py Settlement price resolution helpers

Strategy (user_data/strategies/)

DualModelPolymarketPortfolio — freqtrade IStrategy that:

  • Loads pre-computed event_probs.csv fair values per contract
  • Computes edge = fair_value − market_price
  • Sizes positions with fractional Kelly + OTM distance penalty
  • Enforces a one-position-per-expiry gate (same-expiry contracts are correlated)
  • contracts_jsonl and predictions_dir configurable via config JSON (no strategy edits needed to switch datasets)

Alpha factor (alpha/)

EventProbAlpha — pluggable IAlpha that reads fair-value CSVs and emits ml_edge and ml_fair_value signals.

Preparation script (scripts/prepare_event_model.py)

Four-step pipeline runner:

Step Flag to skip Output
0 — Build feathers --skip-feathers *.feather per contract (synthetic or real via --use-real-data)
1 — Build training data --skip-training-data event_model_training.parquet
2 — Train model event_model.pkl
3 — Generate predictions *-event_probs.csv per contract

Tests (tests/test_polymarket/)

61 unit tests covering contracts parsing, event model training/calibration, settlement resolution, and synthetic price
generation.

Documentation (docs/polymarket/)

  • README.md — quick start, strategy reference, backtesting guide, real-data workflow, configurable paths, model
    metrics, limitations
  • training-guide.md — architecture, feature engineering, label construction, temporal split, calibration, inference,
    retraining guidance

freqtrade submodule

Updated to mlsys-io/freqtrade@5fb001168 which includes:

Validation

End-to-end backtest on real Sep-5 2025 BTC contracts (108K/110K/112K/114K):

  • 1 trade opened: BTCABOVE108K-SEP5-YES (BTC was ~$111,500 at expiry → settled YES)
  • +103.47% per trade, +2.63% portfolio return, 0% drawdown
  • Same-expiry gate correctly blocked 110K/112K/114K after 108K was entered

- Add polymarket/real_data_builder.py: build feather files from a
  Polymarket trade-history parquet instead of synthetic prices.
  Parses BTC binary markets, checks 60% hourly coverage in the 7-day
  window, forward-fills gaps ≤ 6h, and writes real_contracts.jsonl.

- Extend polymarket/contracts.py: support reach/hit/exceed/surpass
  question patterns; parse K-suffix strikes ($88K → 88,000); add
  skip_unparseable flag to load_contracts() so mixed JSONL files do
  not crash the strategy.

- Extend scripts/prepare_event_model.py: --use-real-data and
  --parquet-path flags route Step 0 to real_data_builder instead of
  the synthetic generator.

- Extend DualModelPolymarketPortfolio: contracts_jsonl and
  predictions_dir are now configurable via the freqtrade config JSON,
  enabling real-data backtests without strategy file edits.

- Pin freqtrade submodule to mlsys-io/freqtrade@5fb001168 (includes
  precision.amount fix, merged as PR #12).

- Update .gitignore: exclude user_data/data/polymarket_ml_real/ and
  report/.

- Update docs/polymarket/README.md and training-guide.md to document
  all new flags, config keys, question patterns, and real-data artefacts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@EvW1329 EvW1329 force-pushed the wyf/polymarket_btc_range branch from b2fc200 to cb52cac Compare April 16, 2026 05:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant