| Commodity / Exchange | Exchange Rule (Last Trading Day) | Pandas Operation (from YYYY-MM-01) |
|---|---|---|
| NYM - Heating Oil | 2nd last business day of the previous month. | - pd.offsets.BDay(2) |
| ICE - Brent Crude | Last business day of the second month prior to the contract month. | - pd.DateOffset(months=1) - pd.offsets.BDay(1) |
| NYM - Natural Gas | 3rd business day prior to the contract month. | - pd.offsets.BDay(3) * |
| NYM - RBOB Gasoline | Last business day of the previous month. | - pd.offsets.BDay(1) |
| CBT - Soybeans | 10th business day of the contract month. | + pd.offsets.BMonthBegin(1) + pd.offsets.BDay(9) |
| DCE - Soybean Oil | 10th business day of the contract month. | + pd.offsets.BMonthBegin(1) + pd.offsets.BDay(9) |
| ECBOT - Mini Wheat | 10th business day of the contract month. | + pd.offsets.BMonthBegin(1) + pd.offsets.BDay(9) |
| NYM - Micro WTI | 4 business days prior to the 25th of the preceding month (5 if the 25th is a weekend). | Custom Helper Function (See codebase) |
*Note on Natural Gas: While NYMEX rulebook states 3 business days, some datasets shift by 4 depending on holiday schedules. Adjust BDay(3) or BDay(4) based on historical exchange holiday calendars.
By transforming column headers into exact expiry dates, we can mathematically calculate exactly how many days are left until a contract expires on any given historical row. This allows us to construct a robust Continuous Term Structure (M1, M2, M3...) that strictly rolls over only when a contract officially stops trading.
ideas:
- event study around settlement
- plots with Tm-t instead of t
- check cond distrib normal for GARCH
- what loss
- what target -> error of measurement makes everything blow up? Intraday realized vol?
- present winsorization, dynamic one
- omega = 0
- rescaling ? std ?
This repository implements a Walk-Forward Stochastic Volatility model designed to forecast the variance of financial assets (specifically futures contracts) term structures. Unlike static models that fit parameters once, this model dynamically adapts to changing market regimes by re-calibrating its parameters at every single time step.
The core engine is built with Numba JIT (Just-In-Time) compilation, allowing for high-frequency parameter optimization and recursive filtering that would otherwise be computationally infeasible in standard Python.
The model simulates a realistic trading scenario where future data is strictly unknown. It employs a "Rolling Refit" approach:
- The Outer Loop (Calendar Time): The model iterates through the dataset day-by-day.
-
The Inner Window (Calibration & Warm-up): For every target day
$t$ , the model looks back at a fixed history window (e.g., the previous 60 days).-
Fit: It finds the optimal parameters (
$\lambda, \theta, \kappa, \eta$ ) that minimize the QLIKE loss function for that specific window. -
Filter: Using these fresh parameters, it runs a recursive filter through the window to estimate the current latent variance state (
$v_t$ ).
-
Fit: It finds the optimal parameters (
-
The Forecast: It projects the state forward to predict the variance for the next day (
$h_{t+1}$ ).
This process—Fit History
The heart of the model is the interaction between the Latent Variance State (
-
$v_t$ (Latent State): The instantaneous "spot" variance at the start of the day. This is a hidden variable we cannot see; we must estimate it. -
$h_t$ (Expectation): The total variance we expect to accumulate over the course of the day (integrated variance). This is what we compare against the actual market data.
The model uses a feedback loop (conceptually similar to a Kalman Filter) to update its belief about
-
Calculate Expectation (
$v \to h$ ): Given the current spot variance$v_t$ , the model calculates$h_t$ by integrating the variance process over the time step$\tau$ . It accounts for mean reversion (damping) and the "Samuelson Effect" (variance increasing as maturity approaches).$$h_t = \text{Damping}(\tau) \times [ \text{Term}_A(\theta) + \text{Term}_B(v_t) ]$$ -
Observe & Compare: The day passes, and the market generates an actual Realized Variance (
$RV_{obs}$ ). The model calculates the "Surprise" or prediction error:$$\text{Surprise} = RV_{obs} - h_t$$ -
Update State (
$h \to v$ ): The model updates the latent state for the next day ($v_{t+1}$ ) by combining the natural drift (mean reversion) with the shock from the surprise. $$v_{t+1} = \underbrace{\left[ \theta (1 - e^{-\kappa \Delta t}) + v_{t} e^{-\kappa \Delta t} \right]}{\text{Natural Decay (Drift)}} + \underbrace{\eta (RV{obs} - h_t)}_{\text{Correction (Shock)}}$$
This recursion ensures that if the market is wilder than expected (
It is crucial to distinguish between the two ways the model moves through time:
-
Role: Moves forward through calendar time (Jan 1
$\to$ Jan 2$\to$ Jan 3). -
Action: At each step, it defines a dataset slice called
sub_window(e.g., indicesi-60toi). - Output: A single forecast value and a set of parameters for that specific day.
-
Role: Processes the
sub_windowdefined by the Outer Loop. -
Action:
-
Optimization: Runs
minimize()to find parameters that make the error inside this 60-day chunk as small as possible. -
Filtering: Starts
$v$ at the long-run mean and iterates through the 60 days to "warm up" the state.
-
Optimization: Runs
-
Goal: To arrive at the most accurate possible estimate of
$v$ at the very end of the window (today), so we can forecast tomorrow.
By nesting the Inner Loop inside the Outer Loop, the model effectively "re-learns" the market dynamics every single day.