Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@
ingestion_res/*
scoring_res/*
dev_phase/*
*.pth
67 changes: 50 additions & 17 deletions competition.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,34 @@
version: 2
title: Templat competition - Dummy classification
description: Dummy classification task
title: "Directional Forecasting of the S&P 500 Index"
# Docker image used by Codabench to run ingestion and scoring.
# Build and push with:
# docker build -t nicolasnoya2001/sp500-challenge:v2 -f tools/Dockerfile .
# docker push nicolasnoya2001/sp500-challenge:v2
docker_image: nicolasnoya2001/sp500-challenge:v2
description: >
Can you predict whether the S&P 500 will close UP or DOWN tomorrow?

Each trading day, participants receive a historical feature vector built from
past daily OHLCV data (Open, High, Low, Close, Volume) of the S&P 500 index.

The target label is binary: **1** if the next trading day's close is strictly
above the current day's close, **0** otherwise. Participants are encouraged to
engineer their own historical context features (e.g., rolling volatility, moving averages)
using the provided sequential data.

Participants submit a PyTorch model via a `submission.py` file exposing a `get_model(train_loader)`
function. The ingestion program passes a `DataLoader` yielding `(x, y)` batches where:
- `x` is a `FloatTensor` of shape `(batch, WINDOW_SIZE, n_features)` — a sliding window of historical daily features
- `y` is a `FloatTensor` of shape `(batch,)` — binary labels (1 = up, 0 = down)

`get_model` must return a trained `torch.nn.Module` whose forward pass accepts a tensor of
shape `(batch, WINDOW_SIZE, n_features)` and returns **probabilities in [0, 1]** of shape `(batch,)`.

Submissions are ranked by their **ROC-AUC** score computed from the predicted probabilities.

This is a DataCamp challenge organised at École Polytechnique (INF554 / MAP583).
image: logo.png
registration_auto_approve: False # if True, do not require approval from admin to join the comp
registration_auto_approve: False

terms: pages/terms.md
pages:
Expand All @@ -15,23 +41,30 @@ pages:

tasks:
- index: 0
name: Developement Task
description: 'Tune models with training data, test against examples contained in public test data'
name: Development Task
description: >
Next-day close direction forecasting of the S&P 500 using sliding windows of daily OHLCV data.
Models must be PyTorch modules trained via `get_model(train_loader)` and must output
probabilities (not hard 0s and 1s) to be properly scored via ROC-AUC over a public held-out test window.
input_data: dev_phase/input_data/
reference_data: dev_phase/reference_data/
ingestion_program: ingestion_program/
scoring_program: scoring_program/
public_data: dev_phase/input_data/train
starting_kit: template_starting_kit.ipynb

solutions:
- index: 0
tasks:
- 0
- 0
path: solution/


phases:
- name: Development Phase
description: 'Development phase: tune your models.'
description: >
Tune and validate your forecasting model using the provided historical
S&P 500 training data. Your predictions are scored against a public test set
so you can iterate quickly. Unlimited submissions are allowed in this phase.
start: 10-07-2025
end: 03-31-2026
tasks:
Expand All @@ -41,20 +74,20 @@ leaderboards:
- title: Results
key: main
columns:
- title: Test Accuracy
- title: ROC-AUC (public test)
key: test
index: 0
sorting: asc
- title: Private Test Accuracy
sorting: desc # higher is better
- title: ROC-AUC (private test)
key: private_test
index: 1
sorting: asc
hidden: True
- title: Train time
sorting: desc
hidden: True # revealed only after the phase ends
- title: Train Time (s)
key: train_time
index: 2
sorting: desc
- title: Test time
sorting: asc # lower is better
- title: Predict Time (s)
key: test_time
index: 3
sorting: desc
sorting: asc
163 changes: 136 additions & 27 deletions ingestion_program/ingestion.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,53 +3,156 @@
import time
from pathlib import Path

import numpy as np
import pandas as pd
import torch

# Number of past trading days fed as a sequence to the model.
# Must be consistent between training and inference.
WINDOW_SIZE = 50

EVAL_SETS = ["test", "private_test"]


def evaluate_model(model, X_test):

y_pred = model.predict(X_test)
return pd.DataFrame(y_pred)
class SP500Dataset(torch.utils.data.Dataset):
"""PyTorch Dataset for the S&P 500 direction-forecasting challenge.

Each sample is a sliding window of shape (WINDOW_SIZE, n_features)
ending at day `idx`. The target is the binary label of that last day
(1 = close > prev_close, 0 otherwise).

For the first WINDOW_SIZE-1 days, the window is left-padded with zeros.

Parameters
----------
features_path : Path
Path to the features CSV (columns = feature names, rows = trading days
in chronological order).
labels_path : Path or None
Path to the labels CSV (single column, same row order as features).
Pass None for test sets where labels are withheld.
window_size : int
Number of past days (inclusive of the current day) in each sequence.
"""

def __init__(
self, features_path, labels_path=None, window_size=WINDOW_SIZE
):
self.window_size = window_size
# index_col=0: the first column is the row index saved by setup_data.py,
# not a feature — must be excluded from the data arrays.
self.X = pd.read_csv(features_path, index_col=0).values.astype(
np.float32
)
self.n_features = self.X.shape[1]
if labels_path is not None:
self.y = (
pd.read_csv(labels_path, index_col=0)
.values.astype(np.float32)
.ravel()
)
else:
self.y = None # test mode — labels are unknown

def __len__(self):
return len(self.X)

def __getitem__(self, idx):
"""Return (window, label) where window has shape (window_size, n_features).

The label is the binary target for day `idx` (the last day of the window).
During test mode (no labels), only the window tensor is returned.
"""
window_start = max(0, idx - self.window_size + 1)
window = self.X[window_start : idx + 1] # (<=window_size, n_features)

# Left-pad with zeros if we are at the beginning of the series
if len(window) < self.window_size:
padding = np.zeros(
(self.window_size - len(window), self.n_features),
dtype=np.float32,
)
window = np.concatenate([padding, window], axis=0)

x = torch.tensor(
window, dtype=torch.float32
) # (window_size, n_features)

if self.y is not None:
y = torch.tensor(self.y[idx], dtype=torch.float32) # scalar
return x, y
return x # test mode


def get_train_dataset(data_dir):
"""Build the training Dataset from separate features and labels CSVs."""
data_dir = Path(data_dir)
features_path = data_dir / "train" / "train_features.csv"
labels_path = data_dir / "train" / "train_labels.csv"
return SP500Dataset(features_path, labels_path)


def get_train_data(data_dir):
def get_test_dataset(data_dir, eval_set):
"""Build a test Dataset (no labels) for a given evaluation split."""
data_dir = Path(data_dir)
training_dir = data_dir / "train"
X_train = pd.read_csv(training_dir / "train_features.csv")
y_train = pd.read_csv(training_dir / "train_labels.csv")
return X_train, y_train
features_path = data_dir / eval_set / f"{eval_set}_features.csv"
return SP500Dataset(features_path, labels_path=None)


def main(data_dir, output_dir):
# Here, you can import info from the submission module, to evaluate the
# submission
from submission import get_model
def evaluate_model(model, test_dataset):
"""Run inference over a test Dataset and return a DataFrame of probabilities.

X_train, y_train = get_train_data(data_dir)
The model outputs probabilities in [0, 1] (sigmoid already applied).
The scoring program is responsible for applying the decision threshold.
"""
device = next(model.parameters()).device
loader = torch.utils.data.DataLoader(
test_dataset, batch_size=64, shuffle=False
)
probs = []
model.eval()
with torch.no_grad():
for x in loader:
# test_dataset returns bare tensors (no label) — x is already the input
x = x.to(device)
batch_probs = model(x).cpu().numpy().tolist() # floats in [0, 1]
probs.extend(batch_probs)
return pd.DataFrame({"Probability": probs})

print("Training the model")

model = get_model()
def main(data_dir, output_dir):
from submission import (
get_model,
) # imported here so sys.path is set first

data_dir = Path(data_dir)
output_dir = Path(output_dir)

# ── Training ──────────────────────────────────────────────────────────────
train_dataset = get_train_dataset(data_dir)
train_loader = torch.utils.data.DataLoader(
train_dataset, batch_size=32, shuffle=True
)

print("Training the model")
start = time.time()
model.fit(X_train, y_train)
model = get_model(train_loader) # participant trains and returns the model
train_time = time.time() - start
print("-" * 10)

# ── Evaluation ────────────────────────────────────────────────────────────
print("=" * 40)
print("Evaluate the model")
start = time.time()
res = {}
for eval_set in EVAL_SETS:
X_test = pd.read_csv(data_dir / eval_set / f"{eval_set}_features.csv")
res[eval_set] = evaluate_model(model, X_test)
test_dataset = get_test_dataset(data_dir, eval_set)
res[eval_set] = evaluate_model(model, test_dataset)
test_time = time.time() - start
print("-" * 10)
duration = train_time + test_time
print(f"Completed Prediction. Total duration: {duration}")
print(
f"Completed Prediction. Total duration: {train_time + test_time:.1f}s"
)

# Write output files
# ── Write outputs ─────────────────────────────────────────────────────────
output_dir.mkdir(parents=True, exist_ok=True)
with open(output_dir / "metadata.json", "w+") as f:
json.dump(dict(train_time=train_time, test_time=test_time), f)
Expand All @@ -70,19 +173,25 @@ def main(data_dir, output_dir):
"--data-dir",
type=str,
default="/app/input_data",
help="",
help="Root folder containing train/, test/, and private_test/ splits. "
"Codabench mounts data at /app/input_data. "
"For local testing pass: --data-dir dev_phase/input_data",
)
parser.add_argument(
"--output-dir",
type=str,
default="/app/output",
help="",
help="Folder where prediction CSVs and metadata.json will be written. "
"Codabench expects output at /app/output. "
"For local testing pass: --output-dir ingestion_res",
)
parser.add_argument(
"--submission-dir",
type=str,
default="/app/ingested_program",
help="",
help="Directory containing submission.py. "
"Codabench mounts participant code at /app/ingested_program. "
"For local testing pass: --submission-dir solution",
)

args = parser.parse_args()
Expand Down
3 changes: 2 additions & 1 deletion ingestion_program/metadata.yaml
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
command: python3 ingestion.py
command: python3 ingestion.py
image: nicolasnoya2001/sp500-challenge:v2
Binary file modified logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions pages/data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
You can download the data for this challenge from here:

- Training Features: https://nicolas-public-images.s3.us-east-1.amazonaws.com/train/train_features.csv
- True Labels: https://nicolas-public-images.s3.us-east-1.amazonaws.com/train/train_labels.csv
65 changes: 58 additions & 7 deletions pages/participate.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,61 @@
# How to participate
# How to Participate

You should submit an untrained model in a python file `model.py` which contains
your `class Model`, which will be imported, trained, and tested on Codalab.
## Objective

See the "Seed" page for the outline of a `Model` class, with the expected
function names.
Build a model that predicts whether the S&P 500 index will **close strictly above** the current day's close on the **next trading day**,
using only the provided historical OHLCV features.

See the "Timeline" page for additional information about the phases of this
competition
## Input Features

Each sample in the dataset is a row in a CSV with the following columns (all values are for the **current trading day** or computed from past days only):

| Column | Description |
|--------|-------------|
| `Open` | Opening price of the trading day |
| `High` | Intraday high |
| `Low` | Intraday low |
| `Close` | Closing price of the trading day |
| `Volume` | Total trading volume |

The ingestion program constructs **sliding windows** of the last **50 trading days** for each sample and feeds them to your model as tensors of shape `(batch, 50, n_features)`.

## Target Label

- **1** — today's close will be **strictly above** the previous close
- **0** — today's close will be **at or below** the previous close

## What to Submit

Submit a single file named **`submission.py`** containing a function:

```python
def get_model(train_loader):
...
return model
```

`train_loader` is a `torch.utils.data.DataLoader` yielding `(x, y)` batches where:
- `x` has shape `(batch, 50, n_features)` — a sliding window of the last 50 daily feature vectors
- `y` has shape `(batch,)` — binary labels `{0, 1}`

Your `get_model` function must **train the model** using the provided loader and return a trained `torch.nn.Module` whose `forward(x)` outputs **probabilities in [0, 1]** of shape `(batch,)` — i.e. sigmoid must already be applied inside `forward`.

See the **Seed** page for a working skeleton to get started.

## Evaluation Metric

Submissions are ranked by **ROC-AUC score** on the held-out test set.
A perfect model scores 1.0; random guessing scores ~0.5.

## How to Submit

1. Write your `submission.py` with a `get_model(train_loader)` function.
2. Zip it: `zip submission.zip submission.py`
3. Upload the zip on the **My Submissions** page.

## Rules

- Your model may only use information in the provided feature set — no external data sources.
- External Python libraries (e.g. `torch`, `sklearn`, `numpy`) are allowed.
- You may submit as many times as you like during the Development Phase.
- The private test set is only revealed after the phase ends.
Loading
Loading