x-datascience-datacamp · NicolasNoya · Feb 22, 2026 · Feb 22, 2026 · Feb 22, 2026 · Feb 22, 2026
diff --git a/.gitignore b/.gitignore
@@ -6,3 +6,4 @@
 ingestion_res/*
 scoring_res/*
 dev_phase/*
+*.pth
diff --git a/competition.yaml b/competition.yaml
@@ -1,8 +1,34 @@
 version: 2
-title: Templat competition - Dummy classification
-description: Dummy classification task
+title: "Directional Forecasting of the S&P 500 Index"
+# Docker image used by Codabench to run ingestion and scoring.
+# Build and push with:
+#   docker build -t nicolasnoya2001/sp500-challenge:v2 -f tools/Dockerfile .
+#   docker push nicolasnoya2001/sp500-challenge:v2
+docker_image: nicolasnoya2001/sp500-challenge:v2
+description: >
+  Can you predict whether the S&P 500 will close UP or DOWN tomorrow?
+
+  Each trading day, participants receive a historical feature vector built from
+  past daily OHLCV data (Open, High, Low, Close, Volume) of the S&P 500 index.
+
+  The target label is binary: **1** if the next trading day's close is strictly
+  above the current day's close, **0** otherwise. Participants are encouraged to
+  engineer their own historical context features (e.g., rolling volatility, moving averages)
+  using the provided sequential data.
+
+  Participants submit a PyTorch model via a `submission.py` file exposing a `get_model(train_loader)`
+  function. The ingestion program passes a `DataLoader` yielding `(x, y)` batches where:
+    - `x` is a `FloatTensor` of shape `(batch, WINDOW_SIZE, n_features)` — a sliding window of historical daily features
+    - `y` is a `FloatTensor` of shape `(batch,)` — binary labels (1 = up, 0 = down)
+
+  `get_model` must return a trained `torch.nn.Module` whose forward pass accepts a tensor of
+  shape `(batch, WINDOW_SIZE, n_features)` and returns **probabilities in [0, 1]** of shape `(batch,)`.
+
+  Submissions are ranked by their **ROC-AUC** score computed from the predicted probabilities.
+
+  This is a DataCamp challenge organised at École Polytechnique (INF554 / MAP583).
 image: logo.png
-registration_auto_approve: False  # if True, do not require approval from admin to join the comp
+registration_auto_approve: False
 
 terms: pages/terms.md
 pages:
@@ -15,23 +41,30 @@ pages:
 
 tasks:
   - index: 0
-    name: Developement Task
-    description: 'Tune models with training data, test against examples contained in public test data'
+    name: Development Task
+    description: >
+      Next-day close direction forecasting of the S&P 500 using sliding windows of daily OHLCV data.
+      Models must be PyTorch modules trained via `get_model(train_loader)` and must output
+      probabilities (not hard 0s and 1s) to be properly scored via ROC-AUC over a public held-out test window.
     input_data: dev_phase/input_data/
     reference_data: dev_phase/reference_data/
     ingestion_program: ingestion_program/
     scoring_program: scoring_program/
+    public_data: dev_phase/input_data/train
+    starting_kit: template_starting_kit.ipynb
 
 solutions:
   - index: 0
     tasks:
-    - 0
+      - 0
     path: solution/
 
-
 phases:
   - name: Development Phase
-    description: 'Development phase: tune your models.'
+    description: >
+      Tune and validate your forecasting model using the provided historical
+      S&P 500 training data. Your predictions are scored against a public test set
+      so you can iterate quickly. Unlimited submissions are allowed in this phase.
     start: 10-07-2025
     end: 03-31-2026
     tasks:
@@ -41,20 +74,20 @@ leaderboards:
   - title: Results
     key: main
     columns:
-      - title: Test Accuracy
+      - title: ROC-AUC (public test)
         key: test
         index: 0
-        sorting: asc
-      - title: Private Test Accuracy
+        sorting: desc          # higher is better
+      - title: ROC-AUC (private test)
         key: private_test
         index: 1
-        sorting: asc
-        hidden: True
-      - title: Train time
+        sorting: desc
+        hidden: True           # revealed only after the phase ends
+      - title: Train Time (s)
         key: train_time
         index: 2
-        sorting: desc
-      - title: Test time
+        sorting: asc           # lower is better
+      - title: Predict Time (s)
         key: test_time
         index: 3
-        sorting: desc
+        sorting: asc
diff --git a/ingestion_program/ingestion.py b/ingestion_program/ingestion.py
@@ -3,53 +3,156 @@
 import time
 from pathlib import Path
 
+import numpy as np
 import pandas as pd
+import torch
 
+# Number of past trading days fed as a sequence to the model.
+# Must be consistent between training and inference.
+WINDOW_SIZE = 50
 
 EVAL_SETS = ["test", "private_test"]
 
 
-def evaluate_model(model, X_test):
-
-    y_pred = model.predict(X_test)
-    return pd.DataFrame(y_pred)
+class SP500Dataset(torch.utils.data.Dataset):
+    """PyTorch Dataset for the S&P 500 direction-forecasting challenge.
+
+    Each sample is a sliding window of shape (WINDOW_SIZE, n_features)
+    ending at day `idx`. The target is the binary label of that last day
+    (1 = close > prev_close, 0 otherwise).
+
+    For the first WINDOW_SIZE-1 days, the window is left-padded with zeros.
+
+    Parameters
+    ----------
+    features_path : Path
+        Path to the features CSV (columns = feature names, rows = trading days
+        in chronological order).
+    labels_path : Path or None
+        Path to the labels CSV (single column, same row order as features).
+        Pass None for test sets where labels are withheld.
+    window_size : int
+        Number of past days (inclusive of the current day) in each sequence.
+    """
+
+    def __init__(
+        self, features_path, labels_path=None, window_size=WINDOW_SIZE
+    ):
+        self.window_size = window_size
+        # index_col=0: the first column is the row index saved by setup_data.py,
+        # not a feature — must be excluded from the data arrays.
+        self.X = pd.read_csv(features_path, index_col=0).values.astype(
+            np.float32
+        )
+        self.n_features = self.X.shape[1]
+        if labels_path is not None:
+            self.y = (
+                pd.read_csv(labels_path, index_col=0)
+                .values.astype(np.float32)
+                .ravel()
+            )
+        else:
+            self.y = None  # test mode — labels are unknown
+
+    def __len__(self):
+        return len(self.X)
+
+    def __getitem__(self, idx):
+        """Return (window, label) where window has shape (window_size, n_features).
+
+        The label is the binary target for day `idx` (the last day of the window).
+        During test mode (no labels), only the window tensor is returned.
+        """
+        window_start = max(0, idx - self.window_size + 1)
+        window = self.X[window_start : idx + 1]  # (<=window_size, n_features)
+
+        # Left-pad with zeros if we are at the beginning of the series
+        if len(window) < self.window_size:
+            padding = np.zeros(
+                (self.window_size - len(window), self.n_features),
+                dtype=np.float32,
+            )
+            window = np.concatenate([padding, window], axis=0)
+
+        x = torch.tensor(
+            window, dtype=torch.float32
+        )  # (window_size, n_features)
+
+        if self.y is not None:
+            y = torch.tensor(self.y[idx], dtype=torch.float32)  # scalar
+            return x, y
+        return x  # test mode
+
+
+def get_train_dataset(data_dir):
+    """Build the training Dataset from separate features and labels CSVs."""
+    data_dir = Path(data_dir)
+    features_path = data_dir / "train" / "train_features.csv"
+    labels_path = data_dir / "train" / "train_labels.csv"
+    return SP500Dataset(features_path, labels_path)
 
 
-def get_train_data(data_dir):
+def get_test_dataset(data_dir, eval_set):
+    """Build a test Dataset (no labels) for a given evaluation split."""
     data_dir = Path(data_dir)
-    training_dir = data_dir / "train"
-    X_train = pd.read_csv(training_dir / "train_features.csv")
-    y_train = pd.read_csv(training_dir / "train_labels.csv")
-    return X_train, y_train
+    features_path = data_dir / eval_set / f"{eval_set}_features.csv"
+    return SP500Dataset(features_path, labels_path=None)
 
 
-def main(data_dir, output_dir):
-    # Here, you can import info from the submission module, to evaluate the
-    # submission
-    from submission import get_model
+def evaluate_model(model, test_dataset):
+    """Run inference over a test Dataset and return a DataFrame of probabilities.
 
-    X_train, y_train = get_train_data(data_dir)
+    The model outputs probabilities in [0, 1] (sigmoid already applied).
+    The scoring program is responsible for applying the decision threshold.
+    """
+    device = next(model.parameters()).device
+    loader = torch.utils.data.DataLoader(
+        test_dataset, batch_size=64, shuffle=False
+    )
+    probs = []
+    model.eval()
+    with torch.no_grad():
+        for x in loader:
+            # test_dataset returns bare tensors (no label) — x is already the input
+            x = x.to(device)
+            batch_probs = model(x).cpu().numpy().tolist()  # floats in [0, 1]
+            probs.extend(batch_probs)
+    return pd.DataFrame({"Probability": probs})
 
-    print("Training the model")
 
-    model = get_model()
+def main(data_dir, output_dir):
+    from submission import (
+        get_model,
+    )  # imported here so sys.path is set first
 
+    data_dir = Path(data_dir)
+    output_dir = Path(output_dir)
+
+    # ── Training ──────────────────────────────────────────────────────────────
+    train_dataset = get_train_dataset(data_dir)
+    train_loader = torch.utils.data.DataLoader(
+        train_dataset, batch_size=32, shuffle=True
+    )
+
+    print("Training the model")
     start = time.time()
-    model.fit(X_train, y_train)
+    model = get_model(train_loader)  # participant trains and returns the model
     train_time = time.time() - start
-    print("-" * 10)
+
+    # ── Evaluation ────────────────────────────────────────────────────────────
+    print("=" * 40)
     print("Evaluate the model")
     start = time.time()
     res = {}
     for eval_set in EVAL_SETS:
-        X_test = pd.read_csv(data_dir / eval_set / f"{eval_set}_features.csv")
-        res[eval_set] = evaluate_model(model, X_test)
+        test_dataset = get_test_dataset(data_dir, eval_set)
+        res[eval_set] = evaluate_model(model, test_dataset)
     test_time = time.time() - start
-    print("-" * 10)
-    duration = train_time + test_time
-    print(f"Completed Prediction. Total duration: {duration}")
+    print(
+        f"Completed Prediction. Total duration: {train_time + test_time:.1f}s"
+    )
 
-    # Write output files
+    # ── Write outputs ─────────────────────────────────────────────────────────
     output_dir.mkdir(parents=True, exist_ok=True)
     with open(output_dir / "metadata.json", "w+") as f:
         json.dump(dict(train_time=train_time, test_time=test_time), f)
@@ -70,19 +173,25 @@ def main(data_dir, output_dir):
         "--data-dir",
         type=str,
         default="/app/input_data",
-        help="",
+        help="Root folder containing train/, test/, and private_test/ splits. "
+        "Codabench mounts data at /app/input_data. "
+        "For local testing pass: --data-dir dev_phase/input_data",
     )
     parser.add_argument(
         "--output-dir",
         type=str,
         default="/app/output",
-        help="",
+        help="Folder where prediction CSVs and metadata.json will be written. "
+        "Codabench expects output at /app/output. "
+        "For local testing pass: --output-dir ingestion_res",
     )
     parser.add_argument(
         "--submission-dir",
         type=str,
         default="/app/ingested_program",
-        help="",
+        help="Directory containing submission.py. "
+        "Codabench mounts participant code at /app/ingested_program. "
+        "For local testing pass: --submission-dir solution",
     )
 
     args = parser.parse_args()

diff --git a/ingestion_program/metadata.yaml b/ingestion_program/metadata.yaml
@@ -1 +1,2 @@
-command: python3 ingestion.py
+command: python3 ingestion.py
+image: nicolasnoya2001/sp500-challenge:v2
diff --git a/logo.png b/logo.png
diff --git a/pages/data.md b/pages/data.md
@@ -0,0 +1,4 @@
+You can download the data for this challenge from here:
+
+- Training Features: https://nicolas-public-images.s3.us-east-1.amazonaws.com/train/train_features.csv
+- True Labels: https://nicolas-public-images.s3.us-east-1.amazonaws.com/train/train_labels.csv
diff --git a/pages/participate.md b/pages/participate.md
@@ -1,10 +1,61 @@
-# How to participate
+# How to Participate
 
-You should submit an untrained model in a python file `model.py` which contains
-your `class Model`, which will be imported, trained, and tested on Codalab.
+## Objective
 
-See the "Seed" page for the outline of a `Model` class, with the expected
-function names.
+Build a model that predicts whether the S&P 500 index will **close strictly above** the current day's close on the **next trading day**,
+using only the provided historical OHLCV features.
 
-See the "Timeline" page for additional information about the phases of this
-competition
+## Input Features
+
+Each sample in the dataset is a row in a CSV with the following columns (all values are for the **current trading day** or computed from past days only):
+
+| Column | Description |
+|--------|-------------|
+| `Open` | Opening price of the trading day |
+| `High` | Intraday high |
+| `Low` | Intraday low |
+| `Close` | Closing price of the trading day |
+| `Volume` | Total trading volume |
+
+The ingestion program constructs **sliding windows** of the last **50 trading days** for each sample and feeds them to your model as tensors of shape `(batch, 50, n_features)`.
+
+## Target Label
+
+- **1** — today's close will be **strictly above** the previous close
+- **0** — today's close will be **at or below** the previous close
+
+## What to Submit
+
+Submit a single file named **`submission.py`** containing a function:
+
+```python
+def get_model(train_loader):
+    ...
+    return model
+```
+
+`train_loader` is a `torch.utils.data.DataLoader` yielding `(x, y)` batches where:
+- `x` has shape `(batch, 50, n_features)` — a sliding window of the last 50 daily feature vectors
+- `y` has shape `(batch,)` — binary labels `{0, 1}`
+
+Your `get_model` function must **train the model** using the provided loader and return a trained `torch.nn.Module` whose `forward(x)` outputs **probabilities in [0, 1]** of shape `(batch,)` — i.e. sigmoid must already be applied inside `forward`.
+
+See the **Seed** page for a working skeleton to get started.
+
+## Evaluation Metric
+
+Submissions are ranked by **ROC-AUC score** on the held-out test set.
+A perfect model scores 1.0; random guessing scores ~0.5.
+
+## How to Submit
+
+1. Write your `submission.py` with a `get_model(train_loader)` function.
+2. Zip it: `zip submission.zip submission.py`
+3. Upload the zip on the **My Submissions** page.
+
+## Rules
+
+- Your model may only use information in the provided feature set — no external data sources.
+- External Python libraries (e.g. `torch`, `sklearn`, `numpy`) are allowed.
+- You may submit as many times as you like during the Development Phase.
+- The private test set is only revealed after the phase ends.