This repository provides code examples for an introduction to uncertainty quantification (UQ) in supervised machine learning, with an emphasis on:
- Conformal Prediction/Inference — a distribution-free framework for constructing prediction sets with guaranteed coverage
- Bayesian Methods — parametric approaches for quantifying uncertainty through posterior distributions
| File | Description |
|---|---|
| Introduction_Conformal_Prediction.ipynb | Introduction to conformal prediction for classification using the Dry Bean dataset (UCI). Covers data splitting, calibration sets, and prediction set construction. |
| conformal_classifier_MNIST.ipynb | Conformal inference for image classification on Fashion-MNIST dataset using Random Forest and SVM classifiers. |
| conformal_regression_ex.ipynb | Conformal inference for regression problems. Demonstrates aleatoric uncertainty estimation using MAPIE library with quantile regression and coverage score evaluation. |
| File | Description |
|---|---|
| bayesian_example.ipynb | Bayesian prediction for autoregressive AR(1) processes. Demonstrates posterior inference for time series forecasting with uncertainty bands. |
| bayesian_fnn.ipynb | Bayesian linear regression using Pyro with variational inference (SVI) and automatic guides. |
| bayesina_nn_pyro.ipynb | Bayesian Neural Networks (BNN) using Pyro. Implements a neural network with probabilistic weights for regression with uncertainty estimation. |
| bayesUC_pymc3.ipynb | Bayesian Unobserved Components (UC) model using PyMC3. Example with US Consumer Price Index data for time series decomposition. |
| File | Description |
|---|---|
| utils.py | Helper functions including data simulation (simAR1, cycle), embedding utilities (embed), Bayesian posterior computations (posterior_betas, Predictive), and visualization tools (plot_scores, plot_1d_data). |
| requirements.txt | Python package dependencies |
python3 -m venv env_uq
source env_uq/bin/activatepip install -r requirements.txtFor the Pyro-based notebooks:
pip install pyro-ppl torchFor the PyMC3 notebook:
pip install pymc3 statsmodels pandas-datareaderA distribution-free method that wraps any machine learning model to produce prediction sets with valid coverage guarantees. The key idea is to use a calibration set to determine prediction thresholds.
Provides uncertainty estimates through posterior distributions over model parameters, enabling:
- Credible intervals for predictions
- Model averaging
- Principled handling of parameter uncertainty
- MAPIE - Model Agnostic Prediction Interval Estimator for conformal prediction
- Pyro - Probabilistic programming with PyTorch
- PyMC3 - Bayesian modeling and probabilistic machine learning
- scikit-learn - Machine learning models and utilities