Skip to content

aaronrose227/Carbon-Intensity-Forecasting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Carbon Intensity Forecasting

Oxford Engineering, B1 Sustainable Computing Mini Project (3rd Year) Aaron Rose

The brief: forecast the UK grid's hourly carbon intensity (gCO₂e/kWh) for 2025 from historical data (2009 to 2024), using both a non-ML and an ML method, and discuss the trade-offs. The full write-up is in report/Sustainable_Computing_Report.pdf.

Methods

1. Non-ML: Weighted Historical Average

For a forecast timestamp, take all historical points sharing the same (month, week-of-month ±1, day-of-week, hour) and compute a weighted mean, with recent years weighted more heavily to track the UK's decarbonisation trend (445 → 124 gCO₂e/kWh from 2009 to 2024).

$$\hat{y}_t = \frac{\sum_i w_i \cdot CI_i}{\sum_i w_i} \qquad w_i = 10 \cdot 2^{,y_i - 2024}$$

So the 2024 weight is 10, 2023 is 5, 2022 is 2.5, down to about 0.01 for 2009. If no exact pattern match exists, fall back to month + hour, then hour only, then the global mean.

2. ML: XGBoost Gradient Boosting

A regression tree ensemble trained on 22 engineered features. The continuous calendar variables are encoded cyclically so that, for example, 23:00 and 00:00 sit next to each other in feature space:

$$x_{\text{sin}} = \sin!\left(\frac{2\pi t}{T}\right), \qquad x_{\text{cos}} = \cos!\left(\frac{2\pi t}{T}\right)$$

applied to hour ($T=24$), day-of-week ($T=7$), month ($T=12$), and day-of-year ($T=365.25$). Six interaction features (hour × month, hour × season, etc.) capture how the daily cycle changes with season and weekday.

Boosted trees regress toward the conditional mean and under-predict variance, so predictions are passed through a variance-calibration step:

$$\hat{y}_{\text{cal}} = \bar{y} + \left(\hat{y} - \bar{y}\right) \cdot \frac{\sigma_{\text{target}}}{\sigma_{\hat{y}}}$$

This rescales residuals about the mean to match the actual standard deviation, accepting a small MAE penalty for a much more realistic distribution.

Results

Non-ML, MAE = 48.83 gCO₂e/kWh

Non-ML forecast vs actual

ML (XGBoost), MAE = 58.58 gCO₂e/kWh

XGBoost forecast vs actual

Top: daily-averaged hourly forecast across 2025. Bottom: monthly forecast.

Forecasting Challenges

  • The carbon intensity is really non-stationary, which brings challenges. The grid has changed a lot over the years as we've moved off coal and got greener, so 2009 data isn't really telling you about 2025.
  • Both methods end up over-predicting because of this. The grid has decarbonised faster than the history suggests it should.
  • The XGBoost model predicts a bit too flat. Tree models tend to pull toward the average, so the forecast misses the highs and lows.
  • The weighted average sometimes can't find an exact match for a given hour, so it has to fall back to a coarser pattern.
  • A lot of the hour-to-hour swings come from things like wind ramping up or a gas plant coming online, and you can't really predict that from history alone. You'd need live weather and grid data.
  • Interestingly, the simple weighted average actually beats XGBoost on MAE. Most of the signal is just in month, day-of-week, and hour, which the simple method captures directly.

Layout

Carbon-Intensity-Forecasting/
├── data/                   # 2009 to 2024 train, 2025 test (half-hourly CSV)
├── non_ml_forecast/        # weighted-average scripts
├── ml_forecast/            # XGBoost + variance calibration
├── plots/                  # the two figures shown above
└── report/                 # full PDF write-up

Running It

pip install numpy pandas matplotlib xgboost scikit-learn

python non_ml_forecast/non_ml_forecast.py
python ml_forecast/xgb_forecast.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages