Skip to content

This project provides a complete, reproducible workflow for automated land cover classification using Sentinel-2 image patches and ESA WorldCover data. The pipeline enables extraction, labeling, and supervised modeling of large geospatial chip collections, supporting quantitative environmental audit and analysis.

Notifications You must be signed in to change notification settings

Quantamaster/Earth-Observation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

18 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🌍 Earth Observation Land Cover Classification Pipeline

AI-Based Geospatial Audit of the Delhi Airshed

Python PyTorch Satellite Geospatial License


Space for climate and water

πŸ“š Table of Contents


πŸ“Œ Overview

This repository presents a reproducible Earth Observation (EO) and Deep Learning pipeline for automated land-cover classification using Sentinel-2 RGB image patches and ESA WorldCover 2021 data.

The system enables:

  • Automated geospatial data filtering
  • Raster-based ground-truth label generation
  • Supervised CNN training using ResNet-18
  • Quantitative land-use and environmental analysis

The project is framed as an AI-based Geospatial Audit of the Delhi Airshed, suitable for research labs, environmental agencies, and policy analytics.


πŸ›°οΈ Project Context

This work demonstrates how satellite imagery + deep learning can be leveraged for urban land-use monitoring and environmental auditing.

Data Sources

  • Sentinel-2 (10m resolution RGB imagery)
  • ESA WorldCover 2021 land-cover raster
  • Delhi-NCR administrative boundary (EPSG:4326)

🧠 System Architecture


β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Delhi-NCR AOI (GeoJSON)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Spatial Grid (60Γ—60 km)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Sentinel-2 RGB Image Chips   β”‚
β”‚  (128Γ—128 @ 10m resolution)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ ESA WorldCover Raster (10m)  β”‚
β”‚  β†’ Mode-based Labeling       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Clean Labeled Dataset (CSV)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ ResNet-18 CNN (PyTorch)      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Metrics: Accuracy, F1, CM    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ Directory Structure

/Earth_Observation_Pipeline/
β”‚
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ delhi_ncr_region.geojson
β”‚   β”œβ”€β”€ delhi_ncr_grid.geojson
β”‚   β”œβ”€β”€ worldcover_bbox_delhi_ncr_2021.tif
β”‚   β”œβ”€β”€ rgb/                          # Sentinel-2 image patches (128Γ—128)
β”‚   β”œβ”€β”€ image_coords.csv
β”‚   β”œβ”€β”€ imgs_within_grid.csv
β”‚   └── labelled_images_clean.csv
β”‚
β”œβ”€β”€ 01_grid_visualization.py
β”œβ”€β”€ 02_label_extract_assignment.py
β”œβ”€β”€ 03_train_test_split.py
β”œβ”€β”€ 04_cnn_train_eval.py
β”‚
β”œβ”€β”€ requirements.txt
└── README.md

πŸ“¦ Dataset

Kaggle Dataset (Required): https://www.kaggle.com/datasets/rishabhsnip/earth-observation-delhi-airshed

Includes:

  • Sentinel-2 RGB image patches
  • Delhi-NCR shapefiles
  • Image coordinate metadata

πŸ” Pipeline Breakdown

Phase 1: Spatial Reasoning & Filtering

  • Define AOI using Delhi-NCR boundary
  • Generate a uniform 60 Γ— 60 km grid
  • Filter Sentinel-2 images by grid intersection

Script: 01_grid_visualization.py


Phase 2: Label Construction

  • Extract raster patches from ESA WorldCover
  • Assign land-cover class using mode-based sampling
  • Handle missing data and edge effects

Script: 02_label_extract_assignment.py


Phase 3: Dataset Cleaning & Split

  • Remove invalid labels
  • Perform stratified train-test split
  • Analyze class distribution

Script: 03_train_test_split.py


Phase 4: CNN Training & Evaluation

  • CNN: ResNet-18
  • Input: Sentinel-2 RGB chips
  • Metrics: Accuracy, F1 Score, Confusion Matrix

Script: 04_cnn_train_eval.py


πŸ“Š Results & Metrics

(Representative – depends on training run)

  • Overall Accuracy: ~75–85%
  • Macro F1 Score: ~0.72–0.82
  • Strong Performance: Urban, Vegetation, Water
  • Challenges: Mixed land-cover and boundary regions

Outputs

  • Confusion Matrix
  • Class-wise F1 Scores
  • Correct vs Incorrect Prediction Visualizations

🧠 Model Details

  • Architecture: ResNet-18
  • Framework: PyTorch
  • Loss Function: Cross-Entropy
  • Evaluation Metrics: Accuracy, F1 Score

βš™οΈ Technical Requirements

Python

  • Python 3.8+

Dependencies

pip install geopandas numpy pandas rasterio shapely scipy \
            matplotlib seaborn scikit-learn \
            torch torchvision torchmetrics \
            geemap

πŸš€ Usage

1️⃣ Grid Generation

python 01_grid_visualization.py

2️⃣ Label Assignment

python 02_label_extract_assignment.py

3️⃣ Train/Test Split

python 03_train_test_split.py

4️⃣ Model Training

python 04_cnn_train_eval.py

🧾 Data Format

labelled_images_clean.csv

| filename | lat | lon | label | class_str |


πŸ› οΈ Troubleshooting

  • Ensure all data paths are correct
  • Run scripts in order
  • Verify image presence in rgb/

βœ… Best Practices

  • Preserve intermediate outputs
  • Validate class balance before training
  • Track experiment configurations

πŸ“š Citation

Please cite the following if used in research or applications:

  • ESA WorldCover 2021
  • Copernicus Sentinel-2
  • Relevant geospatial data providers

⭐ If this repository helps your work, consider starring it!

About

This project provides a complete, reproducible workflow for automated land cover classification using Sentinel-2 image patches and ESA WorldCover data. The pipeline enables extraction, labeling, and supervised modeling of large geospatial chip collections, supporting quantitative environmental audit and analysis.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages