🛰️ Semantic Segmentation of Aerial Drone Imagery

This project focuses on semantic segmentation of aerial drone imagery using both CNN-Transformer hybrid models and pre-trained DeepLabV3 backbones.
The pipeline was designed to handle high-resolution drone images, achieving strong segmentation performance across multiple evaluation metrics.

✨ Key Highlights

🧠 Computer Vision Model: Developed a semantic segmentation framework using a DeepLabV3 backbone integrated with a CNN-Transformer hybrid.
📈 Performance: Achieved 84% pixel-wise accuracy, 60% mIoU, and an F1-score of 74% on a dataset of 400 labeled aerial images.
🔄 Pipeline Engineering: Built a complete PyTorch pipeline for data preprocessing, training, and evaluation.
🛠️ Preprocessing: Implemented image tiling and mask remapping for efficient training on high-resolution drone imagery.
📊 Evaluation Metrics: Used mIoU, pixel accuracy, and F1-score for comprehensive model evaluation.

🖼️ Example Outputs

⚙️ Methodology

1. Data Preprocessing

High-resolution aerial drone images split into tiles for GPU-efficient training.
Remapped segmentation masks into consistent class labels.

2. Model Architectures

DeepLabV3: Used pre-trained backbones for semantic segmentation.
Hybrid CNN-Transformer: Designed a custom architecture combining convolutional layers for local feature extraction and Transformer blocks for global context.

3. Evaluation

Metrics: Mean Intersection over Union (mIoU), Pixel Accuracy, F1-score.
Evaluated on a public dataset of 400 labeled aerial drone images.

📊 Results

Model	Pixel Accuracy	mIoU	F1 Score
DeepLabV3 (pre-trained)	81%	55%	70%
CNN-Transformer (Hybrid)	84%	60%	74%

✅ Hybrid CNN-Transformer outperformed baseline DeepLabV3 across all metrics.

📦 Tech Stack

Language: Python 3
Frameworks/Libraries: PyTorch, Torchvision, NumPy, OpenCV, Matplotlib
Deep Learning: DeepLabV3, Transformer layers
Tools: Jupyter Notebook, CUDA-enabled GPU

🔮 Future Improvements

✅ Expand dataset with more diverse aerial imagery.
✅ Add support for multi-class segmentation beyond current labels.
✅ Deploy as a web-based visualization tool with FastAPI/Streamlit.
✅ Experiment with Vision Transformers (ViT) and Swin Transformers.

👤 Author

Jacob Almon

Svetya Koppisetty

Connor MacDonald

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.gitignore		.gitignore
README.md		README.md
alternate_model.ipynb		alternate_model.ipynb
data_cleaning.ipynb		data_cleaning.ipynb
main_model.ipynb		main_model.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🛰️ Semantic Segmentation of Aerial Drone Imagery

📖 Table of Contents

✨ Key Highlights

🖼️ Example Outputs

⚙️ Methodology

1. Data Preprocessing

2. Model Architectures

3. Evaluation

📊 Results

📦 Tech Stack

🔮 Future Improvements

👤 Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🛰️ Semantic Segmentation of Aerial Drone Imagery

📖 Table of Contents

✨ Key Highlights

🖼️ Example Outputs

⚙️ Methodology

1. Data Preprocessing

2. Model Architectures

3. Evaluation

📊 Results

📦 Tech Stack

🔮 Future Improvements

👤 Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages