Md Shihab Reza¹ · Dr. Muhammad Hasibur Rashid Chayon²
¹ North South University (NSU), Dhaka, Bangladesh
² American International University-Bangladesh (AIUB), Dhaka, Bangladesh
We present DMSA-CloudNet, an ultra-lightweight encoder-decoder architecture for ground-based cloud segmentation. Despite containing only 1,727 parameters (6.75 KB) — smaller than a typical JPEG image — DMSA-CloudNet achieves competitive performance against significantly larger baselines across three public benchmarks: SWIMSeg, SWINSeg, and SWINYSeg. The model combines depthwise-separable convolutions, a Dilated Multi-Scale (DMS) bottleneck with four parallel receptive fields (d=1,2,4,8), and Squeeze-Excitation Skip Attention (SESA) to capture multi-scale cloud structures efficiently. A Hybrid Dice-BCE loss further stabilises training on imbalanced cloud/sky masks.
Figure 1. Qualitative comparison of DMSA-CloudNet predictions vs. ground truth across SWIMSeg, SWINSeg, and SWINYSeg. Each row represents one dataset. Columns: Original Image | Ground Truth | DMSA-CloudNet Prediction.
| Dataset | Precision | Recall | F1 Score | Dice Score | Error Rate | Params |
|---|---|---|---|---|---|---|
| SWIMSeg | 0.7458 | 0.8835 | 0.8088 | 0.8088 | 0.2226 | 6.75 KB |
| SWINSeg | 0.8184 | 0.7690 | 0.7930 | 0.7930 | 0.1957 | 6.75 KB |
| SWINYSeg | 0.7543 | 0.9111 | 0.8254 | 0.8254 | 0.2045 | 6.75 KB |
All metrics are computed on the held-out 20% test split, never seen during training.
┌──────────────────────────────────┐
Input │ DMSA-CloudNet │
(3×H×W) ─────────► │
│ Encoder (6-ch DS-Conv × 3) │
│ ├─ s1 ──► SESA ──────────┐ │
│ ├─ s2 ──► SESA ───────┐ │ │
│ └─ s3 ──► SESA ────┐ │ │ │
│ │ │ │ │
│ Adapt (6 → 8 ch) │ │ │ │
│ │ │ │ │
│ DMS Bottleneck │ │ │ │
│ ├─ d=1 │ │ │ │
│ ├─ d=2 │ │ │ │
│ ├─ d=4 │ │ │ │
│ └─ d=8 │ │ │ │
│ └──cat──►merge │ │ │ │
│ │ │ │ │
│ Decoder (6-ch DS) │ │ │ │
│ ├─ up3 ◄─────────cat◄─┘ │ │
│ ├─ up2 ◄──────────cat◄───┘ │
│ └─ up1 ◄───────────cat◄────────┘
│ │
│ Output conv → Sigmoid │
└─────────────────┬────────────────┘
│
Mask (1×H×W)
| Component | Description | Benefit |
|---|---|---|
| Depthwise-Separable Conv | Factorises 3×3 conv into depthwise + pointwise | ~8× fewer parameters |
| DMS Bottleneck | 4 parallel dilated branches (d=1,2,4,8) | Multi-scale cloud context |
| SESA Attention | SE-style channel recalibration on skip connections | Sharpens boundary details |
| HybridDiceBCE Loss | 0.5 × Dice + 0.5 × BCE | Stable on class-imbalanced masks |
| Narrow channels (E=6, B=8) | Ultra-low channel widths throughout | Entire model < 7 KB |
| Dataset | Split | Images | Setting | GT Format |
|---|---|---|---|---|
| SWIMSeg | 811 train / 202 test | 1,013 | Daytime | PNG binary |
| SWINSeg | 92 train / 23 test | 115 | Nighttime | JPG binary |
| SWINYSeg | 5,414 train / 1,354 test | 6,768 | Day + Night | PNG binary |
Download and arrange under data/:
data/
├── swimseg/
│ ├── images/ # *.png
│ └── GTmaps/ # *_GT.png
├── swinseg/
│ ├── images/ # *.jpg
│ └── GTmaps/ # *_GT.jpg
└── swinyseg/
├── images/ # *.jpg
└── GTmaps/ # *.png (same stem, no suffix)
DMSA-CloudNet/
│
├── models/
│ ├── __init__.py # Model registry + get_model()
│ └── dmsa_cloudnet.py # Full architecture implementation
│
├── utils/
│ ├── data_loader.py # CloudDataset + deterministic splits
│ ├── losses.py # DiceLoss + HybridDiceBCELoss
│ ├── metrics.py # F1, Dice, Precision, Recall, Error Rate
│ └── visualization.py # Training curves, comparison grids
│
├── assets/
│ └── comparison_figure.png # Qualitative results figure
│
├── checkpoints/ # Trained weights (see Pretrained Models)
├── data/ # Datasets (download separately)
├── results/ # Auto-generated evaluation outputs
├── logs/ # Auto-generated training logs
│
├── config.py # All hyperparameters and paths
├── train.py # Training entry point
├── evaluate.py # Evaluation entry point
├── predict.py # Inference on arbitrary images
├── DMSA_CloudNet_Demo.ipynb # End-to-end interactive notebook
│
├── requirements.txt
├── .gitignore
└── README.md
git clone https://github.com/ShihabRezaAdit/DMSA-CloudNet.git
cd DMSA-CloudNet
pip install -r requirements.txtDownload SWIMSeg, SWINSeg, and SWINYSeg and place them under data/ as shown above.
# Single dataset
python train.py --dataset swimseg
# All three datasets sequentially
python train.py --dataset all
# Force re-train (override existing checkpoint)
python train.py --dataset swimseg --force
# Custom epoch count
python train.py --dataset swinyseg --epochs 30# Single dataset — prints metrics table
python evaluate.py --dataset swimseg
# All datasets — also saves results/all_datasets_summary.csv
python evaluate.py --dataset all# Single image → saves prediction mask
python predict.py --input sky.jpg --dataset swimseg
# Entire folder
python predict.py --input ./my_images/ --dataset swimseg --output ./predictions/jupyter notebook DMSA_CloudNet_Demo.ipynbThe notebook covers: setup → dataset exploration → training → evaluation → visual comparison → inference demo.
| Setting | Value |
|---|---|
| Random seed | 42 |
| Train / Test split | 80% / 20% (deterministic torch.Generator) |
| Image size | 304 × 304 |
| Optimiser | Adam, lr=1e-3 |
| LR schedule | Exponential decay γ=0.95 |
| Loss | HybridDiceBCE (α=0.5) |
| Max epochs | 50 |
| Early stopping | patience=10 |
To fully reproduce all reported results from scratch:
python train.py --dataset all
python evaluate.py --dataset allAll hyperparameters are centralised in config.py — no magic numbers elsewhere.
Edit config.py to customise any hyperparameter:
class Config:
IMG_SIZE = 304 # Input resolution
TRAIN_SPLIT = 0.8 # 80/20 train-test split
EPOCHS = 50 # Max training epochs
PATIENCE = 10 # Early stopping patience
LR = 1e-3 # Initial learning rate
LR_DECAY_GAMMA = 0.95 # Exponential LR decay
ALPHA_HDBL = 0.5 # Dice weight in HybridDiceBCELoss
RANDOM_SEED = 42 # Global random seed- Python ≥ 3.9
- PyTorch ≥ 2.0 (CUDA recommended)
- See
requirements.txtfor full list
pip install -r requirements.txtThis paper is currently unpublished. If you use this code or build upon this work, please cite:
@misc{reza2026dmsacloudnet,
title = {DMSA-CloudNet: Dilated Multi-Scale Attention for Ultra-Lightweight Ground-Based Cloud Segmentation},
author = {Reza, Md Shihab and Chayon, Muhammad Hasibur Rashid},
year = {2026},
note = {Code available at https://github.com/ShihabRezaAdit/DMSA-CloudNet},
institution = {North South University and American International University-Bangladesh (AIUB), Dhaka, Bangladesh}
}This project is licensed under the MIT License — see LICENSE for details.
Dataset licenses apply independently:
- SWIMSeg / SWINSeg / SWINYSeg: please refer to each dataset's official page for usage terms.
North South University · American International University-Bangladesh (AIUB) · Dhaka, Bangladesh
If you find this work useful, please consider giving it a ⭐
