DMSA-CloudNet

Dilated Multi-Scale Attention CloudNet for Ground-Based Cloud Segmentation

Md Shihab Reza¹ · Dr. Muhammad Hasibur Rashid Chayon²
¹ North South University (NSU), Dhaka, Bangladesh
² American International University-Bangladesh (AIUB), Dhaka, Bangladesh

Abstract

We present DMSA-CloudNet, an ultra-lightweight encoder-decoder architecture for ground-based cloud segmentation. Despite containing only 1,727 parameters (6.75 KB) — smaller than a typical JPEG image — DMSA-CloudNet achieves competitive performance against significantly larger baselines across three public benchmarks: SWIMSeg, SWINSeg, and SWINYSeg. The model combines depthwise-separable convolutions, a Dilated Multi-Scale (DMS) bottleneck with four parallel receptive fields (d=1,2,4,8), and Squeeze-Excitation Skip Attention (SESA) to capture multi-scale cloud structures efficiently. A Hybrid Dice-BCE loss further stabilises training on imbalanced cloud/sky masks.

Qualitative Results

Figure 1. Qualitative comparison of DMSA-CloudNet predictions vs. ground truth across SWIMSeg, SWINSeg, and SWINYSeg. Each row represents one dataset. Columns: Original Image | Ground Truth | DMSA-CloudNet Prediction.

Quantitative Results

Test-Set Performance (80/20 deterministic split, seed=42)

Dataset	Precision	Recall	F1 Score	Dice Score	Error Rate	Params
SWIMSeg	0.7458	0.8835	0.8088	0.8088	0.2226	6.75 KB
SWINSeg	0.8184	0.7690	0.7930	0.7930	0.1957	6.75 KB
SWINYSeg	0.7543	0.9111	0.8254	0.8254	0.2045	6.75 KB

All metrics are computed on the held-out 20% test split, never seen during training.

Architecture

                    ┌──────────────────────────────────┐
   Input            │         DMSA-CloudNet            │
  (3×H×W) ─────────►                                  │
                    │  Encoder (6-ch DS-Conv × 3)      │
                    │    ├─ s1 ──► SESA ──────────┐    │
                    │    ├─ s2 ──► SESA ───────┐  │    │
                    │    └─ s3 ──► SESA ────┐  │  │    │
                    │                       │  │  │    │
                    │  Adapt  (6 → 8 ch)    │  │  │    │
                    │                       │  │  │    │
                    │  DMS Bottleneck       │  │  │    │
                    │   ├─ d=1             │  │  │    │
                    │   ├─ d=2             │  │  │    │
                    │   ├─ d=4             │  │  │    │
                    │   └─ d=8             │  │  │    │
                    │   └──cat──►merge     │  │  │    │
                    │                       │  │  │    │
                    │  Decoder (6-ch DS)    │  │  │    │
                    │   ├─ up3 ◄─────────cat◄─┘  │    │
                    │   ├─ up2 ◄──────────cat◄───┘    │
                    │   └─ up1 ◄───────────cat◄────────┘
                    │                                  │
                    │  Output conv → Sigmoid           │
                    └─────────────────┬────────────────┘
                                      │
                              Mask (1×H×W)

Key Design Decisions

Component	Description	Benefit
Depthwise-Separable Conv	Factorises 3×3 conv into depthwise + pointwise	~8× fewer parameters
DMS Bottleneck	4 parallel dilated branches (d=1,2,4,8)	Multi-scale cloud context
SESA Attention	SE-style channel recalibration on skip connections	Sharpens boundary details
HybridDiceBCE Loss	0.5 × Dice + 0.5 × BCE	Stable on class-imbalanced masks
Narrow channels (E=6, B=8)	Ultra-low channel widths throughout	Entire model < 7 KB

Datasets

Dataset	Split	Images	Setting	GT Format
SWIMSeg	811 train / 202 test	1,013	Daytime	PNG binary
SWINSeg	92 train / 23 test	115	Nighttime	JPG binary
SWINYSeg	5,414 train / 1,354 test	6,768	Day + Night	PNG binary

Download and arrange under data/:

data/
├── swimseg/
│   ├── images/          # *.png
│   └── GTmaps/          # *_GT.png
├── swinseg/
│   ├── images/          # *.jpg
│   └── GTmaps/          # *_GT.jpg
└── swinyseg/
    ├── images/          # *.jpg
    └── GTmaps/          # *.png  (same stem, no suffix)

Repository Structure

DMSA-CloudNet/
│
├── models/
│   ├── __init__.py              # Model registry + get_model()
│   └── dmsa_cloudnet.py         # Full architecture implementation
│
├── utils/
│   ├── data_loader.py           # CloudDataset + deterministic splits
│   ├── losses.py                # DiceLoss + HybridDiceBCELoss
│   ├── metrics.py               # F1, Dice, Precision, Recall, Error Rate
│   └── visualization.py         # Training curves, comparison grids
│
├── assets/
│   └── comparison_figure.png    # Qualitative results figure
│
├── checkpoints/                 # Trained weights (see Pretrained Models)
├── data/                        # Datasets (download separately)
├── results/                     # Auto-generated evaluation outputs
├── logs/                        # Auto-generated training logs
│
├── config.py                    # All hyperparameters and paths
├── train.py                     # Training entry point
├── evaluate.py                  # Evaluation entry point
├── predict.py                   # Inference on arbitrary images
├── DMSA_CloudNet_Demo.ipynb     # End-to-end interactive notebook
│
├── requirements.txt
├── .gitignore
└── README.md

Quickstart

1. Clone & Install

git clone https://github.com/ShihabRezaAdit/DMSA-CloudNet.git
cd DMSA-CloudNet
pip install -r requirements.txt

2. Download Datasets

Download SWIMSeg, SWINSeg, and SWINYSeg and place them under data/ as shown above.

3. Train

# Single dataset
python train.py --dataset swimseg

# All three datasets sequentially
python train.py --dataset all

# Force re-train (override existing checkpoint)
python train.py --dataset swimseg --force

# Custom epoch count
python train.py --dataset swinyseg --epochs 30

4. Evaluate

# Single dataset — prints metrics table
python evaluate.py --dataset swimseg

# All datasets — also saves results/all_datasets_summary.csv
python evaluate.py --dataset all

5. Predict on New Images

# Single image → saves prediction mask
python predict.py --input sky.jpg --dataset swimseg

# Entire folder
python predict.py --input ./my_images/ --dataset swimseg --output ./predictions/

6. Interactive Notebook

jupyter notebook DMSA_CloudNet_Demo.ipynb

The notebook covers: setup → dataset exploration → training → evaluation → visual comparison → inference demo.

Reproducibility

Setting	Value
Random seed	`42`
Train / Test split	80% / 20% (deterministic `torch.Generator`)
Image size	304 × 304
Optimiser	Adam, lr=1e-3
LR schedule	Exponential decay γ=0.95
Loss	HybridDiceBCE (α=0.5)
Max epochs	50
Early stopping	patience=10

To fully reproduce all reported results from scratch:

python train.py --dataset all
python evaluate.py --dataset all

All hyperparameters are centralised in config.py — no magic numbers elsewhere.

Configuration

Edit config.py to customise any hyperparameter:

class Config:
    IMG_SIZE       = 304      # Input resolution
    TRAIN_SPLIT    = 0.8      # 80/20 train-test split
    EPOCHS         = 50       # Max training epochs
    PATIENCE       = 10       # Early stopping patience
    LR             = 1e-3     # Initial learning rate
    LR_DECAY_GAMMA = 0.95     # Exponential LR decay
    ALPHA_HDBL     = 0.5      # Dice weight in HybridDiceBCELoss
    RANDOM_SEED    = 42       # Global random seed

Requirements

Python ≥ 3.9
PyTorch ≥ 2.0 (CUDA recommended)
See requirements.txt for full list

pip install -r requirements.txt

Citation

This paper is currently unpublished. If you use this code or build upon this work, please cite:

@misc{reza2026dmsacloudnet,
  title        = {DMSA-CloudNet: Dilated Multi-Scale Attention for Ultra-Lightweight Ground-Based Cloud Segmentation},
  author       = {Reza, Md Shihab and Chayon, Muhammad Hasibur Rashid},
  year         = {2026},
  note         = {Code available at https://github.com/ShihabRezaAdit/DMSA-CloudNet},
  institution  = {North South University and American International University-Bangladesh (AIUB), Dhaka, Bangladesh}
}

License

This project is licensed under the MIT License — see LICENSE for details.

Dataset licenses apply independently:

SWIMSeg / SWINSeg / SWINYSeg: please refer to each dataset's official page for usage terms.

North South University · American International University-Bangladesh (AIUB) · Dhaka, Bangladesh

If you find this work useful, please consider giving it a ⭐

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DMSA-CloudNet

Dilated Multi-Scale Attention CloudNet for Ground-Based Cloud Segmentation

Abstract

Qualitative Results

Quantitative Results

Test-Set Performance (80/20 deterministic split, seed=42)

Architecture

Key Design Decisions

Datasets

Repository Structure

Quickstart

1. Clone & Install

2. Download Datasets

3. Train

4. Evaluate

5. Predict on New Images

6. Interactive Notebook

Reproducibility

Configuration

Requirements

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
checkpoints		checkpoints
models		models
utils		utils
.gitignore		.gitignore
DMSA_CloudNet_Demo.ipynb		DMSA_CloudNet_Demo.ipynb
LICENSE		LICENSE
README.md		README.md
config.py		config.py
evaluate.py		evaluate.py
predict.py		predict.py
requirements.txt		requirements.txt
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

DMSA-CloudNet

Dilated Multi-Scale Attention CloudNet for Ground-Based Cloud Segmentation

Abstract

Qualitative Results

Quantitative Results

Test-Set Performance (80/20 deterministic split, seed=42)

Architecture

Key Design Decisions

Datasets

Repository Structure

Quickstart

1. Clone & Install

2. Download Datasets

3. Train

4. Evaluate

5. Predict on New Images

6. Interactive Notebook

Reproducibility

Configuration

Requirements

Citation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages