This repository currently implements a scratch semantic segmentation pipeline focused on class imbalance robustness for off-road scenes.
The active, working code path in this snapshot is under project/:
- Training:
project/train.py - Inference helpers:
project/inference_utils.py - Before/after benchmarking:
project/before_after_eval.py - Streamlit QA apps:
project/streamlit_model_tester.py,project/streamlit_before_after.py
project/
config.py # Config defaults, merge, path resolution, validation
dataset.py # Dataset, label mapping, object-focused crop, dataset stats
sampler.py # Class-aware and weighted sampling
model.py # UNet-from-scratch with optional deep supervision
loss.py # Hybrid loss (CE + Dice + Focal + KL distribution alignment)
train.py # End-to-end training loop and validation
inference_utils.py # Model loading, prediction, overlays, per-image metrics
before_after_eval.py # Batch before/after evaluation with robust pairing
streamlit_model_tester.py # Single-image + batch QA dashboard
streamlit_before_after.py # Focused before/after dashboard
scratch_hardfix.yaml # Full training config
smoke_hardfix.yaml # Fast smoke-test config
outputs/
README.md
requirements.txt
streamlit_app.py # Legacy UI path (see note below)
main.py # Legacy CLI path (see note below)
- Load YAML config.
- Deep-merge it with
DEFAULT_CONFIG. - Resolve all
paths.*values relative to the config file location. - Validate:
data.num_classes > 1ignore_indexis validmodel.pretrainedis disabled (hard requirement)- train/val image and mask folders exist
- key training hyperparameters are positive
- Enumerate image files from
paths.train_images/paths.val_images. - Align masks by filename stem (image
abc.pngmust have mask stemabc.*). - Convert raw mask values to class indices via optional
label_map. - Compute dataset statistics:
- per-class pixel counts
- per-image class sets
- class-to-image index map
- rare classes via quantile threshold
- Training-only logic:
- object/rare-class focused crop (
object_crop_prob,object_crop_size) - Albumentations augmentations (random resized crop, flip, color jitter, blur)
- object/rare-class focused crop (
- Validation logic:
- deterministic resize + normalize only
train.py chooses one strategy:
class_aware:- ensures rare-class samples are included each batch
- fills remaining slots via inverse-frequency class sampling
weighted:- uses image-level weighted random sampling, upweighting rare-class images
random:- standard shuffled minibatches
- Build
UNetScratch(no pretrained backbone). - Encoder-decoder UNet blocks with skip connections.
- Main segmentation head outputs
num_classeslogits. - Optional auxiliary head from intermediate decoder feature map for deep supervision.
- Output is
(main_logits, aux_logits).
Per forward pass, HybridSegLoss computes:
- Weighted Cross-Entropy
- Multi-class Dice loss
- Focal loss
- KL divergence between predicted class distribution and GT distribution
Total loss:
ce_weight * CE + dice_weight * Dice + focal_weight * Focal + dist_kl_weight * KL
If auxiliary logits exist, their loss is added with model.aux_weight.
- Set seed and logger.
- Build train/val datasets and dataloaders.
- Compute inverse-log class weights from dataset class pixel counts.
- Create model,
AdamW, cosine LR scheduler, AMP scaler, and hybrid criterion. - For each epoch:
- train with mixed precision (if CUDA +
amp=true) - apply grad clipping
- log loss breakdown (
ce,dice,focal,kl) - run validation and compute confusion-matrix metrics
- log per-class IoU and GT/pred class distributions
- save
last.pth, and updatebest.pthwhen mIoU improves
- train with mixed precision (if CUDA +
- Guardrails:
- optional assertion that every batch contains at least one rare class
- optional assertion that all dataset classes are seen each epoch
- warning when model predicts classes absent from GT above threshold
From confusion matrix:
mioudicemap50(fraction of classes with IoU >= 0.5)pixel_accper_class_iougt_distandpred_dist
Training outputs go to:
project/outputs/<run_name>/train.logproject/outputs/<run_name>/checkpoints/best.pthproject/outputs/<run_name>/checkpoints/last.pth
- Load config + checkpoint bundle.
- Build file maps for before/after/GT directories.
- Pair images robustly (
auto,stem,normalized,numeric,hash,index). - For each matched triplet:
- predict mask on before and after image
- compare each prediction against GT
- record per-image metrics and gains
- Aggregate confusion matrices across all pairs.
- Save
results.jsonand optional visualization panels.
project/streamlit_model_tester.py- single-image inference with optional GT scoring
- batch before/after evaluation in one dashboard
project/streamlit_before_after.py- focused before/after QA workflow
Both apps use inference_utils.py and before_after_eval.py.
pip install -r requirements.txtFor each split, image and mask filenames must match by stem:
train_images/
0001.png
0002.png
train_masks/
0001.png
0002.png
If raw mask ids are not already [0..num_classes-1], define data.label_map in YAML.
Edit:
project/scratch_hardfix.yamlfor full trainingproject/smoke_hardfix.yamlfor quick sanity checks
python project/train.py --config project/scratch_hardfix.yamlQuick smoke run:
python project/train.py --config project/smoke_hardfix.yamlpython project/before_after_eval.py ^
--config project/scratch_hardfix.yaml ^
--checkpoint project/outputs/scratch_hardfix/checkpoints/best.pth ^
--before <path-to-before-images> ^
--after <path-to-after-images> ^
--gt <path-to-gt-masks> ^
--output project/outputs/before_after_eval ^
--pairing autoOn Linux/macOS, replace ^ with \.
streamlit run project/streamlit_model_tester.pyor
streamlit run project/streamlit_before_after.pyThe config maps these raw mask ids into class indices 0..9:
[100, 200, 300, 500, 550, 600, 700, 800, 7100, 10000]
Mask missing for image stem ...: Image and mask names are not aligned.Invalid class indices found in mask ...: Updatedata.label_mapor clean mask ids.Rare-class coverage failed ...: Lowersampler.rare_per_batch, adjustrare_quantile, or disable assertion for debugging.Pretrained models are disabled ...: Keepmodel.pretrained: false.
For this branch, use the project/ scripts documented above.
Branch: test-results-branch
Test images: Offroad_Segmentation_testImages (Color_Images + Segmentation GT masks)
Checkpoint: project/outputs/scratch_hardfix/checkpoints/best.pth
python project/test_inference.py ^
--images "C:/Users/ADMIN/Downloads/Offroad_Segmentation_testImages/Offroad_Segmentation_testImages/Color_Images" ^
--masks "C:/Users/ADMIN/Downloads/Offroad_Segmentation_testImages/Offroad_Segmentation_testImages/Segmentation" ^
--config project/scratch_hardfix.yaml ^
--checkpoint project/outputs/scratch_hardfix/checkpoints/best.pth ^
--output project/outputs/test_results ^
--device cpuOn Linux/macOS replace ^ with \.
Outputs written to project/outputs/test_results/:
test_results.json— per-image mIoU, mAP50, Dice, Pixel Accuracyoverlays/— predicted segmentation overlaid on each test image
Trained on 1204 train images using NVIDIA GeForce RTX3060 (CUDA)
| Metric | Score |
|---|---|
| mIoU | 0.6198 |
| mAP50 | 0.7000 |
| Dice | 0.7501 |
| Pixel Accuracy | 0.8548 |
Each row shows: Input Image → Predicted Mask → Overlay
Add sample overlay images from
project/outputs/test_results/overlays/here after running inference. Example:
| Class Index | Class Name | Color |
|---|---|---|
| 0 | dirt_road | dark gray |
| 1 | gravel | orange |
| 2 | grass | blue |
| 3 | rock | green |
| 4 | water | yellow |
| 5 | mud | purple |
| 6 | sand | teal |
| 7 | vegetation | light orange |
| 8 | obstacle | indigo |
| 9 | sky | brown |