This project implements Denoising Diffusion Probabilistic Models (DDPM) with fine-grained covariate control for generating realistic landscape images. The model allows precise manipulation of image characteristics like brightness, contrast, texture, and noise level during generation.
- Covariate Control: Control image attributes (brightness, contrast, texture, noise) during generation
- Classifier-Free Guidance: Enables conditional and unconditional generation with smooth interpolation
- Preset Landscape Styles: Pre-defined covariate combinations for different landscape types
- Interactive Generation: Real-time control over image characteristics
- Comprehensive Analysis Tools: Dataset analysis and covariate visualization
- Flexible Training: Support for both extracted and synthetic covariates
- Python 3.8+
- PyTorch 1.9+
- CUDA-capable GPU (recommended)
pip install -r requirements.txt Train a covariate-controlled diffusion model on your landscape dataset:
python usage.py --mode train --dataset_path ./landscape_dataset --epochs 200 --batch_size 8Generate landscape samples using a trained model:
python usage.py --mode generate --model_path models/trained_model.pt --num_samples 16Launch interactive mode for real-time control:
python usage.py --mode interactive --model_path models/trained_model.ptAnalyze covariate distributions in your dataset:
python usage.py --mode analyze --dataset_path ./landscape_datasetThe project extends the standard DDPM UNet architecture with:
- Covariate Embedding: MLP that encodes continuous covariates into the diffusion process
- Classifier-Free Guidance: Supports both conditional and unconditional generation
- Self-Attention Layers: Improved spatial relationships in generated images
- Flexible Conditioning: Compatible with both class labels and continuous covariates
UNet_Covariate: Base model with covariate conditioningUNet_Conditional_Covariate: Supports both class labels and covariatesCovariateAwareDiffusion: Enhanced sampling with covariate controlEnhancedDataset: Automated covariate extraction and augmentation
The model controls four image attributes:
- Brightness (-1.0 to 1.0): Overall image luminance
- Contrast (0.5 to 1.5): Difference between dark and light areas
- Texture (0.0 to 1.0): Surface roughness and detail level
- Noise (0.0 to 0.2): Amount of high-frequency noise
The model includes pre-defined covariate combinations for various landscape types:
golden_hour,blue_hour,midday_sun,overcast
smooth_plains,rocky_mountains,dense_forest,desert_dunes
clear_skies,stormy_weather,misty_morning,foggy_valley
high_contrast,soft_dreamy,dramatic_dark,vibrant_colors
Create smooth transitions between different landscape styles:
from utils import interpolate_covariates
start_cov = presets["golden_hour"]
end_cov = presets["blue_hour"]
interpolated = interpolate_covariates(start_cov, end_cov, steps=10)from modules import CovariateUtils
# Sample random covariates
covariates = CovariateUtils.sample_covariates(batch_size=4, device=device)
# Create specific covariate values
covariates = CovariateUtils.create_specific_covariates(
batch_size=4,
brightness=0.5,
contrast=1.2,
texture=0.7,
noise_level=0.05
)ddpm.py: Main implementation of covariate-controlled diffusion with training loops, sampling algorithms, and utility functions.modules.py: Neural network architectures including covariate-conditioned UNet variants and attention mechanisms.utils.py: Dataset handling, covariate analysis, visualization utilities, and preset management.usage.py: Comprehensive example script demonstrating training, generation, analysis, and interactive modes.
# Example training configuration
python usage.py --mode train \
--dataset_path ./datasets/landscapes \
--epochs 500 \
--batch_size 8 \
--image_size 64 \
--learning_rate 3e-4 \
--use_real_covariates # Extract covariates from real images# Generate with specific styles
python usage.py --mode generate \
--model_path models/final_model.pt \
--num_samples 8 \
--cfg_scale 3.0 # Classifier-free guidance scale
# Create covariate control grids
python ddpm.py --model_path models/final_model.pt --mode grid- Folder structure with landscape images
- Common image formats (JPEG, PNG, etc.)
- Recommended minimum size: 64x64 pixels
The dataset handler automatically extracts covariates from images:
from utils import LandscapeDataset
dataset = LandscapeDataset(
root_dir="./landscapes",
extract_covariates=True, # Auto-extract brightness, contrast, etc.
augment_covariates=False # Add noise to covariates for robustness
)- Use smaller image sizes (64x64) for faster experimentation
- Enable mixed precision training for GPU acceleration
- Use larger batch sizes when memory permits
- Higher CFG scales (3.0-7.0) for more conditional control
- More diffusion steps (1000) for better quality
- Experiment with different covariate combinations
- CUDA Out of Memory: Reduce batch size or image dimensions
- Slow Training: Enable mixed precision or reduce model complexity
- Poor Generation Quality: Increase training epochs or adjust learning rate
- Covariate Overfitting: Increase CFG drop probability during training
Use TensorBoard to monitor training progress:
tensorboard --logdir runs/