Emoji Stable Diffusion

Emoji Stable Diffusion is a research project dedicated to the development of an Emoji dataset and the creation of a Stable Diffusion training pipeline enhanced by a Curriculum Learning strategy. The primary objective is to improve the quality of generated images despite limited data availability.

Introduction

The Emoji Stable Diffusion project focuses on researching and developing a model that generates emoji-style images using Stable Diffusion. The project integrates a Curriculum Learning approach to enhance the training process and, consequently, improve the quality of generated images, even when constrained by a limited training dataset.

System Requirements

Programming Language: Python 3.11.9
Environment Management: Anaconda (or equivalent tools)
Hardware: GPU recommended for training procedures
Dependencies: All required packages are listed in the requirements.txt file

Installation

To set up a new environment and install the necessary dependencies, execute the following commands in your terminal:

conda env create -f environment.yml
conda activate esd_env

Project Structure

The repository is organized in the following structure, ensuring consistency and ease of management:

Emoji_SD/
├── config/
│   ├── eval.yaml  
│   ├── train.yaml  
│   └── visualize.yaml  
├── data/
│   ├── processed/
│   │   ├── resized_val_images/  
│   │   ├── train_images/  
│   │   ├── val_images/  
│   │   ├── train.csv  
│   │   └── val.csv  
│   └── raw/  
├── experiments/
│   └── 10042025_2024/
│       ├── gen_images/  
│       ├── best_model.pth  
│       ├── config.yaml  
│       ├── losses.png  
│       └── losses.txt  
├── notebooks/  
├── scripts/
│   ├── eval.sh  
│   └── train.sh  
├── src/
│   ├── data/  
│   ├── models/  
│   ├── training/  
│   ├── utils/  
│   └── __init__.py  
├── weights/  
├── .env  
├── .gitignore  
├── eval.py  
├── README.md  
├── requirements.txt  
├── train.py  
├── visualize.ipynb  
└── visualize.py

Configuration and Training

The training and evaluation parameters are specified in YAML configuration files located in the config/ directory.

Training Configuration

An example configuration in train.yaml is provided below:

os:
    seed: 42

model:
    vae_id: "stabilityai/sd-vae-ft-mse"  # VAE model identifier on Hugging Face
    text_encoder_id: "openai/clip-vit-base-patch32"  # Text encoder model identifier on Hugging Face
    unet_dim: 256         # Dimensionality of the UNet blocks
    unet_heads: 8         # Number of heads in the multi-head attention mechanism
    step_dim: 128         # Dimensionality of the step representation
    context_dim: 512      # Dimensionality of the text embedding from the text encoder

dataset:
    train_csv_file: "data/processed/train.csv"      # Path to the training CSV file
    train_image_folder: "data/processed/train_images" # Directory containing training images
    val_csv_file: "data/processed/val.csv"            # Path to the validation CSV file
    val_image_folder: "data/processed/val_images"     # Directory containing validation images
    batch_size: 512                                  
    resolution: 64                                  # Image resolution after resizing

training:
    learning_rate: 1e-4
    eta_min: 1e-6               # Minimum learning rate for the Cosine Annealing Scheduler
    num_epochs: 500             # Total number of training epochs
    save_best: True             # Save weights of the best performing model
    save_after: 0.75            # Save model weights after reaching 75% of the total epochs
    weights_folder: "experiments/" 
    weight_name: "best_model.pth"
    experiments_folder: "experiments/"
    gen_images_folder_name: "gen_images"  # Directory for generated images used to compute FID score

Evaluation Configuration

An example configuration in eval.yaml is outlined below:

os:
    seed: 42

data:
    path_real_images: "data/processed/resized_val_images"  # Directory containing real validation images
    path_generated_images: "experiments/10042025_2024/gen_images"  # Directory containing images generated post-training

inference:
    dims: 2048
    batch_size: 128

Execution of Training and Evaluation Scripts

The provided scripts in the scripts/ directory facilitate the training and evaluation processes. Execute the following commands in your terminal:

bash scripts/train.sh  # Initiates model training
bash scripts/eval.sh   # Performs model evaluation

Training: Upon completion, the training process will generate:
- The best model weights file (.pth)
- A set of generated images for FID score assessment
- A training configuration file and a loss-chart image along with a corresponding log file
Evaluation: Ensure the evaluation script is configured with the correct path to the generated images directory to display the FID score in the terminal output.

Testing

To verify that the configurations are properly set, modify the respective YAML configuration files under the config/ directory as needed and follow the execution guidelines under the Execution of Training and Evaluation Scripts section.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Emoji Stable Diffusion

Table of Contents

Introduction

System Requirements

Installation

Project Structure

Configuration and Training

Training Configuration

Evaluation Configuration

Execution of Training and Evaluation Scripts

Testing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
config		config
data		data
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
eval.py		eval.py
finetune.py		finetune.py
requirements.txt		requirements.txt
train.py		train.py
visualize.ipynb		visualize.ipynb
visualize.py		visualize.py

Folders and files

Latest commit

History

Repository files navigation

Emoji Stable Diffusion

Table of Contents

Introduction

System Requirements

Installation

Project Structure

Configuration and Training

Training Configuration

Evaluation Configuration

Execution of Training and Evaluation Scripts

Testing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages