Skip to content

[ECCV 2024 Oral] ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction

License

Notifications You must be signed in to change notification settings

haoosz/ConceptExpress

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ConceptExpress

License arXiv

This is the official PyTorch codes for the paper:

ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction
Shaozhe Hao, Kai Han, Zhengyao Lv, Shihao Zhao, Kwan-Yee K. Wong
The University of Hong Kong
ECCV 2024 (Oral)

We present Unsupervised Concept Extraction (UCE) that focuses on the unsupervised problem of extracting multiple concepts from a single image.

Project Page

The dataset of input images used in our paper is now available at this link. All images in this dataset are sourced from Unsplash under a license that allows free download and use!

Set-up

Create a conda environment uce using

conda env create -f environment.yml
conda activate uce
conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia
pip install -r requirements.txt

Training

Create a new folder that contains an img.jpg. For example, download our dataset and put it under the root path. You can change --instance_data_dir in bash file scripts/train.sh to uce_images/XX or any other image path you like. You can specify --output_dir to save the checkpoints.

When the above is ready, run the following to start training:

bash scripts/train.sh

The learned token embeddings of all concepts are saved to .bin files under your --output_dir.

Inference

Once trained, the i-th concept is represented as <asset$i> in the tokenizer. We can then freely generate images using any concept token <asset$i> (replace $i with a valid concept index):

python infer.py \
  --embed_path $CKPT_BIN_FILE \
  --prompt "a photo of <asset$i> in the snow" \
  --save_path $SAVE_FOLDER \
  --seed 0

Please specify $CKPT_BIN_FILE which is the .bin file path of your learned token embeddings, and $SAVE_FOLDER to save the generated images. You can also find inference examples in scripts/infer.sh.

Citation

If you use this code in your research, please consider citing our paper:

@InProceedings{hao2024conceptexpress,
    title={Concept{E}xpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction}, 
    author={Shaozhe Hao and Kai Han and Zhengyao Lv and Shihao Zhao and Kwan-Yee~K. Wong},
    booktitle={ECCV},
    year={2024},
}

Acknowledgements

This code repository is based on the great work of Break-A-Scene. Thanks!

About

[ECCV 2024 Oral] ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published