Skip to content

dxlabskku/CLIP-SegFusion

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CLIP-SegFusion: An Attention-Guided Feature Fusion Framework for Multi-Level Detection on AI-Generated Artworks

This repository provides the PyTorch implementation of the paper:

CLIP-SegFusion: An Attention-Guided Feature Fusion Framework for Multi-Level Detection on AI-Generated Artworks
(Implementation of a multi-level detection framework for distinguishing AI-generated artworks)


Architecture


Dependencies

Ensure that you have the following environment:

Package Version
CUDA 11.8
Python 3.8.20
PyTorch 2.4.1
Torchvision 0.19.1
NumPy 1.24.4
pandas 2.0.3
tqdm 4.67.1
Pillow 10.4.0
scikit-learn 1.3.2
opencv-python 4.11.0.86
albumentations 1.4.18
imgaug 0.4.0
transformers 4.46.3
open-clip-torch 2.30.0
ftfy 6.2.3
regex 2024.11.6
packaging 24.2

Usage

Training

Run the following command to train the model:

python train.py \
  --dataset_path /path/to/dataset \
  --csv_path /path/to/metadata.csv \
  --output_path /path/to/save_dir

Evaluation

Run the following command to evaluate a trained model:

python evaluate.py \
  --dataset_path /path/to/dataset \
  --csv_path /path/to/metadata.csv \
  --model_weight_path /path/to/model_weights.pth \
  --results_file /path/to/save_results.csv

Arguments

Argument Description
--dataset_path Root directory of dataset images.
--csv_path Path to the metadata CSV file containing image paths and labels.
--output_path Directory to save checkpoints.
--model_weight_path Path to a trained .pth model file (required only for evaluation).
--results_file Path to save the evaluation metrics as a CSV file (only for evaluation).

Evaluation Results (%) on ID Dataset (Stable Diffusion)

LA_Inpainting

Model Accuracy (%) Precision (%) Recall (%) F1 (%) AUC (%)
AutoGAN 83.15 83.15 83.15 83.15 90.63
DIRE 91.35 91.35 91.35 91.35 97.32
De-Fake 89.90 89.92 89.90 89.90 96.18
ZeroFake 50.30 50.43 50.30 46.17 51.49
LaRE 95.50 95.50 95.50 95.50 99.08
CLIPping (PT) 94.05 94.43 94.05 94.04 98.84
CLIPping (LP) 92.50 92.98 92.50 92.48 98.07
LOTA 67.35 67.90 67.35 67.10 73.05
SAFE 77.40 77.56 77.40 77.37 85.83
CLIP-SegFusion (Ours) 94.80 94.80 94.80 94.80 98.77

Inpainting

Model Accuracy (%) Precision (%) Recall (%) F1 (%) AUC (%)
AutoGAN 73.98 74.58 73.98 73.82 82.78
DIRE 85.09 85.09 85.09 85.09 93.03
De-Fake 85.53 85.56 85.53 85.53 92.58
ZeroFake 65.83 67.33 65.83 65.08 74.70
LaRE 90.42 90.47 90.42 90.42 97.00
CLIPping (PT) 87.01 87.46 87.02 86.97 95.11
CLIPping (LP) 86.42 87.94 86.42 86.28 95.78
LOTA 52.89 53.93 52.88 49.49 55.88
SAFE 65.33 65.59 65.34 65.19 72.07
CLIP-SegFusion (Ours) 90.47 90.55 90.47 90.46 96.88

Evaluation Results on OOD Dataset - F1 (%) / AUC (%)

Model Latent Diffusion Midjourney DALLE Imagen Janus StyleGAN2
AutoGAN 60.69 / 70.38 44.94 / 52.52 41.92 / 49.54 43.65 / 52.91 72.18 / 80.92 44.86 / 56.69
DIRE 51.03 / 57.51 66.02 / 84.53 61.08 / 81.12 37.84 / 60.11 68.08 / 85.65 33.78 / 44.64
De-Fake 53.74 / 54.69 61.95 / 80.40 53.52 / 70.39 39.81 / 56.50 63.81 / 82.10 72.10 / 84.78
ZeroFake 45.46 / 46.62 37.05 / 34.89 35.66 / 30.86 44.31 / 34.29 68.77 / 82.30 31.24 / 20.39
LaRE 68.43 / 78.37 82.02 / 90.17 71.29 / 79.68 92.40 / 97.76 77.20 / 84.98 55.00 / 70.45
CLIPping (PT) 69.12 / 77.62 52.53 / 79.01 43.45 / 71.85 33.33 / 39.18 74.95 / 94.67 64.01 / 91.57
CLIPping (LP) 76.42 / 84.37 82.45 / 94.38 71.21 / 89.53 61.29 / 87.45 84.02 / 96.15 91.74 / 98.11
LOTA 47.59 / 53.43 62.30 / 67.57 59.57 / 64.56 58.61 / 64.58 68.70 / 73.92 72.26 / 76.25
SAFE 63.66 / 68.66 68.17 / 78.20 64.20 / 73.50 83.44 / 90.57 81.20 / 88.56 58.02 / 67.67
CLIP-SegFusion (Ours) 77.22 / 89.80 87.77 / 95.03 82.96 / 92.48 68.61 / 85.99 91.50 / 97.27 93.93 / 98.51

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages