HSIC Bottleneck for Cross-Generator and Domain-Incremental Synthetic Image Detection
Chin-Chia Yang, Yung-Yu Chuang, Hwann-Tzong Chen, Tyng-Luh Liu
ICLR 2026
[Paper]
- Clone this repository
git clone https://github.com/jamesyoung0623/HSIC.git
cd HSIC- Create the conda environment from the exported file
conda env create -f environment.yml
conda activate HSICThis repository trains on pre-extracted CLIP features rather than raw images. The expected workflow is:
- Register the image roots in
dataset_paths_train.pyanddataset_paths_test.py. - Run
extract_features_ddp.pyto extract CLIP features for each dataset. - Run
pack_features.pyto concatenate the extracted per-layer features into a singlepacked_features.npyfile for each split.
The 3DGS benchmark data is hosted on Hugging Face:
https://huggingface.co/datasets/jamesyoung0623/3DGS_Synthetic_Image_Benchmark
One way to download it is with the Hugging Face CLI:
pip install -U "huggingface_hub[cli]"
hf download jamesyoung0623/3DGS_Synthetic_Image_Benchmark \
--repo-type dataset \
--local-dir ./datasetsIf the dataset requires authentication, log in first:
hf auth loginAfter download, organize the extracted folders under datasets/train/... and
datasets/test/... so they match the layout below and the paths you register in
dataset_paths_train.py and dataset_paths_test.py.
For GAN datasets, please refer to CNNDetection.
For diffusion datasets, please refer to GenImage.
The raw image directories referenced in dataset_paths_train.py and
dataset_paths_test.py should follow the wang2020 layout used by the
extractor:
datasets/
train/
<dataset>/
0_real/
1_fake/
test/
<dataset>/
0_real/
1_fake/
Add each dataset to DATASET_PATHS as a dictionary with real_path,
fake_path, data_mode, and key. For datasets already organized with
0_real and 1_fake folders, use data_mode='wang2020'.
Example entry:
dict(
real_path='/path/to/datasets/train/sdv4',
fake_path='/path/to/datasets/train/sdv4',
data_mode='wang2020',
key='sdv4'
)To extract CLIP ViT-L/14 features with distributed inference:
PYTHONNOUSERSITE=1 python -m torch.distributed.run --standalone --nproc_per_node=2 \
extract_features_ddp.py \
--arch "CLIP:ViT-L/14" \
--batch_size 128 \
--flush_every 4096This writes one file per layer into each dataset directory, for example:
datasets/train/sdv4/0_real/feature_layer0.npy
datasets/train/sdv4/0_real/feature_layer1.npy
...
datasets/train/sdv4/0_real/feature_before_projection.npy
pack_features.py then concatenates the layer files into the final packed
feature matrix:
python pack_features.py ./datasets/train/sdv4/0_real
python pack_features.py ./datasets/train/sdv4/1_fake
python pack_features.py ./datasets/test/<dataset>/0_real
python pack_features.py ./datasets/test/<dataset>/1_fakepack_features.py concatenates the default layer set (layer0 to layer23
plus before_projection) and writes:
datasets/<split>/<dataset>/0_real/packed_features.npy
datasets/<split>/<dataset>/1_fake/packed_features.npy
These packed_features.npy files are the inputs consumed by train_HSIC.py
and train_HGR.py.
train_HSIC.py trains the HSIC detector from pre-extracted CLIP features stored
as packed_features.npy. Before launching training, make sure the training
split contains:
datasets/
train/
<dataset>/
0_real/packed_features.npy
1_fake/packed_features.npy
The script loads datasets/train/<dataset>/0_real/packed_features.npy and
datasets/train/<dataset>/1_fake/packed_features.npy, trains a single model
with the built-in dataset-specific HSIC weights, saves checkpoints after each
epoch, and then evaluates all saved checkpoints on every available packed test
dataset under datasets/test/*/{0_real,1_fake}/packed_features.npy.
Example:
CUDA_VISIBLE_DEVICES=0 python train_HSIC.py \
--name sdv4 \
--dataset sdv4Outputs are written to checkpoints/<name>/ and
include:
opt.txt: parsed command-line options.hparams.txt: dataset name and the selectedlambda_x/lambda_y.model_epoch_*.pth: one checkpoint per epoch.train/: TensorBoard logs.results_<checkpoint>.txtandsummary.csv: post-training evaluation across all available test datasets.
--dataset currently accepts progan and sdv4.
train_HGR.py runs the domain-incremental HGR training pipeline with replay
selection based on k-center coverage and HSIC centrality. It expects packed
features for train, val, and test splits:
datasets/
train/<dataset>/{0_real,1_fake}/packed_features.npy
val/<dataset>/{0_real,1_fake}/packed_features.npy
test/<dataset>/{0_real,1_fake}/packed_features.npy
The script preloads all required datasets, initializes from the base
checkpoint, and then evaluates all permutations of the tail-domain order using
the hyperparameters defined in train_HGR.py.
The current setup uses base domains sdv4 and progan, tail domains
nersemble, NHA, and GAGAvatar, and base checkpoints at:
./checkpoints/sdv4/model_best.pth
./checkpoints/progan/model_best.pth
Example:
CUDA_VISIBLE_DEVICES=0 python train_HGR.py \
--name HGR \
--base-select progan \
--keep_frac 0.01Outputs are written under checkpoints/<name>__bo_<base>__<sequence>__.../
and include model checkpoints, TensorBoard logs, and log.txt with evaluation
results. Per-sequence results are written under seq_logs/.
If you find our work helpful in your research, please cite it using the following:
@inproceedings{
yang2026hsic,
title={{HSIC} Bottleneck for Cross-Generator and Domain-Incremental Synthetic Image Detection},
author={Chin-Chia Yang and Yung-Yu Chuang and Hwann-Tzong Chen and Tyng-Luh Liu},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=msLnKDvhBx}
}