Skip to content

HU-Qiqi/WM_encoder_decoder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WM_encoder_decoder with Revisited HiDDeN

Colab demo (for using the pre-trained networks for traditional image watermarking.)

This repository is heavily based on the paper HiDDeN: Hiding Data With Deep Networks. The main differences are:

Another implementation is available at ando-khachatryan/HiDDeN.

Setup

Requirements

This codebase has been developed with python version 3.8, PyTorch version 1.12.0, CUDA 11.3. PyTorch can be installed with:

conda install -c pytorch torchvision pytorch==1.12.0 cudatoolkit=11.3

To install the remaining dependencies with pip, run:

pip install -r requirements.txt

Data

The paper uses the COCO dataset.

Usage

Training

The main script is in main.py. It can be used to train the encoder and decoder networks.

To run it on one GPU, use the following command:

torchrun --nproc_per_node=1 main.py --dist False

To run it on multiple GPUs, use the following command:

torchrun --nproc_per_node=$GPUS$ main.py --local_rank 0

Options

Experiment Parameters
  • --train_dir: Path to the directory containing the training data. Default: "path/to/train"
  • --val_dir: Path to the directory containing the validation data. Default: "path/to/val"
  • --output_dir: Output directory for logs and images. Default: "output/"
  • --verbose: Verbosity level for output during training. Default: 1
  • --seed: Random seed. Default: 0
Marking Parameters
  • --num_bits: Number of bits in the watermark. Default: 32
  • --redundancy: Redundancy of the watermark in the decoder (the output is bit is the sum of redundancy bits). Default: 1
  • --img_size: Image size during training. Having a fixed image size during training improves efficiency thanks to batching. The network can generalize (to a certain extent) to arbitrary resolution at test time. Default: 128
Encoder Parameters
  • --encoder: Encoder type (e.g., "hidden", "dvmark", "vit"). Default: "hidden"
  • --encoder_depth: Number of blocks in the encoder. Default: 4
  • --encoder_channels: Number of channels in the encoder. Default: 64
  • --use_tanh: Use tanh scaling. Default: True
Decoder Parameters
  • --decoder: Decoder type (e.g., "hidden"). Default: "hidden"
  • --decoder_depth: Number of blocks in the decoder. Default: 8
  • --decoder_channels: Number of channels in the decoder. Default: 64
Training Parameters
  • --bn_momentum: Momentum of the batch normalization layer. Default: 0.01
  • --eval_freq: Frequency of evaluation during training (in epochs). Default: 1
  • --saveckp_freq: Frequency of saving checkpoints (in epochs). Default: 100
  • --saveimg_freq: Frequency of saving images (in epochs). Default: 10
  • --resume_from: Checkpoint path to resume training from.
  • --scaling_w: Scaling of the watermark signal. Default: 1.0
  • --scaling_i: Scaling of the original image. Default: 1.0
Optimization Parameters
  • --epochs: Number of epochs for optimization. Default: 400
  • --optimizer: Optimizer to use (e.g., "Adam"). Default: "Adam"
  • --scheduler: Learning rate scheduler to use (ex: "CosineLRScheduler,lr_min=1e-6,t_initial=400,warmup_lr_init=1e-6,warmup_t=5"). Default: None
  • --lambda_w: Weight of the watermark loss. Default: 1.0
  • --lambda_i: Weight of the image loss. Default: 0.0
  • --loss_margin: Margin of the Hinge loss or temperature of the sigmoid of the BCE loss. Default: 1.0
  • --loss_i_type: Loss type for image loss ("mse" or "l1"). Default: 'mse'
  • --loss_w_type: Loss type for watermark loss ("bce" or "cossim"). Default: 'bce'
Loader Parameters
  • --batch_size: Batch size for training. Default: 16
  • --batch_size_eval: Batch size for evaluation. Default: 64
  • --workers: Number of workers for data loading. Default: 8
Attenuation Parameters

Additonally, the codebase allows to train with a just noticeable difference map (JND) to attenuate the watermark signal in the perceptually sensitive regions of the image. This can also be added at test time only, at the cost of some accuracy.

  • --attenuation: Attenuation type. Default: None
  • --scale_channels: Use channel scaling. Default: True
Data Augmentation Parameters
  • --data_augmentation: Type of data augmentation to use at marking time ("combined", "kornia", "none"). Default: "combined"
  • --p_crop: Probability of the crop augmentation. Default: 0.5
  • --p_res: Probability of the resize augmentation. Default: 0.5
  • --p_blur: Probability of the blur augmentation. Default: 0.5
  • --p_jpeg: Probability of the JPEG compression augmentation. Default: 0.5
  • --p_rot: Probability of the rotation augmentation. Default: 0.5
  • --p_color_jitter: Probability of the color jitter augmentation. Default: 0.5
Distributed Training Parameters
  • --debug_slurm: Enable debugging for SLURM.
  • --local_rank: Local rank for distributed training. Default: -1
  • --master_port: Port for the master process. Default: -1
  • --dist: Enable distributed training. Default: True

Example

For instance the following command line reproduces the hidden extractor with same parameters as in the paper:

torchrun --nproc_per_node=8 main.py \
  --val_dir path/to/coco/test2014/ --train_dir path/to/coco/train2014/ --output_dir output --eval_freq 5 \
  --img_size 256 --num_bits 48  --batch_size 16 --epochs 300 \
  --scheduler CosineLRScheduler,lr_min=1e-6,t_initial=300,warmup_lr_init=1e-6,warmup_t=5  --optimizer Lamb,lr=2e-2 \
  --p_color_jitter 0.0 --p_blur 0.0 --p_rot 0.0 --p_crop 1.0 --p_res 1.0 --p_jpeg 1.0 \
  --scaling_w 0.3 --scale_channels False --attenuation none \
  --loss_w_type bce --loss_margin 1 

CUDA_VISIBLE_DEVICES=1
torchrun --nproc_per_node=1 main.py
--val_dir datasets/coco_dataset/test2014 --train_dir datasets/coco_dataset/train2014 --output_dir output/output_bit/7 --eval_freq 5
--img_size 256 --num_bits 48 --batch_size 16 --epochs 300
--scheduler CosineLRScheduler,lr_min=1e-6,t_initial=300,warmup_lr_init=1e-6,warmup_t=5 --optimizer Lamb,lr=2e-2
--p_color_jitter 0.0 --p_blur 0.0 --p_rot 0.0 --p_crop 1.0 --p_res 1.0 --p_jpeg 1.0
--scaling_w 0.3 --scale_channels False --attenuation none
--loss_w_type bce --loss_margin 1
--local_rank 0 --dist False --workers 2

CUDA_VISIBLE_DEVICES=0,1
torchrun --nproc_per_node=2 main4.py
--val_dir datasets/val_dir --train_dir datasets/train_dir --output_dir output/output_bit/9_mytest --eval_freq 5
--img_size 256 --num_bits 48 --batch_size 16 --epochs 300
--scheduler CosineLRScheduler,lr_min=1e-5,t_initial=300,warmup_lr_init=1e-4,warmup_t=5 --optimizer Lamb,lr=2e-2
--p_color_jitter 0.0 --p_blur 0.0 --p_rot 0.0 --p_crop 0.0 --p_res 0.0 --p_jpeg 0.0
--scaling_w 0.3 --scale_channels False --attenuation none
--loss_w_type bce --loss_margin 1
--local_rank 0 --dist True --workers 2

This should create a folder output with the checkpoints, logs and images. The logs during training are: logs.txt - console.stdout.

The resulting checkpoints have approximately the same performance as in the paper: hidden_replicate.pth - hidden_replicate_whit.torchscript.pth. (Robustness to JPEG is a bit worse, because the augmentation implementation differ a bit from the paper: this could be fixed by increasing the JPEG augmentation probability).

--scheduler CosineLRScheduler,lr_min=1e-6,t_initial=300,warmup_lr_init=1e-6,warmup_t=5 --optimizer Lamb,lr=2e-2 --p_color_jitter 0.0 --p_blur 0.0 --p_rot 0.0 --p_crop 1.0 --p_res 1.0 --p_jpeg 1.0 --scaling_w 0.3 --scale_channels False --attenuation none --loss_w_type bce --loss_margin 1 --dist False

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors