Skip to content

gmontana/DecodingViewerEmotions

Repository files navigation

Decoding Viewer Emotions in Video Ads

TSAM Architecture

Paper · Dataset · Model Weights

Code and pre-trained model for "Decoding Viewer Emotions in Video Ads" (Antonov et al., Nature Scientific Reports, 2024). The Temporal Shift Augmented Module (TSAM) predicts viewers' emotional reactions to video advertisements from short 5-second excerpts, processing both video frames and audio.

Quick Start

1. Install dependencies

pip install -r requirements.txt

ffmpeg is also required for preprocessing.

2. Download data and weights

from huggingface_hub import snapshot_download

# Download dataset (video clips and CSV splits)
snapshot_download(
    repo_id="dnamodel/adcumen-viewer-emotions",
    repo_type="dataset",
    local_dir="./adcumen-data"
)

# Download model weights
snapshot_download(
    repo_id="dnamodel/tsam-viewer-emotions",
    local_dir="./adcumen-data"
)

3. Preprocess

python setup_data.py --input ./adcumen-data --workers 8

This extracts video frames, audio, and model weights into the expected directory structure. Run python setup_data.py --help for all options.

4. Run inference

python predict.py \
    --data config/default.json \
    --model weights \
    --type test \
    --id test_run

Predictions are saved to ./data/predicted/test_run/.

Dataset

The dataset contains 26,635 five-second video clips from video advertisements, annotated for eight emotional categories:

Emotion Total Train Validation Test
Anger 2,894 2,282 404 208
Contempt 3,317 2,581 367 369
Disgust 3,061 2,564 254 243
Fear 3,166 2,549 317 300
Happiness 3,577 2,918 383 276
Neutral 3,491 2,771 398 322
Sadness 3,576 2,886 346 344
Surprise 3,553 2,841 387 325
Total 26,635 21,392 2,856 2,387

HuggingFace Repositories

Dataset: huggingface.co/datasets/dnamodel/adcumen-viewer-emotions

  • training.csv, validation.csv, testing.csv -- dataset splits with columns: Video_Name, Start_Second, Label, Clips_Name
  • 5-second_MP4_Clips.zip -- the 26,635 five-second video clips (MP4)

Model weights: huggingface.co/dnamodel/tsam-viewer-emotions

  • backbone_weights.tar -- ResNet50 backbone pre-trained on ImageNet-21K
  • tsam_weights.tar -- TSAM model checkpoint (best balanced accuracy)

Project Structure

.
├── setup_data.py              # Preprocesses HuggingFace download
├── predict.py                 # Run inference with trained model
├── train.py                   # Train TSAM model
├── config/
│   └── default.json           # Default config (relative paths)
├── lib/
│   ├── dataset/               # Data loading (video + audio)
│   ├── model/                 # TSAM architecture
│   └── utils/                 # Training utilities
├── mvlib/                     # Video processing library
├── DataAdcumen/               # Split files and VDB
├── requirements.txt
└── LICENCE

Requirements

  • Python 3.10+
  • PyTorch 2.5+
  • ffmpeg (system install required for both preprocessing and audio loading)
  • CUDA-capable GPU (for inference)

See requirements.txt for Python packages.

Training

python train.py \
    --config config/default.json \
    --cuda_ids 0 \
    --run_id my_experiment

Dataset Access Disclaimer

The dataset leverages System1's proprietary "Test Your Ad" tool for public, educational, and illustrative use. The advertisements and excerpts, while derived from System1's tool, remain the property of their original owners. Usage beyond this study's scope requires explicit permission from those owners. By accessing the dataset, you agree to these conditions.

License

The TSAM software and associated documentation are made available under a custom license that permits use solely for academic research and non-commercial evaluation. See LICENCE for full terms. For commercial use inquiries, contact Warwick Ventures at ventures@warwick.ac.uk.

Citation

@article{antonov2024decoding,
  title={Decoding viewer emotions in video ads},
  author={Antonov, Alexey and Kumar, Shravan Sampath and Wei, Jiefei and Headley, William and Wood, Orlando and Montana, Giovanni},
  journal={Scientific Reports},
  volume={14},
  pages={25680},
  year={2024},
  publisher={Nature Publishing Group}
}

Contact

For questions, suggestions, or collaborations, please contact Giovanni Montana at g.montana@warwick.ac.uk.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages