ALACen: Automatic Language-level Adjustment for Video Censorship

About the Project

This project is part of KAIST Spring 2024's course project of EE474: Introduction to Multimedia.

ALACen is a pipeline for Automatic Language-Level Adjustment for Video Censorship that uses deep-learning techniques to censor videos containing violent speeches while preserving immersion. ALACen consists of four stages: Speech Recognition, Paraphrase Generation, Text-to-Speech Synthesis, and Lip Synchronization.

Installation

The instructions below assume that you already have Conda installed. If not so, follow the Conda installation guide here before you proceed.

Create a new Conda environment with Python 3.9 and activate it. For example,

conda create -n alacen python=3.9 && conda activate alacen

Install Mamba using the following command. You need to restart your terminal after the installation finishes. We need Mamba because installing the dependencies with conda through the conda-forge channel often hangs. Note: You can skip this step if you already have Mamba installed.

bash install_mamba.sh

Run the following command to install the dependencies and download pre-trained models. If it fails with a connection error, try running it again. Note: You may have to activate your environment first before running the following command.

bash setup.sh

Usage

We provide four options for running ALACen.

Execute the Python module. Replace <path-to-your-video> with the correct path.

python -m src.alacen -v --video <path-to-your-video> --num-gpus 3 --device cuda:3

Run the run.ipynb file. This gives you an interactive execution of ALACen. Put your configuration parameters in the Configuration cell and run all the cells. If you encounter the prompt saying files already exist, try removing those files and rerun the cell.
Run the evaluate.ipynb file. This gives you the feel of how ALACen's output videos for the user study were generated.
Run the Gradio demo application with the following command. Then, you can access the application with your browser by visiting the specified URL.

python app.py -v -s --num-gpus 3 --device cuda:3

Note that the current version runs on 4 NVIDIA GeForce GTX 1070 GPUs. If you have fewer but larger GPUs, you can set the number of GPUs and default device in the command line arguments accordingly.

Project Structure

Following is the project structure.

ALACen/
├─install_mamba.sh                  # Scripts for installing Mamba
├─app.py                            # Gradio demo app
├─README.md                         # README file
├─run.ipynb                         # Notebook for running ALACen
├─src/                              # Source directory
│ ├─datasets/                       # Python scripts for constructing the Violent Speech dataset
│ └─alacen/                         # ALACen implementation
│   ├─asr/                          # Speech Recognition
│   ├─config.py                     # Configuration file
│   ├─tts/                          # Text-to-Speech Synthesis
│   ├─lipsync/                      # Lip Synchronization
│   ├─paraphrase/                   # Paraphrase Generation
│   ├─__main__.py                   # Main ALACen script
│   ├─alacen.py                     # ALACen class
│   └─...                           # Other helper and utility functions
├─setup.sh                          # Script for setting up the environment
├─demo/                             # Demo videos
├─assets/                           # Images and other assets
├─datasets/                         # Datasets
│ ├─violent_speech_list.txt         # Extracted violent speeches
│ └─violent_speech_dataset.json     # Violent Speech Dataset
├─requirements.txt                  # Dependencies
├─finetune_lm.ipynb                 # Notebook for fine-tuning the LM
└─evaluation.ipynb                  # Notebook for generating videos for the user study

Acknowledgements

This repository contains code from the diff2lip and VoiceCraft repositories. Visit them for more detail.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ALACen: Automatic Language-level Adjustment for Video Censorship

Table of Contents

About the Project

Installation

Usage

Project Structure

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
assets		assets
datasets		datasets
demo		demo
src		src
.gitignore		.gitignore
README.md		README.md
app.py		app.py
evaluation.ipynb		evaluation.ipynb
finetune_lm.ipynb		finetune_lm.ipynb
install_mamba.sh		install_mamba.sh
requirements.txt		requirements.txt
run.ipynb		run.ipynb
setup.sh		setup.sh

Folders and files

Latest commit

History

Repository files navigation

ALACen: Automatic Language-level Adjustment for Video Censorship

Table of Contents

About the Project

Installation

Usage

Project Structure

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages