vsep is a high-performance audio stem separator that splits music into vocals, drums, bass, and other instruments using state-of-the-art AI models from UVR (Ultimate Vocal Remover).
- β‘ Fast Downloads - Parallel model downloads (4-8x faster than standard)
- π Resume Support - Automatically resumes interrupted downloads
- π― Multiple Architectures - Support for MDX, VR, Demucs, and MDXC models
- ποΈ Ensemble Mode - Combine multiple models for better quality
- π» GPU/CPU/DirectML - Works with NVIDIA GPU, Apple MPS, AMD DirectML, or CPU
- π§ Configurable - Easy-to-use configuration system for custom settings
- π Remote API - Deploy to Cloud Run or Modal for cloud processing
Clone and install dependencies:
git clone https://github.com/BF667-IDLE/vsep.git
cd vsep
pip install -r requirements.txtFor development (includes testing tools):
pip install -r requirements-dev.txtSee INSTALL.md for detailed platform-specific instructions (Windows GPU, macOS, Linux).
Separate vocals from instrumentation:
python utils/cli.py your_song.mp3Use a specific model:
python utils/cli.py your_song.mp3 -m UVR-MDX-NET-Inst_1.onnxList available models:
python utils/cli.py --list_modelsDownload a model:
python utils/cli.py --download_model_only UVR-MDX-NET-Inst_1.onnxfrom separator import Separator
# Initialize
separator = Separator()
# Separate audio
output_files = separator.separate("your_song.mp3")
print(f"Separated files: {output_files}")Advanced usage with custom settings:
from separator import Separator
import config.variables as cfg
# Customize download settings
cfg.MAX_DOWNLOAD_WORKERS = 8 # More parallel downloads
cfg.DOWNLOAD_CHUNK_SIZE = 524288 # 512KB chunks
separator = Separator(
model_file_dir="./models",
sample_rate=44100,
use_soundfile=True,
)
output_files = separator.separate("your_song.mp3")vsep supports 100+ models from UVR. Here are some popular ones:
| Model | Architecture | Stems | Quality |
|---|---|---|---|
ht-demucs-ft.yaml |
Demucs v4 | vocals, drums, bass, other | βββββ |
UVR-MDX-NET-Inst_1.onnx |
MDX-Net | vocals, instrumental | ββββ |
BS-Roformer-Viperx-1297.ckpt |
Roformer | vocals, instrumental | βββββ |
Mel-Roformer-Viperx-1053.ckpt |
Roformer | vocals, instrumental | βββββ |
List all available models:
audio-separator --list_modelsDownload a specific model:
audio-separator --download_model_only UVR-MDX-NET-Inst_1.onnxRun vsep in Google Colab with free GPU access:
The Colab notebook includes:
- Interactive audio upload
- Model selection dropdown
- Audio playback for results
- Download separated stems
All configuration is centralized in config/variables.py:
import config.variables as cfg
# Use mirror repository
cfg.UVR_PUBLIC_REPO_URL = "https://your-mirror.com/models"
# Adjust for your connection
cfg.MAX_DOWNLOAD_WORKERS = 8 # Parallel downloads (default: 4)
cfg.DOWNLOAD_CHUNK_SIZE = 524288 # Chunk size (default: 256KB)
cfg.DOWNLOAD_TIMEOUT = 600 # Timeout in seconds (default: 300)See config/README.md for full documentation.
Download Speed Comparison:
| Method | Time (100MB model) |
|---|---|
| Standard | ~60 seconds |
| vsep (parallel) | ~15 seconds |
| vsep + mirror | ~8 seconds |
Separation Speed:
| Model | CPU (RTX 3060) | GPU (RTX 3060) |
|---|---|---|
| Demucs v4 | ~30 seconds | ~8 seconds |
| MDX-Net | ~45 seconds | ~12 seconds |
| Roformer | ~60 seconds | ~15 seconds |
Combine multiple models for superior quality:
# Use built-in preset
audio-separator song.mp3 --ensemble_preset vocals_ensemble
# Custom ensemble
audio-separator song.mp3 --model_filename model1.onnx --extra_models model2.onnx model3.onnx --ensemble_algorithm median_waveAvailable ensemble algorithms:
avg_wave- Average waveforms (default)median_wave- Median waveform (removes artifacts)max_wave- Maximum amplitudeavg_fft- Average in frequency domainmedian_fft- Median in frequency domain
Process long audio files in chunks to reduce memory usage:
separator = Separator(chunk_duration=60) # Process in 60-second chunks
output_files = separator.separate("long_mix.mp3")Deploy vsep as a cloud API:
Modal (GPU):
python remote/deploy_modal.py deployGoogle Cloud Run:
python remote/deploy_cloudrun.py deploySee remote/README.md for deployment details.
vsep/
βββ config/ # Configuration and settings
β βββ variables.py # Centralized config
β βββ __init__.py
β βββ README.md
βββ separator/ # Core separation logic
β βββ separator.py # Main Separator class
β βββ architectures/ # Model architectures (MDX, VR, Demucs)
β βββ uvr_lib_v5/ # UVR library code
βββ remote/ # Cloud deployment
β βββ deploy_modal.py
β βββ deploy_cloudrun.py
β βββ api_client.py
βββ utils/ # Utilities
β βββ cli.py # Command-line interface
βββ notebooks/ # Jupyter/Colab demos
βββ tools/ # Development tools
pytest tests/ -vblack . --line-length 140poetry buildContributions are welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- UVR Team - For the amazing models and training data
- Anjok07 - Primary model trainer and UVR developer
- TRvlvr - For the model repository
- NomadKaraoke - For the python-audio-separator project
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Colab Demo: Try it free
Made with β€οΈ by the audio separation community