This project provides a powerful and automated workflow for creating clean, single-instrument datasets from a collection of mixed audio tracks. By leveraging the Lalal.ai API for high-quality source separation and sox for audio manipulation, this tool streamlines the process of isolating specific instruments from complex musical pieces.
- Recursive Audio Discovery: Automatically finds all audio files in a specified directory.
- Silence Removal: Uses a two-pass
soxprocess to remove silence from the beginning, middle, and end of audio files. - Audio Segmentation: Splits long audio files into smaller, manageable segments.
- Source Separation: Integrates with the Lalal.ai API to extract specific instrumental stems (e.g., wind, strings, piano).
- Clear Logging: Provides detailed logs for easy debugging and progress tracking.
-
Create and activate a virtual environment:
python -m venv venv venv\Scripts\activate
-
Install the required dependencies:
pip install -r requirements.txt
Run the script from your terminal with the following command, providing the path to your dataset, the instruments you want to extract, and your Lalal.ai API key.
python process_audio.py --dataset_path /path/to/your/audio/files --instruments winds strings --api_key YOUR_LALALAI_API_KEY--dataset_path: Path to the dataset folder.
--segment_duration: Duration of audio segments in minutes.
--instruments: List of instruments to extract (e.g., winds, strings). The supported stems are: wind, strings, piano, bass, electric_guitar, acoustic_guitar, synthesizer, synth, drum, voice, accompaniment, ensemble, percussion, and effects.
--api_key: Lalal.ai API key.
--output_path: Path to save the processed files. If not provided, the processed files will be saved in the same folder as the input files.
--no_stem_separation: Skip the upload to Lalal.ai. If provided, the script will not send the audio files to Lalal.ai for source separation.