Preprocessing++ is Side-Step's dataset-aware adapter targeting system. It uses Fisher Information Matrix (FIM) analysis to determine which parts of the model are most important for your specific audio data.
Instead of training every layer with the same capacity (flat rank), Preprocessing++:
- Auto-targets modules: Finds which attention blocks and MLPs react most to your audio.
- Assigns adaptive ranks: Allocates more parameters (higher rank) to important layers and fewer to less relevant ones.
- Saves a map: Writes
fisher_map.jsonto your dataset folder, which training automatically detects and uses.
- Efficiency: Puts parameters where they matter most.
- Quality: Often captures texture and timbre better than flat-rank training.
- Stability: Can reduce "catastrophic forgetting" by leaving irrelevant model parts alone.
[!WARNING] Power User Tool Preprocessing++ is extremely effective at adaptation, which means it can overfit faster than standard training.
- Lower your learning rate (e.g., if you usually use
1e-4, try5e-5).- Monitor closely (check samples early).
- Turbo Users: Be careful. Turbo models are fragile. Preprocessing++ can destabilize Turbo training. Base models are recommended for this workflow.
- Prerequisite: You must have already run standard Preprocessing to generate
.pttensor files in a dataset directory. - Analysis: You run Preprocessing++ on that dataset directory.
- Output: It generates a
fisher_map.jsonfile inside the dataset directory. - Training: When you later run training on that dataset, Side-Step detects the map and automatically switches to "Adaptive Rank" mode.
- Select Preprocessing++ (auto-targeting + adaptive ranks) from the main menu.
- Model Selection: Pick the model you intend to train on (usually
base). - Dataset: Point to your folder of
.ptfiles. - Timestep Focus: Choose how the analysis should "listen" to your audio:
- Balanced (Default): Full timestep range. Recommended for most use cases.
- Texture: Focuses on timbre and sound design. Best for style transfer only.
- Structure: Focuses on rhythm, beat grids, and composition.
- Rank Budget:
- Base Rank: The median target (e.g., 64).
- Min/Max: The floor and ceiling for adaptive ranks (e.g., 16 to 128).
- Confirm:
- The wizard shows a final confirmation summary (dataset size, focus, rank budget).
- If Turbo is selected, the Turbo fragility warning appears here with explicit opt-in.
- Run: The process takes a few minutes depending on dataset size.
You can optionally load preset-derived rank defaults before stepping through the flow (only overlapping fields are applied).
Use the analyze subcommand:
sidestep analyze \
--dataset-dir ./my_dataset_tensors \
--rank 64 \
--rank-min 16 \
--rank-max 128 \
--timestep-focus balancedThe checkpoint directory and model variant are read from your settings. Override with --checkpoint-dir and --model if needed.
| Argument | Description | Default |
|---|---|---|
--dataset-dir |
Path to folder with .pt files (Required) |
- |
--timestep-focus |
balanced, texture, structure |
balanced |
--rank |
Target median rank | 64 |
--rank-min |
Minimum rank for any layer | 16 |
--rank-max |
Maximum rank for any layer | 128 |
--convergence-patience |
Stop analysis when stable for N batches | 5 |
If you see this during training start-up, it means the system found fisher_map.json. It will print details about the adaptive ranks.
To force standard flat-rank training on a dataset that has a map, use:
--ignore-fisher-map (on the train subcommand).
If training collapses (loss spikes, output is noise) on Turbo with Preprocessing++, delete fisher_map.json and retry with standard flat ranks, or switch to a Base model.