Optimize ACE-Step: NLC VAE, compiled decode, LoRA support#498
Open
fspecii wants to merge 1 commit intoBlaizzy:pc/add-acefrom
Open
Optimize ACE-Step: NLC VAE, compiled decode, LoRA support#498fspecii wants to merge 1 commit intoBlaizzy:pc/add-acefrom
fspecii wants to merge 1 commit intoBlaizzy:pc/add-acefrom
Conversation
- Rewrite VAE to native NLC format with nn.Conv1d and FastConvTranspose1d, fusing weight-norm (g*v/||v||) at load time instead of every forward pass - Replace manual attention with mx.fast.scaled_dot_product_attention - Simplify turbo diffusion to single-pass (no CFG/APG) matching upstream behavior - Add compiled VAE decode (mx.compile) with auto-conversion to mlx_weights.safetensors - Add LoRA adapter support (load/unload with weight fusion) - Add quantized 5Hz LM variants (0.6B-8bit, 0.6B-4bit) - Add music metadata params (bpm, keyscale, timesignature) - Add acestep model remapping and custom_loading support in utils
Collaborator
|
I don't see the weight norm conv changes -- did those get excluded? Also, it looks like you removed cfg entirely? Was it not needed even for existing models? If you could explain the intent/goal of the changes vs just the mechanical pieces, it would be helpful context. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
WeightNormConv1d/WeightNormConvTranspose1dwithnn.Conv1dandFastConvTranspose1d. Weight-norm parameters (weight_g,weight_v) are fused into regular weights at load time viasanitize(), eliminating per-forward-pass normalization overhead.mx.compile(model.vae.decode)with auto-conversion tomlx_weights.safetensorson first load (skips PT→MLX conversion on subsequent runs).mx.fast.scaled_dot_product_attention: Replaces manual attention (matmul → mask → softmax → matmul) with MLX's fused kernel.load_lora()/unload_lora()with weight fusion (W + scale * (alpha/r) * B @ A), base weight backup/restore for hot-swapping adapters.0.6B-8bitand0.6B-4bitmodel IDs for lower-memory language model inference.bpm,keyscale,timesignatureparameters forwarded to prompt formatting.custom_loadingclass attribute +acestepremapping for clean integration withmlx_audio.utils.base_load_model.Test plan
--model ACE-Step/ACE-Step1.5— verified output WAVmlx_audio.tts.load()API path works with custom_loading