MLX-compatible REAP for pruning MoE models on Apple Silicon
-
Updated
Jun 17, 2026 - Python
MLX-compatible REAP for pruning MoE models on Apple Silicon
KL-divergence + blind task eval of GLM-4.7-Flash and its REAP-pruned variant vs gpt-oss-20b on llama.cpp
Add a description, image, and links to the expert-pruning topic page so that developers can more easily learn about it.
To associate your repository with the expert-pruning topic, visit your repo's landing page and select "manage topics."