MetaPAC is a meta-learning based framework for predictive adaptive compression of transformer models. It combines pruning and quantization techniques with meta-learned predictions to achieve optimal compression while maintaining model performance.
- Meta-learning based compression: Predict optimal compression configurations using learned meta-models
- Hybrid compression: Combines structured/unstructured pruning with variable-bit quantization
- Flexible pipeline: Modular architecture supporting various compression strategies
- Fine-tuning integration: Post-compression fine-tuning with knowledge distillation
- CLI & Python API: Easy-to-use command-line interface and programmatic access
pip install metapacOr install from source:
git clone https://github.com/alenoi/MetaPAC.git
cd MetaPAC
pip install -e .Full automatic pipeline:
python -m metapac --mode autoCompress a model with configuration:
python -m metapac --mode compress --config examples/configs/compress_distilbert_sst2.yamlExtract features for meta-learning:
python -m metapac --mode feature_extract --config examples/configs/feature_extraction.yamlTrain meta-predictor:
python -m metapac --mode train_meta --config examples/configs/meta_distilbert_sst2.yamlfrom metapac import build_meta_dataset, TorchMetaPredictor, TorchModelWrapper
# Build meta-dataset from a model
config = {
"model_name": "distilbert-base-uncased",
"dataset": "glue",
"dataset_config": "sst2"
}
meta_dataset = build_meta_dataset(config)
# Train meta-predictor
predictor = TorchMetaPredictor()
predictor.train(meta_dataset)
# Use for compression prediction
wrapper = TorchModelWrapper(model)
predictions = predictor.predict(wrapper)auto: Run full pipeline (baseline fine-tuning → feature extraction → meta-training → compression)auto:feature_extract: Run from feature extraction onwards (skip baseline fine-tuning)baseline_finetune: Fine-tune baseline model onlyfeature_extract: Extract features for meta-learningtrain_meta: Train meta-predictorcompress: Compress model with optional fine-tuning
--config PATH: Path to YAML configuration file--mode MODE: Pipeline mode to run
See examples/configs/ for example configuration files:
compress_distilbert_sst2.yaml: Basic compression configurationcompress_with_finetuning.yaml: Compression with post-compression fine-tuningfeature_extraction.yaml: Feature extraction configurationmeta_distilbert_sst2.yaml: Meta-predictor training configuration
Pre-configured compression scenarios in examples/configs/scenarios/ (used in ablation studies):
prune_magnitude_logical_30.yaml: Pruning-only baseline (30% magnitude-based)quant_vb_headroom_on.yaml: Quantization-only baseline (variable-bit 2-8 bits)compress_finetune_no_kd.yaml: Combined pruning + quantization (no fine-tuning)compress_finetune_kd.yaml: Full pipeline with knowledge distillation (recommended)
mode: compress
model:
name: "distilbert-base-uncased"
task: "sequence-classification"
dataset:
name: "glue"
config: "sst2"
compression:
pruning:
enabled: true
ratio: 0.3
method: "magnitude"
quantization:
enabled: true
method: "variable_bit"
bits: [2, 4, 6, 8]
fine_tuning:
enabled: true
epochs: 3
learning_rate: 2e-5
use_kd: true # Knowledge distillationSee example configurations for advanced options including:
- Custom pruning strategies (magnitude, gradient-based, meta-predicted)
- Variable-bit quantization with headroom optimization
- Fine-tuning with knowledge distillation
- Custom meta-predictor architectures
Extract layer-level and parameter-level features from the model for meta-learning.
Train a meta-model to predict optimal compression configurations based on extracted features.
Apply pruning and/or quantization based on meta-predictions or predefined strategies.
Fine-tune compressed model with optional knowledge distillation from the original model.
targets/<model_name>/models/experiments/<experiment_name>/
├── pruned_before_quant/ # Model after pruning
├── quantized_before_ft/ # Model after quantization (fake-quant)
├── finetuned/ # Model after fine-tuning
├── compressed/ # Final compressed model
│ ├── pytorch_model.bin # Fake-quant FP32 weights
│ ├── model_packed.bin # Packed variable-bit weights
│ └── compression_config.json
└── logs/ # Training and compression logs
- Python >= 3.8
- PyTorch >= 2.0
- Transformers >= 4.44
- See
requirements.txtfor full dependency list
If you use MetaPAC in your research, please cite:
@software{metapac2025,
title = {MetaPAC: Meta-learning based Predictive Adaptive Compression},
author = {Panyi, Tamás},
year = {2025},
version = {0.1.0},
institution = {Óbudai Egyetem},
url = {https://github.com/alenoi/metapac}
}This project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.
This research was conducted at Óbudai Egyetem (Óbuda University).
For questions, issues, or feature requests, please open an issue on GitHub.