Benchmark study comparing feature importance ranking methods through feature selection evaluation. Implements and evaluates 30+ methods (TM-based, filter, wrapper, explainability) across multiple datasets using 5 evaluation protocols (Top-K, Deletion, Insertion, ROAR, ROAD-Mask).
https://link.springer.com/chapter/10.1007/978-3-032-11402-0_15
https://arxiv.org/abs/2508.06991
This project implements and evaluates multiple feature selection approaches:
- Traditional methods: Mutual Information, Chi-squared, Variance
- Tsetlin Machine-based: Clause analysis and feature importance from TM training
- ROAD (RemOve And retrain): Feature importance through iterative removal and retraining
- Wrapper methods: Dropout, Permutation Importance
- Sklearn models: BernoulliNB, DecisionTree, LogisticRegression, SVM, KNN
The study follows a structured workflow:
- Hyperparameter Optimization → Find best parameters for each dataset
- Feature Selection Experiments → Run comprehensive feature selection analysis
- Visualization & Analysis → Generate plots and comparisons
Feature Selection/
├── src/ # Core source code
│ ├── utils/ # Shared utilities
│ │ ├── data_loading.py # Dataset loading and preprocessing
│ │ ├── synthetic_datasets.py # Synthetic dataset generators
│ │ ├── tm_utils.py # Tsetlin Machine utilities
│ │ ├── sklearn_utils.py # Sklearn model utilities
│ │ ├── serialization.py # JSON serialization helpers
│ │ └── paths.py # Centralized output path configuration
│ ├── hyperparameter_optimization/ # Step 1: HPO
│ │ └── optuna_tm_search.py # TM hyperparameter optimization
│ ├── experiments/ # Step 2: Main experiments
│ │ └── (to be refactored)
│ └── visualization/ # Step 3: Plotting & analysis
│ └── (to be refactored)
│
├── outputs/ # All generated outputs
│ ├── figures/ # Visualization outputs
│ │ ├── correlations/ # Score correlation heatmaps
│ │ ├── top_k/ # Top-K performance plots
│ │ ├── pruning_curves/ # Deletion/insertion/ROAR/ROAD curves
│ │ ├── heatmaps/ # Feature importance heatmaps
│ │ └── misc/ # Other plots
│ ├── params/ # Hyperparameters
│ │ ├── best_params/ # Per-dataset optimized parameters
│ │ ├── aggregated/ # Aggregated parameter files
│ │ └── road/ # ROAD-specific results
│ ├── results/ # Experiment results
│ │ └── experiments/ # Feature selection experiment JSONs
│ └── examples/ # Curated subset for GitHub
│ ├── figures/ # Example visualizations
│ └── params/ # Example parameter files
│
├── _local_only/ # Bulk artifacts (not tracked by git)
│
├── legacy/ # Legacy scripts (preserved for reference)
│ ├── FS_3_types/ # Original FS_3_types scripts
│ ├── ROAD_Models/ # Original ROAD experiment scripts
│ └── tm_feature_selection/ # Original TM feature selection
│
├── 1_optimize_hyperparameters.py # Step 1: Run hyperparameter optimization
├── 2_run_road_experiments.py # Step 2: Run ROAD experiments
├── 3_run_ktop_experiments.py # Step 3: Run Top-K experiments (planned)
├── 4_generate_plots.py # Step 4: Generate visualizations (planned)
│
└── requirements.txt
- Install dependencies:
pip install -r requirements.txtRequired packages:
- numpy
- scikit-learn
- optuna
- tmu (Tsetlin Machine implementation)
- matplotlib, seaborn
- shap, lime (optional, for explainability methods)
First, find optimal hyperparameters for Tsetlin Machines on each dataset:
python 1_optimize_hyperparameters.pyThis will:
- Run Optuna optimization for each dataset
- Save best parameters to
outputs/params/best_params/ - Aggregate results to
outputs/params/aggregated/all_best_tm_params.json
Configuration: Edit the script to modify:
- Datasets to optimize
- Number of Optuna trials
- TM hyperparameters (clauses, epochs, patience)
Run the main feature selection experiments using optimized parameters:
python 2_run_road_experiments.pyNote: Currently uses legacy scripts. Full refactoring to src/experiments/ is in progress.
For now, you can run the legacy scripts directly:
python legacy/ROAD_Models/FSB_ROAD.py
python legacy/FS_3_types/FS_KTop.pyGenerate plots from experiment results:
python legacy/FS_3_types/Replot_png_figures.pyThe study uses multiple datasets:
UCI/OpenML datasets:
- Iris, Wine, Breast Cancer, Digits
- Heart Disease, Pima Indians Diabetes
- Ionosphere, Sonar, Glass, Vehicle
- Steel Plates Fault, Spambase, Ecoli
- Balance Scale, Banknote Authentication
- Blood Transfusion Service Center
Synthetic datasets:
- Increasing Parity Complexity
- Hierarchical Boolean Rules
- Progressive Feature Interaction
All outputs are organized in the outputs/ directory:
- Figures:
outputs/figures/{category}/ - Parameters:
outputs/params/{type}/ - Results:
outputs/results/experiments/
Filter Methods:
- Mutual Information
- Chi-squared
- Variance
Embedded Methods (TM-specific):
- TM-Weight, TM-Weight-PosNeg
- CW-Sum, CW-Sum-PosNeg (Class-weighted)
- CW-Feat, CW-Feat-PosNeg
- Support-CW-Sum, Support-CW-Sum-PosNeg
- Margin, Margin-PosNeg
- Gini, Gini-PosNeg
Wrapper Methods:
- Dropout
- Permutation Importance
- SHAP (optional)
- LIME (optional)
Sklearn Embedded:
- Feature importances from DecisionTree, RandomForest
- Coefficients from LogisticRegression, LinearSVM
- Top-K Performance: Accuracy vs. number of top features selected
- Deletion Curve: Performance when removing top features
- Insertion Curve: Performance when adding features incrementally
- ROAR Curve: Performance after removing and retraining
- ROAD-Mask Curve: Performance after masking and retraining
Results are saved as JSON files containing:
- Feature importance scores per method
- Evaluation metrics (accuracy, F1, balanced accuracy, MCC)
- Timing information
- Correlation matrices between methods
- Top-K performance curves
Visualizations include:
- Score correlation heatmaps
- Top-K performance comparisons
- Pruning curves (deletion/insertion/ROAR/ROAD)
- Feature importance visualizations
All experiments use fixed random seeds for reproducibility:
- Dataset splits:
random_state=42 - Model training:
random_state=42(where applicable) - Optuna studies: Use default random state
Hyperparameters are saved and can be reloaded to reproduce exact model configurations.
When adding new experiments:
- Use utilities from
src/utils/for data loading, preprocessing - Use
src/utils/paths.pyfor output paths - Follow the workflow: HPO → Experiments → Visualization