Skip to content

EofK/Benchmark-15-ML-Models-Using-159-Binary-Class-Tabular-Datasets

Repository files navigation

Benchmarking Classical Machine Learning Models on Tabular Binary Classification

This repository contains the reproducible code and results for a benchmarking study evaluating the performance of 15 classical machine learning model families across 159 tabular binary‑classification datasets. The project emphasizes methodological transparency, reproducibility, and clear documentation.

This repository includes the final report and supporting Python notebooks for the analysis and comparison of 15 machine learning models on a supervised classification task. The project addresses four core questions:

  1. Which models perform best overall?
  2. What makes datasets difficult to classify?
  3. Which models handle specific complexity types most effectively?
  4. How do accuracy and speed trade off across models?

The final report documents model performance and throughput, dataset characteristics, and tradeoffs between performance and efficiency. Using the Lorena et al. framework, the report characterizes each of the 159 datasets across 22 complexity dimensions spanning feature discriminability, class separability, geometric structure, and neighborhood cohesion. By correlating these complexity measures with the performance of 15 diverse models, the analysis identifies not only which models perform best overall, but also which algorithms are best suited to handle specific types of dataset difficulty.

Repository Structure

• notebooks

     Cleaned Jupyter notebooks used for data preparation, feature processing, model training, and model evaluation. Each notebook is numbered to reflect the recommended execution order.

• Binary Classification Final Report

     Complete research paper presenting the benchmarking study of 15 ML models on 159 binary classification tabular datasets. Includes methodology, statistical analyses, performance rankings, dataset complexity analysis, and key findings on what makes classification problems difficult.

• LICENSE

     MIT license information governing use, distribution, and modification of this repository.MIT license information governing use, distribution, and modification of this repository.

• Consolidated results Excel file of model performance and dataset complexity

     Master data file containing performance results for all 2,384 model-dataset combinations. Includes accuracy/F1/AUC metrics, throughput, dataset complexity scores, and computational statistics.

• README

     This document

• requirements.txt

     Python package dependencies required to reproduce the environment used for all experiments.

Project Overview

The goal of this project is to systematically evaluate how well different classical machine learning model families perform on heterogeneous tabular datasets. The study includes:

     • 22 dataset complexity measures

     • 8 model families

     • 2,384 model–dataset evaluations

     • Accuracy and throughput metrics

     • Correlation analyses between complexity and performance.

The analysis is designed to support reproducible benchmarking and to provide insight into how dataset characteristics relate to model performance.

Complexity Analysis

Dataset complexity was measured using the problexity library implementation of Lorena et al. (2019) measures. See notebooks/08_calculate_dataset_complexity.ipynb for the complete calculation workflow.

The problexity library requires specific data formatting. Refer to the notebook for working example.

How to Use This Repository

  1. Clone the repository git clone https://github.com//.git

  2. Install dependencies pip install -r requirements.txt

  3. Open the notebooks Launch Jupyter or VS Code and run the notebooks in the order listed in the notebooks/ directory.

  4. Explore results The Excel file in results/ contains all model–dataset outcomes used in the analysis.

Reproducibility Notes

• All experiments were run using a fixed Python environment defined in requirements.txt.

• Random seeds were set where applicable to support reproducibility.

• The 159 raw datasets used for this project are listed in Appendix D of the Final Report, including a link to each dataset. The datasets are not included in this repository.

Datasets

This benchmark uses 159 tabular binary classification tabular datasets from public repositories (UCI, OpenML, Kaggle).

A complete catalog of all 159 datasets, including source links, is provided in Appendix B of the accompanying Final Report.

Note: Datasets are not included in this repository. Users must download datasets individually from their respective sources.

Citation

If you use this repository or its results in academic work, please cite it appropriately: Ed Kaempf, Benchmarking 15 Machine Learning Models for Binary Classification: Accuracy, Complexity, and Speed, December 2025.

License

     This project is licensed under the MIT License. Refer to the “LICENSE” file for details.

Contact Information

For questions or suggestions, feel free to contact:

     • Name: Ed Kaempf

     • Email: edkaempf@gmail.com

     • GitHub: github.com/EofK

     • Linkedin: https://www.linkedin.com/in/ed-kaempf-4887839b/

About

This study benchmarks 15 machine learning models on 159 tabular datasets for binary classification to answer four questions: (1) Which models perform best overall? (2) What makes datasets difficult to classify? (3) Which models handle specific complexity types most effectively? (4) How do accuracy and speed trade off across models?

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors