Predictive Maintenance with Custom Decision Trees

Overview

This repository demonstrates the application of supervised machine learning techniques, focusing on decision trees and ensemble methods, to predict equipment failures using historical sensor data.

Companion Article: Decision Trees and Predictive Maintenance

Repository Structure

sklearn_trees.ipynb – Implements standard machine learning models using scikit-learn, including Decision Tree, Random Forest, and Gradient Boosting classifiers.

custom_trees.ipynb – Develops custom decision tree classifiers from scratch, utilizing entropy-based splits and information gain.

data/maintenance.csv – Contains 500 records of machine sensor readings and operational metrics, labeled with failure outcomes.

Methodology

Data Preprocessing:

Transformed failure outcomes into a binary variable indicating the occurrence of failure within a future window.

Applied one-hot encoding for categorical variables and standardized numerical features.

Addressed class imbalance using SMOTE (Synthetic Minority Oversampling Technique) to generate synthetic samples of the minority failure class.

Model Training and Evaluation:

Trained various classifiers, including:

Decision Tree (scikit-learn)
Random Forest
Gradient Boosting
AdaBoost
XGBoost
LightGBM
CatBoost

Evaluated models using metrics such as accuracy, precision, recall, and F1-score.

Custom Model Development:

Built decision tree classifiers from the ground up in custom_trees.ipynb, focusing on interpretability and domain-specific insights.

Benchmarked the custom models against scikit-learn implementations to assess performance improvements.

Results

Model Accuracy

Model	Accuracy
Custom Gradient Boosting	0.96
Decision Tree(scikit-learn)	0.92
CatBoost	0.86
XGBoost	0.85
LightGBM	0.84
Gradient Boosting	0.83
Random Forest	0.81
AdaBoost	0.79

The custom gradient boosting model outperformed all other models, achieving an accuracy of 96%. Notably, it demonstrated superior precision and recall for the minority failure class, indicating its effectiveness in detecting rare but critical failure events.

Feature Importance

The custom gradient boosting model identified the following features as most influential in predicting equipment failure:

Features	Accuracy
Operational Hours	0.5427
Air Temperature	0.1312
Rotational Speed (rpm)	0.1034
Process Temperature (K)	0.1022
Torque (Nm)	0.0575
Vibration Levels	0.0562

Understanding feature importance aids in pinpointing key factors contributing to equipment failures, facilitating targeted maintenance strategies.

Interpretability and Deployment

Visualization: Utilized plot_tree() from scikit-learn to visualize decision paths in the custom decision tree, enhancing interpretability.

SHAP Analysis: Applied SHAP (SHapley Additive exPlanations) to the custom gradient boosting model to elucidate feature contributions on a per-instance basis.

Deployment: While not covered in this repository, models can be integrated into maintenance systems using platforms like Streamlit or Gradio for interactive user interfaces.

Getting Started

Clone the repository:

git clone [https://github.com/yourusername/predictive-maintenance.git](https://github.com/4CDA/predictive_maintenance_decisiontrees.git)
cd predictive-maintenance

Install dependencies:

pip install -r requirements.txt

Run notebooks:

Open sklearn_trees.ipynb or custom_trees.ipynb in Jupyter Notebook or JupyterLab.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Data		Data
Visualizations		Visualizations
catboost_info		catboost_info
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
custom_trees.ipynb		custom_trees.ipynb
decision_tree		decision_tree
requirements		requirements
sklearn_trees.ipynb		sklearn_trees.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predictive Maintenance with Custom Decision Trees

Overview

Repository Structure

Methodology

Data Preprocessing:

Results

Model Accuracy

Feature Importance

Interpretability and Deployment

Getting Started

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Predictive Maintenance with Custom Decision Trees

Overview

Repository Structure

Methodology

Data Preprocessing:

Results

Model Accuracy

Feature Importance

Interpretability and Deployment

Getting Started

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages