Skip to content

rukeshsg/Predictive-Modeling-Using-Machine-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

🚀 Predictive Modeling Using Machine Learning


📌 Project Overview

This project focuses on building multiple machine learning models to predict outcomes based on different datasets. It demonstrates the complete workflow of supervised learning, including data preprocessing, model training, evaluation, and visualization.


🎯 Objectives

  • Apply various machine learning algorithms
  • Train and test models on real-world datasets
  • Evaluate performance using metrics and visualizations
  • Gain hands-on experience in supervised learning

🧠 Algorithms Used

Algorithm Dataset Task
Logistic Regression Titanic Survival Prediction
Decision Tree Heart Disease Disease Detection
Random Forest Loan Dataset Loan Approval
SVM Diabetes Diabetes Prediction
KNN Iris Flower Classification
Naive Bayes Spam Spam Detection

📁 Project Structure

ML-Predictive-Modeling/
│
├── notebooks/
│   ├── 01_logistic_regression_titanic.ipynb
│   ├── 02_decision_tree_heart.ipynb
│   ├── 03_random_forest_loan.ipynb
│   ├── 04_svm_diabetes.ipynb
│   ├── 05_knn_iris.ipynb
│   ├── 06_naive_bayes_spam.ipynb
│
├── datasets/
│   ├── Titanic-Dataset.csv
│   ├── heart.csv
│   ├── diabetes.csv
│   ├── spam.csv
│   ├── train_u6lujuX_CVtuZ9i.csv
│
├── README.md
└── requirements.txt

⚙️ Technologies Used

  • Python
  • Pandas
  • NumPy
  • Matplotlib
  • Seaborn
  • Scikit-learn

🔄 Workflow

  1. Data Loading
  2. Data Cleaning & Preprocessing
  3. Feature Selection & Encoding
  4. Train-Test Split
  5. Model Training
  6. Prediction
  7. Evaluation
  8. Visualization

📊 Model Evaluation

Each model is evaluated using:

  • Accuracy Score
  • Confusion Matrix
  • Classification Report
  • ROC Curve (for performance analysis)

📈 Visualizations

🔥 Heatmap (Correlation Matrix)

Used to understand relationships between features.

📊 Confusion Matrix

Shows predicted vs actual classification results.

📈 ROC Curve

Evaluates model performance across thresholds.

🌲 Feature Importance (Random Forest)

Displays the most important features affecting predictions.


🔮 Sample Predictions

Each notebook includes real-time predictions with:

  • Input data
  • Model output

💡 Key Learnings

  • Handling missing data and preprocessing techniques
  • Encoding categorical variables
  • Understanding different ML algorithms
  • Model evaluation and performance comparison
  • Working with real-world datasets

🚀 How to Run

  1. Clone the repository
  2. Install dependencies:
pip install -r requirements.txt
  1. Open notebooks:
jupyter notebook

📌 Conclusion

This project provides a complete understanding of supervised machine learning workflows and demonstrates the application of multiple algorithms on different datasets.


🤝 Connect With Me


📄 License

This project is licensed under the Apache 2.0 License. Developed as part of a learning and internship experience.

About

This project focuses on building multiple machine learning models to **predict outcomes based on different datasets**. It demonstrates the complete workflow of supervised learning, including data preprocessing, model training, evaluation, and visualization.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors