This project focuses on predicting natural disasters using machine learning techniques. It includes data collection, preprocessing, model development, evaluation, tuning, and ensemble techniques.
This Machine Learning project, entitled "Sustainable Futures through Natural Disaster Prediction," embarks on a transformative quest towards promoting heightened resilience and sustainable practices globally. The primary goal of this project is to promote the preservation of sustainability. By employing predictive techniques,strive to reduce the detrimental impact that disasters have on ecosystems, natural resources, and the environment. The objective aligns with worldwide endeavors in addressing climate change. Through machine learning models such as K-means neighbor, Random Forest, Support Vector Machine (SVM), and Naive Bayes are used as early warning indicators, thus facilitating proactive approaches for disaster preparedness and response strategies. By thoroughly analyzing historical data on diverse types of calamities such as earthquakes, hurricanes, floods, and wildfires and by employing feature engineering, and machine learning algorithms, aim is to develop precise prediction models that offer practical insights for effective action.
Languages: Python
Technology: Machine Learning
Libraries: Pandas, Scikit-Learn, Numpy, Seaborn, Matplotlib
Tools: Jupyter Notebook, GitHub
Data Sources: Historical disaster data
Project_Report.pdf
Industry_CaseStudy.pd
Significant_paper_slides.pdf
natural_disasters_dataset.csv
preprocessed_data.csv
Complete_code.ipynb
Complete_code.py
new_unseen_dataset.csv
Testing_the_saved_model.ipynb
Testing_the_saved_model.py
https://drive.google.com/drive/u/1/folders/1ND3XnuUrvSIkmtFcO-zI6ovv2ykQyWp3
1.Data: Obtained a diverse dataset of historical natural disasters from Kaggle. Metadata provided information such as timestamps, location details, and event characteristics
2.Exploratory Data Analysis (EDA) : Visualized and interpreted using histograms, bar graphs, and time series analyses during a collaborative EDA. Goal was to identify distribution patterns, correlations, outliers, and intricate details related to probable disaster situations.We have performed EDA before data preprocessing.
3.Data Preprocessing :Checked for missing values and handled them by imputing the Numerical values with Mean and Categorical values with Mode and encoded categorical variables.Performed EDA after data preprocessing to gain futhur insights about the data.
4.Feature Selection: Based on Mutal Information and Domain knowledge,selected the Features and have been loaded into preproceesed_data.csv file.We have selected 6 features and one target variable which is Diaster Type.
5.Model Development: Chosen four machine learning techniques to best match the distinctive characteristics of natural disasters in order to properly predict them.Four different machine learning models are Random Forest, SVM, K-NN, and Naive Bayes.
6.Model Evaluation: Perfomred evaluation metrics such as Accuracy,F1 Score,Recall and precision.
7.Balacning the data: Balanced the data and performed the model developemnt and again to check the performnace of the chosen models. Implemented soft and hard voting ensemble techniques.
8.Hyperparameter Tuning: Conducted hyperparameter tuning using GridSearchCV and also done visualization of Performance metrics using Grouped Bar chart.
9.Next model deployment was done using Joblib Approach and tested on new unseen data.

