GitHub - Isacapps/Predictive-Modeling-Student-Depression-Risk-Factors.ipynb: Predictive modeling of student depression determinants. Features an end-to-end pipeline: rigorous data preprocessing, statistical visualization (EDA), and classification via supervised Machine Learning (Scikit-Learn).

Predictive Analysis of Mental Health Determinants in Student Populations

Abstract
This research project implements a supervised machine learning framework to identify and classify depression risk factors among students. By analyzing a dataset of over 27,000 observations, the study explores the intersection of academic pressure, socio-economic status, and lifestyle habits to provide a data-driven perspective on student well-being.

Research Objectives

Risk Classification: Develop high-precision models to categorize individuals based on depressive symptom indicators.
Feature Criticality: Quantify the impact of variables such as 'Financial Stress', 'Academic Pressure', and 'Study Satisfaction'.
Lifestyle Correlation: Evaluate the statistical significance of sleep patterns and dietary habits in predicting mental health outcomes.

Dataset Taxonomy The study utilizes a multi-dimensional dataset comprising 18 features across four primary domains:

Demographics: Age, Gender, City, and Profession.
Academic Metrics: Degree type, CGPA, Academic Pressure, and Study Satisfaction.
Lifestyle Factors: Sleep Duration, Dietary Habits, and Daily Study/Work Hours.
Clinical Indicators: Family History of Mental Illness, Suicidal Thoughts, and Financial Stress levels.

Methodology: Data Preprocessing To ensure the integrity of the statistical inference, a comprehensive preprocessing pipeline was executed:

Missing Value Management: Systematic imputation or removal of incomplete records to maintain dataset balance.
Categorical Encoding: Implementation of One-Hot Encoding and Ordinal Mapping to translate qualitative descriptors (e.g., 'Sleep Duration') into computationally viable formats.
Outlier Mitigation: Statistical filtering of non-physiological values in continuous features (e.g., Age and Study Hours) to reduce model variance.
Feature Scaling: Standardizing numerical inputs to a uniform scale, preventing magnitude bias during algorithmic training.
Target Encoding: Harmonizing the 'Depression' label for binary classification tasks.
Model Validation & Diagnostic: Implementation of Confusion Matrices across all classifiers to provide a granular breakdown of True Positives and False Negatives, ensuring the clinical reliability of the predictions.

Technical Framework

Language: Python
Key Libraries: Pandas (Data Manipulation), Scikit-Learn (Machine Learning), Seaborn & Matplotlib (Exploratory Data Analysis).
Evaluation Metrics: Performance Metrics: While accuracy was monitored, the primary optimization focused on Recall and F1-Score.
- Recall Optimization: Critical for clinical screening to minimize Type II errors (False Negatives), ensuring that students at risk of depression are correctly identified.
- Confusion Matrix Analysis: Used as the primary diagnostic tool to evaluate the trade-off between sensitivity and specificity for each implemented model.

Project Deliverables

Analysis Notebook: Python implementation of the end-to-end ML pipeline.
Research Presentation: Detailed slide deck summarizing findings and clinical insights.
Visual Assets: High-resolution infographics and correlation matrices.

Authors

Isabella Cappiello
Nathan Dubourg
Romain Sartori

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
README.md		README.md
Student-Mental-Health-Classification-Analysis.ipynb.ipynb		Student-Mental-Health-Classification-Analysis.ipynb.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages