Skip to content

saad-sharif/Credit-Risk-Dataset

Repository files navigation

Credit Risk Classification – Predictive Modeling & Evaluation

Project Overview

This project analyzes a credit risk dataset to understand the relationships between customer attributes and credit outcomes, and to build classification models capable of predicting a chosen binary credit risk variable.

The analysis is structured in two main stages:

  1. Exploratory Data Analysis (EDA) to examine variable distributions, relationships, and data quality
  2. Supervised classification modeling to assess predictive performance using multiple machine learning algorithms

Objectives

  • Perform exploratory data analysis to understand feature distributions and relationships
  • Handle missing values and mixed data types
  • Build and compare multiple classification models:
    • Logistic Regression
    • Random Forest
    • Support Vector Machine (SVM)
    • Linear SVM
  • Evaluate model performance using appropriate classification metrics
  • Identify the most effective model for credit risk prediction

Dataset

Data Source

  • credit_risk_dataset_classification.csv

The dataset contains customer-level financial and demographic features along with a binary target variable representing credit risk.


Technologies & Libraries Used

This project is implemented in Python using the following libraries:

Core Data Handling

  • numpy
  • pandas

Visualization

  • matplotlib
  • seaborn

Preprocessing & Pipelines

  • scikit-learn
    • ColumnTransformer
    • Pipeline
    • OneHotEncoder
    • StandardScaler
    • SimpleImputer

Modeling

  • LogisticRegression
  • RandomForestClassifier
  • SVC
  • LinearSVC
  • DecisionTreeClassifier

Model Selection & Evaluation

  • train_test_split
  • StratifiedKFold
  • GridSearchCV
  • ROC-AUC
  • Precision-Recall AUC
  • F1-score

Statistical Utilities

  • scipy.stats (loguniform)
  • matplotlib.ticker

About

Analysis into a credit risk dataset and application of several supervised learning models to predict the binary variable on default status

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors