Skip to content

Gowri-88/Transaction-Fraud-Risk-Intelligence-Analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Transaction Fraud Risk Intelligence Analytics

An end-to-end fraud detection pipeline that turns raw transaction data into actionable risk intelligence — built with SQL and Power BI.


What This Project Does

Financial institutions deal with millions of transactions daily, and manual fraud review doesn't scale. This project builds a rule-based risk scoring system on top of the BankSim synthetic dataset to:

  • Flag suspicious transactions using behavioral signals
  • Segment them into risk tiers (High / Medium / Low)
  • Measure how well the model performs
  • Quantify the financial value of catching fraud early

The end result is a 4-page interactive Power BI dashboard covering model performance, operational prioritization, and ROI.


Dashboard Preview

Executive Overview Model Performance
Executive Overview Model Performance
Operational Queue Financial Impact
Operational Queue Financial Impact

Key Metrics

Metric Value
Total Transactions 21,051
Fraud Transactions 226
Fraud Rate 1.1%
Estimated Fraud Loss $137.84K
Fraud Capture Rate 55.8%
Net Benefit (after review cost) $135.57K
ROI 5959%

Tech Stack

  • Database: MySQL
  • Feature Engineering & Scoring: SQL
  • Visualization: Power BI
  • Dataset: BankSim (synthetic transaction data)

How It Works

Pipeline Overview

CSV Dataset
    ↓
MySQL (banksim_raw)
    ↓
Feature Engineering
    ├── customer_velocity
    ├── customer_spend_baseline
    └── category_risk
    ↓
Risk Scoring Table (transaction_risk_score)
    ↓
Power BI Dashboard

Risk Scoring Logic

Each transaction gets scored across three dimensions:

Velocity Risk — how often is this customer transacting?

  • 5+ tx/day → 30 pts
  • 3+ tx/day → 20 pts
  • Otherwise → 5 pts

Amount Deviation Risk — is this spend unusual for this customer?

  • ≥ 2× their average → 40 pts
  • ≥ 1.5× their average → 25 pts
  • Otherwise → 5 pts

Category Risk — is this a high-fraud category?

  • Fraud rate ≥ 10% → 30 pts
  • Fraud rate ≥ 5% → 20 pts
  • Otherwise → 5 pts

Final Score = velocity + amount + category

Risk Tiers:

  • 🔴 High → Score ≥ 85
  • 🟡 Medium → Score ≥ 55
  • 🟢 Low → Everything else

Model Performance

Evaluated entirely in SQL:

  • Fraud Capture Rate: 55.8%
  • Precision (High Risk): 27.69%
  • False Positive Rate: 72.3%

The High Risk tier accounts for a small share of total transactions but drives 86% of preventable fraud loss.


Repository Structure

├── Images/
│   ├── Executive Overview.png
│   ├── Financial Impact.png
│   ├── Model Performance.png
│   └── Operational Queue.png
├── PowerBI/
├── SQL_script/
└── README.md

Dataset

BankSim is a synthetic payments dataset generated to simulate realistic fraud patterns. No real customer data is used.


About

End-to-end fraud risk scoring and financial impact analytics using SQL feature engineering and Power BI modeling to prioritize investigation workflows and quantify loss prevention.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors