Transaction Fraud Risk Intelligence Analytics

An end-to-end fraud detection pipeline that turns raw transaction data into actionable risk intelligence — built with SQL and Power BI.

What This Project Does

Financial institutions deal with millions of transactions daily, and manual fraud review doesn't scale. This project builds a rule-based risk scoring system on top of the BankSim synthetic dataset to:

Flag suspicious transactions using behavioral signals
Segment them into risk tiers (High / Medium / Low)
Measure how well the model performs
Quantify the financial value of catching fraud early

The end result is a 4-page interactive Power BI dashboard covering model performance, operational prioritization, and ROI.

Dashboard Preview

Executive Overview	Model Performance

Operational Queue	Financial Impact

Key Metrics

Metric	Value
Total Transactions	21,051
Fraud Transactions	226
Fraud Rate	1.1%
Estimated Fraud Loss	$137.84K
Fraud Capture Rate	55.8%
Net Benefit (after review cost)	$135.57K
ROI	5959%

Tech Stack

Database: MySQL
Feature Engineering & Scoring: SQL
Visualization: Power BI
Dataset: BankSim (synthetic transaction data)

How It Works

Pipeline Overview

CSV Dataset
    ↓
MySQL (banksim_raw)
    ↓
Feature Engineering
    ├── customer_velocity
    ├── customer_spend_baseline
    └── category_risk
    ↓
Risk Scoring Table (transaction_risk_score)
    ↓
Power BI Dashboard

Risk Scoring Logic

Each transaction gets scored across three dimensions:

Velocity Risk — how often is this customer transacting?

5+ tx/day → 30 pts
3+ tx/day → 20 pts
Otherwise → 5 pts

Amount Deviation Risk — is this spend unusual for this customer?

≥ 2× their average → 40 pts
≥ 1.5× their average → 25 pts
Otherwise → 5 pts

Category Risk — is this a high-fraud category?

Fraud rate ≥ 10% → 30 pts
Fraud rate ≥ 5% → 20 pts
Otherwise → 5 pts

Final Score = velocity + amount + category

Risk Tiers:

🔴 High → Score ≥ 85
🟡 Medium → Score ≥ 55
🟢 Low → Everything else

Model Performance

Evaluated entirely in SQL:

Fraud Capture Rate: 55.8%
Precision (High Risk): 27.69%
False Positive Rate: 72.3%

The High Risk tier accounts for a small share of total transactions but drives 86% of preventable fraud loss.

Repository Structure

├── Images/
│   ├── Executive Overview.png
│   ├── Financial Impact.png
│   ├── Model Performance.png
│   └── Operational Queue.png
├── PowerBI/
├── SQL_script/
└── README.md

Dataset

BankSim is a synthetic payments dataset generated to simulate realistic fraud patterns. No real customer data is used.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transaction Fraud Risk Intelligence Analytics

What This Project Does

Dashboard Preview

Key Metrics

Tech Stack

How It Works

Pipeline Overview

Risk Scoring Logic

Model Performance

Repository Structure

Dataset

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Images		Images
PowerBI		PowerBI
SQL_script		SQL_script
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Transaction Fraud Risk Intelligence Analytics

What This Project Does

Dashboard Preview

Key Metrics

Tech Stack

How It Works

Pipeline Overview

Risk Scoring Logic

Model Performance

Repository Structure

Dataset

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages