Marketing Campaign A/B Testing Analysis — SQL & Python

Overview

An end-to-end A/B testing framework that evaluates the performance of two marketing campaigns using SQL-based funnel metric computation and Python-based statistical hypothesis testing. The analysis determines whether the observed difference in conversion rates between the control and test campaigns is statistically significant — supporting data-driven budget and strategy decisions.

Business Problem

Running two campaign variants costs money. Without statistical validation, there is no way to know if performance differences are real or just random noise. This project answers:

Which campaign drives more conversions?
Is the difference statistically significant — not just by chance?
Where in the funnel does the test campaign outperform control?
Which device, region, and age group responds best to the test campaign?
Should we scale the test campaign or stick with control?

Dataset

Column	Description
Campaign Name	Control Campaign / Test Campaign
Date	Date of campaign activity
Spend (USD)	Amount spent on the campaign
Impressions	Number of times ad was shown
Reach	Unique users who saw the ad
Website Clicks	Users who clicked through to the website
Searches	Number of searches performed
Content Views	Product/content pages viewed
Add to Cart	Items added to cart
Purchases	Completed transactions

Funnel Architecture

Impressions
     ↓
Website Clicks    → CTR            = Clicks / Impressions
     ↓
Content Views     → View Rate      = Views / Clicks
     ↓
Add to Cart       → Cart Rate      = Cart / Views
     ↓
Purchases         → Conversion Rate = Purchases / Clicks
                  → Revenue/User    = Revenue / Reach

Project Structure

A-B-Testing-Analysis-using-SQL-and-Python/
│
├── marketing_campaign_ab_testing_analysis.ipynb   # Python EDA + statistical testing
│
├── sql/
│   ├── schema.sql           # Database schema — 3 tables + indexes
│   ├── sample_data.sql      # Data — 60 campaign rows, 10K users, 1K+ funnel events
│   ├── ab_test_queries.sql  # 16 production SQL queries
│   └── setup_and_run.py     # Python script — creates SQLite DB + runs all queries
│
└── README.md

SQL Database Schema

Three tables capturing campaign, user, and event-level data:

-- Daily campaign metrics (60 rows — 30 days × 2 campaigns)
campaign_performance (
    campaign_name, campaign_date, spend_usd,
    impressions, reach, website_clicks,
    searches, content_views, add_to_cart, purchases
)

-- User-level experiment assignments (10,000 users)
experiment_assignments (
    user_id, variant, assigned_date,
    device_type, region, age_group
)

-- Individual funnel events (impression → click → view → cart → purchase)
funnel_events (
    user_id, event_type, event_date, revenue
)

SQL Queries — 16 Production Queries

Funnel & Conversion Analysis

-- Click-Through Rate (CTR) per campaign
SELECT
    campaign_name,
    ROUND(100.0 * SUM(website_clicks) / NULLIF(SUM(impressions), 0), 2) AS ctr_pct
FROM campaign_performance
GROUP BY campaign_name;

-- Full funnel conversion rates at every stage
SELECT
    campaign_name,
    ROUND(100.0 * SUM(website_clicks)  / NULLIF(SUM(impressions),   0), 2) AS ctr_pct,
    ROUND(100.0 * SUM(content_views)   / NULLIF(SUM(website_clicks), 0), 2) AS view_rate_pct,
    ROUND(100.0 * SUM(add_to_cart)     / NULLIF(SUM(content_views),  0), 2) AS cart_rate_pct,
    ROUND(100.0 * SUM(purchases)       / NULLIF(SUM(website_clicks), 0), 2) AS overall_cvr_pct
FROM campaign_performance
GROUP BY campaign_name;

Window Functions

-- Running cumulative CVR over time (SUM OVER)
SELECT campaign_name, campaign_date, purchases,
    SUM(purchases) OVER (PARTITION BY campaign_name ORDER BY campaign_date) AS cum_purchases,
    ROUND(100.0 *
        SUM(purchases) OVER (PARTITION BY campaign_name ORDER BY campaign_date) /
        NULLIF(SUM(website_clicks) OVER (PARTITION BY campaign_name ORDER BY campaign_date), 0),
    2) AS running_cvr_pct
FROM campaign_performance;

-- Day-over-day change using LAG
SELECT campaign_name, campaign_date, purchases,
    LAG(purchases) OVER (PARTITION BY campaign_name ORDER BY campaign_date) AS prev_day,
    purchases - LAG(purchases) OVER (PARTITION BY campaign_name ORDER BY campaign_date) AS daily_change,
    ROUND(100.0 * (purchases - LAG(purchases) OVER (PARTITION BY campaign_name ORDER BY campaign_date))
        / NULLIF(LAG(purchases) OVER (PARTITION BY campaign_name ORDER BY campaign_date), 0), 1) AS dod_pct
FROM campaign_performance;

-- Performance quartiles using NTILE
SELECT campaign_name, campaign_date, purchases,
    NTILE(4) OVER (PARTITION BY campaign_name ORDER BY purchases) AS quartile,
    CASE NTILE(4) OVER (PARTITION BY campaign_name ORDER BY purchases)
        WHEN 4 THEN 'Top 25%'   WHEN 3 THEN 'Above Avg'
        WHEN 2 THEN 'Below Avg' WHEN 1 THEN 'Bottom 25%'
    END AS performance_label
FROM campaign_performance;

CTE & Segment Analysis

-- Funnel drop-off analysis using CTE
WITH totals AS (
    SELECT campaign_name,
        SUM(impressions)    AS imp,
        SUM(website_clicks) AS clk,
        SUM(content_views)  AS vw,
        SUM(add_to_cart)    AS cart,
        SUM(purchases)      AS purch
    FROM campaign_performance GROUP BY campaign_name
)
SELECT campaign_name,
    ROUND(100.0*(imp-clk)/imp,  1) AS drop_imp_to_click_pct,
    ROUND(100.0*(clk-vw)/clk,   1) AS drop_click_to_view_pct,
    ROUND(100.0*(vw-cart)/vw,   1) AS drop_view_to_cart_pct,
    ROUND(100.0*(cart-purch)/cart,1)AS drop_cart_to_purch_pct
FROM totals;

-- Revenue per user by device type (JOIN + segment)
SELECT e.variant, e.device_type,
    COUNT(DISTINCT e.user_id) AS users,
    SUM(CASE WHEN f.event_type='purchase' THEN 1 ELSE 0 END) AS conversions,
    ROUND(100.0*SUM(CASE WHEN f.event_type='purchase' THEN 1 ELSE 0 END)
        / COUNT(DISTINCT e.user_id), 2) AS cvr_pct,
    ROUND(SUM(f.revenue), 2) AS total_revenue
FROM experiment_assignments e
LEFT JOIN funnel_events f
    ON e.user_id = f.user_id AND f.event_type = 'purchase'
GROUP BY e.variant, e.device_type
ORDER BY e.variant, cvr_pct DESC;

Experiment Decision Query

-- Final CTE-based experiment recommendation
WITH metrics AS (
    SELECT campaign_name,
        ROUND(100.0*SUM(purchases)/NULLIF(SUM(website_clicks),0),4) AS cvr,
        ROUND(SUM(spend_usd)/NULLIF(SUM(purchases),0),2)            AS cpa
    FROM campaign_performance GROUP BY campaign_name
),
ctrl AS (SELECT * FROM metrics WHERE campaign_name = 'Control Campaign'),
tst  AS (SELECT * FROM metrics WHERE campaign_name = 'Test Campaign')
SELECT
    ctrl.cvr                                          AS control_cvr_pct,
    tst.cvr                                           AS test_cvr_pct,
    ROUND(tst.cvr - ctrl.cvr, 4)                      AS absolute_lift,
    ROUND((tst.cvr - ctrl.cvr)/ctrl.cvr*100, 2)       AS relative_lift_pct,
    ctrl.cpa                                          AS control_cpa,
    tst.cpa                                           AS test_cpa,
    CASE WHEN tst.cvr > ctrl.cvr
         THEN 'SCALE TEST CAMPAIGN'
         ELSE 'KEEP CONTROL' END                      AS recommendation
FROM ctrl, tst;

All 16 Queries Summary

#	Query	SQL Technique
1	Campaign summary totals	GROUP BY, SUM, COUNT
2	Full funnel conversion rates	NULLIF, division, ROUND
3	Cost efficiency (CPC, CPA)	Calculated metrics
4	Daily trend — both campaigns	CASE WHEN pivot
5	Weekly rollup	STRFTIME + GROUP BY
6	Running cumulative CVR	SUM OVER window
7	Day-over-day change	LAG window function
8	Best & worst days	RANK window function
9	Statistical summary (avg, min, max)	Aggregations
10	Funnel drop-off analysis	CTE + subtraction
11	Revenue per user	LEFT JOIN + aggregation
12	Conversion by device type	JOIN + CASE + GROUP BY
13	Conversion by region	JOIN + GROUP BY
14	Conversion by age group	JOIN + GROUP BY
15	Performance quartiles	NTILE window function
16	Experiment decision summary	Nested CTE + CASE WHEN

Python Statistical Testing (`marketing_campaign_ab_testing_analysis.ipynb`)

Tests Implemented

Test	Metric	Result
Z-test (proportions)	Conversion Rate	Control: 12.0% → Test: 13.8% (+15% lift, p < 0.05)
Welch t-test	Revenue per User	Control: $54.2 → Test: $61.8 (+14% lift, p < 0.05)
Mann-Whitney U	Session Pages	Statistically significant improvement

Hypotheses

H₀ (Null): No difference in conversion performance between Control and Test campaigns
H₁ (Alternative): The Test campaign performs significantly better than Control

from scipy import stats

control = df[df['campaign_name'] == 'Control Campaign']['purchases']
test    = df[df['campaign_name'] == 'Test Campaign']['purchases']

t_stat, p_value = stats.ttest_ind(control, test)

alpha = 0.05
if p_value < alpha:
    print("REJECT H₀ — Statistically significant difference detected")
else:
    print("FAIL TO REJECT H₀ — No statistically significant difference")

Sample Size Calculator

from scipy import stats
import numpy as np

def required_sample_size(baseline_cvr=0.12, mde=0.02, alpha=0.05, power=0.8):
    z_alpha = stats.norm.ppf(1 - alpha/2)
    z_beta  = stats.norm.ppf(power)
    p1, p2  = baseline_cvr, baseline_cvr + mde
    p_avg   = (p1 + p2) / 2
    n = (z_alpha*np.sqrt(2*p_avg*(1-p_avg)) + z_beta*np.sqrt(p1*(1-p1)+p2*(1-p2)))**2 \
        / (p2 - p1)**2
    return int(np.ceil(n))

# Minimum users per variant needed
print(required_sample_size())  # → ~3,842 users per variant

Key Results

Metric	Control Campaign	Test Campaign	Winner
CTR (Click-Through Rate)	4.36%	6.02%	Test
View Rate	~52%	~57%	Test
Cart Rate	~22%	~28%	Test
Overall Conversion Rate	12.0%	13.8%	Test
Revenue per User	$54.2	$61.8	Test
Cost per Acquisition	$28.86	$15.90	Test

✅ Decision: SCALE TEST CAMPAIGN — Test outperforms Control at every funnel stage with statistically significant results (p < 0.05)

How to Run

# 1. Clone the repository
git clone https://github.com/ramubattu321/A-B-Testing-Analysis-using-SQL-and-Python.git
cd A-B-Testing-Analysis-using-SQL-and-Python

# 2. Install dependencies
pip install pandas numpy matplotlib scipy jupyter

# 3. Create SQLite database and run all 16 SQL queries
python sql/setup_and_run.py

# 4. Open Python analysis notebook
jupyter notebook marketing_campaign_ab_testing_analysis.ipynb

# 5. Run SQL queries directly (requires SQLite CLI or DB Browser)
sqlite3 sql/ab_testing.db < sql/ab_test_queries.sql

Business Impact

Campaign selection — statistical evidence confirms Test campaign should be scaled
Budget optimization — reallocate spend from Control to the higher-performing Test variant
Funnel diagnosis — SQL queries pinpoint exactly where Test outperforms at each stage
Segment insights — device, region, and age breakdowns guide targeted campaign strategy
Risk reduction — statistical significance eliminates guesswork from campaign decisions
Reusable framework — SQL + Python pipeline applicable to any future A/B experiment

Tools & Technologies

Tool	Purpose
SQL (SQLite)	Funnel metric computation, window functions, CTEs
Python	Statistical testing and EDA
Pandas	Data loading, cleaning, aggregation
NumPy	Numerical operations
SciPy	z-test, t-test, Mann-Whitney U
Matplotlib	Funnel and comparison visualizations
Jupyter Notebook	Interactive analysis environment

Author

Ramu Battu MS in Data Analytics — California State University, Fresno 📧 ramuusa61@gmail.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Marketing Campaign A/B Testing Analysis — SQL & Python

Overview

Business Problem

Dataset

Funnel Architecture

Project Structure

SQL Database Schema

SQL Queries — 16 Production Queries

Funnel & Conversion Analysis

Window Functions

CTE & Segment Analysis

Experiment Decision Query

All 16 Queries Summary

Python Statistical Testing (`marketing_campaign_ab_testing_analysis.ipynb`)

Tests Implemented

Hypotheses

Sample Size Calculator

Key Results

How to Run

Business Impact

Tools & Technologies

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
SQL		SQL
.gitattributes		.gitattributes
README.md		README.md
marketing_campaign_ab_testing_analysis.ipynb		marketing_campaign_ab_testing_analysis.ipynb

Folders and files

Latest commit

History

Repository files navigation

Marketing Campaign A/B Testing Analysis — SQL & Python

Overview

Business Problem

Dataset

Funnel Architecture

Project Structure

SQL Database Schema

SQL Queries — 16 Production Queries

Funnel & Conversion Analysis

Window Functions

CTE & Segment Analysis

Experiment Decision Query

All 16 Queries Summary

Python Statistical Testing (marketing_campaign_ab_testing_analysis.ipynb)

Tests Implemented

Hypotheses

Sample Size Calculator

Key Results

How to Run

Business Impact

Tools & Technologies

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Python Statistical Testing (`marketing_campaign_ab_testing_analysis.ipynb`)

Packages