Skip to content

muhssamy/Contoso-Analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ Contoso Database Setup Guide

A comprehensive guide to setting up the Contoso database with sample data for analysis and testing.


πŸ“‹ Prerequisites

Before you begin, ensure you have:

  • PostgreSQL installed on your system
  • Git installed (to clone the repository)
  • At least 2GB of free disk space
  • 7-Zip or similar tool to extract .7z archives

πŸ”§ Step 1: Add PostgreSQL to Environment Variables

Windows

  1. Locate your PostgreSQL bin directory

    • Default location: C:\Program Files\PostgreSQL\<version>\bin
  2. Add to PATH

    • Right-click on This PC or My Computer β†’ Properties
    • Click Advanced system settings β†’ Environment Variables
    • Under System variables, find and select Path β†’ Click Edit
    • Click New and add your PostgreSQL bin path
    • Click OK on all windows to save
  3. Verify installation

    psql --version

macOS

  1. Add to PATH (if installed via Homebrew)

    echo 'export PATH="/Library/PostgreSQL/18/bin:$PATH"' >> ~/.zshrc
    source ~/.zshrc
  2. For Postgres.app users

    echo 'export PATH="/Applications/Postgres.app/Contents/Versions/latest/bin:$PATH"' >> ~/.zshrc
    source ~/.zshrc
  3. Verify installation

    psql --version

πŸ“₯ Step 2: Clone the Repository

git clone https://github.com/muhssamy/Contoso-Analytics.git
cd Contoso-Analytics

πŸ’Ύ Step 3: Download and Extract CSV Files

  1. Download the dataset

    Download the CSV files from the official release:

    https://github.com/sql-bi/Contoso-Data-Generator-V2-Data/releases/download/ready-to-use-data/csv-10m.7z
    
  2. Extract the archive

    Extract the .7z file using 7-Zip or your preferred extraction tool

  3. Move CSV files to the project

    ⚠️ Important: Copy only the .csv files (not the folder) into your project's csv-10m directory

    your-project/
    └── csv-10m/
        β”œβ”€β”€ customers.csv
        β”œβ”€β”€ products.csv
        β”œβ”€β”€ sales.csv
        β”œβ”€β”€ contoso_load.sql
        └── ... (other CSV files)
    

πŸ—„οΈ Step 4: Create the Contoso Database

Run the following command to create a new database:

psql -U postgres -c "CREATE DATABASE contoso;"

You'll be prompted for your PostgreSQL password.


πŸ“Š Step 5: Load the Data

Execute the SQL script to import all CSV data into the database:

psql -U postgres -d contoso -f csv-10m/contoso_load.sql

This process may take several minutes depending on your system specifications and the size of the dataset.


βœ… Verification

After loading, verify your setup:

-- Connect to the database
psql -U postgres -d contoso

-- Check tables
\dt

-- Check row counts
SELECT 
    schemaname,
    tablename,
    n_live_tup as row_count
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC;

🎯 Next Steps

Your Contoso database is now ready! You can:

  • Run analytical queries
  • Build dashboards and reports
  • Practice SQL skills
  • Test ETL pipelines

πŸ› Troubleshooting

Issue: psql command not found

  • Solution: Ensure PostgreSQL bin directory is properly added to PATH and restart your terminal

Issue: Permission denied when creating database

  • Solution: Make sure you're using a PostgreSQL superuser account (default: postgres)

Issue: CSV files not loading

  • Solution: Verify that CSV files are directly in the csv-10m folder, not in a subfolder

πŸ“š Resources


Happy Querying! πŸŽ‰

About

πŸ—„οΈ Contoso Database Setup - PostgreSQL sample database with 10M+ records for data analysis, SQL practice, and BI reporting. Includes automated data loading scripts and comprehensive setup guide.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors