InventoryStandardization

A modular AI-powered pipeline designed to standardize, clean, and generate structured product data (SKUs, confidence scores, mappings) from raw multi-category inventory files such as Jewelry, Textiles, Electronics, and more.

Overview

The InventoryStandardization project uses a series of AI and rule-based agents to automate inventory data transformation and SKU generation.

The goal is to:

Process messy raw inventory data files (with 60+ columns)
Extract and normalize only the relevant fields
Generate standardized SKUs
Compute confidence scores
Route low-confidence records to HITL (Human-in-the-Loop) review

Architecture Overview

The complete pipeline includes the following agents:

Agent	Purpose	Type
InputProcessingAgent	Cleans & extracts necessary fields from raw CSVs (e.g. Brand, Category, Description)	AI agent
ConfidenceSKUAgent	Generates standardized SKUs & assigns confidence scores (row + column level)	AI-assisted
MappingEngineAgent	Classifies data into known internal schemas & identifies missing/ambiguous fields	ML Classifier
HITL Review Agent	Routes low-confidence mappings (< threshold) to a review interface	Human-in-the-Loop
StatisticsAgent (optional)	Aggregates logs, generates audit reports for QA	AI

Features

Automatic SKU generation from cleaned input Confidence scoring for both row-level & attribute-level data Multi-domain support (Jewelry, Textiles, Electronics, etc.) AI-assisted data enrichment and normalization Human-in-the-loop fallback for uncertain predictions Exports final standardized datasets to .csv and .parquet

Tech Stack

Python 3.10+
Pandas for data handling
FastAPI (optional) for service orchestration
Google Gemini API for AI-powered text interpretation
dotenv for key management
scikit-learn (planned) for classification and mapping engine

Setup Instructions

1. Generate a Gemini API Key

Visit https://aistudio.google.com/app/apikey Copy your key and keep it secure.

2. Add API Key to `.env`

Create a .env file in the project root and paste:

GEMINI_API_KEY = "your_gemini_api_key_here"

3. Create and Activate a Virtual Environment

In the project root:

cd agents
python -m venv venv
# Activate venv
# On Windows:
venv\Scripts\activate
# On Mac/Linux:
source venv/bin/activate

4. Install Dependencies

pip install -r requirements.txt

5. Run the Input Processing Agent

python inputProcessing.py

This step:

Reads raw data from data/raw/
Cleans and extracts essential fields
Saves the intermediate output to data/processed/

6. Run the Confidence + SKU Agent

python confidence_sku_agent.py

This step:

Loads the processed file from data/processed/
Generates standardized SKUs
Calculates confidence scores
Flags low-confidence rows for review
Saves final results to:

data/output/standardized_inventory.csv
data/output/standardized_inventory.parquet

Sample Output

SKU	Brand	Category	Description	Confidence_Score	Review_Flag
JW-BR-GLD-NEC-001	Bluestone	Jewelry	Gold Necklace 18K	0.94	No
TX-FB-COT-SHR-022	FabIndia	Textile	Cotton Shirt Men’s	0.87	Yes

Recommended Development Order

1️⃣ inputProcessing.py → Data extraction and normalization 2️⃣ confidence_sku_agent.py → SKU generation + confidence scoring 3️⃣ mapping_engine.py → ML-based attribute classifier 4️⃣ hitl_agent.py → HITL flag routing + review dashboard 5️⃣ audit_reporting.py (optional) → Logging and QA reports

Example Folder Structure

InventoryStandardization/ │ ├── agents/ │ ├── inputProcessing.py │ ├── confidence_sku_agent.py │ ├── mapping_engine.py │ ├── hitl_agent.py │ └── audit_reporting.py │ ├── data/ │ ├── raw/ │ ├── processed/ │ └── output/ │ ├── .env ├── requirements.txt └── README.md

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.vscode		.vscode
Scripts		Scripts
agents		agents
api		api
models		models
.gitignore		.gitignore
README.md		README.md
main.py		main.py
pyvenv.cfg		pyvenv.cfg
run_pipeline.py		run_pipeline.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InventoryStandardization

Overview

Architecture Overview

Features

Tech Stack

Setup Instructions

1. Generate a Gemini API Key

2. Add API Key to `.env`

3. Create and Activate a Virtual Environment

4. Install Dependencies

5. Run the Input Processing Agent

6. Run the Confidence + SKU Agent

Sample Output

Recommended Development Order

Example Folder Structure

About

Uh oh!

Releases

Languages

raksh-dev/inventory-data-standardization

Folders and files

Latest commit

History

Repository files navigation

InventoryStandardization

Overview

Architecture Overview

Features

Tech Stack

Setup Instructions

1. Generate a Gemini API Key

2. Add API Key to .env

3. Create and Activate a Virtual Environment

4. Install Dependencies

5. Run the Input Processing Agent

6. Run the Confidence + SKU Agent

Sample Output

Recommended Development Order

Example Folder Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Languages

2. Add API Key to `.env`