- Full Name: Muhammad Alber
- Student ID: su92-bscsm-f23-353
- Section: BSCS-6A
This repository contains a high-performance n8n Automation Workflow designed to streamline email processing. The system automatically scrapes incoming emails from a Gmail inbox, analyzes the content and length, processes entire email threads, and utilizes OpenAI's Large Language Models (LLMs) via Langchain to generate intelligent summaries of long email chains. The processed data, along with thread analytics and AI summaries, is automatically aggregated and logged into structured Google Sheets for reporting and business intelligence.
This project demonstrates strong capabilities in:
- Workflow Automation & Orchestration (n8n)
- API Integrations (Google Workspace, OpenAI)
- Data Engineering & ETL (Extract, Transform, Load)
- AI Agent Integration for Natural Language Processing
- n8n: Core workflow automation platform.
- Gmail API: For extracting emails, thread IDs, and parsing message payloads.
- Google Sheets API: Serving as the structured database for final data aggregation.
- OpenAI (gpt-4o-mini): For intelligent text summarization.
- Langchain: Orchestrating the AI agent to extract structured JSON (Summarized body & Thread ID).
- JavaScript (Node.js): Custom code nodes for data transformation, header extraction, and length evaluation.
The automation follows a parallel-processing approach:
- Trigger & Fetch: The workflow initializes and fetches unread/inbox emails via the Gmail node.
- Text Analysis & Routing: Custom JavaScript code evaluates the length of the email text.
- If the text is exceedingly long, it flags the email for AI summarization.
- It also extracts unique Thread IDs to group conversations.
- AI Summarization Pipeline: An AI Agent receives the email content with a strict prompt to summarize the thread focusing on key decisions and action items, returning structured JSON.
- Thread Processing Pipeline: Concurrently, the workflow retrieves full thread data, combining messages (up to 500 per thread), calculating character counts, and extracting timestamps.
- Data Merge & Aggregation: The AI summaries and the raw thread data are merged using the
Thread IDas the primary key. - Storage: The final, enriched datasets are exported to targeted Google Sheets for analytics and record-keeping.
n8n-Automations/
├── workflows/
│ ├── Email_Scraping_Automation_Org.json # Main production workflow
│ ├── Email_Scraping_Automation_Testing.json # Staging/Testing environment workflow
│ └── Development_Version.json # Legacy/development iteration
├── assets/
│ ├── workflow-overview.png # High-level view of the n8n nodes
│ ├── successful-execution.png # Execution logs and data flow
│ └── workflow-details.png # Detailed view of AI & Merge logic
├── samples/
│ └── output.json # Sample data output from the automation
└── README.md # Project documentation
An architectural view of the n8n nodes working in harmony.

Real-time execution logs demonstrating the successful parsing and routing of over 1,500 items.

Close-up on the Langchain AI Agent processing text and merging endpoints before exporting to Google Sheets.

This project highlights modern full-stack development practices, focusing on stability, efficient API pathing, and production-first automation design.