A production-pattern agentic customer service chat workflow built on n8n. Users interact via a chat interface and are routed across three distinct capabilities — knowledge base search, account lookup, and refund requests — based on the nature of their question. Refund requests are held for human approval via Slack before the customer is notified of an outcome.
The fictional customer is Kaya Connect — a South African mobile network operator.
Built to demonstrate:
- Tool-calling agent architecture with distinct capabilities and trust boundaries per tool
- RAG retrieval via Pinecone vector store with OpenAI embeddings, exposed as an agent tool
- Human-in-the-loop (HITL) approval flow for financial decisions via Slack
- Deterministic CRM lookup — account data is returned by code, not inferred by the LLM
- Tool usage tracking via execution metadata for post-conversation ROI routing
- Multi-tool agent orchestration in n8n with Claude Sonnet as the LLM
Chat trigger (WhatsApp / Telegram / web)
└── AI Agent (Claude)
├── Knowledge Base tool → Pinecone vector search (RAG)
├── Customer Lookup tool → CRM mock (Code node)
└── Refund Request tool → Slack HITL approval
└── Process Refund (on approval)
Agent output → Route by Tool Used
├── CRM path → Time Saved: Account Lookup (3 min)
├── Refund path → Time Saved: Refund Request (5 min)
└── KB path → Time Saved: Knowledge Base (1 min)
Key n8n features used:
| Feature | Role |
|---|---|
| AI Agent node | Orchestrates tool selection and conversation |
| Vector Store (Pinecone) | RAG over 35 KB articles |
| Slack HITL Tool | Human approval gate for refund requests |
| Time Saved nodes | Per-branch ROI tracking in n8n Insights |
| Switch node + customData | Routes post-agent flow based on which tool fired |
Pinecone vector store exposed as a retrieval tool. The tool description scopes it explicitly to general product questions — plans, pricing, network coverage, troubleshooting, billing policy, roaming, SIM, devices, and FAQ — so the agent knows when to use it versus the CRM. Embeddings are generated via OpenAI.
A JavaScript code tool that accepts an account ID and returns a structured account object (customer name, plan, data remaining, balance due, next billing date, status, loyalty tier). The LLM never infers account data — it retrieves it. Account ID is always requested from the customer as an explicit identity validation step, even when they are already authenticated.
Tool usage is recorded to execution customData on each invocation for downstream routing.
A Slack HITL tool that sends a formatted approval request to the refund approvals channel and waits for human action before returning a result to the agent. The agent gathers the refund reason and approximate amount from the customer, and retrieves the customer name via CRM before calling this tool. The customer is notified only after a human approves or denies.
- Tool-calling over a monolithic prompt. General knowledge questions, account-specific lookups, and financial transactions have different reliability requirements, different data sources, and different trust boundaries. Separating them into distinct tools makes each capability independently auditable, testable, and replaceable.
- Human approval for financial decisions. The agent handles information gathering and routing autonomously, but no refund is confirmed without a human approving it in Slack. This is a deliberate trust boundary — the agent is good at understanding what the customer wants; the approval decision belongs with a human agent.
- Deterministic CRM over LLM inference. Account data is looked up by code and returned as structured JSON. The LLM receives pre-formatted account summaries, not raw data to reason over. This keeps account responses accurate and eliminates a class of hallucination risk for factual customer data.
- Tool usage tracking for ROI routing. Each tool records its invocation to
$execution.customDatausing a sharedtoolsUsedarray. After the agent completes, a Switch node reads this metadata and routes to time-saved measurement nodes — 1 minute for knowledge base resolutions, 3 minutes for account lookups, 5 minutes for refund requests. This provides a quantifiable efficiency signal per conversation type. - Account ID validation as a trust layer. The system prompt instructs the agent to ask for account ID even though the customer is already logged in. This is an explicit identity validation step that prevents one customer from accessing another's account data through conversational ambiguity.
- n8n v1.68+
- OpenAI account (embeddings:
text-embedding-3-small— swap for any supported model) - Anthropic account (LLM:
claude-sonnet-4-6— swap for any supported model) - Pinecone account (free serverless tier is sufficient)
- Slack workspace with a dedicated approval channel
cd kb
OPENAI_API_KEY=sk-... PINECONE_API_KEY=pcsk-... python3 generate_kb.py --uploadThis creates a serverless Pinecone index named kaya-connect-kb.
# Options
--index kaya-connect-kb # index name (must match n8n credential config)
--cloud aws # aws | gcp | azure
--region us-east-1 # must be valid for the chosen cloud
--model text-embedding-3-small- Open your n8n instance
- Go to Workflows > Import from file
- Select
workflow/ask-kaya-connect.json - Wire up credentials (see below)
The workflow uses four credential types. All placeholder IDs in the JSON (YOUR_*_CREDENTIAL_ID) are replaced automatically when you select credentials in the n8n UI after import.
| Credential | Used by | Notes |
|---|---|---|
| Anthropic API | Claude LLM node | Swap for a different LLM of your choice |
| OpenAI API | Embeddings node | Required (or equivalent) for Pinecone vector search |
| Pinecone API | Knowledge Base node | Set index name to kaya-connect-kb |
| Slack OAuth2 | Refund Request (HITL) | Requires a Slack app with chat:write and incoming webhook; set channel to your approval channel |
- Activate the workflow
- Open the Chat trigger URL (or use the n8n chat widget)
- Try these test prompts:
- "What prepaid plans do you offer?" (KB path)
- "What's my balance?" then provide account number when prompted (CRM path)
- "I was charged twice last month, I'd like a refund of R149" (HITL path)
The mock CRM contains three hardcoded accounts for testing. Reference them by account number.
Design note: The agent assumes the user is already authenticated via the Kaya Connect support portal. Rather than asking an anonymous session to volunteer an account number, the agent requests it as an identity verification step — confirming the caller matches the authenticated session before acting on any account data.
- Plan: Kaya Pro 48 (R599/month) · Loyalty tier: Kaya Gold
- Data: 18.4 GB remaining of 60 GB · Balance due: R142.30
Example session — data usage query:
User: Hi, I'd like to check my remaining data. Agent: I can help with that. To verify your identity, could you please confirm your account number? User: It's 1. Agent: Thank you, Sarah. You're on the Kaya Pro 48 plan and currently have 18.4 GB remaining out of your 60 GB monthly allowance. Your next billing date is 1 April. Is there anything else I can help you with?
- Plan: Kaya Plus (R149/month) · Loyalty tier: Kaya
- Data: 0.3 GB remaining of 5 GB · Balance due: R0.00
Example session — low data / upgrade prompt:
User: I keep running out of data every month. Agent: I can look into that for you. To verify your identity, could you please confirm your account number? User: 2. Agent: Thank you, Thabo. You're on the Kaya Plus plan and you're nearly out of data this month — only 0.3 GB remaining. I can walk you through our higher-tier plans or help you add a top-up bundle. Which would you prefer?
- Plan: Kaya Business (R899/month) · Loyalty tier: Kaya Platinum
- Data: 72.1 GB remaining of 100 GB · Balance due: R899.00 (overdue)
Example session — general query → KB lookup → refund request:
User: Hi, what's your refund policy? Agent: (LLM Tool) Happy to help. Generally, Kaya Connect reviews refund requests within 5–7 business days. Would you like me to look up the full policy details? User: Yes please. Agent: (KB Tool) According to our policy, refunds are applicable for duplicate charges, billing errors, and unused bundle portions cancelled within the cooling-off period. Approved refunds are credited within 3–5 business days. Is there something specific on your account you'd like to raise? User: Yes — I think I was double-charged last month and I'd like to request a refund. Agent: I can process that for you. To verify your identity before accessing your account, could you please confirm your account number? User: 3. Agent: (CRM + HITL Tools) Thank you, Lerato. I can see your account shows an outstanding balance of R899.00 and is currently flagged as payment overdue. I've raised a refund request on your behalf — a human agent will review it shortly. You'll be notified once it's processed. Is there anything else I can help you with?
Swap the LLM: Replace the Claude LLM node with any n8n-supported chat model (OpenAI, Gemini, Mistral, etc.). Update the system prompt accordingly.
Replace the mock CRM: The Customer Lookup tool is a Code node with three hardcoded accounts. Replace the accounts object with a live HTTP Request node pointing at your CRM API.
Add more KB articles: Edit kb/kaya_connect_kb_articles.json and re-run generate_kb.py --upload. The script upserts by ID so existing vectors are overwritten cleanly.
Scale the HITL channel: The Slack HITL node posts to a single channel. For high volume, add a Switch node before the HITL node to route to appropriate medium/channel by refund amount or category.
Persistent memory: The workflow uses an in-memory buffer window for conversation history (suitable for demos). For production, replace the Chat Memory node with the MongoDB or Redis memory node.
The three Time Saved nodes log per-execution time savings to n8n Insights:
| Branch | Minutes saved | Rationale |
|---|---|---|
| Knowledge Base | 1 min | Simple FAQ deflection |
| Account Lookup | 3 min | CRM query + response vs agent call |
| Refund Request | 5 min | Structured intake + async approval vs live call |
Adjust the minutesSaved values in each node to match your baseline handle times. Aggregate ROI is visible in Insights > Time Saved in the n8n dashboard.
MIT
