A FastAPI backend that converts natural language financial questions into SQL, executes them safely, and generates LLM-powered human-readable insights using Claude (Anthropic).
User Question (NL)
β
βΌ
βββββββββββββββββββββ
β POST /query β FastAPI endpoint
βββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββ
β Cache Check β In-memory TTL cache (Redis-optional)
βββββββββββββββββββββ
β miss
βΌ
βββββββββββββββββββββ
β NL β SQL (Claude) β Anthropic API call β JSON {sql, params}
βββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββ
β SQL Validation β Whitelist: SELECT-only, no DDL/DML
βββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββ
β SQLite Execute β Parameterized queries (? placeholders)
βββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββ
β Insight (Claude) β LLM summarises raw rows β readable insight
βββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββ
β History Log + β Stored in query_history table
β Cache Write β Cached for 5 minutes
βββββββββββββββββββββ
β
βΌ
JSON Response
| Feature | Implementation |
|---|---|
| NL β SQL | Claude (claude-sonnet-4-20250514) via Anthropic API |
| Database | SQLite (parameterized ? placeholders β SQL injection safe) |
| SQL Safety | Whitelist validation: only SELECT, blocks INSERT/UPDATE/DELETE/DROP/PRAGMA |
| Caching | In-process TTL dict cache; auto-upgrades to Redis if REDIS_URL is set |
| LLM Insights | Second Claude call turns raw rows into a 2-3 sentence insight |
| Query History | All queries logged to query_history table with success/fail/row count |
| Analytics | /analytics/{user_id} β category breakdown, monthly trends, top merchant |
| Error Handling | Invalid SQL, empty results, LLM failures, bad inputs β all handled gracefully |
| Sample Data | 600 seeded transactions (3 users Γ 200 rows) across 8 categories |
git clone <repo-url>
cd query-engine
pip install -r requirements.txtcp .env.example .env
# Edit .env and set your Anthropic API key:
# ANTHROPIC_API_KEY=sk-ant-...Optional: set REDIS_URL=redis://localhost:6379/0 for Redis caching.
python run.py
# Server starts at http://localhost:8000The database is created and seeded automatically on first start.
Convert a natural language question into SQL, execute it, and return an AI-generated insight.
Request:
{
"user_id": 1,
"question": "How much did I spend on food last month?"
}Response:
{
"success": true,
"question": "How much did I spend on food last month?",
"sql": "SELECT SUM(amount) AS total_spent FROM transactions WHERE user_id = ? AND category = ? AND transaction_date >= date('now','start of month','-1 month') AND transaction_date < date('now','start of month')",
"data": [{ "total_spent": 3240.50 }],
"row_count": 1,
"insight": "You spent βΉ3,240.50 on Food last month across multiple merchants. This is roughly βΉ108 per day β consider meal planning to bring this down.",
"cached": false,
"latency_ms": 1842.3
}Returns spending breakdown and monthly trends.
GET /analytics/1{
"user_id": 1,
"total_spent": 483920.39,
"total_transactions": 200,
"avg_transaction": 2419.60,
"top_category": "Education",
"top_merchant": "Coursera",
"by_category": [
{ "category": "Education", "total": 90946.84, "count": 36 },
{ "category": "Food", "total": 67979.18, "count": 30 }
],
"by_month": [
{ "month": "2026-04", "total": 54449.22, "count": 28 }
]
}Returns query history for a user.
GET /history/1?limit=10{ "status": "ok", "cache_size": 3 }These all work out of the box with the seeded data:
"How much did I spend on food last month?"
"Show me my top 5 most expensive transactions"
"What did I spend on entertainment this year?"
"How many transactions did I make on Swiggy?"
"What is my average transaction amount?"
"Show me my spending by category"
"Which month did I spend the most?"
"How much did I spend on Uber this year?"
- SQL Injection: All user-supplied values go through
?placeholders. The only strings ever interpolated are category/column names generated by the LLM under a whitelisted schema prompt. - SQL Whitelist: A
_validate_sql()guard blocks any statement that doesn't start withSELECTor contains DDL/DML keywords (INSERT,UPDATE,DELETE,DROP,ALTER,CREATE,PRAGMA,ATTACH). - Input validation: Pydantic models enforce
user_id > 0and3 β€ len(question) β€ 500.
pytest tests/ -v17 tests covering:
- Database seeding and parameterized queries
- Analytics endpoint structure
- In-memory cache (get/set/TTL/delete/size)
- SQL whitelist validation (SELECT allowed, DDL/DML blocked)
- User ID injection into parameterized queries
query-engine/
βββ app/
β βββ __init__.py
β βββ main.py # FastAPI app, routes, startup
β βββ models.py # Pydantic request/response schemas
β βββ database.py # SQLite layer, seeding, history, analytics
β βββ query_engine.py # NLβSQL, validation, execution, LLM insight
β βββ cache.py # InMemoryCache + RedisCache with auto-fallback
βββ tests/
β βββ test_engine.py # 17 unit tests
βββ run.py # Entry point (uvicorn)
βββ requirements.txt
βββ .env.example
βββ README.md
Why SQLite? Zero-setup for assignment; the database abstraction layer in database.py is easily swapped for PostgreSQL by replacing sqlite3 with psycopg2 and changing ? placeholders to %s.
Why two LLM calls? The NLβSQL call must return structured JSON (sql + params). Mixing insight generation into that prompt degrades JSON reliability. Keeping them separate gives cleaner outputs and easier error handling.
Why separate param injection? Claude returns null as the user_id placeholder, which gets replaced with the real value server-side. This prevents prompt injection: an attacker who manipulates the question cannot override the user_id used in the query.
Caching strategy: 5-minute TTL on successful query results. Cache key = query:{user_id}:{question.lower()}. Redis is used if REDIS_URL is set; falls back to in-process dict cache with zero config required.