Built a production-grade AI learning platform with an event-driven architecture, integrating Retrieval-Augmented Generation (RAG), semantic caching, and adaptive learning loops. Designed a resilient backend using PostgreSQL transactions, atomic SQL operations, and retry-safe background jobs (Inngest). Implemented embedding-based semantic caching to reduce API cost and latency, and a feedback-driven remediation system that generates personalized study plans based on user performance. Ensured reliability through fallback API strategies, strict validation pipelines, and non-blocking async workflows.
- Users submit a topic and learning preferences to generate a complete study path.
- The system stores course structure, topic state, generated content, quiz attempts, and subscription status in PostgreSQL.
- Long-running AI jobs move out of the request path, so the API stays responsive.
- The backend focuses on reliability and lifecycle control, not only content generation.
Client
β
API (Next.js)
β
DB + Inngest
β
AI (Gemini)
β
DB
β
Polling β UI
- Next.js route handlers expose:
- course generation
- notes generation
- flashcards/quiz generation
- status polling
- dashboard reads
- user stats
- billing
- Each route has one job.
- Inputs are validated and normalized before DB/AI work starts.
- Gemini generates:
- course outlines
- topic notes
- flashcards
- quizzes
- Each request uses a fresh per-request chat session.
- Prompts request structured output for clean validation/storage.
- Inngest handles user creation and study-content generation as async events.
- Request timing is decoupled from AI execution.
- Workers include identity guards so completed records are not reprocessed on retries.
- Drizzle ORM writes to PostgreSQL tables for:
- users
- study materials
- topics
- study-type content
- quiz attempts
- payments
- Database is the source of truth for state/progress/lifecycle.
- Critical multi-table writes use transactions.
- Sync: course creation, topic reads, dashboard reads, quiz attempt writes, payment webhooks, status queries.
- Async: flashcard/quiz generation (Inngest), Clerk-driven user provisioning.
- Client submits a request.
- API validates auth, input, and quotas.
- API either writes in a transaction or queues an Inngest event.
- Worker/route runs AI generation with a fresh session.
- Result is validated, stored, and exposed through status-aware reads.
- Modular monolith with event-driven processing.
- Not microservices.
- Domain logic is separated while app/database remain shared.
-
User Request
Frontend submits a course/content action. -
API Validation
Auth, quota, input shape, and context checks run first.
Type values are normalized to lowercase before validation. -
DB Write / Status Update
Records are created/updated with lifecycle status.
Multi-table writes run in transactions. -
Inngest Event Trigger
Long-running content generation moves to background workers. -
AI Processing
Gemini generates outline/notes/flashcards/quiz data using fresh sessions. -
Validation + Retry/Fallback
Output parsing runs in dedicated try/catch.
Failures trigger retries or fallback logic. -
DB Update
Record status becomescompletedorfailed.
retryCountincreases on failures.
Progress fields are recalculated. -
Frontend Polling β Response
UI polls status endpoints until generation completes.
- User submits topic, course type, difficulty, creator identity.
POST /api/generate-course-outlinechecks daily quota viaUserStatsService.incrementDailyCourseCount().- If quota exceeded, route returns
429before AI call. - Route builds prompt for JSON outline with exactly three chapters.
- Gemini returns structured data.
- JSON parsing happens in dedicated try/catch.
- Route validates chapter count and trims/pads to fixed shape.
- If Gemini fails or returns malformed output, route uses manual-outline fallback.
- Course + topic inserts run in one DB transaction.
- If topic insert fails, course insert is rolled back.
- Progress counters initialize in
studyMaterial. - User activity stats update (streak + daily usage).
- User selects topic and requests notes.
POST /api/generate-topic-notesloads topic row first.- If already
completed, cached content is returned. - If already
generating, API returns202. - If unexpected state, API returns
409 Conflict. failedstate supports retry.- Route marks topic as
generatingbefore Gemini call. - Prompt requests concise markdown with:
- explanation
- key points
- code example
- interview questions
- On success: write
topics.notesContent, set statuscompleted. - Recompute course totals/progress percentage.
- On failure: set status
failed.
- User chooses flashcards or quiz.
POST /api/study-type-contentnormalizestypeto lowercase before validation.- Unknown types return
400. - API checks for existing
(courseId, type)row. - If existing row is
generatingorcompleted, return existing ID. - Prevents duplicate jobs.
- If prior attempt
failed, reuse row and set back togenerating. - If no row exists, insert new
studyTypeContent. - On unique-constraint race, safely fetch existing row instead of throwing.
- Before calling AI: system checks the embedding-based semantic cache for recent similar queries and returns cached content on a hit to avoid redundant generation.
- If cache miss: a RAG retrieval step fetches top-K relevant content chunks which are injected into the prompt to ground generation and reduce hallucination.
- API sends Inngest event with type, prompt, course ID, and record ID.
- Core pipeline: semantic cache β RAG retrieval β AI generation ensuring minimal latency, reduced cost, and grounded outputs.
- API sends Inngest event with type, prompt, course ID, and record ID.
- Worker exits early if row is already
completed. - Worker sets
generatingand clears prior error. - Flashcard/quiz generation uses fresh per-request model session.
- On failure, increment
retryCountand updateerror. - Frontend polls
/api/study-statusuntil ready.
- On Gemini
429,503, or resource exhaustion, switch to fallback API key. - If fallback succeeds, continue and reset to primary key after success.
- If both fail, row is marked
failed. - Error message is stored in
errorcolumn. - JSON parsing errors are handled separately.
- Inngest retries absorb transient failures.
- Task-specific prompts per output type.
- Course generation requests JSON structure.
- Notes generation requests markdown.
- Flashcards/quizzes request JSON with fixed keys.
- Prompts centralized in
configs/prompts.js.
- Fresh model instance created inside each route/worker.
- Prevents shared chat history and cross-request context bleed.
- JSON parsed in dedicated try/catch.
- Course chapter count normalized to exactly 3.
- Quiz/flashcard payloads filtered to valid records only.
- Valid output stored in
studyMaterial,topics, orstudyTypeContent. - Status fields drive polling and lifecycle reads.
- Switch from primary Gemini key to fallback key on overload/rate limits.
- Reset back to primary after successful call.
- Course generation also includes manual-outline fallback.
- Malformed model JSON.
- Quota/rate-limit failures.
- AI timeouts/overload.
- Partial generation requiring retryability.
-
Implemented a RAG pipeline for flashcards, quizzes, and adaptive remediation.
-
Course content is chunked and stored with embeddings in PostgreSQL (pgvector).
-
Queries are embedded and matched using cosine similarity.
-
Multi-step retrieval strategy:
- Same-course retrieval (strict threshold)
- Relaxed threshold retry
- Cross-course fallback
-
Top-K relevant chunks are injected into prompts for grounded generation.
-
Prevents hallucination and improves consistency across generated content.
- Implemented embedding-based semantic cache to avoid redundant AI calls.
- Cache lookup flow:
- Embed incoming query
- Compare with stored embeddings
- Return cached result if similarity β₯ threshold
-
Supports:
- Exact hit β immediate response
- Near hit β fallback reuse when AI fails
-
Reduces latency and API cost significantly.
-
Cache stored in PostgreSQL with vector indexing.
- Built a feedback-driven learning loop based on quiz performance.
Flow:
- User completes quiz
- Score is evaluated against threshold
- Weak topics are extracted from wrong answers
- Background job generates targeted remediation plan
-
Remediation content includes:
- topic-specific explanations
- key points
- practice questions
- correct answers
-
Stored separately and linked to quiz attempts for traceability.
Transforms the system into a feedback-driven adaptive learning loop, where user mistakes directly influence future content generation and reinforcement strategy.
-
Designed system for active learning, not passive reading:
- Users attempt answers before seeing solutions
- Explanation and key points revealed only after answer
-
Weak Areas dashboard:
- Tracks low-score attempts
- Surfaces targeted remediation plans
-
Encourages retention through recall-based practice.
pending: topic exists, notes not generated.generating: generation in progress.completed: notes persisted.failed: generation failed; retries allowed.
generating: async generation running.completed: flashcards/quiz ready.failed: failed;retryCountincremented anderrorcaptured.
- Notes route checks status before work.
- Study-content route reuses existing row.
- Worker skips already-completed rows.
- Returning
202or cached payload prevents duplicate AI calls/inserts.
- Repeated requests are safe in practice.
- Existing records reused instead of repeatedly inserting.
- Explicit state transitions make retries predictable.
-
Why async (Inngest)?
Flashcards/quizzes are long-running; background jobs keep APIs responsive. -
Why DB transactions?
Multi-table operations (create/delete course) must stay atomic. -
Why atomic SQL increments for quiz stats?
Prevent read-modify-write races via DB-side updates (e.g.,sql\${col} + 1``). -
Why per-request AI model factories?
Avoid shared history and cross-user context leakage. -
Why retries?
External AI/network/DB errors are often transient. -
Why fallback API key?
Improves availability under provider throttling. -
Why JSON validation?
Protects database/frontend from malformed AI output. -
Why store progress in DB?
Fast dashboard reads with denormalized counters. -
Why centralized fallback logic (DRY)?
Less duplication, easier tuning. -
Why frontend string normalization?
Avoid glitches from casing/whitespace inconsistencies. -
Why propagate backend AI errors to UI?
Better observability and user transparency.
-
Event-driven architecture
-
Async processing
-
Database transactions
-
Atomic SQL writes
-
DB constraints and normalization
-
Retry mechanisms
-
Rate limiting and quotas
-
Data consistency via explicit state transitions
-
API input normalization and validation
-
Pollable status endpoints
-
Retrieval-Augmented Generation (RAG)
-
Semantic caching with vector similarity
-
Adaptive learning systems
-
Feedback-driven content generation
Identity, membership, streak, study time, completed courses, daily usage, quiz stats.
Generated course outline, type, topic, difficulty, creator, and progress fields.
Chapter/topic indexing, titles, notes content, and generation status.
Flashcards/quiz payload per course/type.
Unique constraint: (courseId, type) plus retryCount and error.
Scores, totals, percentages, timing, and quiz history.
Payment identifiers and subscription state from Stripe webhooks.
- Fallback key on quota/overload/resource exhaustion.
- Separate JSON parse error handling.
- Persist failed state with retryability metadata.
- Course listing retries with exponential backoff.
- Read failures return proper errors (no fake fallback data).
- Multi-table writes protected with transactions.
- Verify affected rows.
- Log missing users clearly.
- Null-check sensitive webhook fields.
- Inngest retries for transient failures.
- Worker skips completed rows.
- Reused rows prevent duplicate content.
- Primary Gemini β fallback key rollover.
- Course generation manual-outline fallback.
- Notes route may return cached completed content.
- Non-blocking APIs for interactive responsiveness
- Background jobs isolate AI latency
- Atomic SQL increments remove counter races
- Transactions ensure write consistency
- Per-request AI sessions avoid shared state
- Quotas prevent cost spikes
- Status polling keeps UI simple
- Denormalized counters speed dashboard reads
- Cached content avoids repeated AI calls
- Semantic caching reduces redundant AI calls and API cost
- RAG improves response accuracy and reduces hallucination in generated content
- Adaptive remediation reduces repeated failure patterns by generating targeted study plans
- Adaptive learning loop reduces repeated mistakes by reinforcing weak concepts through targeted practice.
- Heavy AI work moved to background jobs
- API can handle more concurrent users
- No long-held HTTP connections for generation
- Polling replaces long-lived request lifecycles
- Daily/subscription limits protect cost and throughput
- Unique constraints prevent duplicate writes under load
- IDOR prevention: strict server-side
currentUser()identity checks - Client-provided sensitive identifiers are ignored
- Clerk middleware protects private routes
- Authenticated routes use server-side context
- Clerk and Stripe webhooks use signature verification
- DB connection string remains server-only
- Secrets/tokens never exposed to client
- Billing/provisioning only via trusted backend handlers
- Next.js 15 (App Router)
- React 19
- PostgreSQL
- Drizzle ORM
- Inngest
- Google Gemini AI
- Clerk Authentication
- Stripe Billing
- Tailwind CSS
- Radix UI
npm install
node scripts/enable-pgvector.js
npx drizzle-kit push
npm run dev
npx inngest-cli dev