Production-grade multi-tenant AI Agent memory backend built with FastAPI, LangGraph, LangMem, Redis, Qdrant, and PostgreSQL.
┌─────────────────────────────────────────────────────────┐
│ API Layer (FastAPI) │
│ /prompt /llm /memory /emotion /personality │
└─────────────────────┬───────────────────────────────────┘
│ DI (dependency-injector)
┌─────────────────────▼───────────────────────────────────┐
│ Service Layer │
│ PromptService LLMService MemoryService │
│ EmotionService PersonalityService │
└──────┬────────────┬────────────────────────────────────┘
│ │
┌──────▼──────┐ ┌───▼──────────────────────────────────┐
│ LangGraph │ │ Store Layer │
│ Workflow │ │ Redis | Qdrant | PostgreSQL (async) │
└─────────────┘ └──────────────────────────────────────┘
| Layer | Storage | Purpose |
|---|---|---|
| Short-term | Redis | Recent messages, session summary, emotion state |
| Long-term | Qdrant | Semantic vector search over user history |
| Core Memory | PostgreSQL | Stable user facts, personality, preferences |
All features are toggled via environment variables:
| Flag | Default | Effect |
|---|---|---|
MEMORY_ENABLED |
true | Master switch — if false, pass directly to LLM |
SHORT_TERM_MEMORY_ENABLED |
true | Redis short-term memory |
LONG_TERM_MEMORY_ENABLED |
true | Qdrant vector memory |
CORE_MEMORY_ENABLED |
true | PostgreSQL core memory |
EMOTION_ENABLED |
true | Emotion detection and injection |
PERSONALITY_ENABLED |
true | Personality profile injection |
LLM_ENABLED |
true | Whether to call the LLM at all |
cp .env.example .env
# Edit .env with your OpenAI API key, DB credentials, etc.pip install -e ".[dev]"
# or with uv:
uv syncYou need PostgreSQL, Redis, and Qdrant running. Minimal quick-start commands:
# PostgreSQL
docker run -d --name pg -e POSTGRES_PASSWORD=postgres -p 5432:5432 postgres:16
# Redis
docker run -d --name redis -p 6379:6379 redis:7
# Qdrant
docker run -d --name qdrant -p 6333:6333 qdrant/qdrant# From the project root (where alembic.ini lives):
alembic upgrade headcd src
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000Navigate to: http://localhost:8000/docs
| Method | Path | Description |
|---|---|---|
| POST | /api/v1/prompt/build |
Build prompt from all memory layers |
| POST | /api/v1/llm/invoke |
Build prompt + call LLM |
| POST | /api/v1/memory/short-term/message |
Write message to Redis |
| GET | /api/v1/memory/short-term/{tenant_id}/{user_id}/{session_id} |
Get short-term memory |
| POST | /api/v1/memory/short-term/summary |
Update session summary |
| POST | /api/v1/memory/core/upsert |
Upsert core memory variable |
| GET | /api/v1/memory/core/{tenant_id}/{user_id} |
Get all core memory |
| POST | /api/v1/memory/long-term/upsert |
Store fact + embedding in Qdrant |
| POST | /api/v1/memory/long-term/search |
Semantic search in Qdrant |
| POST | /api/v1/memory/extract |
LangMem memory extraction |
| POST | /api/v1/emotion/analyze |
Analyse emotion from text |
| GET | /api/v1/emotion/{tenant_id}/{user_id}/{session_id} |
Get emotion state |
| POST | /api/v1/emotion/config |
Update emotion feature flags |
| GET | /api/v1/personality/{tenant_id}/{user_id} |
Get personality profile |
| POST | /api/v1/personality/upsert |
Update personality variables |
| POST | /api/v1/personality/config |
Toggle personality injection |
src/
├── app/
│ ├── main.py # FastAPI app + lifespan
│ ├── containers.py # DI container (dependency-injector)
│ ├── constants.py # Env-loaded constants
│ ├── lifecycle.py # @Init / @Destroy decorators
│ └── events.py # Event subscribers
├── config/
│ └── settings.py # All settings and feature flags
├── api/
│ ├── deps.py # FastAPI dependency factories
│ ├── memory_api.py # Memory endpoints
│ ├── prompt_api.py # Prompt build endpoint
│ ├── llm_api.py # LLM invoke endpoint
│ ├── emotion_api.py # Emotion endpoints
│ └── personality_api.py # Personality endpoints
├── models/ # SQLAlchemy ORM models
│ ├── core_memory.py
│ ├── personality.py
│ └── emotion.py
├── schemas/ # Pydantic request/response schemas
│ ├── common.py
│ ├── memory.py
│ ├── emotion.py
│ ├── personality.py
│ └── prompt.py
├── stores/ # Data access layer
│ ├── redis_store.py # Async Redis (short-term memory)
│ ├── qdrant_store.py # Async Qdrant (long-term memory)
│ └── postgres_repo.py # Async SQLAlchemy repos
├── services/ # Business logic
│ ├── memory_service.py
│ ├── prompt_service.py
│ └── llm_service.py
├── workflows/
│ └── prompt_workflow.py # LangGraph 7-node workflow
├── agents/
│ └── memory_agent.py # LangMem extraction agent
├── prompt/
│ └── builder.py # PromptBuilder (assembles blocks)
├── emotion/
│ ├── analyzer.py # LLM + rule-based emotion classifiers
│ └── service.py # Emotion orchestration
├── personality/
│ ├── service.py
│ └── prompt_adapter.py # Personality → prompt block
├── llm/
│ ├── base.py # LLMProvider ABC
│ └── openai_provider.py # OpenAI + Mock implementations
├── embeddings/
│ ├── base.py # EmbeddingProvider ABC
│ └── openai_embeddings.py # OpenAI + Mock implementations
├── db/
│ ├── base.py # SQLAlchemy DeclarativeBase
│ └── session.py # Async session factory
├── migrations/
│ ├── env.py # Alembic env
│ ├── script.py.mako # Migration template
│ └── versions/
│ └── 20240101_001_initial_schema.py
└── utils/ # Existing utilities (unchanged)
Migrations are managed with Alembic. Tables are created in the schema specified by DB_SCHEMA (default public; set to mem in the current .env).
Make sure your .env has the correct schema:
DB_SCHEMA=mem # all tables land here
DB_DATABASE=demo # target database# From the project root (where alembic.ini lives)
alembic upgrade headThis will:
- Create the schema if it does not exist (handled by
search_pathin asyncpg connect args) - Create
alembic_versionin the target schema to track state - Apply every pending migration in
src/migrations/versions/
# Rollback the last applied migration
alembic downgrade -1
# Rollback all the way to a clean slate
alembic downgrade base
# Auto-generate a migration after changing a model
alembic revision --autogenerate -m "add user_preference table"
# Show applied / pending migration history
alembic history --verbose
# Show current DB revision
alembic current- Schema isolation —
DB_SCHEMAis read at runtime. Changing it in.envand re-runningalembic upgrade headwill create tables in the new schema without touching the old one. - asyncpg
search_path— The env.py passesserver_settings={"search_path": DB_SCHEMA}to asyncpg so the schema is active before Alembic opens its transaction. This is why?options=in the URL is NOT used. - Existing tables — All migration scripts include explicit
schema=SCHEMAarguments so they are always idempotent and schema-aware regardless of the connection's default search path.
# Build image
docker build -t mem-backend:latest .
# Run (assumes external services are accessible)
docker run -d \
--name mem-backend \
-p 8000:8000 \
--env-file .env \
mem-backend:latestEvery API request includes tenant_id, user_id, and session_id.
- Redis keys:
shortmem:{tenant_id}:{user_id}:{session_id}:{suffix} - Qdrant points: filtered by
tenant_id+user_idin payload - PostgreSQL rows: all tables have
tenant_idcolumn with unique indexes
The prompt-building pipeline runs as a 7-node LangGraph directed graph:
load_core_memory → load_short_term → search_long_term
→ load_emotion → load_personality → build_prompt → END
Each node is independently disabled via feature flags without modifying the graph.
POST /api/v1/memory/extract runs LangMem extraction against a conversation.
If langmem is not installed, falls back to a direct LLM-based extractor.
| Component | Strategy |
|---|---|
| PostgreSQL | pg_dump daily, point-in-time recovery (WAL archiving) |
| Redis | BGSAVE + RDB snapshots, or AOF for persistence |
| Qdrant | Built-in snapshot API: POST /collections/{name}/snapshots |
See .env.example for the full list with descriptions.