Skip to content

paganini2008/ephemera

Repository files navigation

AI Agent Memory System — Backend

Production-grade multi-tenant AI Agent memory backend built with FastAPI, LangGraph, LangMem, Redis, Qdrant, and PostgreSQL.


Architecture Overview

┌─────────────────────────────────────────────────────────┐
│                    API Layer (FastAPI)                   │
│  /prompt  /llm  /memory  /emotion  /personality         │
└─────────────────────┬───────────────────────────────────┘
                      │ DI (dependency-injector)
┌─────────────────────▼───────────────────────────────────┐
│               Service Layer                             │
│  PromptService  LLMService  MemoryService               │
│  EmotionService  PersonalityService                     │
└──────┬────────────┬────────────────────────────────────┘
       │            │
┌──────▼──────┐ ┌───▼──────────────────────────────────┐
│  LangGraph  │ │         Store Layer                  │
│  Workflow   │ │  Redis | Qdrant | PostgreSQL (async)  │
└─────────────┘ └──────────────────────────────────────┘

Memory Layers

Layer Storage Purpose
Short-term Redis Recent messages, session summary, emotion state
Long-term Qdrant Semantic vector search over user history
Core Memory PostgreSQL Stable user facts, personality, preferences

Feature Flags

All features are toggled via environment variables:

Flag Default Effect
MEMORY_ENABLED true Master switch — if false, pass directly to LLM
SHORT_TERM_MEMORY_ENABLED true Redis short-term memory
LONG_TERM_MEMORY_ENABLED true Qdrant vector memory
CORE_MEMORY_ENABLED true PostgreSQL core memory
EMOTION_ENABLED true Emotion detection and injection
PERSONALITY_ENABLED true Personality profile injection
LLM_ENABLED true Whether to call the LLM at all

Quick Start

1. Copy environment config

cp .env.example .env
# Edit .env with your OpenAI API key, DB credentials, etc.

2. Install dependencies

pip install -e ".[dev]"
# or with uv:
uv sync

3. Start external services

You need PostgreSQL, Redis, and Qdrant running. Minimal quick-start commands:

# PostgreSQL
docker run -d --name pg -e POSTGRES_PASSWORD=postgres -p 5432:5432 postgres:16

# Redis
docker run -d --name redis -p 6379:6379 redis:7

# Qdrant
docker run -d --name qdrant -p 6333:6333 qdrant/qdrant

4. Run database migrations

# From the project root (where alembic.ini lives):
alembic upgrade head

5. Start the development server

cd src
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

6. Open Swagger UI

Navigate to: http://localhost:8000/docs


API Endpoints

Method Path Description
POST /api/v1/prompt/build Build prompt from all memory layers
POST /api/v1/llm/invoke Build prompt + call LLM
POST /api/v1/memory/short-term/message Write message to Redis
GET /api/v1/memory/short-term/{tenant_id}/{user_id}/{session_id} Get short-term memory
POST /api/v1/memory/short-term/summary Update session summary
POST /api/v1/memory/core/upsert Upsert core memory variable
GET /api/v1/memory/core/{tenant_id}/{user_id} Get all core memory
POST /api/v1/memory/long-term/upsert Store fact + embedding in Qdrant
POST /api/v1/memory/long-term/search Semantic search in Qdrant
POST /api/v1/memory/extract LangMem memory extraction
POST /api/v1/emotion/analyze Analyse emotion from text
GET /api/v1/emotion/{tenant_id}/{user_id}/{session_id} Get emotion state
POST /api/v1/emotion/config Update emotion feature flags
GET /api/v1/personality/{tenant_id}/{user_id} Get personality profile
POST /api/v1/personality/upsert Update personality variables
POST /api/v1/personality/config Toggle personality injection

Project Structure

src/
├── app/
│   ├── main.py              # FastAPI app + lifespan
│   ├── containers.py        # DI container (dependency-injector)
│   ├── constants.py         # Env-loaded constants
│   ├── lifecycle.py         # @Init / @Destroy decorators
│   └── events.py            # Event subscribers
├── config/
│   └── settings.py          # All settings and feature flags
├── api/
│   ├── deps.py              # FastAPI dependency factories
│   ├── memory_api.py        # Memory endpoints
│   ├── prompt_api.py        # Prompt build endpoint
│   ├── llm_api.py           # LLM invoke endpoint
│   ├── emotion_api.py       # Emotion endpoints
│   └── personality_api.py   # Personality endpoints
├── models/                  # SQLAlchemy ORM models
│   ├── core_memory.py
│   ├── personality.py
│   └── emotion.py
├── schemas/                 # Pydantic request/response schemas
│   ├── common.py
│   ├── memory.py
│   ├── emotion.py
│   ├── personality.py
│   └── prompt.py
├── stores/                  # Data access layer
│   ├── redis_store.py       # Async Redis (short-term memory)
│   ├── qdrant_store.py      # Async Qdrant (long-term memory)
│   └── postgres_repo.py     # Async SQLAlchemy repos
├── services/                # Business logic
│   ├── memory_service.py
│   ├── prompt_service.py
│   └── llm_service.py
├── workflows/
│   └── prompt_workflow.py   # LangGraph 7-node workflow
├── agents/
│   └── memory_agent.py      # LangMem extraction agent
├── prompt/
│   └── builder.py           # PromptBuilder (assembles blocks)
├── emotion/
│   ├── analyzer.py          # LLM + rule-based emotion classifiers
│   └── service.py           # Emotion orchestration
├── personality/
│   ├── service.py
│   └── prompt_adapter.py    # Personality → prompt block
├── llm/
│   ├── base.py              # LLMProvider ABC
│   └── openai_provider.py   # OpenAI + Mock implementations
├── embeddings/
│   ├── base.py              # EmbeddingProvider ABC
│   └── openai_embeddings.py # OpenAI + Mock implementations
├── db/
│   ├── base.py              # SQLAlchemy DeclarativeBase
│   └── session.py           # Async session factory
├── migrations/
│   ├── env.py               # Alembic env
│   ├── script.py.mako       # Migration template
│   └── versions/
│       └── 20240101_001_initial_schema.py
└── utils/                   # Existing utilities (unchanged)

Database Migrations

Migrations are managed with Alembic. Tables are created in the schema specified by DB_SCHEMA (default public; set to mem in the current .env).

Prerequisites

Make sure your .env has the correct schema:

DB_SCHEMA=mem        # all tables land here
DB_DATABASE=demo     # target database

Run all migrations (first deploy or after pulling new code)

# From the project root (where alembic.ini lives)
alembic upgrade head

This will:

  1. Create the schema if it does not exist (handled by search_path in asyncpg connect args)
  2. Create alembic_version in the target schema to track state
  3. Apply every pending migration in src/migrations/versions/

Common commands

# Rollback the last applied migration
alembic downgrade -1

# Rollback all the way to a clean slate
alembic downgrade base

# Auto-generate a migration after changing a model
alembic revision --autogenerate -m "add user_preference table"

# Show applied / pending migration history
alembic history --verbose

# Show current DB revision
alembic current

Notes

  • Schema isolationDB_SCHEMA is read at runtime. Changing it in .env and re-running alembic upgrade head will create tables in the new schema without touching the old one.
  • asyncpg search_path — The env.py passes server_settings={"search_path": DB_SCHEMA} to asyncpg so the schema is active before Alembic opens its transaction. This is why ?options= in the URL is NOT used.
  • Existing tables — All migration scripts include explicit schema=SCHEMA arguments so they are always idempotent and schema-aware regardless of the connection's default search path.

Docker Build & Run

# Build image
docker build -t mem-backend:latest .

# Run (assumes external services are accessible)
docker run -d \
  --name mem-backend \
  -p 8000:8000 \
  --env-file .env \
  mem-backend:latest

Multi-tenancy

Every API request includes tenant_id, user_id, and session_id.

  • Redis keys: shortmem:{tenant_id}:{user_id}:{session_id}:{suffix}
  • Qdrant points: filtered by tenant_id + user_id in payload
  • PostgreSQL rows: all tables have tenant_id column with unique indexes

LangGraph Workflow

The prompt-building pipeline runs as a 7-node LangGraph directed graph:

load_core_memory → load_short_term → search_long_term
→ load_emotion → load_personality → build_prompt → END

Each node is independently disabled via feature flags without modifying the graph.


LangMem Integration

POST /api/v1/memory/extract runs LangMem extraction against a conversation. If langmem is not installed, falls back to a direct LLM-based extractor.


Backup Recommendations

Component Strategy
PostgreSQL pg_dump daily, point-in-time recovery (WAL archiving)
Redis BGSAVE + RDB snapshots, or AOF for persistence
Qdrant Built-in snapshot API: POST /collections/{name}/snapshots

Environment Variables Reference

See .env.example for the full list with descriptions.

About

Production-grade multi-tenant AI Agent memory backend built with FastAPI, LangGraph, LangMem, Redis, Qdrant, and PostgreSQL.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages