d r e a m - m e m o r y
Offline memory consolidation for AI agents.
Inspired by how brains dream.
Quickstart •
How It Works •
API •
Adapters •
CLI •
Examples
AI agents accumulate memories. They store facts, preferences, context. But they never think about what they know. Memories pile up, go stale, contradict each other, bloat token budgets.
Biological brains solve this by dreaming -- consolidating, merging, and pruning memories offline.
dream-memory gives your AI agents the same ability.
┌─────────────────────────────────────────┐
│ DREAM CYCLE │
│ │
Sessions ───────▶│ ① Orient Survey the landscape │
│ ② Gather Extract new knowledge │
Memories ◀──────▶│ ③ Consolidate Merge & deduplicate │
│ ④ Prune Drop what's stale │
│ │
└─────────────────────────────────────────┘
Every agent framework has memory. None of them have memory maintenance.
| Feature | RAG / Vector DBs | dream-memory |
|---|---|---|
| Stores memories | ✓ | ✓ |
| Retrieves relevant context | ✓ | ✓ (via your existing RAG) |
| Merges redundant memories | ✗ | ✓ |
| Resolves contradictions | ✗ | ✓ |
| Prunes stale knowledge | ✗ | ✓ |
| Discovers patterns across sessions | ✗ | ✓ |
| Gets sharper over time | ✗ | ✓ |
dream-memory doesn't replace your memory store. It maintains it.
pip install dream-memoryfrom dream_memory import DreamEngine, MemoryStore
from dream_memory.llm import OpenAIAdapter
# 1. Point at a directory (or use SQLiteStore for bigger setups)
store = MemoryStore("./agent_memories")
# 2. Pick any LLM
llm = OpenAIAdapter(model="gpt-4o-mini") # or AnthropicAdapter, CallableLLM, etc.
# 3. Create the engine
engine = DreamEngine(store, llm=llm)
# 4. Log sessions as your agent runs
engine.log_session("User asked about Python best practices. We discussed type hints.")
engine.log_session("User set up CI/CD with GitHub Actions. Prefers pytest over unittest.")
engine.log_session("User mentioned they're switching from Flask to FastAPI.")
# 5. Dream
result = engine.dream()
print(f"Memories: {result.memories_before} → {result.memories_after}")
print(f"Added: {result.total_added}, Pruned: {result.total_pruned}")That's it. The engine surveys what the agent knows, extracts new knowledge from recent sessions, merges duplicates, and prunes stale facts. Your agent wakes up sharper.
Every dream cycle runs four phases, each doing one job:
┌──────────┐ ┌──────────┐ ┌──────────────┐ ┌──────────┐
│ ORIENT │────▶│ GATHER │────▶│ CONSOLIDATE │────▶│ PRUNE │
│ │ │ │ │ │ │ │
│ Survey │ │ Extract │ │ Merge & │ │ Drop │
│ memories │ │ new info │ │ deduplicate │ │ stale │
│ + recent │ │ from │ │ redundant │ │ low-val │
│ sessions │ │ sessions │ │ memories │ │ memories │
└──────────┘ └──────────┘ └──────────────┘ └──────────┘
① Orient — Scans all existing memories and recent sessions. Identifies stale topics, redundant groups, new signals, and contradictions. Produces a health score.
② Gather — Extracts novel facts, preferences, and patterns from sessions that aren't already captured in memory. Each extracted memory gets a topic and priority level.
③ Consolidate — Merges overlapping memories into single, improved versions. Resolves contradictions by preferring newer information. Removes duplicates.
④ Prune — Removes stale, low-value, and ephemeral memories. Enforces total memory budgets. Respects protected topics and core priorities.
Dreams don't run on every call. The engine uses a gating system:
DreamConfig(
min_sessions_between_dreams=3, # Need 3+ new sessions
min_hours_between_dreams=1.0, # At least 1 hour between dreams
)Call engine.dream(force=True) to bypass the gate.
Every memory has a priority that affects how it's treated:
| Priority | Value | Behavior |
|---|---|---|
EPHEMERAL |
0 | Auto-pruned after 24h |
LOW |
1 | Auto-pruned after 7 days |
NORMAL |
2 | Standard lifecycle |
HIGH |
3 | Resistant to pruning |
CORE |
4 | Never pruned |
No LLM? No problem. dream-memory can run prune-only cycles using deterministic rules:
engine = DreamEngine(store, llm=None) # No LLM
result = engine.dream(force=True) # Prunes by rules aloneThe main interface. Orchestrates dream cycles.
engine = DreamEngine(store, llm=llm, config=config)
# Run a dream cycle
result = engine.dream(force=False)
# Log a session
engine.log_session("conversation transcript", summary="optional summary", tag="v1")
# Manually add a memory
engine.add_memory("User prefers dark mode", topic="preferences", priority=3)
# Check status
status = engine.status()
# {'total_memories': 15, 'topics': ['prefs', 'stack'], 'ready_to_dream': True, ...}Two built-in stores. Both share the same interface.
# Filesystem (default) — simple, zero deps
from dream_memory import MemoryStore
store = MemoryStore("./memories")
# SQLite — better for 100+ memories, atomic ops
from dream_memory import SQLiteStore
store = SQLiteStore("./memories/dream.db")Store methods:
store.list_memories(topic=None) # List all or filter by topic
store.get_memory(id) # Get one by ID
store.save_memory(memory) # Save/update
store.delete_memory(id) # Delete
store.log_session(session) # Log a session
store.get_unprocessed_sessions() # Sessions since last dream
store.should_dream() # (bool, reason) gate checkfrom dream_memory import Memory, Session, DreamConfig, DreamResult
# Memory
mem = Memory(content="fact", topic="general", priority=MemoryPriority.HIGH)
# Session
sess = Session(content="transcript", summary="optional")
# Config
config = DreamConfig(
min_sessions_between_dreams=3,
min_hours_between_dreams=1.0,
max_total_memories=500,
staleness_threshold_hours=168,
protected_topics=["identity", "core"],
)from dream_memory.llm import OpenAIAdapter
# OpenAI
llm = OpenAIAdapter(model="gpt-4o-mini")
# Any OpenAI-compatible API (Ollama, vLLM, LMStudio, LiteLLM)
llm = OpenAIAdapter(
model="llama3",
base_url="http://localhost:11434/v1",
api_key="not-needed",
)from dream_memory.llm import AnthropicAdapter
llm = AnthropicAdapter(model="claude-sonnet-4-20250514")from dream_memory.llm import CallableLLM
def my_llm(prompt: str, max_tokens: int = 2000) -> str:
return my_api.generate(prompt, max_tokens=max_tokens)
llm = CallableLLM(my_llm)from dream_memory.llm.base import LLMAdapter
class MyAdapter(LLMAdapter):
def complete(self, prompt: str, max_tokens: int = 2000) -> str:
return my_custom_api(prompt, max_tokens)# Check status
dream-memory --store ./memories status
# List memories
dream-memory --store ./memories memories
dream-memory --store ./memories memories --topic preferences
# List topics
dream-memory --store ./memories topics
# Log a session from stdin
echo "User asked about deployment" | dream-memory --store ./memories log
# Log from a file
dream-memory --store ./memories log --file session.txt
# Trigger a dream cycle
dream-memory --store ./memories dream --llm openai --force
# Deterministic prune (no LLM needed)
dream-memory --store ./memories dream --deterministic --forceDreamConfig(
# Gate: when to dream
min_sessions_between_dreams=3, # Accumulate before dreaming
min_hours_between_dreams=1.0, # Cooldown between dreams
# Consolidation
max_memories_per_topic=50, # Cap per topic
similarity_threshold=0.85, # Dedup threshold
# Pruning
max_total_memories=500, # Hard cap
staleness_threshold_hours=168, # 7 days
prune_ephemeral_after_hours=24, # Auto-prune ephemeral
prune_low_after_hours=168, # Auto-prune low priority
# Budget
max_tokens=8000, # Total token budget per dream
# Protection
protected_topics=["identity"], # Never prune these
)See examples/ for working code:
- quickstart.py — 10-line setup with mock LLM
- multi_session.py — Simulate multiple days of agent activity
- custom_adapter.py — Connect Ollama, LiteLLM, or any backend
Zero required dependencies. The core library uses only Python stdlib. LLM adapters are optional extras.
Framework agnostic. Works with LangChain, CrewAI, AutoGen, Hermes, your custom agent, or raw scripts. If it has memory, dream-memory can maintain it.
LLM agnostic. OpenAI, Anthropic, Ollama, vLLM, or any function that takes text and returns text. Swap models without changing your agent code.
Deterministic fallback. No LLM? Prune phase still runs using rule-based logic. Your agent's memory stays bounded even without API access.
Composable phases. Import and run individual phases (orient, gather, consolidate, prune) if you want fine-grained control.
MIT