Lethe is a long-running personal AI assistant with a brain-inspired cognitive architecture: cortex, hippocampus, brainstem, and a default-mode network running as cooperating actors. She has continuous memory across sessions, notices things on her own, delegates to focused subagents, and can read her own source — propose changes to it, restart herself with new logic. Lives on your machine as a systemd service. Persists across reboots, models, hardware upgrades.
Written in Rust as a single ~50 MB static binary. Routes LLM traffic through genai. Intentionally has no web console.
# 1. Build (or download a binary from Releases)
cargo build --release
install -m 755 target/release/lethe ~/.local/bin/lethe
# 2. Set up — interactive prompts for provider, model, API key, workspace
lethe init
# 3. Chat
lethe chat -m "hello"lethe init writes ~/.lethe/config/.env, seeds the workspace and core memory blocks, and runs a smoke test against the LLM and embedding pipeline before declaring success. If you'd rather configure by hand, copy .env.example and edit. The first turn that uses recall/notes triggers a one-time ~150MB download of the embedding runtime and model (progress is shown).
To sign in (or re-auth) a single provider without re-running the full wizard, use lethe login:
lethe login openai # asks: ChatGPT Plus/Pro subscription (default) or API key
lethe login anthropic # asks: Claude Pro/Max subscription (default) or API key
lethe login openrouter # API key onlyEach command writes credentials to ~/.lethe/credentials/ (subscription) or sets the API key in ~/.lethe/config/.env, flips LLM_PROVIDER, and prompts for LLM_MODEL / LLM_MODEL_AUX (defaults from the curated catalog — accept with Enter, or type any other model id).
Sanity-check an existing setup any time with lethe check — it pings the model and exercises the embedding pipeline rather than just printing config.
Telegram / HTTP API
|
v
Cortex: user-facing agent
memory, tools, delegation, final replies
|
+----------------+----------------+
| | |
v v v
Hippocampus Actor System Notification Pipeline
recall over subagents, scoring, gating,
notes/archive/ registry, and proactive
conversations event bus transport output
| |
v +----------------+
Memory Stack |
markdown blocks, v
notes, SQLite-vec index, DMN + heartbeat
message history background thought
|
v
Tool Registry
files, shell/PTY, browser, web, Telegram/API transport
Core runtime pieces:
| Area | Rust modules | Responsibility |
|---|---|---|
| Agent/cortex | src/agent.rs |
Prompt assembly, LLM calls, tool loop, and actor turn execution. |
| LLM routing | src/llm/ |
genai client, OAuth (ChatGPT Plus/Pro, Claude Pro/Max) and API-key auth, OpenRouter prompt-cache forwarding via vendored genai patch, model metadata. |
| Memory | src/memory/ |
Markdown memory blocks, SQLite-vec recall tables (memory, message_history, plus their *_vec virtual siblings), SQLite todos. |
| Recall | src/hippocampus.rs |
Hybrid lexical/vector recall over notes, archival memories, and conversation history. |
| Actors | src/actor.rs, src/background.rs |
Resident Kameo actors, supervisor-owned state, mailbox/event routing, autonomous subagent wakeups, persistent DMN. |
| Notifications | src/notification.rs, src/heartbeat.rs, src/runtime.rs |
Background candidate gating and proactive output limits. |
| Transports | src/telegram.rs, src/api.rs, src/conversation.rs |
Telegram polling, HTTP/SSE API, debounce/cancel handling. |
| Tools | src/tools/ |
Filesystem, shell, PTY terminal, browser, image, web, memory, notes, todos, actors, transport tools. |
git clone https://github.com/atemerev/lethe.git
cd lethe
cp .env.example .env
cargo build --release
target/release/lethe checkNative installer:
curl -fsSL https://lethe.gg/install | bash
~/.lethe/bin/lethe checkThe installer downloads the latest GitHub binary release for the current platform when available. If no release asset matches, it falls back to a local Cargo build. Force source builds with LETHE_INSTALL_FROM_SOURCE=1.
Run tests:
cargo test
cargo build --releaseBrowser automation uses the external agent-browser CLI when browser tools are called.
CLI check is the default when LETHE_MODE is unset:
target/release/lethe checkRecommended deployment: a single lethe api process hosts the HTTP/SSE
transport and the Telegram poller (when TELEGRAM_BOT_TOKEN is set)
in the same address space, sharing one Agent, one actor registry, and
one Brainstem (the sole source of heartbeats / proactive emissions —
transports just subscribe and forward).
LETHE_API_TOKEN=change-me target/release/lethe api
# or override the port:
target/release/lethe api --port 1373API mode binds to LETHE_API_HOST (127.0.0.1 by default) on
LETHE_API_PORT (1373). Use a reverse proxy for remote access.
The standalone subcommands still work when you want a single transport:
target/release/lethe telegram run # Telegram poller only
target/release/lethe tui # connect a TUI to a running apitarget/release/lethe tui # local API
target/release/lethe tui --url http://host:1373 --token $LETHE_API_TOKENInline tool cards, an actors/todos sidebar, streaming assistant text
(Anthropic + OpenAI OAuth providers), @-prefix workspace path
autocomplete, and slash commands (/help, /clear, /cancel,
/todos, /actors, /model, /quit).
Lethe routes chat through genai. The runtime supports both API-key and subscription-OAuth auth, plus OpenAI-compatible local servers:
| Provider | Auth | Example LLM_MODEL |
|---|---|---|
| Anthropic (API key) | ANTHROPIC_API_KEY |
claude-opus-4-7 |
| Anthropic (Claude Pro/Max) | lethe login anthropic → token file |
claude-opus-4-7 |
| OpenAI (API key) | OPENAI_API_KEY |
gpt-5.5 |
| OpenAI (ChatGPT Plus/Pro) | lethe login openai → token file |
gpt-5.5 |
| OpenRouter | OPENROUTER_API_KEY |
openrouter/moonshotai/kimi-k2.6 |
| Local OpenAI-compatible | LLM_API_BASE + OPENAI_API_KEY=local |
openai/gemma-4-31B-it-Q8_0.gguf |
LLM_PROVIDER is optional but useful when a model id does not carry a provider prefix — for example LLM_PROVIDER=openrouter with LLM_MODEL=moonshotai/kimi-k2.6. Subscription auth also requires LLM_PROVIDER=openai or LLM_PROVIDER=anthropic so the router picks the OAuth path instead of looking for an API key (the lethe login commands set this for you).
LLM_MODEL_AUX defaults to the main model and is used for lightweight/background calls.
lethe login openai runs a device-code flow against auth.openai.com; tokens land in ~/.lethe/credentials/openai_oauth_tokens.json. Calls then go to the Codex Responses API at chatgpt.com/backend-api/codex/responses using your ChatGPT Plus/Pro session — no OPENAI_API_KEY needed. Override the token file with LETHE_OPENAI_OAUTH_TOKENS or supply a raw token via OPENAI_AUTH_TOKEN.
lethe login anthropic runs a PKCE browser flow against claude.ai/oauth/authorize; tokens land in ~/.lethe/credentials/anthropic_oauth_tokens.json. Override with LETHE_ANTHROPIC_OAUTH_TOKENS or ANTHROPIC_AUTH_TOKEN.
Lethe stamps cache breakpoints on the system prompt (1h-TTL persistent prefix + 5min-TTL ephemeral tail) and forwards them through to:
- Anthropic direct and Anthropic OAuth — cache_control is emitted on system blocks.
- OpenRouter — cache_control is emitted on system content parts, which OpenRouter forwards to upstream providers that support explicit caching (Anthropic, Qwen, Gemini explicit). Providers with automatic prefix caching (OpenAI, Grok, Moonshot/Kimi, Groq, DeepSeek, Gemini implicit) ignore the field but benefit from the stable structured shape.
Both genai's native OpenAI adapter and our vendored fork now carry the patch — see vendor/genai/LETHE_FORK.md for the patch surface.
Configuration is read from process environment, a local .env, and $LETHE_HOME/config/.env.
| Variable | Description | Default |
|---|---|---|
LETHE_MODE |
cli, telegram, or api |
cli |
LETHE_HOME |
Runtime root | ~/.lethe |
WORKSPACE_DIR |
Workspace directory | $LETHE_HOME/workspace |
MEMORY_DIR |
Memory data directory | $LETHE_HOME/data/memory |
DB_PATH |
SQLite todo database path | $LETHE_HOME/data/lethe.db |
LOGS_DIR |
Runtime log directory | $LETHE_HOME/logs |
TELEGRAM_BOT_TOKEN |
Bot token from BotFather | required for Telegram |
TELEGRAM_ALLOWED_USER_IDS |
Comma-separated allowlist | all users |
TELEGRAM_TRANSCRIPTION_ENABLED |
Transcribe Telegram audio/voice | true |
LETHE_API_TOKEN |
Bearer or x-lethe-token auth for API mode |
required for API |
LETHE_API_HOST |
API bind address | 127.0.0.1 |
LETHE_API_PORT |
API port | 1373 |
LLM_PROVIDER |
Optional provider hint | auto |
LLM_MODEL |
Main model | required for chat |
LLM_MODEL_AUX |
Auxiliary model | main model |
LLM_API_BASE |
Custom OpenAI-compatible base URL | unset |
LLM_CONTEXT_LIMIT |
Context size hint | 100000 |
OPENROUTER_API_KEY |
OpenRouter key | unset |
ANTHROPIC_API_KEY |
Anthropic key | unset |
ANTHROPIC_AUTH_TOKEN |
Optional Anthropic OAuth access token (raw) | unset |
LETHE_ANTHROPIC_OAUTH_TOKENS |
Optional Anthropic OAuth token file | $CREDENTIALS_DIR/anthropic_oauth_tokens.json |
OPENAI_API_KEY |
OpenAI/local-compatible key | unset |
OPENAI_AUTH_TOKEN |
Optional OpenAI OAuth access token (raw) | unset |
LETHE_OPENAI_OAUTH_TOKENS |
Optional OpenAI OAuth token file | $CREDENTIALS_DIR/openai_oauth_tokens.json |
EXA_API_KEY |
Exa search/fetch tools | unset |
LETHE_SEMANTIC_SEARCH_ENABLED |
Enable vector recall (fallback is keyword search) | true |
LETHE_EMBEDDING_PROVIDER |
fastembed or hash |
fastembed |
LETHE_EMBEDDING_MODEL |
FastEmbed model id | Snowflake/snowflake-arctic-embed-m-v2.0 |
ACTORS_ENABLED |
Enable actor/subagent system | true |
HIPPOCAMPUS_ENABLED |
Enable associative recall | true |
CURATOR_ENABLED |
Enable memory curator | true |
HEARTBEAT_ENABLED |
Enable proactive heartbeat loop | true |
HEARTBEAT_INTERVAL |
Heartbeat interval seconds | 3600 |
PROACTIVE_MAX_PER_DAY |
Proactive message daily limit | 4 |
PROACTIVE_COOLDOWN_MINUTES |
Minimum spacing for proactive messages | 60 |
TRANSCRIPTION_PROVIDER |
auto, openrouter, openai, or local |
auto |
TRANSCRIPTION_MODEL |
STT model override | provider default |
TRANSCRIPTION_LANGUAGE |
Optional language hint | auto |
TRANSCRIPTION_LOCAL_COMMAND |
Local Whisper command | whisper |
Lethe stores runtime state under the workspace and data directories:
workspace/memory/identity.md-- persona and identity, user-editable.workspace/memory/human.md-- facts about the user.workspace/memory/project.md-- current project/context.workspace/notes/-- tagged markdown notes.$MEMORY_DIR/lethe-memory.db-- SQLite-vec database withmemory(archival + notes, withnote-<uuid>andmem-<uuid>ids),message_history, and their*_vecvirtual siblings for embedding search.- SQLite database at
$DB_PATH-- todos.
Core memory block defaults and prompt templates are embedded into the binary, so lethe check and first startup work without copying prompt files into the workspace.
Upgrading from a pre-0.19 install? See MIGRATION.md for the one-shot lethe-migrate workflow that moves legacy LanceDB data into the new layout.
Pack the workspace, agent state (memory + history), and .env into a single tar.gz archive:
lethe backup # ./lethe-backup-YYYYMMDD-HHMMSS.tar.gz
lethe backup --output ~/backups/lethe.tgzThe archive is written with 0600 permissions because it contains the .env secrets — keep it private.
Restore an archive into the current $LETHE_HOME:
lethe restore lethe-backup-20260525-160522.tar.gz
lethe restore archive.tgz --yes # skip prompts (for scripts / non-TTY)Restore prompts before overwriting an existing workspace and again before overwriting an existing .env — declining either keeps the local copy intact. Memory and history are restored unconditionally (that is the point of restoring).
Lethe writes structured runtime logs to $LOGS_DIR/lethe.log and mirrors them to stderr. The default level is info; override it with RUST_LOG, for example:
RUST_LOG=debug scripts/lethe-telegram-foreground
tail -f ~/.lethe/logs/lethe.logTelegram turns, LLM responses, tool calls, tool results, heartbeat failures, and background actor update relay failures are logged for post-mortem debugging.
Full LLM request/response dumps are opt-in because they contain prompts, memory, tool schemas, tool results, and attachments:
LLM_DEBUG=true scripts/lethe-telegram-foreground
ls ~/.lethe/logs/llm/Override the dump directory with LLM_DEBUG_DIR.
All API routes require Authorization: Bearer <LETHE_API_TOKEN> or x-lethe-token.
| Route | Method | Purpose |
|---|---|---|
/health |
GET |
Readiness check. |
/chat |
POST |
Send a user message and receive SSE response events. |
/events |
GET |
Subscribe to brainstem + actor SSE events. |
/cancel |
POST |
Cancel active work for a chat. |
/configure |
POST |
Store user metadata in memory. |
/model |
GET/POST |
Inspect or update main/aux model ids. |
/file?path=... |
GET |
Serve a workspace file. |
/actors |
GET |
Snapshot of active and recently terminated actors. |
/todos |
GET |
List todos (filters: status, priority, include_completed, limit). |
/session/history |
GET |
Last N persisted messages (limit). |
SSE event vocabulary on /chat and /events:
| Event | Payload | Meaning |
|---|---|---|
turn.start |
{chat_id} |
A new agent turn has begun. |
assistant.delta |
{content} |
Streamed assistant token chunk (Anthropic + OpenAI OAuth). |
text |
{content, parse_mode, message_id} |
Complete (sub-)message; submessage boundaries follow the --- rule from interfaces/telegram/formatting.rs. |
tool.start |
{call_id, name, args_preview} |
Tool execution started. |
tool.end |
{call_id, name, success, output_preview, duration_ms} |
Tool execution finished. |
actor.spawned / actor.state / actor.task / actor.message |
{actor_id, payload} |
Actor lifecycle events fanned out from ActorEventBus. |
usage |
{prompt_tokens} |
Updated context window usage. |
typing_start / typing_stop |
{} |
Compatibility hints for chat clients. |
done |
{} |
Turn complete; safe to close the stream. |
Start an OpenAI-compatible server:
./build/bin/llama-server \
--model /path/to/gemma-4-31B-it-Q8_0.gguf \
--host 0.0.0.0 --port 8090 \
--ctx-size 98304 \
--jinjaConfigure Lethe:
LLM_PROVIDER=openai
LLM_MODEL=openai/gemma-4-31B-it-Q8_0.gguf
LLM_API_BASE=http://localhost:8090/v1
OPENAI_API_KEY=local
LLM_CONTEXT_LIMIT=96000cargo fmt --check
cargo test
cargo build --releaseBuild a local release archive:
cargo build --release
scripts/package-release
ls dist/Tagged pushes (v*) build GitHub release assets on a four-runner matrix — linux-x86_64, linux-aarch64, macos-x86_64, macos-aarch64 — each producing one lethe-<target>.tar.gz plus a sibling lethe-migrate-<target>.tar.gz (install.sh and update.sh consume the lethe-* assets from the latest release). Linux gnu binaries are built on ubuntu-22.04(-arm) for a glibc 2.35 floor; macOS binaries link only against system frameworks.
Useful smoke checks:
target/release/lethe check
target/release/lethe telegram split "hello from lethe"MIT