From a2b7847299856772e78176eb8cc7a8bc70b16946 Mon Sep 17 00:00:00 2001
From: Viperdk2020 <69140170+Viperdk2020@users.noreply.github.com>
Date: Mon, 1 Sep 2025 05:54:14 +0200
Subject: [PATCH] docs: document memory backends

---
 README.md               |   6 ++-
 docs/config.md          | 111 +++++++++++++++++++++-------------------
 docs/memory-backends.md |  36 +++++++++++++
 3 files changed, 98 insertions(+), 55 deletions(-)
 create mode 100644 docs/memory-backends.md
diff --git a/README.md b/README.md
index 0c4b64538e2..f6b33da20bf 100644
--- a/README.md
+++ b/README.md
@@ -62,7 +62,6 @@ You can also use Codex with an API key, but this requires [additional setup](./d
 
 Codex CLI supports [MCP servers](./docs/advanced.md#model-context-protocol-mcp). Enable by adding an `mcp_servers` section to your `~/.codex/config.toml`.
 
-
 ### Configuration
 
 Codex CLI supports a rich set of configuration options, with preferences stored in `~/.codex/config.toml`. For full configuration options, see [Configuration](./docs/config.md).
@@ -71,9 +70,12 @@ Codex CLI supports a rich set of configuration options, with preferences stored
 
 Codex keeps a lightweight, per‑repository memory of key actions to help you recall decisions and changes in each project. Entries are written locally to `<repo>/.codex/memory/memory.jsonl` after tool use (shell exec, MCP tool calls) and patch application. The file is plain JSONL so you can search, back up, or clear it easily. See [Per‑repo memory](./docs/config.md#per-repo-memory-local) for details.
 
+For backend choices (JSONL vs SQLite) and maintenance commands, see [memory backends](./docs/memory-backends.md).
+
 Toggle per run (TUI or exec): add `--memory off` to disable, or `--memory on` to force‑enable. You can also set `CODEX_PER_REPO_MEMORY=0|1`.
 
 Durable memory and smarter preamble
+
 - Durable items: In addition to action logs (exec, tool, change), Codex records durable items you and Codex can reuse across turns:
   - `pref`: your preferences and decisions (e.g., “Prefer Ruff + Black”).
   - `summary`: short outcomes/facts captured on task completion.
@@ -81,6 +83,7 @@ Durable memory and smarter preamble
 - Smarter preamble: On new sessions, Codex injects a short “project memory” preamble built from recent `pref` and `summary` items. It deduplicates, merges tags, caps counts, and enforces a length limit so your prompt stays concise.
 
 TUI: memory slash commands
+
 - Add preference: `/memory add Use Ruff + Black`
 - List recent durable: `/memory list 10`
 - Search durable: `/memory search ruff`
@@ -89,6 +92,7 @@ TUI: memory slash commands
 - Help: `/memory help`
 
 What gets logged automatically (TUI)
+
 - On task complete: a `summary` durable item with a compact preview of the last assistant message (kept brief).
 - On approval request: a `decision` durable item noting the request (exec/patch).
 
diff --git a/docs/config.md b/docs/config.md
index edb00e51417..0bd34c85db9 100644
--- a/docs/config.md
+++ b/docs/config.md
@@ -1,6 +1,5 @@
 # Config
 
-
 Codex supports several mechanisms for setting config values:
 
 - Config-specific command-line flags, such as `--model o3` (highest precedence).
@@ -388,10 +387,10 @@ set = { CI = "1" }
 include_only = ["PATH", "HOME"]
 ```
 
-| Field                     | Type                       | Default | Description                                                                                                                                     |
-| ------------------------- | -------------------------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
-| `inherit`                 | string                     | `all`   | Starting template for the environment:<br>`all` (clone full parent env), `core` (`HOME`, `PATH`, `USER`, …), or `none` (start empty).           |
-| `ignore_default_excludes` | boolean                    | `false` | When `false`, Codex removes any var whose **name** contains `KEY`, `SECRET`, or `TOKEN` (case-insensitive) before other rules run.              |
+| Field                     | Type                 | Default | Description                                                                                                                                     |
+| ------------------------- | -------------------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
+| `inherit`                 | string               | `all`   | Starting template for the environment:<br>`all` (clone full parent env), `core` (`HOME`, `PATH`, `USER`, …), or `none` (start empty).           |
+| `ignore_default_excludes` | boolean              | `false` | When `false`, Codex removes any var whose **name** contains `KEY`, `SECRET`, or `TOKEN` (case-insensitive) before other rules run.              |
 | `exclude`                 | array<string>        | `[]`    | Case-insensitive glob patterns to drop after the default filter.<br>Examples: `"AWS_*"`, `"AZURE_*"`.                                           |
 | `set`                     | table<string,string> | `{}`    | Explicit key/value overrides or additions – always win over inherited values.                                                                   |
 | `include_only`            | array<string>        | `[]`    | If non-empty, a whitelist of patterns; only variables that match _one_ pattern survive the final step. (Generally used with `inherit = "all"`.) |
@@ -566,60 +565,62 @@ Options that are specific to the TUI.
 
 ## Config reference
 
-| Key | Type / Values | Notes |
-| --- | --- | --- |
-| `model` | string | Model to use (e.g., `gpt-5`). |
-| `model_provider` | string | Provider id from `model_providers` (default: `openai`). |
-| `model_context_window` | number | Context window tokens. |
-| `model_max_output_tokens` | number | Max output tokens. |
-| `approval_policy` | `untrusted` | `on-failure` | `on-request` | `never` | When to prompt for approval. |
-| `sandbox_mode` | `read-only` | `workspace-write` | `danger-full-access` | OS sandbox policy. |
-| `sandbox_workspace_write.writable_roots` | array<string> | Extra writable roots in workspace‑write. |
-| `sandbox_workspace_write.network_access` | boolean | Allow network in workspace‑write (default: false). |
-| `sandbox_workspace_write.exclude_tmpdir_env_var` | boolean | Exclude `$TMPDIR` from writable roots (default: false). |
-| `sandbox_workspace_write.exclude_slash_tmp` | boolean | Exclude `/tmp` from writable roots (default: false). |
-| `disable_response_storage` | boolean | Required for ZDR orgs. |
-| `notify` | array<string> | External program for notifications. |
-| `instructions` | string | Currently ignored; use `experimental_instructions_file` or `AGENTS.md`. |
-| `mcp_servers.<id>.command` | string | MCP server launcher command. |
-| `mcp_servers.<id>.args` | array<string> | MCP server args. |
-| `mcp_servers.<id>.env` | map<string,string> | MCP server env vars. |
-| `model_providers.<id>.name` | string | Display name. |
-| `model_providers.<id>.base_url` | string | API base URL. |
-| `model_providers.<id>.env_key` | string | Env var for API key. |
-| `model_providers.<id>.wire_api` | `chat` | `responses` | Protocol used (default: `chat`). |
-| `model_providers.<id>.query_params` | map<string,string> | Extra query params (e.g., Azure `api-version`). |
-| `model_providers.<id>.http_headers` | map<string,string> | Additional static headers. |
-| `model_providers.<id>.env_http_headers` | map<string,string> | Headers sourced from env vars. |
-| `model_providers.<id>.request_max_retries` | number | Per‑provider HTTP retry count (default: 4). |
-| `model_providers.<id>.stream_max_retries` | number | SSE stream retry count (default: 5). |
-| `model_providers.<id>.stream_idle_timeout_ms` | number | SSE idle timeout (ms) (default: 300000). |
-| `project_doc_max_bytes` | number | Max bytes to read from `AGENTS.md`. |
-| `profile` | string | Active profile name. |
-| `profiles.<name>.*` | various | Profile‑scoped overrides of the same keys. |
-| `history.persistence` | `save-all` | `none` | History file persistence (default: `save-all`). |
-| `history.max_bytes` | number | Currently ignored (not enforced). |
-| `file_opener` | `vscode` | `vscode-insiders` | `windsurf` | `cursor` | `none` | URI scheme for clickable citations (default: `vscode`). |
-| `tui` | table | TUI‑specific options (reserved). |
-| `hide_agent_reasoning` | boolean | Hide model reasoning events. |
-| `show_raw_agent_reasoning` | boolean | Show raw reasoning (when available). |
-| `model_reasoning_effort` | `minimal` | `low` | `medium` | `high` | Responses API reasoning effort. |
-| `model_reasoning_summary` | `auto` | `concise` | `detailed` | `none` | Reasoning summaries. |
-| `model_verbosity` | `low` | `medium` | `high` | GPT‑5 text verbosity (Responses API). |
-| `model_supports_reasoning_summaries` | boolean | Force‑enable reasoning summaries. |
-| `chatgpt_base_url` | string | Base URL for ChatGPT auth flow. |
-| `experimental_resume` | string (path) | Resume JSONL path (internal/experimental). |
-| `experimental_instructions_file` | string (path) | Replace built‑in instructions (experimental). |
-| `experimental_use_exec_command_tool` | boolean | Use experimental exec command tool. |
-| `responses_originator_header_internal_override` | string | Override `originator` header value. |
-| `projects.<path>.trust_level` | string | Mark project/worktree as trusted (only `"trusted"` is recognized). |
-| `preferred_auth_method` | `chatgpt` | `apikey` | Select default auth method (default: `chatgpt`). |
-| `tools.web_search` | boolean | Enable web search tool (alias: `web_search_request`) (default: false). |
+| Key                                              | Type / Values      | Notes                                                                   |
+| ------------------------------------------------ | ------------------ | ----------------------------------------------------------------------- | ------------------------------------------------ | ------------------------------------- | ------------------------------- | ------------------------------------------------------- |
+| `model`                                          | string             | Model to use (e.g., `gpt-5`).                                           |
+| `model_provider`                                 | string             | Provider id from `model_providers` (default: `openai`).                 |
+| `model_context_window`                           | number             | Context window tokens.                                                  |
+| `model_max_output_tokens`                        | number             | Max output tokens.                                                      |
+| `approval_policy`                                | `untrusted`        | `on-failure`                                                            | `on-request`                                     | `never`                               | When to prompt for approval.    |
+| `sandbox_mode`                                   | `read-only`        | `workspace-write`                                                       | `danger-full-access`                             | OS sandbox policy.                    |
+| `sandbox_workspace_write.writable_roots`         | array<string>      | Extra writable roots in workspace‑write.                                |
+| `sandbox_workspace_write.network_access`         | boolean            | Allow network in workspace‑write (default: false).                      |
+| `sandbox_workspace_write.exclude_tmpdir_env_var` | boolean            | Exclude `$TMPDIR` from writable roots (default: false).                 |
+| `sandbox_workspace_write.exclude_slash_tmp`      | boolean            | Exclude `/tmp` from writable roots (default: false).                    |
+| `disable_response_storage`                       | boolean            | Required for ZDR orgs.                                                  |
+| `notify`                                         | array<string>      | External program for notifications.                                     |
+| `instructions`                                   | string             | Currently ignored; use `experimental_instructions_file` or `AGENTS.md`. |
+| `mcp_servers.<id>.command`                       | string             | MCP server launcher command.                                            |
+| `mcp_servers.<id>.args`                          | array<string>      | MCP server args.                                                        |
+| `mcp_servers.<id>.env`                           | map<string,string> | MCP server env vars.                                                    |
+| `model_providers.<id>.name`                      | string             | Display name.                                                           |
+| `model_providers.<id>.base_url`                  | string             | API base URL.                                                           |
+| `model_providers.<id>.env_key`                   | string             | Env var for API key.                                                    |
+| `model_providers.<id>.wire_api`                  | `chat`             | `responses`                                                             | Protocol used (default: `chat`).                 |
+| `model_providers.<id>.query_params`              | map<string,string> | Extra query params (e.g., Azure `api-version`).                         |
+| `model_providers.<id>.http_headers`              | map<string,string> | Additional static headers.                                              |
+| `model_providers.<id>.env_http_headers`          | map<string,string> | Headers sourced from env vars.                                          |
+| `model_providers.<id>.request_max_retries`       | number             | Per‑provider HTTP retry count (default: 4).                             |
+| `model_providers.<id>.stream_max_retries`        | number             | SSE stream retry count (default: 5).                                    |
+| `model_providers.<id>.stream_idle_timeout_ms`    | number             | SSE idle timeout (ms) (default: 300000).                                |
+| `project_doc_max_bytes`                          | number             | Max bytes to read from `AGENTS.md`.                                     |
+| `profile`                                        | string             | Active profile name.                                                    |
+| `profiles.<name>.*`                              | various            | Profile‑scoped overrides of the same keys.                              |
+| `history.persistence`                            | `save-all`         | `none`                                                                  | History file persistence (default: `save-all`).  |
+| `history.max_bytes`                              | number             | Currently ignored (not enforced).                                       |
+| `file_opener`                                    | `vscode`           | `vscode-insiders`                                                       | `windsurf`                                       | `cursor`                              | `none`                          | URI scheme for clickable citations (default: `vscode`). |
+| `tui`                                            | table              | TUI‑specific options (reserved).                                        |
+| `hide_agent_reasoning`                           | boolean            | Hide model reasoning events.                                            |
+| `show_raw_agent_reasoning`                       | boolean            | Show raw reasoning (when available).                                    |
+| `model_reasoning_effort`                         | `minimal`          | `low`                                                                   | `medium`                                         | `high`                                | Responses API reasoning effort. |
+| `model_reasoning_summary`                        | `auto`             | `concise`                                                               | `detailed`                                       | `none`                                | Reasoning summaries.            |
+| `model_verbosity`                                | `low`              | `medium`                                                                | `high`                                           | GPT‑5 text verbosity (Responses API). |
+| `model_supports_reasoning_summaries`             | boolean            | Force‑enable reasoning summaries.                                       |
+| `chatgpt_base_url`                               | string             | Base URL for ChatGPT auth flow.                                         |
+| `experimental_resume`                            | string (path)      | Resume JSONL path (internal/experimental).                              |
+| `experimental_instructions_file`                 | string (path)      | Replace built‑in instructions (experimental).                           |
+| `experimental_use_exec_command_tool`             | boolean            | Use experimental exec command tool.                                     |
+| `responses_originator_header_internal_override`  | string             | Override `originator` header value.                                     |
+| `projects.<path>.trust_level`                    | string             | Mark project/worktree as trusted (only `"trusted"` is recognized).      |
+| `preferred_auth_method`                          | `chatgpt`          | `apikey`                                                                | Select default auth method (default: `chatgpt`). |
+| `tools.web_search`                               | boolean            | Enable web search tool (alias: `web_search_request`) (default: false).  |
 
 ## Per-repo memory (local)
 
 Codex records a lightweight, per-repository memory of key actions to help you and external tools recall decisions and changes made in a project. This data is written locally to your repo and never leaves your machine.
 
+For backend options and maintenance commands, see [memory backends](./memory-backends.md).
+
 - Location: `<repo>/.codex/memory/memory.jsonl` (one JSON object per line)
 - Scope: Automatically associated with the repository that contains `.git/` or `.codex/`.
 - When entries are added:
@@ -635,11 +636,13 @@ Example (one line):
 ```
 
 Notes
+
 - This store is local-only and intended for project memory/history.
 - Clearing: delete or truncate the file at `<repo>/.codex/memory/memory.jsonl`.
 - Backups/exports: copy the JSONL file anywhere (each line is an entry).
 
 Toggle
+
 - Per run (both TUI and exec): pass `--memory off` to disable or `--memory on` to force-enable.
 - Environment variable (both modes): set `CODEX_PER_REPO_MEMORY=0|1` (also accepts `on|off`, `true|false`). Alias: `CODEX_MEMORY`.
   - Examples:
diff --git a/docs/memory-backends.md b/docs/memory-backends.md
new file mode 100644
index 00000000000..c93fe48f51a
--- /dev/null
+++ b/docs/memory-backends.md
@@ -0,0 +1,36 @@
+# Memory backends
+
+Codex stores per-repo state so you and the CLI can recall decisions across sessions. Two storage backends are available.
+
+## JSONL (default)
+
+- One JSON object per line, easy to inspect and back up.
+- Paths: `<repo>/.codex/memory/memory.jsonl` or `~/.codex/memory/memory.jsonl`.
+- Works out of the box and is diff‑friendly for version control.
+
+## SQLite (optional)
+
+- Adds atomic updates and indexes for faster queries.
+- Paths: `<repo>/.codex/memory/memory.db` or `~/.codex/memory/memory.db`.
+- Requires a build with the SQLite feature (`--features codex-memory/sqlite`).
+- Select at runtime with `CODEX_MEMORY_BACKEND=sqlite` (defaults to `jsonl`).
+
+## Migrating existing data
+
+Convert an existing JSONL store to SQLite:
+
+```bash
+codex memory migrate
+```
+
+After migration, enable SQLite with `CODEX_MEMORY_BACKEND=sqlite`.
+
+## Compacting
+
+Reclaim space and keep the store tidy:
+
+```bash
+codex memory compact
+```
+
+This vacuums a SQLite database or rewrites a JSONL file to drop unused entries.