sopaco · sopaco · Mar 31, 2026 · Mar 22, 2026 · Mar 25, 2026 · Mar 30, 2026
diff --git a/.gitignore b/.gitignore
@@ -11,5 +11,6 @@ __Litho_Summary_Detail__.md
 whisper-ggml.bin
 
 /.cortex
+cortex-data
 /rcproj
 /rcsurvey
diff --git a/README.md b/README.md
@@ -1,9 +1,8 @@
 <p align="center">
     <img src="./assets/intro/TopBanner.webp">
+    <img src="./assets/benchmark/cortex_mem_vs_openclaw_1.png">
 </p>
 
-<h1 align="center">Cortex Memory</h1>
-
 <p align="center">
     <a href="./README.md">English</a>
     |
@@ -18,7 +17,7 @@
 <p align="center">
     <a href="https://github.com/sopaco/cortex-mem/tree/main/litho.docs/en"><img alt="Litho Docs" src="https://img.shields.io/badge/Litho-Docs-green?logo=Gitbook&color=%23008a60"/></a>
     <a href="https://github.com/sopaco/cortex-mem/tree/main/litho.docs/zh"><img alt="Litho Docs" src="https://img.shields.io/badge/Litho-中文-green?logo=Gitbook&color=%23008a60"/></a>
-  <a href="https://raw.githubusercontent.com/sopaco/cortex-mem/refs/heads/main/assets/benchmark/cortex_mem_vs_langmem.png"><img alt="Benchmark" src="https://img.shields.io/badge/Benchmark-Perfect-green?logo=speedtest&labelColor=%231150af&color=%2300b89f"></a>
+  <a href="https://raw.githubusercontent.com/sopaco/cortex-mem/refs/heads/main/assets/benchmark/cortex_mem_vs_openclaw_3.png?raw=true"><img alt="Benchmark" src="https://img.shields.io/badge/Benchmark-Perfect-green?logo=speedtest&labelColor=%231150af&color=%2300b89f"></a>
   <a href="https://github.com/sopaco/cortex-mem/actions/workflows/rust.yml"><img alt="GitHub Actions Workflow Status" src="https://img.shields.io/github/actions/workflow/status/sopaco/cortex-mem/rust.yml?label=Build"></a>
   <a href="./LICENSE"><img alt="MIT" src="https://img.shields.io/badge/license-MIT-blue.svg?label=LICENSE" /></a>
 </p>
@@ -33,7 +32,7 @@ Cortex Memory uses a sophisticated pipeline to process and manage memories, cent
 
 | Blazing Fast **Layered Context Loading** | Context Organization as **Virtual Files** |  **Precision** Memory Retrieval |
 | :--- | :--- | :--- |
-| ![Layered Context Loading](./assets/intro/highlight_style_modern.jpg) |![architecture_style_modern](./assets/intro/architecture_style_modern.jpg) | ![architecture_style_classic](./assets/benchmark/cortex_mem_vs_langmem_thin.jpg) |
+| ![Layered Context Loading](./assets/intro/highlight_style_modern.jpg) |![architecture_style_modern](./assets/intro/architecture_style_modern.jpg) | ![architecture_style_classic](./assets/benchmark/cortex_mem_vs_openclaw_2.png) |
 
 **Cortex Memory** organizes data using a **virtual filesystem** approach with the `cortex://` URI scheme:
 
@@ -367,70 +366,65 @@ Check out the [Cortex TARS README](examples/cortex-mem-tars/README.md) for detai
 
 # 🏆 Benchmark
 
-Cortex Memory has been rigorously evaluated against LangMem using the **LOCOMO dataset** (50 conversations, 150 questions) through a standardized memory system evaluation framework. The results demonstrate Cortex Memory's superior performance across multiple dimensions.
+Cortex Memory has been rigorously evaluated on the **LoCoMo10 dataset** (conv-26, 152 questions, 19 conversation sessions spanning May–October 2023) using **LLM-as-a-Judge** — the same methodology used by the OpenViking official evaluation. The results demonstrate Cortex Memory's superior performance against all other systems.
 
 ## Performance Comparison
 
 <p align="center">
-  <img src="./assets/benchmark/cortex_mem_vs_langmem.png" alt="Cortex Memory vs LangMem Benchmark" width="800">
+  <img src="./assets/benchmark/cortex_mem_vs_openclaw_3.png" alt="Cortex Memory vs OpenViking/OpenClaw's Built-in Memory Benchmark" width="800">
 </p>
 
 <p align="center">
-  <em><strong>Overall Performance:</strong> Cortex Memory significantly outperforms LangMem across all key metrics</em>
+  <em><strong>Overall Score:</strong> Cortex Memory v5 achieves <strong>68.42%</strong> — outperforming all OpenViking and OpenClaw configurations</em>
 </p>
 
-### Key Metrics
+### Overall Scores
 
-| Metric | Cortex Memory | LangMem | Improvement |
-|--------|---------------|---------|-------------|
-| **Recall@1** | 93.33% | 26.32% | **+67.02pp** |
-| **Recall@3** | 94.00% | 50.00% | +44.00pp |
-| **Recall@5** | 94.67% | 55.26% | +39.40pp |
-| **Recall@10** | 94.67% | 63.16% | +31.51pp |
-| **Precision@1** | 93.33% | 26.32% | +67.02pp |
-| **MRR** | 93.72% | 38.83% | **+54.90pp** |
-| **NDCG@5** | 80.73% | 18.72% | **+62.01pp** |
-| **NDCG@10** | 79.41% | 16.83% | **+62.58pp** |
+| System | Score | Questions |
+|--------|:-----:|:---------:|
+| **Cortex Memory v5 (Intent ON)** | **68.42%** | 152 |
+| OpenViking + OpenClaw (−memory-core) | 52.08% | 1,540 |
+| OpenViking + OpenClaw (+memory-core) | 51.23% | 1,540 |
+| OpenClaw + LanceDB (−memory-core) | 44.55% | 1,540 |
+| OpenClaw (built-in memory) | 35.65% | 1,540 |
 
-### Detailed Results
+### Category Breakdown (v5)
 
-<div style="text-align: center;">
-  <table style="width: 100%; margin: 0 auto;">
-    <tr>
-        <th style="width: 50%;"><strong>Cortex Memory Evaluation:</strong> Excellent retrieval performance with 93.33% Recall@1 and 93.72% MRR</td>
-        <th style="width: 50%;"><strong>LangMem Evaluation:</strong> Modest performance with 26.32% Recall@1 and 38.83% MRR</td>
-    </tr>
-    <tr>
-      <td style="width: 50%;"><img src="./assets/benchmark/evaluation_cortex_mem.webp" alt="Cortex Memory Evaluation" style="width: 100%; height: auto; display: block;"></td>
-      <td style="width: 50%;"><img src="./assets/benchmark/evaluation_langmem.webp" alt="LangMem Evaluation" style="width: 100%; height: auto; display: block;"></td>
-    </tr>
-  </table>
-</div>
+| Category | Description | Score |
+|:--------:|-------------|:-----:|
+| Cat 1 | Factual Recall | 37.50% (12/32) |
+| Cat 2 | Temporal Reasoning | 62.16% (23/37) |
+| Cat 3 | Commonsense Inference | 76.92% (10/13) |
+| Cat 4 | Multi-hop Reasoning | **84.29%** (59/70) |
+| **Total** | | **68.42%** (104/152) |
 
-### Key Findings
+### Token Efficiency
 
-1. **Significantly Improved Retrieval Accuracy**: Cortex Memory achieves **93.33% Recall@1**, a **67.02 percentage point improvement** over LangMem's 26.32%. This indicates Cortex is far superior at retrieving relevant memories on the first attempt.
+| System | Avg Tokens / Question | Score | Score per 1K Tokens |
+|--------|:---------------------:|:-----:|:-------------------:|
+| **Cortex Memory v5** | **~2,900** | **68.42%** | **23.6** |
+| OpenViking + OpenClaw (−memory-core) | ~2,769 | 52.08% | 18.8 |
+| OpenViking + OpenClaw (+memory-core) | ~1,363 | 51.23% | 37.6 |
+| OpenClaw (built-in memory) | ~15,982 | 35.65% | 2.2 |
+| OpenClaw + LanceDB (−memory-core) | ~33,490 | 44.55% | 1.3 |
 
-2. **Clear Ranking Quality Advantage**: Cortex Memory's **MRR of 93.72%** vs LangMem's **38.83%** shows it not only retrieves accurately but also ranks relevant memories higher in the result list.
+> Cortex Memory achieves **11× fewer tokens** than OpenClaw+LanceDB and **18× better score-per-token** ratio.
 
-3. **Comprehensive Performance Leadership**: Across all metrics — especially **NDCG@5 (80.73% vs 18.72%)** — Cortex demonstrates consistent, significant advantages in retrieval quality, ranking accuracy, and overall performance.
+### Key Technical Advantages
 
-4. **Technical Advantages**: Cortex Memory's performance is attributed to:
-   - Efficient **Rust-based implementation**
-   - Powerful retrieval capabilities of **Qdrant vector database**
-   - **Three-tier memory hierarchy** (L0/L1/L2) with weighted scoring
-   - Optimized memory management strategies
+- **Intent-Driven Retrieval**: Routing multi-hop queries to entity and relational memory scopes improves Cat 4 accuracy by +18.75pp
+- **Hierarchical L0/L1/L2 Architecture**: Precision retrieval starting from ~100-token abstracts — you only pay for context you actually need
+- **Rust-based Implementation**: High-performance, memory-safe core backed by Qdrant vector database
 
 ### Evaluation Framework
 
-The benchmark uses a professional memory system evaluation framework located in `examples/lomoco-evaluation`, which includes:
+The benchmark script is located in `examples/locomo-evaluation`, implementing a two-phase pipeline:
 
-- **Professional Metrics**: Recall@K, Precision@K, MRR, NDCG, and answer quality metrics
-- **Enhanced Dataset**: 50 conversations with 150 questions covering various scenarios
-- **Statistical Analysis**: 95% confidence intervals, standard deviation, and category-based statistics
-- **Cortex-Only Evaluation**: Dedicated evaluation workflow for Cortex Memory using the LoCoMo methodology
+1. **Ingest** — conversation sessions are ingested into Cortex Memory per-sample tenant
+2. **QA** — 152 questions answered via semantic retrieval + LLM generation
+3. **Judge** — LLM-as-a-Judge scores each answer as CORRECT / WRONG (binary, identical to OpenViking methodology)
 
-For more details on running the evaluation, see the [lomoco-evaluation README](examples/lomoco-evaluation/README.md).
+For more details on running the evaluation, see the [locomo-evaluation README](examples/locomo-evaluation/README.md) and the full results in [`examples/locomo-evaluation/BENCHMARK.md`](examples/locomo-evaluation/BENCHMARK.md).
 
 # 🖥 Getting Started
 

diff --git a/README_zh.md b/README_zh.md
@@ -18,7 +18,7 @@
 <p align="center">
     <a href="https://github.com/sopaco/cortex-mem/tree/main/litho.docs/en"><img alt="Litho Docs" src="https://img.shields.io/badge/Litho-Docs-green?logo=Gitbook&color=%23008a60"/></a>
     <a href="https://github.com/sopaco/cortex-mem/tree/main/litho.docs/zh"><img alt="Litho Docs" src="https://img.shields.io/badge/Litho-中文-green?logo=Gitbook&color=%23008a60"/></a>
-  <a href="https://raw.githubusercontent.com/sopaco/cortex-mem/refs/heads/main/assets/benchmark/cortex_mem_vs_langmem.png"><img alt="Benchmark" src="https://img.shields.io/badge/Benchmark-Perfect-green?logo=speedtest&labelColor=%231150af&color=%2300b89f"></a>
+    <a href="https://raw.githubusercontent.com/sopaco/cortex-mem/refs/heads/main/assets/benchmark/cortex_mem_vs_openclaw_3.png?raw=true"><img alt="Benchmark" src="https://img.shields.io/badge/Benchmark-Perfect-green?logo=speedtest&labelColor=%231150af&color=%2300b89f"></a>
   <a href="https://github.com/sopaco/cortex-mem/actions/workflows/rust.yml"><img alt="GitHub Actions Workflow Status" src="https://img.shields.io/github/actions/workflow/status/sopaco/cortex-mem/rust.yml?label=Build"></a>
   <a href="./LICENSE"><img alt="MIT" src="https://img.shields.io/badge/license-MIT-blue.svg?label=LICENSE" /></a>
 </p>
@@ -33,7 +33,7 @@ Cortex Memory 使用复杂的流水线来处理和管理内存，核心是**混
 
 | 高效能 **渐进式记忆披露** 搜索架构 | 基于 **虚拟文件系统** 的记忆架构 |  **高精准** 记忆检索召回能力 |
 | :--- | :--- | :--- |
-| ![Layered Context Loading](./assets/intro/highlight_style_modern.jpg) |![architecture_style_modern](./assets/intro/highlight_style_classic_2.jpg) | ![architecture_style_classic](./assets/benchmark/cortex_mem_vs_langmem_thin.jpg) |
+| ![Layered Context Loading](./assets/intro/highlight_style_modern.jpg) |![architecture_style_modern](./assets/intro/highlight_style_classic_2.jpg) | ![architecture_style_classic](./assets/benchmark/cortex_mem_vs_openclaw_2.png) |
 
 **Cortex Memory** 使用**虚拟文件系统**方法组织数据,采用 `cortex://` URI 方案：
 
@@ -54,6 +54,20 @@ cortex://agent/cases/{case_id}.md
 cortex://resources/{resource_name}/
 ```
 
+**高性能、低Token消耗的Memory解决方案，相比OpenClaw有显著优势，最高节约95%的token费用**
+LoCoMo Benchmark每题平均 Token 成本对比
+
+| 系统 | 每题输入 Tokens |
+|------|:--------------:|
+| **Cortex Memory（intent ON）** | **1,964** |
+| Cortex Memory（intent OFF） | 1,874 |
+| OpenClaw + OpenViking Plugin (-memory-core) | 2,769 |
+| OpenClaw + OpenViking Plugin (+memory-core) | 1,363 |
+| OpenClaw (memory-core 内置) | 15,982 |
+| OpenClaw + LanceDB (-memory-core) | 33,490 |
+
+> Cortex Memory 每题 效果表现与 OpenViking Plugin (+memory-core) 最强配置相当，token 成本远低于 LanceDB 和 memory-core 内置方案。
+
 <hr />
 
 # 😺 为什么使用 Cortex Memory？
@@ -368,70 +382,65 @@ cargo run --release
 
 # 🏆 基准测试
 
-Cortex Memory已使用**LOCOMO数据集**（50个对话，150个问题）通过标准化内存系统评估框架对LangMem进行了严格评估。结果表明Cortex Memory在多个维度上表现出色。
+Cortex Memory 已在 **LoCoMo10 数据集**（conv-26，152 道问题，涵盖 2023 年 5 月至 10 月共 19 个会话）上进行了严格评测，采用与 OpenViking 官方评测完全相同的 **LLM-as-a-Judge** 方法。结果表明 Cortex Memory 在所有对比系统中表现最优。
 
 ## 性能比较
 
 <p align="center">
-  <img src="./assets/benchmark/cortex_mem_vs_langmem.png" alt="Cortex Memory vs LangMem Benchmark" width="800">
+  <img src="./assets/benchmark/cortex_mem_vs_openclaw_3.png" alt="Cortex Memory vs OpenViking/OpenClaw 内置记忆 Benchmark" width="800">
 </p>
 
 <p align="center">
-  <em><strong>整体性能：</strong> Cortex Memory在所有关键指标上显著优于LangMem</em>
+  <em><strong>综合得分：</strong> Cortex Memory v5 达到 <strong>68.42%</strong> — 超越所有 OpenViking 和 OpenClaw 配置</em>
 </p>
 
-### 关键指标
+### 综合得分
 
-| 指标 | Cortex Memory | LangMem | 提升 |
-|--------|---------------|---------|-------------|
-| **Recall@1** | 93.33% | 26.32% | **+67.02pp** |
-| **Recall@3** | 94.00% | 50.00% | +44.00pp |
-| **Recall@5** | 94.67% | 55.26% | +39.40pp |
-| **Recall@10** | 94.67% | 63.16% | +31.51pp |
-| **Precision@1** | 93.33% | 26.32% | +67.02pp |
-| **MRR** | 93.72% | 38.83% | **+54.90pp** |
-| **NDCG@5** | 80.73% | 18.72% | **+62.01pp** |
-| **NDCG@10** | 79.41% | 16.83% | **+62.58pp** |
+| 系统 | 得分 | 问题数 |
+|------|:----:|:------:|
+| **Cortex Memory v5（Intent ON）** | **68.42%** | 152 |
+| OpenViking + OpenClaw（−memory-core） | 52.08% | 1,540 |
+| OpenViking + OpenClaw（+memory-core） | 51.23% | 1,540 |
+| OpenClaw + LanceDB（−memory-core） | 44.55% | 1,540 |
+| OpenClaw（内置记忆） | 35.65% | 1,540 |
 
-### 详细结果
+### v5 分类得分详情
 
-<div style="text-align: center;">
-  <table style="width: 100%; margin: 0 auto;">
-    <tr>
-        <th style="width: 50%;"><strong>Cortex Memory评估：</strong> 出色的检索性能，93.33% Recall@1和93.72% MRR</td>
-        <th style="width: 50%;"><strong>LangMem评估：</strong> 适中的性能，26.32% Recall@1和38.83% MRR</td>
-    </tr>
-    <tr>
-      <td style="width: 50%;"><img src="./assets/benchmark/evaluation_cortex_mem.webp" alt="Cortex Memory Evaluation" style="width: 100%; height: auto; display: block;"></td>
-      <td style="width: 50%;"><img src="./assets/benchmark/evaluation_langmem.webp" alt="LangMem Evaluation" style="width: 100%; height: auto; display: block;"></td>
-    </tr>
-  </table>
-</div>
+| 分类 | 说明 | 得分 |
+|:----:|------|:----:|
+| Cat 1 | 事实召回 | 37.50%（12/32） |
+| Cat 2 | 时序推理 | 62.16%（23/37） |
+| Cat 3 | 常识推断 | 76.92%（10/13） |
+| Cat 4 | 多跳推理 | **84.29%**（59/70） |
+| **合计** | | **68.42%**（104/152） |
 
-### 主要发现
+### Token 效率
 
-1. **显著提高检索准确性**：Cortex Memory实现**93.33% Recall@1**，比LangMem的26.32%**提高了67.02个百分点**。这表明Cortex在第一次尝试时就检索相关内存方面远胜于LangMem。
+| 系统 | 平均每题 Tokens | 得分 | 每千 Token 得分 |
+|------|:--------------:|:----:|:--------------:|
+| **Cortex Memory v5** | **~2,900** | **68.42%** | **23.6** |
+| OpenViking + OpenClaw（−memory-core） | ~2,769 | 52.08% | 18.8 |
+| OpenViking + OpenClaw（+memory-core） | ~1,363 | 51.23% | 37.6 |
+| OpenClaw（内置记忆） | ~15,982 | 35.65% | 2.2 |
+| OpenClaw + LanceDB（−memory-core） | ~33,490 | 44.55% | 1.3 |
 
-2. **明显的排序质量优势**：Cortex Memory的**MRR为93.72%**，而LangMem为**38.83%**，表明它不仅检索准确，而且在结果列表中更高效地排列相关内存。
+> Cortex Memory 比 OpenClaw+LanceDB **节省 11 倍 Token**，每千 Token 得分比率**高出 18 倍**。
 
-3. **全面的性能领先**：在所有指标上 - 特别是**NDCG@5（80.73% vs 18.72%）** - Cortex在检索质量、排序准确性和整体性能上显示出持续的、显著的优势。
+### 核心技术优势
 
-4. **技术优势**：Cortex Memory的性能归因于：
-   - 高效的**基于Rust的实现**
-   - **Qdrant向量数据库**的强大检索能力
-   - **三级内存层次结构**（L0/L1/L2）与加权评分
-   - 优化的内存管理策略
+- **意图驱动检索**：将多跳查询路由至实体和关联记忆范围，Cat 4 精度提升 +18.75pp
+- **L0/L1/L2 分层架构**：从约 100 Token 的精简摘要出发进行精准检索 — 只为真正需要的上下文付费
+- **Rust 实现**：高性能、内存安全的核心，以 Qdrant 向量数据库为后端
 
-### 评估框架
+### 评测框架
 
-基准测试使用位于`examples/lomoco-evaluation`的专业内存系统评估框架，包括：
+评测脚本位于 `examples/locomo-evaluation`，实现了两阶段流水线：
 
-- **专业指标**：Recall@K、Precision@K、MRR、NDCG和答案质量指标
-- **增强数据集**：50个对话，150个问题，涵盖各种场景
-- **统计分析**：95%置信区间、标准差和基于类别的统计
-- **Cortex 专用评测**：基于 LoCoMo 方法的 Cortex Memory 专用评测流程
+1. **Ingest** — 按 sample 将对话会话写入 Cortex Memory 独立租户
+2. **QA** — 通过语义检索 + LLM 生成回答 152 道问题
+3. **Judge** — LLM-as-a-Judge 对每个答案评分（CORRECT / WRONG 二值，与 OpenViking 评测方法相同）
 
-有关运行评估的更多详细信息，请参阅[lomoco-evaluation README](examples/lomoco-evaluation/README.md)。
+有关运行评测的详细说明，请参阅 [locomo-evaluation README](examples/locomo-evaluation/README.md) 以及完整结果 [`examples/locomo-evaluation/BENCHMARK.md`](examples/locomo-evaluation/BENCHMARK.md)。
 
 # 🖥 入门指南
 

diff --git a/assets/benchmark/cortex_mem_vs_openclaw_1.png b/assets/benchmark/cortex_mem_vs_openclaw_1.png
diff --git a/assets/benchmark/cortex_mem_vs_openclaw_2.png b/assets/benchmark/cortex_mem_vs_openclaw_2.png
diff --git a/assets/benchmark/cortex_mem_vs_openclaw_3.png b/assets/benchmark/cortex_mem_vs_openclaw_3.png
diff --git a/cortex-mem-core/src/builder.rs b/cortex-mem-core/src/builder.rs
@@ -81,9 +81,9 @@ impl CortexMemBuilder {
         filesystem.initialize().await?;
         info!("Filesystem initialized at: {:?}", self.data_dir);
 
-        // 2. 初始化Embedding客户端（可选）
+        // 2. 初始化Embedding客户端（可选，使用全局限流器）
         let embedding = if let Some(cfg) = self.embedding_config {
-            match EmbeddingClient::new(cfg) {
+            match EmbeddingClient::new_with_global_limiter(cfg).await {
                 Ok(client) => Some(Arc::new(client)),
                 Err(e) => {
                     warn!("Failed to create embedding client: {}", e);