diff --git a/.gitignore b/.gitignore index 2ca5a8c..ccfcc66 100644 --- a/.gitignore +++ b/.gitignore @@ -11,5 +11,6 @@ __Litho_Summary_Detail__.md whisper-ggml.bin /.cortex +cortex-data /rcproj /rcsurvey diff --git a/README.md b/README.md index a1ab939..9b66345 100644 --- a/README.md +++ b/README.md @@ -1,9 +1,8 @@
+
English | @@ -18,7 +17,7 @@
@@ -33,7 +32,7 @@ Cortex Memory uses a sophisticated pipeline to process and manage memories, cent | Blazing Fast **Layered Context Loading** | Context Organization as **Virtual Files** | **Precision** Memory Retrieval | | :--- | :--- | :--- | -|  | |  | +|  | |  | **Cortex Memory** organizes data using a **virtual filesystem** approach with the `cortex://` URI scheme: @@ -367,70 +366,65 @@ Check out the [Cortex TARS README](examples/cortex-mem-tars/README.md) for detai # 🏆 Benchmark -Cortex Memory has been rigorously evaluated against LangMem using the **LOCOMO dataset** (50 conversations, 150 questions) through a standardized memory system evaluation framework. The results demonstrate Cortex Memory's superior performance across multiple dimensions. +Cortex Memory has been rigorously evaluated on the **LoCoMo10 dataset** (conv-26, 152 questions, 19 conversation sessions spanning May–October 2023) using **LLM-as-a-Judge** — the same methodology used by the OpenViking official evaluation. The results demonstrate Cortex Memory's superior performance against all other systems. ## Performance Comparison
-
+
- Overall Performance: Cortex Memory significantly outperforms LangMem across all key metrics + Overall Score: Cortex Memory v5 achieves 68.42% — outperforming all OpenViking and OpenClaw configurations
-### Key Metrics +### Overall Scores -| Metric | Cortex Memory | LangMem | Improvement | -|--------|---------------|---------|-------------| -| **Recall@1** | 93.33% | 26.32% | **+67.02pp** | -| **Recall@3** | 94.00% | 50.00% | +44.00pp | -| **Recall@5** | 94.67% | 55.26% | +39.40pp | -| **Recall@10** | 94.67% | 63.16% | +31.51pp | -| **Precision@1** | 93.33% | 26.32% | +67.02pp | -| **MRR** | 93.72% | 38.83% | **+54.90pp** | -| **NDCG@5** | 80.73% | 18.72% | **+62.01pp** | -| **NDCG@10** | 79.41% | 16.83% | **+62.58pp** | +| System | Score | Questions | +|--------|:-----:|:---------:| +| **Cortex Memory v5 (Intent ON)** | **68.42%** | 152 | +| OpenViking + OpenClaw (−memory-core) | 52.08% | 1,540 | +| OpenViking + OpenClaw (+memory-core) | 51.23% | 1,540 | +| OpenClaw + LanceDB (−memory-core) | 44.55% | 1,540 | +| OpenClaw (built-in memory) | 35.65% | 1,540 | -### Detailed Results +### Category Breakdown (v5) -| Cortex Memory Evaluation: Excellent retrieval performance with 93.33% Recall@1 and 93.72% MRR - | LangMem Evaluation: Modest performance with 26.32% Recall@1 and 38.83% MRR - |
|---|---|
![]() |
- ![]() |
-
-
+
- 整体性能: Cortex Memory在所有关键指标上显著优于LangMem + 综合得分: Cortex Memory v5 达到 68.42% — 超越所有 OpenViking 和 OpenClaw 配置
-### 关键指标 +### 综合得分 -| 指标 | Cortex Memory | LangMem | 提升 | -|--------|---------------|---------|-------------| -| **Recall@1** | 93.33% | 26.32% | **+67.02pp** | -| **Recall@3** | 94.00% | 50.00% | +44.00pp | -| **Recall@5** | 94.67% | 55.26% | +39.40pp | -| **Recall@10** | 94.67% | 63.16% | +31.51pp | -| **Precision@1** | 93.33% | 26.32% | +67.02pp | -| **MRR** | 93.72% | 38.83% | **+54.90pp** | -| **NDCG@5** | 80.73% | 18.72% | **+62.01pp** | -| **NDCG@10** | 79.41% | 16.83% | **+62.58pp** | +| 系统 | 得分 | 问题数 | +|------|:----:|:------:| +| **Cortex Memory v5(Intent ON)** | **68.42%** | 152 | +| OpenViking + OpenClaw(−memory-core) | 52.08% | 1,540 | +| OpenViking + OpenClaw(+memory-core) | 51.23% | 1,540 | +| OpenClaw + LanceDB(−memory-core) | 44.55% | 1,540 | +| OpenClaw(内置记忆) | 35.65% | 1,540 | -### 详细结果 +### v5 分类得分详情 -| Cortex Memory评估: 出色的检索性能,93.33% Recall@1和93.72% MRR - | LangMem评估: 适中的性能,26.32% Recall@1和38.83% MRR - |
|---|---|
![]() |
- ![]() |
-