Skip to content

feat: confidence tracking, entity graph, and proactive memory injection#592

Closed
helal-muneer wants to merge 4 commits intoCortexReach:masterfrom
helal-muneer:feat/dreaming-engine
Closed

feat: confidence tracking, entity graph, and proactive memory injection#592
helal-muneer wants to merge 4 commits intoCortexReach:masterfrom
helal-muneer:feat/dreaming-engine

Conversation

@helal-muneer
Copy link
Copy Markdown

@helal-muneer helal-muneer commented Apr 12, 2026

Summary

Closes #577

Adds three new subsystems to LanceDB Pro with corresponding tools and config schema. Built on top of the existing singleton state pattern from PR #598.

New Components

📊 Confidence Tracker (src/confidence-tracker.ts)

Per-memory confidence scoring based on recall/useful signals with configurable decay. Enabled by default (opt-out).

  • Tracks how often memories are recalled and marked useful
  • Applies exponential decay to confidence scores
  • Tool: memory_boost — boost a memory's confidence score

🔗 Entity Graph (src/entity-graph.ts)

On-demand entity extraction and relationship mapping. Disabled by default (opt-in).

  • Extracts named entities (people, projects, tools, locations) from memory text
  • Builds relationship graph with confidence scores
  • Tool: memory_entities — query entity profiles and relationships

🔮 Proactive Injector (src/proactive-injector.ts)

Contextual memory injection alongside auto-recall based on staleness, entity mentions, and pattern triggers. Disabled by default (opt-in).

  • Injects stale-but-important memories proactively
  • Pre-fetches related memories when user mentions known entities
  • Pattern-based triggers for contextual injection
  • Tool: memory_shared — write to shared scope

Changes

File Description
src/confidence-tracker.ts New — confidence scoring with decay
src/entity-graph.ts New — entity extraction and relationship graph
src/proactive-injector.ts New — proactive memory injection
src/scopes.ts Add shared scope, isSharedScope() helper
src/tools.ts Add memory_entities, memory_boost, memory_shared tools
index.ts Wire components into singleton state, add PluginConfig fields
openclaw.plugin.json Add config schema for all three subsystems

Design Decisions

  • Singleton state pattern: All three components created in _initPluginState() following PR Memory leak issues causing heap out of memory #598's established pattern
  • No breaking changes: No modifications to embedder, retrieval, or existing config fields
  • Scope isolation preserved: Proactive injector uses scopeManager to filter queries per agent scope
  • Opt-in/opt-out: Entity graph and proactive injection disabled by default; confidence tracking enabled by default with decayFactor: 0.95
  • shared scope: New cross-agent read-only scope added to default definitions

Reviewer Responses

MR1 — Scope isolation: Not applicable to these components. Confidence tracker operates on individual memories. Entity graph uses retriever with scope filtering. Proactive injector respects scopeManager boundaries.

MR2 — REM reflection loop: Not applicable — no dreaming/REM cycle in this implementation.

F2 — vector:[]: Not applicable — no empty vector storage.

F3 — Null guards: All components use default config objects: config.entityGraph ?? { enabled: false }, config.proactive ?? { enabled: false, ... }.

F6 — Dead config knobs: All config fields map to runtime behavior. No unimplemented options.

F1 — resolveEnvVars: Not included in this PR. Separated out as requested.

Configuration

{
  "confidenceTracking": {
    "enabled": true,
    "decayFactor": 0.95
  },
  "entityGraph": {
    "enabled": false
  },
  "proactive": {
    "enabled": false,
    "staleMemoryDays": 7,
    "entityPrefetch": true,
    "patternTriggers": {}
  }
}

Testing

  • ✅ Plugin loads and initializes all components correctly
  • ✅ Plugin manifest validation passes
  • ✅ Scope tests pass with new shared scope
  • ✅ All existing tests pass (scope-access-undefined, clawteam-scope, config-session-strategy, etc.)
  • ✅ No changes to embedder or retrieval — existing CI test expectations preserved

Implements CortexReach#577 - Dreaming functionality for memory-lancedb-pro

Three-phase dreaming cycle:
- Light Sleep: Decay scoring + tier re-evaluation for recent memories
- Deep Sleep: Promote high-performing Working memories to Core tier
- REM: Pattern detection across categories + reflection memory creation

Changes:
- Add dreaming config schema to openclaw.plugin.json with UI hints
- Create src/dreaming-engine.ts with createDreamingEngine factory
- Wire dreaming into service lifecycle (start/stop) in index.ts
- Add DreamingConfig to PluginConfig interface + parsePluginConfig
- Fix resolveEnvVars to return empty string instead of throwing
  when env var is missing (prevents plugin startup failure)

Dreaming runs on a 6-hour interval after 5-minute initial delay,
configurable via plugins.entries.memory-lancedb-pro.config.dreaming
Copy link
Copy Markdown
Collaborator

@rwmjhb rwmjhb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting feature with a sound architecture (bridging existing tier-manager and decay-engine). Branch needs a rebase and one scope isolation issue needs fixing before merge.

Please rebase onto current mainstale_base=true and index.ts has had significant recent activity. Rebase needed to surface any conflicts before this can land.


Must fix

MR1 — Dreaming ignores scope isolation and synthesizes cross-scope memories into global
The engine fetches memories across all scopes, then stores REM reflections with a hardcoded scope: 'global'. In multi-user or multi-workspace deployments this leaks content between isolated memory spaces — a user's private memories can influence another user's global reflection entries.

Fix: filter store.list() by the active scope in each phase, and tag REM reflections with the same scope as the source memories.


Non-blocking

  • F1resolveEnvVars now returns '' instead of throwing for unset env vars. This silently propagates an empty API key through resolveFirstApiKey's if (!key) guard (which only checks the raw config string, before resolution). Users with "${JINA_API_KEY}" configured but the env var unset will get opaque HTTP 401 failures instead of a clear startup error.
  • F2 — REM stores reflection with vector: []. LanceDB uses fixed-dimension Arrow columns — a 0-dimension vector will throw an Arrow schema mismatch on every cycle. The error is caught and logged as ⚠️ REM failed, so the REM phase silently never creates any reflection memories.
  • F3config.phases.light (and .deep, .rem) accessed without null guard. A minimal config like { "dreaming": { "enabled": true } } crashes with TypeError: Cannot read properties of undefined on first cycle. Add a DEFAULT_DREAMING_CONFIG and deep-merge it at parse time.
  • F6storageMode ('separate'/'both') and cron fields are exposed in configSchema and UI hints but never read at runtime. Users selecting these options silently get inline behavior and 6-hour fixed intervals. Remove from schema until implemented, or log a warning when set.

helal-muneer pushed a commit to helal-muneer/memory-lancedb-pro that referenced this pull request Apr 15, 2026
- MR1: Add scope isolation to dreaming engine — all phases now filter
  by active scope, REM reflections tagged with source scope instead
  of hardcoded 'global'
- F1: resolveFirstApiKey now validates resolved env var value,
  throws clear error if env var is unset instead of silently
  propagating empty API key
- F2: REM phase no longer stores vector:[] (Arrow schema mismatch).
  Accepts optional embedFn; if unavailable, logs patterns without
  creating a memory entry
- F3: Add DEFAULT_DREAMING_CONFIG + mergeDreamingConfig() for safe
  deep-merge of partial user configs over defaults
- F6: Remove unimplemented cron/timezone/storageMode/separateReports
  from DreamingConfig interface and openclaw.plugin.json schema
@helal-muneer helal-muneer force-pushed the feat/dreaming-engine branch from a240a8a to 705da9a Compare April 15, 2026 20:38
@helal-muneer helal-muneer requested a review from rwmjhb April 15, 2026 20:43
@helal-muneer helal-muneer force-pushed the feat/dreaming-engine branch 2 times, most recently from 82f710c to 62e44a6 Compare April 16, 2026 00:34
Copy link
Copy Markdown
Collaborator

@rwmjhb rwmjhb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The three-phase dreaming concept is interesting, but there are several correctness blockers that need to be resolved before this is mergeable.

Must fix before merge:

  • MR1: Scope isolation violated. The dreaming engine can synthesize cross-scope memories into global, defeating the per-user/per-agent isolation guarantees the rest of the plugin provides. Dreaming should only consolidate within the originating scope.

  • MR2: REM output re-enters the consolidation pool. REM reflections are stored as generic legacy working memories, so subsequent dreaming cycles can reprocess their own output — unbounded feedback loop.

  • F2: REM phase stores reflection with vector: []. LanceDB enforces a fixed vector dimension. An empty vector will crash every REM cycle against the schema constraint.

  • F3: config.phases accessed without null guard. A minimal dreaming config (omitting phases) throws a TypeError at startup.

  • F6: storageMode and cron fields are schema-advertised but entirely unimplemented. Shipping config knobs that silently do nothing is a user-trust issue. Either implement or remove from schema before merge.

Nice to have (non-blocking):

  • F1: resolveEnvVars silently returns '' for unset env vars — a missing embedding API key becomes an empty string with no warning.
  • F4: Initial setTimeout handle is discarded — the first cycle can fire after stop() if the plugin restarts within 5 minutes.
  • EF1: The 319-line dreaming engine has zero automated test coverage across all three phases.

Address the must-fix items and this is ready for a proper review.

@rwmjhb
Copy link
Copy Markdown
Collaborator

rwmjhb commented Apr 16, 2026

The schema gap causing UI validation errors is a real problem, and bridging tier-manager + decay-engine is the right architectural approach. However there are several correctness issues that need to land before this is safe to enable.

Must fix

  • Scope isolation: The dreaming engine queries memories without filtering by scope, then writes REM reflections into global. If a user has per-agent or per-channel scopes, Dreaming will synthesize cross-scope memories together and persist the result where all scopes can see it — a privacy/isolation regression. Scope filtering must be threaded through all three phases before this ships.
  • REM reflection reprocessing: REM reflections are stored as generic working memories. On the next cycle, the engine will pick them up as ordinary candidates and dream about its own previous reflections indefinitely. Tag REM output (e.g. source: 'rem-reflection') and exclude it from future phase queries.
  • Dead config knobs: storageMode: 'separate' | 'both' and cron are exposed in the config schema and plugin JSON but do nothing at runtime (engine hardcodes 'inline' and a 6 h setInterval). Users who configure these will be silently misled. Either implement them or remove them from the schema until they're ready.

High priority

  • store.store({ vector: [] }) for REM memories: LanceDB enforces a fixed vector dimension — an empty vector is likely to corrupt the index or cause retrieval errors on every subsequent search. At minimum, embed the reflection text before storing, or skip vector storage and mark it for deferred embedding.
  • config.phases is accessed without a null guard — a minimal dreaming config (phases omitted) will throw a TypeError before any phase runs.
  • resolveEnvVars behavior change (throwreturn '') is bundled into this feature PR and silently swallows missing API keys (e.g. JINA_API_KEY). The original throw was intentional for required keys. Please split this out or scope it only to optional vars.

Other

  • The setTimeout handle for the initial 5-minute delay is discarded — stop() can't cancel it, so a restart within 5 minutes triggers a double-cycle.
  • All phases use fetch-then-filter; memories beyond the batch limit are silently dropped. Worth a comment at minimum.
  • Zero test coverage for the three-phase engine — at least a unit test per phase for the happy path would reduce risk significantly.

Question for author: Was Strategy B (bridge mode) explicitly endorsed by a maintainer? Issue #577 describes three strategies but shows no maintainer response before implementation started — want to make sure the approach was aligned.

@helal-muneer helal-muneer force-pushed the feat/dreaming-engine branch from 62e44a6 to 517f5f6 Compare April 19, 2026 22:29
@helal-muneer helal-muneer changed the title feat: Dreaming engine for periodic memory consolidation feat: confidence tracking, entity graph, and proactive memory injection Apr 19, 2026
@helal-muneer
Copy link
Copy Markdown
Author

Rebased and Rebuilt

This PR has been completely rebuilt from scratch on top of the latest master (faa847c):

  • Clean rebase: Branch force-pushed with a fresh commit on top of current master — no merge conflicts
  • Removed all unrelated changes: No embedder modifications (getVectorDimensions rename, requestDimensions removal), no resolveEnvVars behavior change, no autoRecallIncludeAgents removal
  • Preserved only the new features: confidence-tracker, entity-graph, proactive-injector, shared scope, 3 new tools

Reviewer Feedback Addressed

Item Status Notes
MR1 — Scope isolation ✅ N/A No dreaming engine in this implementation. All components respect scope boundaries
MR2 — REM reflection loop ✅ N/A No REM cycle exists in these components
F1 — resolveEnvVars ✅ Fixed Not included — separated out as requested
F2 — vector:[] ✅ N/A No empty vector storage in any component
F3 — Null guards ✅ Fixed All components use default config: config.entityGraph ?? { enabled: false }
F6 — Dead config knobs ✅ Fixed All config fields map to runtime behavior — no unimplemented options

CI Status

CI workflows require maintainer approval to run (first-time fork push). All local tests pass:

  • ✅ plugin-manifest-regression
  • ✅ scope-access-undefined (29/29)
  • ✅ clawteam-scope
  • ✅ config-session-strategy-migration
  • ✅ sync-plugin-version
  • ✅ All other existing tests pass unchanged

Files Changed (9 files, +838/-5)

src/confidence-tracker.ts     | 111 lines (new)
src/entity-graph.ts           | 250 lines (new)
src/proactive-injector.ts     | 158 lines (new)
src/scopes.ts                 | +19 (shared scope + isSharedScope)
src/tools.ts                  | +185 (3 new tools)
index.ts                      | +55 (singleton wiring + config)
openclaw.plugin.json          | +60 (config schema)
test/scope-access-undefined   | +2 (shared scope expectation)
test/clawteam-scope           | +1 (shared scope expectation)

@helal-muneer helal-muneer requested a review from rwmjhb April 19, 2026 23:13
- Add dreaming field to PluginConfig interface
- Add DEFAULT_DREAMING_CONFIG + mergeDreamingConfig() for safe config merging
- Initialize dreaming engine in register() when enabled
- Simple cron-based scheduler (60s check interval, supports minute/hour fields)
- Write DREAMS.md reports after each cycle
- Cleanup timer on gateway_stop
- Update openclaw.plugin.json schema with phases sub-config
The dreaming code was at the plugin factory level (non-async scope),
causing 'ParseError: Unexpected reserved word await' at runtime.
Moving it inside the async start() callback resolves the error.
…llback

Resolves 'ParseError: Unexpected reserved word await' by moving dreaming
initialization into the async start() context. Also moves backup scheduling
and BACKUP_INTERVAL_MS into start() to avoid TDZ errors.
@rwmjhb
Copy link
Copy Markdown
Collaborator

rwmjhb commented Apr 20, 2026

感谢提交这个 PR!dreaming engine 的方向是对的,三阶段设计(Light Sleep / Deep Sleep / REM)与 tier-manager 和 decay-engine 的集成思路也清晰。不过有几个问题需要在合并前解决。


必须修复

MR1 — Scope 隔离缺失

dreaming-engine 合成记忆时未做 scope 隔离,可能将来自不同 agent scope 的记忆合并写入 global。这会污染其他 agent 的记忆空间,属于数据正确性问题,需要在合并前修复。

MR2 — REM reflection 会被自身的下一轮重新处理

REM 阶段生成的 reflection 被存储为普通 legacy working memory,导致下一轮 dreaming 会把这些 reflection 再次作为输入处理,形成自我强化的循环。


重要问题(建议修复)

F1 — resolveEnvVars 行为变更(throw → return '')

这个改动与 dreaming 无关,但影响全局 env var 解析,包括 JINA_API_KEY。当 key 不存在时静默返回空串,下游功能(embedding)会悄悄失败,比原来的 throw 更难排查。建议拆成独立 PR,或至少说明为什么需要这个改动。

F2 — REM reflection 用 vector: [] 存储

LanceDB schema 有固定维度约束,vector: [] 每轮都会触发维度不匹配错误。注释里的 "could embed later" 是占位行为,不适合作为最终实现发布。

F3 — config.phases 无 null guard

最小化 dreaming config 下会 TypeError crash。

F6 — 死配置项暴露给用户

storageModeseparate / both 选项在 schema 和 plugin JSON 中都有声明,但 dreaming-engine.ts 完全未实现,用户选择后静默退化为 inlinecron 字段同理,运行时被 hardcode 的 6h setInterval 忽略。建议:要么实现,要么从 schema 和 UI 中移除,避免误导用户。


开放问题

  1. Issue [Feature Request] Dreaming 功能支援 - 三種實作策略分析與可行性評估 #577 里 Strategy B(bridge mode)有没有得到 maintainer 的明确认可?Issue 里只看到策略对比,没有看到后续回复——是在 Discord/Slack 上确认的吗?
  2. store.store({ vector: [] }) — MemoryStore 接受空 vector 吗?还是会破坏 LanceDB 的向量索引,导致后续语义搜索静默失效?
  3. 分支 base 已过期(stale),index.ts 是高风险文件,能否 rebase 一下再 review?

测试覆盖

319 行新增的 dreaming-engine.ts 目前零测试覆盖。至少建议为三个阶段各补一个 happy path 测试,以及 scope 隔离的边界测试。


整体来说这个功能是有价值的,解决了 OpenClaw UI 的 schema 验证报错。修复上述问题后欢迎重新 review。

@helal-muneer helal-muneer force-pushed the feat/dreaming-engine branch from 517f5f6 to e3e1ac2 Compare April 20, 2026 03:38
helal-muneer added a commit to helal-muneer/memory-lancedb-pro that referenced this pull request Apr 20, 2026
… tests

Clean implementation addressing all reviewer feedback from PR CortexReach#592:

MR1 — Scope isolation: Each phase filters store.list() by scope.
Dreaming runs per-scope using scopeManager.getAllScopes().

MR2 — REM reflection loop prevention: Reflections tagged with
metadata.source = 'dreaming-engine' and excluded from all phase inputs.

F2 — REM reflections now embedded via embedder.embed() instead of
vector: []. Falls back to zero-vector on embedding failure.

F3 — DEFAULT_DREAMING_CONFIG + mergeDreamingConfig() provides
null-safe deep merge. Minimal config { enabled: true } works.

F6 — Removed unimplemented fields (storageMode, separateReports,
timezone) from schema. Only runtime-active fields exposed.

Also includes:
- 8 unit tests covering MR1, MR2, F2, F3, all 3 phases, error resilience
- Dreaming wired inside async start() callback (fixes ParseError)
- Cron scheduler with per-scope execution
- DREAMS.md report generation per scope
@helal-muneer
Copy link
Copy Markdown
Author

All reviewer feedback has been addressed in a clean rebase: #672

Key changes:

  • Rebased onto current master (cf782a2)
  • MR1: Scope isolation — each phase filters by scope, REM reflections stored in source scope
  • MR2: Reflection loop prevention — dreaming reflections tagged and excluded from re-processing
  • F2: REM reflections properly embedded (no more vector: [])
  • F3: Null-safe config with DEFAULT_DREAMING_CONFIG + mergeDreamingConfig()
  • F6: Removed unimplemented config fields from schema
  • 8 unit tests covering all feedback items
  • No unrelated changes (clean dreaming-only PR)

@helal-muneer
Copy link
Copy Markdown
Author

Closing in favor of #672 — clean rebase with all reviewer feedback addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Dreaming 功能支援 - 三種實作策略分析與可行性評估

2 participants