Optimize LLM call volume with gated batched extraction by zhuyankarl · Pull Request #61 · adoresever/graph-memory

zhuyankarl · 2026-04-25T07:16:01Z

SummaryThis PR adds call-count control for graph-memory's LLM-backed extraction/summary flow, aimed at reducing unnecessary LLM calls while keeping high-signal memory updates timely.Key changes:- Add an extraction gate that skips trivial acknowledgements and immediately extracts high-signal turns such as errors, user corrections, and explicit completion messages.- Add debounce, max-batch, interval flush, and `session_end` forced flush behavior for normal extraction batches.- Add a monthly virtual LLM call budget with dynamic daily allowance based on remaining calls and remaining days in the month.- Expose the new controls in `openclaw.plugin.json` and document them in README / README_CN.- Keep token limits optional; defaults focus on call-count control.- Add tests for extraction gating and monthly budget behavior.## ContextIn OpenClaw/CodePlan-style usage, memory extraction can otherwise run once per turn, which makes call count grow very quickly during active coding sessions. The new default flow is:- error / user correction / explicit completion: extract immediately- normal messages: debounce first- `extractMaxBatchMessages`: force extract- `extractFlushIntervalMs`: fallback flush- `session_end`: force flush pending messagesThis branch also includes the preceding session-end/community-maintenance improvements because the call-control patch builds on those touched areas and applying only the latest commit conflicts with current upstream `main`.## Validation- `openclaw.plugin.json` parsed successfully- `npm test` -> 91 tests passed- `npm run build` -> passed

Follow-up Validation Plan

We are also running this patch locally in an OpenClaw/CodePlan setup for a longer real-world validation window. After collecting enough usage data, we plan to follow up with before/after call-volume comparisons and any tuning changes needed based on actual deployment behavior.

问题： - session_end对所有session(含短命subagent)都执行finalize+maintenance，645/1928个短session白白消耗大量LLM调用 - 社区摘要每次maintenance全量刷新所有社区，LLM调用成本高 - 缺少maintenance触发次数等可观测数据修复： 1. index.ts: 两道防线(消息<3条跳过 + 无节点跳过maintenance) 2. community.ts: 增量摘要模式(只处理新增+成员变化社区)，24h自动全量兜底 3. maintenance.ts: gm_meta触发计数器 + 结果记录 4. db.ts: 新增m7_meta迁移(gm_meta表) 5. store.ts: getMeta/setMeta/getIncrementalCommunities辅助函数

root and others added 2 commits April 25, 2026 11:56

Optimize graph memory LLM call control

fe03633

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize LLM call volume with gated batched extraction#61

Optimize LLM call volume with gated batched extraction#61
zhuyankarl wants to merge 2 commits intoadoresever:mainfrom
zhuyankarl:contrib/llm-call-control-20260425

zhuyankarl commented Apr 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zhuyankarl commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Follow-up Validation Plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

zhuyankarl commented Apr 25, 2026 •

edited

Loading