Optimize LLM call volume with gated batched extraction#61
Open
zhuyankarl wants to merge 2 commits intoadoresever:mainfrom
Open
Optimize LLM call volume with gated batched extraction#61zhuyankarl wants to merge 2 commits intoadoresever:mainfrom
zhuyankarl wants to merge 2 commits intoadoresever:mainfrom
Conversation
问题: - session_end对所有session(含短命subagent)都执行finalize+maintenance,645/1928个短session白白消耗大量LLM调用 - 社区摘要每次maintenance全量刷新所有社区,LLM调用成本高 - 缺少maintenance触发次数等可观测数据 修复: 1. index.ts: 两道防线(消息<3条跳过 + 无节点跳过maintenance) 2. community.ts: 增量摘要模式(只处理新增+成员变化社区),24h自动全量兜底 3. maintenance.ts: gm_meta触发计数器 + 结果记录 4. db.ts: 新增m7_meta迁移(gm_meta表) 5. store.ts: getMeta/setMeta/getIncrementalCommunities辅助函数
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
SummaryThis PR adds call-count control for graph-memory's LLM-backed extraction/summary flow, aimed at reducing unnecessary LLM calls while keeping high-signal memory updates timely.Key changes:- Add an extraction gate that skips trivial acknowledgements and immediately extracts high-signal turns such as errors, user corrections, and explicit completion messages.- Add debounce, max-batch, interval flush, and
session_endforced flush behavior for normal extraction batches.- Add a monthly virtual LLM call budget with dynamic daily allowance based on remaining calls and remaining days in the month.- Expose the new controls inopenclaw.plugin.jsonand document them in README / README_CN.- Keep token limits optional; defaults focus on call-count control.- Add tests for extraction gating and monthly budget behavior.## ContextIn OpenClaw/CodePlan-style usage, memory extraction can otherwise run once per turn, which makes call count grow very quickly during active coding sessions. The new default flow is:- error / user correction / explicit completion: extract immediately- normal messages: debounce first-extractMaxBatchMessages: force extract-extractFlushIntervalMs: fallback flush-session_end: force flush pending messagesThis branch also includes the preceding session-end/community-maintenance improvements because the call-control patch builds on those touched areas and applying only the latest commit conflicts with current upstreammain.## Validation-openclaw.plugin.jsonparsed successfully-npm test-> 91 tests passed-npm run build-> passedFollow-up Validation Plan
We are also running this patch locally in an OpenClaw/CodePlan setup for a longer real-world validation window. After collecting enough usage data, we plan to follow up with before/after call-volume comparisons and any tuning changes needed based on actual deployment behavior.