Skip to content

Optimize LLM call volume with gated batched extraction#61

Open
zhuyankarl wants to merge 2 commits intoadoresever:mainfrom
zhuyankarl:contrib/llm-call-control-20260425
Open

Optimize LLM call volume with gated batched extraction#61
zhuyankarl wants to merge 2 commits intoadoresever:mainfrom
zhuyankarl:contrib/llm-call-control-20260425

Conversation

@zhuyankarl
Copy link
Copy Markdown

@zhuyankarl zhuyankarl commented Apr 25, 2026

SummaryThis PR adds call-count control for graph-memory's LLM-backed extraction/summary flow, aimed at reducing unnecessary LLM calls while keeping high-signal memory updates timely.Key changes:- Add an extraction gate that skips trivial acknowledgements and immediately extracts high-signal turns such as errors, user corrections, and explicit completion messages.- Add debounce, max-batch, interval flush, and session_end forced flush behavior for normal extraction batches.- Add a monthly virtual LLM call budget with dynamic daily allowance based on remaining calls and remaining days in the month.- Expose the new controls in openclaw.plugin.json and document them in README / README_CN.- Keep token limits optional; defaults focus on call-count control.- Add tests for extraction gating and monthly budget behavior.## ContextIn OpenClaw/CodePlan-style usage, memory extraction can otherwise run once per turn, which makes call count grow very quickly during active coding sessions. The new default flow is:- error / user correction / explicit completion: extract immediately- normal messages: debounce first- extractMaxBatchMessages: force extract- extractFlushIntervalMs: fallback flush- session_end: force flush pending messagesThis branch also includes the preceding session-end/community-maintenance improvements because the call-control patch builds on those touched areas and applying only the latest commit conflicts with current upstream main.## Validation- openclaw.plugin.json parsed successfully- npm test -> 91 tests passed- npm run build -> passed

Follow-up Validation Plan

We are also running this patch locally in an OpenClaw/CodePlan setup for a longer real-world validation window. After collecting enough usage data, we plan to follow up with before/after call-volume comparisons and any tuning changes needed based on actual deployment behavior.

root and others added 2 commits April 25, 2026 11:56
问题:
- session_end对所有session(含短命subagent)都执行finalize+maintenance,645/1928个短session白白消耗大量LLM调用
- 社区摘要每次maintenance全量刷新所有社区,LLM调用成本高
- 缺少maintenance触发次数等可观测数据

修复:
1. index.ts: 两道防线(消息<3条跳过 + 无节点跳过maintenance)
2. community.ts: 增量摘要模式(只处理新增+成员变化社区),24h自动全量兜底
3. maintenance.ts: gm_meta触发计数器 + 结果记录
4. db.ts: 新增m7_meta迁移(gm_meta表)
5. store.ts: getMeta/setMeta/getIncrementalCommunities辅助函数
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant