fix: agent缺少现实校验（暂保留） by Nomikfk1215 · Pull Request #390 · 1024XEngineer/bytemind

Nomikfk1215 · 2026-05-06T09:20:16Z

解决 #389 保留更改暂不合并
#389 问题归并到新issue #394 里待修复

解决的是：

当用户问题明显需要现实/最新/外部信息时，模型不能再直接凭记忆给最终答案。比如“gpt-5.4-mini 是否存在”“这个 URL 页面是否合理”“官网/联网查一下”等场景，现在会被判定为必须先使用 web_search 或 web_fetch。

参考其他产品

Agent/平台	解决思路	关键点
OpenAI Responses/Agents	托管 web_search，带 citations/sources；必须检索时可用 tool_choice: required	官方文档也提醒 auto 下搜索是可选的，必须检索要强制：OpenAI Web Search
Claude	Server-side web_search/web_fetch，搜索结果默认带引用；可限制 max_uses、域名；fetch 只能取用户或搜索结果里出现过的 URL	更像“检索是模型流程的一部分”，并带安全边界：Claude Web Search、Claude Web Fetch
Gemini	Grounding with Google Search，返回 groundingMetadata，包含查询、结果和 citations	把“搜索、处理、引用”作为 grounding 元数据闭环：Gemini Grounding
Cursor	显式 @Web/@Docs/MCP，把外部资料作为上下文；Web search 默认可控	偏“用户显式拉入上下文”，减少模型自作主张
GitHub Copilot coding agent	Web search 用于补充训练截止后的库变更、异常错误、最佳实践	明确把联网定位成弥补 cutoff：GitHub changelog
Perplexity/Sonar	产品/API 本身就是 web-grounded，默认输出 cited answers	检索不是外挂，而是答案生成主路径：Perplexity API

修改内容

对输入进行分类

在 internal/policy/web_lookup.go 增加 EvaluateWebLookupRequirement
把请求分成 none/should/must，当前先重点使用 must。触发条件包括 URL、联网/官网/GitHub、最新/当前、模型 ID 现实性问题等。

执行层面进行约束
internal/policy/decision.go 把 requirement 放进 PromptHintResult，再通过 engine_run_setup.go 和 run_setup.go 传到每轮执行。

在最终回答前加守门
internal/agent/turn_processing.go 在“没有工具调用就 finalize”的分支前检查：如果本轮是 must web，但还没执行过 web_search/web_fetch，就不允许 finalize。

增加 repair 机制
新增 internal/agent/web_lookup_guard.go。当模型想跳过联网直接回答时，会注入一条控制提示，让下一轮必须先调用 web 工具。如果 web 工具不可用，则暂停并说明当前 tool policy 不允许完成这个请求。

测试也补了：

policy 分类测试
gpt-5.4-mini 是否存在？这种场景下，先拦截直接答案，再强制走 web_search，最后才允许 finalize
验证已通过：go test ./internal/agent、go test ./internal/policy、go test ./internal/context -v。

codecov · 2026-05-06T09:21:34Z

Codecov Report

❌ Patch coverage is 75.73529% with 33 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
internal/agent/turn_processing.go	42.10%	20 Missing and 2 partials ⚠️
internal/agent/web_lookup_guard.go	83.33%	3 Missing and 3 partials ⚠️
internal/policy/web_lookup.go	90.19%	3 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

fennoai

One behavioral regression worth addressing: the new web-lookup classifier now forces remote browsing for some prompts that are explicitly about the local workspace.

fennoai · 2026-05-06T09:21:59Z

+		return requiredWebLookup("user explicitly requested a remote repository lookup")
+	}
+
+	if looksLocalWorkspaceOnly(text) && !llmModelIDPattern.MatchString(text) {


looksLocalWorkspaceOnly is only honored when the prompt does not mention a model ID. That means a local-repo question like Does the current workspace support gpt-5.4? now falls through to the model-reality branch below and gets marked must, which will push the runner into web_search/web_fetch instead of inspecting the checked-out code. This is a behavior regression for repo-review tasks; the local-workspace short-circuit should win even when model names appear in the prompt.

fix: 修复agent 缺少强制现实校验导致输出信息太旧或不符合实际

1a99d17

Nomikfk1215 changed the title ~~fix： agent缺少现实校验~~ fix: agent缺少现实校验 May 6, 2026

fennoai Bot reviewed May 6, 2026

View reviewed changes

Nomikfk1215 changed the title ~~fix: agent缺少现实校验~~ fix: agent缺少现实校验（暂保留） May 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: agent缺少现实校验（暂保留）#390

fix: agent缺少现实校验（暂保留）#390
Nomikfk1215 wants to merge 1 commit into1024XEngineer:mainfrom
Nomikfk1215:pr-389

Nomikfk1215 commented May 6, 2026 •

edited

Loading

Uh oh!

codecov Bot commented May 6, 2026

Uh oh!

fennoai Bot left a comment

Uh oh!

fennoai Bot May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Nomikfk1215 commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

解决的是：

参考其他产品

修改内容

Uh oh!

codecov Bot commented May 6, 2026

Codecov Report

Uh oh!

fennoai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

fennoai Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Nomikfk1215 commented May 6, 2026 •

edited

Loading