Skip to content

fix: agent缺少现实校验 (暂保留)#390

Open
Nomikfk1215 wants to merge 1 commit into1024XEngineer:mainfrom
Nomikfk1215:pr-389
Open

fix: agent缺少现实校验 (暂保留)#390
Nomikfk1215 wants to merge 1 commit into1024XEngineer:mainfrom
Nomikfk1215:pr-389

Conversation

@Nomikfk1215
Copy link
Copy Markdown
Collaborator

@Nomikfk1215 Nomikfk1215 commented May 6, 2026

解决 #389 保留更改暂不合并
#389 问题归并到新issue #394 里 待修复

解决的是:

当用户问题明显需要现实/最新/外部信息时,模型不能再直接凭记忆给最终答案。比如“gpt-5.4-mini 是否存在”“这个 URL 页面是否合理”“官网/联网查一下”等场景,现在会被判定为必须先使用 web_search 或 web_fetch。

参考其他产品

Agent/平台 解决思路 关键点
OpenAI Responses/Agents 托管 web_search,带 citations/sources;必须检索时可用 tool_choice: required 官方文档也提醒 auto 下搜索是可选的,必须检索要强制:OpenAI Web Search
Claude Server-side web_search/web_fetch,搜索结果默认带引用;可限制 max_uses、域名;fetch 只能取用户或搜索结果里出现过的 URL 更像“检索是模型流程的一部分”,并带安全边界:Claude Web SearchClaude Web Fetch
Gemini Grounding with Google Search,返回 groundingMetadata,包含查询、结果和 citations 把“搜索、处理、引用”作为 grounding 元数据闭环:Gemini Grounding
Cursor 显式 @Web/@Docs/MCP,把外部资料作为上下文;Web search 默认可控 偏“用户显式拉入上下文”,减少模型自作主张
GitHub Copilot coding agent Web search 用于补充训练截止后的库变更、异常错误、最佳实践 明确把联网定位成弥补 cutoff:GitHub changelog
Perplexity/Sonar 产品/API 本身就是 web-grounded,默认输出 cited answers 检索不是外挂,而是答案生成主路径:Perplexity API

修改内容

对输入进行分类

在 internal/policy/web_lookup.go 增加 EvaluateWebLookupRequirement
把请求分成 none/should/must,当前先重点使用 must。触发条件包括 URL、联网/官网/GitHub、最新/当前、模型 ID 现实性问题等。

执行层面进行约束
internal/policy/decision.go 把 requirement 放进 PromptHintResult,再通过 engine_run_setup.go 和 run_setup.go 传到每轮执行。

在最终回答前加守门
internal/agent/turn_processing.go 在“没有工具调用就 finalize”的分支前检查:如果本轮是 must web,但还没执行过 web_search/web_fetch,就不允许 finalize。

增加 repair 机制
新增 internal/agent/web_lookup_guard.go。当模型想跳过联网直接回答时,会注入一条控制提示,让下一轮必须先调用 web 工具。如果 web 工具不可用,则暂停并说明当前 tool policy 不允许完成这个请求。

测试也补了:

policy 分类测试
gpt-5.4-mini 是否存在? 这种场景下,先拦截直接答案,再强制走 web_search,最后才允许 finalize
验证已通过:go test ./internal/agent、go test ./internal/policy、go test ./internal/context -v。

@Nomikfk1215 Nomikfk1215 changed the title fix: agent缺少现实校验 fix: agent缺少现实校验 May 6, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 6, 2026

Codecov Report

❌ Patch coverage is 75.73529% with 33 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/agent/turn_processing.go 42.10% 20 Missing and 2 partials ⚠️
internal/agent/web_lookup_guard.go 83.33% 3 Missing and 3 partials ⚠️
internal/policy/web_lookup.go 90.19% 3 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

@fennoai fennoai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One behavioral regression worth addressing: the new web-lookup classifier now forces remote browsing for some prompts that are explicitly about the local workspace.

return requiredWebLookup("user explicitly requested a remote repository lookup")
}

if looksLocalWorkspaceOnly(text) && !llmModelIDPattern.MatchString(text) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looksLocalWorkspaceOnly is only honored when the prompt does not mention a model ID. That means a local-repo question like Does the current workspace support gpt-5.4? now falls through to the model-reality branch below and gets marked must, which will push the runner into web_search/web_fetch instead of inspecting the checked-out code. This is a behavior regression for repo-review tasks; the local-workspace short-circuit should win even when model names appear in the prompt.

@Nomikfk1215 Nomikfk1215 changed the title fix: agent缺少现实校验 fix: agent缺少现实校验 (暂保留) May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant