fix: agent缺少现实校验 (暂保留)#390
Open
Nomikfk1215 wants to merge 1 commit into1024XEngineer:mainfrom
Open
Conversation
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
| return requiredWebLookup("user explicitly requested a remote repository lookup") | ||
| } | ||
|
|
||
| if looksLocalWorkspaceOnly(text) && !llmModelIDPattern.MatchString(text) { |
There was a problem hiding this comment.
looksLocalWorkspaceOnly is only honored when the prompt does not mention a model ID. That means a local-repo question like Does the current workspace support gpt-5.4? now falls through to the model-reality branch below and gets marked must, which will push the runner into web_search/web_fetch instead of inspecting the checked-out code. This is a behavior regression for repo-review tasks; the local-workspace short-circuit should win even when model names appear in the prompt.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
解决 #389 保留更改暂不合并
#389 问题归并到新issue #394 里 待修复
解决的是:
当用户问题明显需要现实/最新/外部信息时,模型不能再直接凭记忆给最终答案。比如“gpt-5.4-mini 是否存在”“这个 URL 页面是否合理”“官网/联网查一下”等场景,现在会被判定为必须先使用 web_search 或 web_fetch。
参考其他产品
修改内容
对输入进行分类
在 internal/policy/web_lookup.go 增加 EvaluateWebLookupRequirement
把请求分成 none/should/must,当前先重点使用 must。触发条件包括 URL、联网/官网/GitHub、最新/当前、模型 ID 现实性问题等。
执行层面进行约束
internal/policy/decision.go 把 requirement 放进 PromptHintResult,再通过 engine_run_setup.go 和 run_setup.go 传到每轮执行。
在最终回答前加守门
internal/agent/turn_processing.go 在“没有工具调用就 finalize”的分支前检查:如果本轮是 must web,但还没执行过 web_search/web_fetch,就不允许 finalize。
增加 repair 机制
新增 internal/agent/web_lookup_guard.go。当模型想跳过联网直接回答时,会注入一条控制提示,让下一轮必须先调用 web 工具。如果 web 工具不可用,则暂停并说明当前 tool policy 不允许完成这个请求。
测试也补了:
policy 分类测试
gpt-5.4-mini 是否存在? 这种场景下,先拦截直接答案,再强制走 web_search,最后才允许 finalize
验证已通过:go test ./internal/agent、go test ./internal/policy、go test ./internal/context -v。