fix(baidu-cli): implement filterOrganic, add EvaluateJSON + retry, google-cli fixes#9
fix(baidu-cli): implement filterOrganic, add EvaluateJSON + retry, google-cli fixes#9jay77721 wants to merge 1 commit into
Conversation
- Implement filterOrganic: drop empty titles, recommend_list, ai_agent_distribute; keep www_index, sg_kg_ (Baike), se_com_default - Change extractorJS to return JSON.stringify for robust EvaluateJSON parsing - Add EvaluateJSON, evaluateWithRetry, isTransientContextError to browser client (mirrors google-cli pattern — handles CDP context-not-ready race) - fix(google-cli): Navigate uses newTab:false to avoid tab accumulation - fix(google-cli): drop search results with empty URLs
Review: two issues in the baidu-cli changesThanks for the fixes! The google-cli part (Bug 4 + 5) looks correct. However, the baidu-cli part has two issues: Issue 1: EvaluateJSON will double-unwrap in baidu-cliIn // current Evaluate() in baidu-cli
func (c *Client) Evaluate(code string) (json.RawMessage, error) {
raw, err := c.Call("evaluate", ...)
// ... unwraps {type, value} envelope here ...
return env.Value, nil // returns the inner value directly
}But the new // new EvaluateJSON() in this PR
func (c *Client) EvaluateJSON(code string, v any) error {
raw, err := c.Evaluate(code) // already unwrapped!
var env struct {
Type string `json:"type"`
Value string `json:"value"`
}
json.Unmarshal(raw, &env) // this will fail — raw is already the inner value, not the envelopeThis works in Fix options:
Issue 2: evaluateWithRetry is added but never calledThe new // baidu/search.go after this PR's changes
if err := client.EvaluateJSON(extractorJS, &results); err != nil {
return nil, fmt.Errorf("extract results: %w", err)
}If retry logic is intended, |
|
谢谢 @jay77721 的贡献,也特别感谢 @RachelXiaolan 上面那份非常详尽的 review! 作为维护者补充几点:
修好后再 push 上来,我们再 review 一轮,期待合入这个 PR! |
Recognizing @RachelXiaolan for code contributions (PR #17 — Status() health check + SilenceUsage for baidu-cli/google-cli) and pull-request reviews (detailed technical review on PR #9, plus follow-up PR #16). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
修复了 baidu-cli 的 3 个 bug
Bug 1: filterOrganic() 是空实现
recommend_list(相关搜索 stub)、ai_agent_distribute(AI 幻觉卡片);保留www_index(有机结果)、sg_kg_系列(百度百科卡片)、se_com_default(兜底结果)Bug 2: extractorJS 返回原始 JS 数组
Evaluate()获取原始 JS 返回值,解析方式不一致JSON.stringify()包裹,通过EvaluateJSON()解析,与 google-cli 保持一致Bug 3: 缺少 evaluateWithRetry 重试逻辑
EvaluateJSON、evaluateWithRetry、isTransientContextError等函数修复了 google-cli 的 2 个 bug
Bug 4: Navigate 每次开新标签页
newTab: true导致每次搜索都开一个新 tab,积累大量空标签newTab: false,复用当前标签页Bug 5: 空 URL 结果未过滤
URL ==的条目