feat: add Exa AI-powered search tool#81
Conversation
Adds a general-purpose web_search tool backed by Exa so the agent can ground answers in current web content when HF docs, papers, and GitHub search aren't enough. - New agent/tools/web_search_tool.py with a typed WebSearchResult model and a snippet cascade (summary > highlights > text). - Registered through create_builtin_tools only when EXA_API_KEY is set, so the tool stays dark for users who don't configure it. - Added to the research sub-agent's read-only tool allow-list. - Surfaces Exa search types, category, domain filters, date filters, and content modes (text / highlights / summary). - Tests cover response parsing, snippet fallbacks, argument validation, router gating, and the integration-attribution header.
|
@claude please review this PR. Focus on correctness, security, LiteLLM/Bedrock routing impact, and whether it's safe to merge against current main. Keep it concise and prefer inline comments where it matters. |
|
Claude finished @akseljoonas's task in 2m 25s —— View job PR Review: feat: add Exa AI-powered search toolTasks
Verdict: NEEDS-CHANGES before merge (2 items) · Logic/tests are solid Correctness🔴 1.
# current (wrong)
dependencies = [
...
"exa-py>=2.0.0", # ← installs on every `uv sync`, key or not
]Every deployment pays the install cost even when 🔴 2.
🟡 3.
🟡 4. Per-call
🟢 5. Arg validation is thorough and consistent
Security🟡 Exception message forwarded to agent
🟢 No SSRF / injection surface All HTTP calls go through the Exa SDK to the Exa API. No user-controlled URLs are fetched directly. Domain filter strings ( 🟢 API key never logged or surfaced in output The key is read from env at call time, passed only to the LiteLLM / Bedrock Routing ImpactNo impact. The web search tool doesn't call Prompt caching ( Bedrock: no active Bedrock path exists in Research sub-agent: Summary
Fix the two 🔴 items and this is good to merge. The handler logic, test coverage (19 tests, all isolation patterns correct), and integration pattern with the research sub-agent are well done. |
Summary
Adds a general-purpose
web_searchtool backed by Exa so the agent can ground answers in current web content when HF docs, papers, and GitHub search aren't enough — e.g. recent blog posts, announcements, non-arxiv references, or cross-domain context.agent/tools/web_search_tool.pywith a typedWebSearchResultmodel and a snippet cascade (summary → highlights → text).create_builtin_toolsonly whenEXA_API_KEYis set, so the tool stays dark for users who don't configure it.hf_papersand GitHub tools.auto/neural/fast),category, include/exclude domain filters, date filters, and content modes (text/highlights/summary).Usage
The agent can then call
web_searchlike any other tool:{ "query": "state-of-the-art flash attention kernels 2025", "num_results": 5, "type": "auto", "category": "research paper", "include_domains": ["arxiv.org", "github.com"], "start_published_date": "2024-01-01" }Output format is consistent with the other docs/search tools — title, URL, published date, relevance score, and a snippet that falls back through summary → highlights → text.
Files changed
agent/tools/web_search_tool.py— new tool module (handler + spec + result type).agent/tools/__init__.py— export the tool spec / handler / enabled-check.agent/core/tools.py— conditionally register the tool whenEXA_API_KEYis set.agent/tools/research_tool.py— allowweb_searchfrom the research sub-agent.pyproject.toml— addexa-py>=2.0.0dependency; movepytest-asynciointo thedevextras so tests run out of the box.README.md— documentEXA_API_KEYas an optional env var.tests/unit/test_web_search_tool.py— unit tests (19 new).Test plan
uv run python -m pytest tests/— 30 passed (11 existing + 19 new).EXA_API_KEYis unset.EXA_API_KEYis set (verified viacreate_builtin_tools).x-exa-integrationheader is set on every Exa client instance (asserted intest_handler_happy_path_sets_integration_header).type, invalidcategory, non-list domain filters, missing query.num_resultsis clamped to[1, 25].searchis used when no content modes are requested;search_and_contentswhen any are.EXA_API_KEYin a full CLI session (not run here — requires a key).