fix: validate URL origin in fetch_hf_docs to prevent SSRF#83
fix: validate URL origin in fetch_hf_docs to prevent SSRF#83JasonOA888 wants to merge 2 commits intohuggingface:mainfrom
Conversation
The fetch_hf_docs tool accepted arbitrary URLs from LLM-generated arguments without origin validation. Since the request carries the user's HF Bearer token and follows redirects, a crafted prompt could trick the LLM into exfiltrating the token to an attacker-controlled server. Add _is_allowed_doc_url() which restricts fetch requests to huggingface.co, hf.co, and gradio.app origins (matching the tool's intended purpose of fetching HF/Gradio documentation).
23 test cases covering: - Allowed origins (exact + subdomain) - Blocked schemes (HTTP, FTP) - Disallowed hosts (evil.com, metadata endpoint, prefix attacks) - SSRF payloads (127.0.0.1, 0.0.0.0, ::1, localhost) - Edge cases (empty, garbage, port numbers)
|
@claude please review this PR. Focus on correctness, security, LiteLLM/Bedrock routing impact, and whether it's safe to merge against current main. Keep it concise and prefer inline comments where it matters. |
|
Claude finished @akseljoonas's task in 1m 37s —— View job PR Review: fix SSRF in
|
Summary
fetch_hf_docsaccepted arbitrary URLs from LLM-generated arguments without origin validation. Since the request carries the user's HF Bearer token (Authorization: Bearer {hf_token}) and usesfollow_redirects=True, a crafted prompt could trick the LLM into sending the token to an attacker-controlled server.Fix
Add
_is_allowed_doc_url()which restricts requests tohuggingface.co,hf.co, andgradio.app— the three origins this tool is actually designed to fetch from.The check runs before the HTTP request is made, so no auth headers ever leave allowed origins.
Before
After
Testing
ast.parseclean)host.endswith()check