docs(examples): add regex-allowlist credential-leak example + index-page link#23
Merged
Merged
Conversation
…age link Adds a new worked example explaining why hand-rolled regexes are the wrong tool for hostname allowlists, why URLPattern is the textbook fix, and how to spell the common "host or any subdomain" policy as a component-aware pattern rather than a regex tweak. The canonical case is invoke-ai/InvokeAI#7518: a configuration field where each trusted upstream gets a regex paired with a credential. The naive entry ``url_regex: 'private.example'`` leaks the secret when the client visits either of two attacker-controlled URL shapes: - https://malicious.example/private.example/theft.safetensors (path-segment fallthrough; re.search finds the literal anywhere) - https://private.example.malicious.example/theft.safetensors (subdomain shadowing; the legitimate label sits inside the attacker's host) A component-aware URLPattern matches the hostname *as the hostname*; it cannot be tricked into accepting a path segment or a label-of-attacker's-host that happens to spell the same text. Wires the new page into: - docs/examples/index.md under a new "Security" section - properdocs.yml nav (alongside the webhook-shape validator) - docs/index.md home page — a sly inline note that hand-rolled regex URL allowlists are routinely error-prone, linking to the seed InvokeAI issue. The differentiator paragraph now closes with one extra clause that points security-curious readers at the example. No code changes; docs only. ``just docs`` builds clean in strict mode.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
docs/examples/avoid-regex-hostname-allowlist-vulns.md), explaining the textbook URL-allowlist regex bug and the URLPattern fix.docs/index.md) pointing security-curious readers at the seedinvoke-ai/InvokeAI#7518.docs/examples/index.md— new "Security" section) and the site nav (properdocs.yml).Why this example matters
invoke-ai/InvokeAI#7518is an open issue on a ~26k★ Python AI project that surfaces a concrete credential-leak scenario:The intent reads cleanly — "attach
secretwhen callingprivate.example." Butre.searchfinds the regex source anywhere in the URL string, and a URL is not a flat character sequence, so two attacker-controlled URL shapes also match:https://malicious.example/private.example/theft.safetensors· path-segment fallthroughhttps://private.example.malicious.example/theft.safetensors· subdomain shadowingThe example walks through both attack shapes, the component-aware URLPattern fix, and the "host or any subdomain" variant via
{:subdomain.}*private.example.What this PR changes
docs/examples/avoid-regex-hostname-allowlist-vulns.mddocs/examples/index.mdproperdocs.ymldocs/index.md[hand-rolled regexes for URL allowlists are routinely error-prone](...)linking to the InvokeAI issueTest plan
just docs— strict-mode build green (no broken links, nav consistent)just lint— all tools green (ruff, mypy, pyright, ty, semgrep, shellcheck, rumdl, codespell, interrogate, validate-pyproject)No code changes; docs only.