Render inline HTML in markdown + fix garbled PDF export#74
Merged
Conversation
Adds rehype-raw + rehype-sanitize so raw HTML embedded in markdown renders like it does on GitHub, while stripping XSS vectors (script, event handlers, javascript: URLs, iframe, etc.). Plugin order is raw -> sanitize -> trusted generators (katex, highlight) -> source-line stamper so sanitize never strips KaTeX/highlight output. Sanitize schema = GitHub defaultSchema plus two narrow allowances: math marker classes (so remark-math survives to rehype-katex) and the internal wikilink: href protocol. Also fixes a latent bug: react-markdown's default urlTransform stripped wikilink: hrefs to empty, so wikilink clicks never fired. A custom urlTransform now passes that scheme through (everything else still defaults, so javascript: etc. stay blocked).
The old PDF path drew text with jsPDF's standard fonts, which only support
single-byte WinAnsi (Latin-1). Anything outside that range — emoji, curly
quotes, em dashes, arrows — was emitted as raw bytes and rendered as garbage
(e.g. '&'-interleaved text and mangled headings). It also hand-rebuilt layout
from parsed HTML, so the PDF never matched the preview.
Replace it with the webview's own print pipeline: render the same standalone
HTML used for HTML export inside an isolated off-screen iframe and call
print() ("Save as PDF"). The result matches the preview exactly with real
Unicode + color emoji, selectable text and working links, at zero added bundle
weight. PDF always renders the light theme so it stays legible on white paper.
- Remove jsPDF dependency and all dead jsPDF layout code (parseHTMLForPDF,
pdfFontSizes, hexToRgb, PDFElement).
- Harden the print path: wait for fonts + inlined images, clean up the iframe
on afterprint with a fallback timer.
- Enhance @media print CSS: @page margins, print-color-adjust for code/table/
blockquote fills, sensible page-break rules.
- Refresh now-stale jsPDF comments in App.tsx / ExportMenu.tsx.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What & why
Two fixes plus one latent-bug fix discovered along the way.
1. PDF export was garbling text (emoji, smart quotes, em dashes)
The old path drew text with jsPDF's standard fonts, which only support single-byte WinAnsi (Latin-1). Anything outside that range was emitted as raw bytes and rendered as garbage (the
&-interleaved text and mangledØ=ßâ&headings users saw). It also hand-rebuilt layout from parsed HTML, so the PDF never matched the preview.Now: we render the same standalone HTML we already produce for HTML export inside an isolated off-screen iframe and drive the webview's own print pipeline ("Save as PDF" / "Microsoft Print to PDF"). The result matches the preview exactly — real Unicode + color emoji, selectable/searchable text, working links — at zero added bundle weight. PDF always renders the light theme so it stays legible on white paper.
jspdfdependency and all dead jsPDF layout code (parseHTMLForPDF,pdfFontSizes,hexToRgb,PDFElement) — net −500+ lines.afterprintwith a fallback timer.@media printCSS:@pagemargins,print-color-adjustfor code/table/blockquote fills, sensible page-break rules.2. Inline HTML in markdown (GitHub-style)
Adds
rehype-raw+rehype-sanitizeso raw HTML embedded in markdown renders like it does on GitHub, while stripping XSS vectors. Plugin order israw → sanitize → katex → highlight → source-lineso sanitize runs right after the only unsafe step and never strips KaTeX/highlight output. Sanitize schema = GitHub'sdefaultSchemaplus two narrow allowances (math marker classes; the internalwikilink:protocol).3. Latent bug: wikilinks never fired
react-markdown's default
urlTransformstrippedwikilink:hrefs to"", so the wikilink click handler could never run. A customurlTransformnow passes that scheme through (everything else still defaults, sojavascript:etc. stay blocked).Verification
details/summary/kbd/sub/sup/div) renders, wikilinks/task lists/math/syntax highlighting/mermaid all preserved, andscript/onerror/javascript:/iframe/onclickall stripped.tsc --noEmitclean; all 122 unit tests pass.🤖 Generated with Claude Code