You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
flexmark-java's default SUPPRESSED_LINKS pattern (javascript:.*) can be bypassed in two ways, both resulting in dangerous URLs being rendered as clickable <a href="..."> links in the HTML output:
HTML entity bypass — When a link URL in the source Markdown contains HTML entities (e.g. j for j), isSuppressedLinkPrefix() checks the raw, un-decoded URL string and does not match the javascript:.* pattern. However, resolvedLink.getUrl() returns the decoded URL, which is placed in the href attribute. The browser then decodes the entity and executes the script. This was previously noted in issue XSS via HTML entities in javascript: URLs #672.
Missing scheme coverage — The default blocklist only covers javascript:. The data: and vbscript: URI schemes are not blocked by default, allowing them to be used as-is without any encoding tricks.
Affected component:
HtmlRenderer
To Reproduce
importcom.vladsch.flexmark.html.HtmlRenderer;
importcom.vladsch.flexmark.parser.Parser;
importcom.vladsch.flexmark.util.ast.Node;
importcom.vladsch.flexmark.util.data.MutableDataSet;
publicclassXssPoC {
publicstaticvoidmain(String[] args) {
MutableDataSetoptions = newMutableDataSet();
Parserparser = Parser.builder(options).build();
HtmlRendererrenderer = HtmlRenderer.builder(options).build();
// Bypass 1: HTML entity encoding of 'j' evades the javascript: filterStringmd1 = "[click me](javascript:alert(document.domain))";
Nodedoc1 = parser.parse(md1);
System.out.println(renderer.render(doc1));
// Output: <p><a href="javascript:alert(document.domain)">click me</a></p>// Bypass 2: data: URI scheme is not in the default blocklistStringmd2 = "[click me](data:text/html,<script>alert(document.domain)</script>)";
Nodedoc2 = parser.parse(md2);
System.out.println(renderer.render(doc2));
// Output: <p><a href="data:text/html,<script>alert(document.domain)</script>">click me</a></p>// Browser decodes HTML entities in href attribute value → executes script// Bypass 3: vbscript: URI scheme is not in the default blocklistStringmd3 = "[click me](vbscript:msgbox(1))";
Nodedoc3 = parser.parse(md3);
System.out.println(renderer.render(doc3));
// Output: <p><a href="vbscript:msgbox(1)">click me</a></p>
}
}
Root cause (Bypass 1)
In CoreNodeRenderer, the suppression check uses node.getUrl(), which returns the raw URL from the parsed Markdown source — HTML entities are not decoded at this point. The pattern javascript:.* therefore does not match javascript:.... After the check passes, resolvedLink.getUrl() (with entities decoded) is written into the href attribute:
HtmlRenderer.SUPPRESSED_LINKS defaults to "javascript:.*" only. data: and vbscript: URIs pass the filter without any modification.
// HtmlRenderer.javafinalpublicstaticDataKey<String> SUPPRESSED_LINKS =
newDataKey<>("SUPPRESSED_LINKS", "javascript:.*"); // data: and vbscript: not covered
Expected behavior
All three inputs should render as plain text (no <a> tag), or the href should be replaced with a safe fallback such as #. The suppression check should operate on the decoded URL, and the default blocklist should cover all commonly dangerous URI schemes.
All three render clickable links with dangerous URIs. A user who clicks any of these in an application that renders user-supplied Markdown will trigger script execution in their browser.
Additional context
Tested against com.vladsch.flexmark:flexmark:0.64.8 (current Maven Central release) with default renderer options on JDK 17.
Suggested fix:
Decode HTML entities in the URL before applying isSuppressedLinkPrefix(), so that entity-encoded schemes are detected correctly.
Expand the default SUPPRESSED_LINKS pattern to also cover data: and vbscript:, for example: "(?i)(javascript|data|vbscript):.*".
Related: issue #672 (HTML entity bypass was previously reported; this report adds the missing-scheme vectors and provides a consolidated reproducer).
Describe the bug
flexmark-java's default
SUPPRESSED_LINKSpattern (javascript:.*) can be bypassed in two ways, both resulting in dangerous URLs being rendered as clickable<a href="...">links in the HTML output:HTML entity bypass — When a link URL in the source Markdown contains HTML entities (e.g.
jforj),isSuppressedLinkPrefix()checks the raw, un-decoded URL string and does not match thejavascript:.*pattern. However,resolvedLink.getUrl()returns the decoded URL, which is placed in thehrefattribute. The browser then decodes the entity and executes the script. This was previously noted in issue XSS via HTML entities in javascript: URLs #672.Missing scheme coverage — The default blocklist only covers
javascript:. Thedata:andvbscript:URI schemes are not blocked by default, allowing them to be used as-is without any encoding tricks.Affected component:
HtmlRendererTo Reproduce
Root cause (Bypass 1)
In
CoreNodeRenderer, the suppression check usesnode.getUrl(), which returns the raw URL from the parsed Markdown source — HTML entities are not decoded at this point. The patternjavascript:.*therefore does not matchjavascript:.... After the check passes,resolvedLink.getUrl()(with entities decoded) is written into thehrefattribute:Root cause (Bypasses 2 & 3)
HtmlRenderer.SUPPRESSED_LINKSdefaults to"javascript:.*"only.data:andvbscript:URIs pass the filter without any modification.Expected behavior
All three inputs should render as plain text (no
<a>tag), or thehrefshould be replaced with a safe fallback such as#. The suppression check should operate on the decoded URL, and the default blocklist should cover all commonly dangerous URI schemes.Expected output for all three inputs:
or
Resulting Output
Actual output (flexmark 0.64.8, JDK 17):
All three render clickable links with dangerous URIs. A user who clicks any of these in an application that renders user-supplied Markdown will trigger script execution in their browser.
Additional context
Tested against
com.vladsch.flexmark:flexmark:0.64.8(current Maven Central release) with default renderer options on JDK 17.Suggested fix:
isSuppressedLinkPrefix(), so that entity-encoded schemes are detected correctly.SUPPRESSED_LINKSpattern to also coverdata:andvbscript:, for example:"(?i)(javascript|data|vbscript):.*".Related: issue #672 (HTML entity bypass was previously reported; this report adds the missing-scheme vectors and provides a consolidated reproducer).