Skip to content

Parser: fix exponential parse time on speculative NOT prefix parsing#2351

Closed
LucaCappelletti94 wants to merge 1 commit into
apache:mainfrom
LucaCappelletti94:pathological4
Closed

Parser: fix exponential parse time on speculative NOT prefix parsing#2351
LucaCappelletti94 wants to merge 1 commit into
apache:mainfrom
LucaCappelletti94:pathological4

Conversation

@LucaCappelletti94
Copy link
Copy Markdown
Contributor

parse_prefix wraps parse_expr_prefix_by_reserved_word in try_parse, which speculatively dispatches parse_not for the NOT keyword. parse_not's inner parse_subexpr(UnaryNot) infix-loops through -, hits the next NOT and recurses, so on chains like SELECT x-not-b.x-not-b... ending in a parse error every NOT segment re-walks the rest of the chain and parse_not is called 2^N - 1 times.

Fix is a per-parse BTreeSet<usize> of token positions where the speculative parse_not already failed. Subsequent visits at the same position skip the speculation and fall back to identifier interpretation.

Input Before After
903 B cursed libFuzzer artifact ~12 s ~2.5 ms
SELECT x-not-b. * 22 ~11 s ~1 ms
SELECT x-not-b. * 30 (too slow to measure) ~1.5 ms

@LucaCappelletti94
Copy link
Copy Markdown
Contributor Author

Closing as superseded by #2352. The new per-arm try_parse(reserved_word) cache in #2352 generalises this PR's NOT-specific cache to all reserved-word prefixes: regression input parses in 948us on the amended branch vs >60s on apache main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant