Optimize _WS_EXT_RE backtracking on Python 3.11+#12346
Optimize _WS_EXT_RE backtracking on Python 3.11+#12346HarshithReddy01 wants to merge 7 commits intoaio-libs:masterfrom
Conversation
Use an atomic group for Python 3.11+ in websocket extension parsing and add focused tests to validate behavior and guard against worst-case backtracking regressions.
Include contributor attribution and a misc changelog note for the websocket extension regex optimization change.
for more information, see https://pre-commit.ci
Merging this PR will not alter performance
Comparing Footnotes
|
❌ 3 Tests Failed:
View the top 1 failed test(s) by shortest run time
View the full list of 2 ❄️ flaky test(s)
To view more test analytics, go to the Test Analytics Dashboard |
tests/test_ws_ext_helpers.py
Outdated
| def test_permessage_deflate_only(self) -> None: | ||
| compress, notakeover = ws_ext_parse("permessage-deflate") | ||
| assert compress == 15 | ||
| assert notakeover is False | ||
|
|
||
| def test_server_no_context_takeover(self) -> None: | ||
| compress, notakeover = ws_ext_parse( | ||
| "permessage-deflate; server_no_context_takeover", isserver=True | ||
| ) | ||
| assert compress == 15 | ||
| assert notakeover is True | ||
|
|
||
| def test_client_no_context_takeover(self) -> None: | ||
| compress, notakeover = ws_ext_parse( | ||
| "permessage-deflate; client_no_context_takeover", isserver=False | ||
| ) | ||
| assert compress == 15 | ||
| assert notakeover is True | ||
|
|
||
| def test_server_max_window_bits(self) -> None: | ||
| compress, notakeover = ws_ext_parse( | ||
| "permessage-deflate; server_max_window_bits=12", isserver=True | ||
| ) | ||
| assert compress == 12 | ||
| assert notakeover is False |
There was a problem hiding this comment.
Let's parametrize these. So, one test that looks something like:
@pytest.mark.parametrize(
("msg", "server", "expected"),
(
("permessage-defalt", False, (15, False)),
...
)
)
def test_ws_ext_parser(msg: str, server: bool, expected: tuple[int, bool]) -> None:
assert ws_ext_parser(msg, isserver=server) == expected
tests/test_ws_ext_helpers.py
Outdated
| # Without the atomic group fix this causes exponential backtracking. | ||
| evil = "permessage-deflate" + ("; server_no_context_takeover" * 30) + ";INVALID" | ||
| start = time.perf_counter() | ||
| try: |
There was a problem hiding this comment.
Rather than ignoring an exception, we can verify that it occurs:
| try: | |
| with pytest.raises(WSHandShakeError): |
|
The Test (ubuntu, 3.14t, false) failure is pre-existing on master and unrelated to this PR, it's test_parse_set_cookie_headers_uses_unquote_with_octal failing due to Python 3.14t's new strict control character check in http.cookies.Morsel. The fail-fast strategy cancelled all other jobs. Happy to rerun CI once confirmed. |
Yeah, it's a security patch to 3.14 that's broken it. We'll need to sort a separate PR for that shortly. |
What do these changes do?
This change optimizes WebSocket extension parsing in
aiohttp/_websocket/helpers.pyby using an atomic outer group for Python 3.11+ in_WS_EXT_RE.The previous pattern used a repeating outer non-atomic group:
(?:;\s*(?:...))*.With many valid extension tokens followed by an invalid suffix, regex matching could spend extra time backtracking across prior iterations before failing.
On Python 3.11+, the outer group is changed to atomic
(?>...)*, which prevents that backtracking path while preserving the same matching intent.On Python 3.10 and lower, behavior is unchanged because atomic groups are not supported there.
Are there changes in behavior for the user?
No user-facing behavior change is intended.
Accepted/rejected extension strings remain the same for valid inputs.
This is a performance-oriented change focused on reducing worst-case backtracking on failing matches.
Is it a substantial burden for the maintainers to support this?
No.
The implementation is a small version-gated regex compile-time branch, with both variants kept structurally identical except for the atomic-group syntax on 3.11+.
Related issue number
Discussed in GHSA-qhr8-wxhx-9q9w (closed).
This PR is submitted as a performance improvement, not a security fix/CVE claim.
Checklist
CONTRIBUTORS.txtCHANGES/folder