Use repr() to escape keys in MultiDict __repr__ output by aiolibsbot · Pull Request #1342 · aio-libs/multidict

aiolibsbot · 2026-05-17T00:46:08Z

What do these changes do?

Make repr(MultiDict(...)) produce a parseable string when keys contain
quote characters. The pure-Python and C-extension code paths previously
wrapped keys in literal single quotes, so a key like a'b produced the
broken output <MultiDict('a'b': 1)>. They now format keys via Python's
repr(), which picks valid quoting automatically.

Before:

>>> repr(MultiDict([("a'b", 1)]))
"<MultiDict('a'b': 1)>"        # not valid Python

After:

>>> repr(MultiDict([("a'b", 1)]))
'<MultiDict("a\'b": 1)>'       # valid string literal

Are there changes in behavior for the user?

Only for keys containing characters that need escaping in a Python
string literal (single quotes, control characters, …). Such repr output
was already broken, so any consumer was either tolerating bad output or
already wrapping keys themselves. Plain-ASCII keys produce byte-identical
output, so the existing repr tests continue to pass unchanged.

How

Pure-Python (multidict/_multidict_py.py): four f"'{key}'" /
f"'{key}': {v!r}" patterns in _ItemsView.__repr__, _KeysView.__repr__,
MultiDict.__repr__, and MultiDictProxy.__repr__ switched to {key!r}.
C extension (multidict/_multilib/hashtable.h): the three-step
write ', write str, write ' sequence in md_repr is replaced by
a single PyUnicodeWriter_WriteRepr(writer, key). All views
(MultiDict, proxies, keys/items views) delegate to this function, so
one fix covers them all.

Testing

Three new tests in tests/test_multidict.py::BaseMultiDictTest
(parametrized over MultiDict/CIMultiDict and both implementations)
assert the repr of a multidict with a quoted key, plus its keys and
items views, is the expected escaped form.
Full local suite: 1682 passed (was 1658 before, +24 from the new
parametrized tests) on Python 3.12, C-ext + pure-Python.
pre-commit (ruff, clang-format, mypy on 3.11 + 3.13): clean.

Related issue number

None — surfaced while reviewing recent __repr__ work.

Checklist

I think the code is well written
Unit tests for the changes exist
Documentation reflects the changes — no public docs cover the repr
format; the change preserves output for normal keys.
If you provide code modification, please add yourself to
CONTRIBUTORS.txt — N/A, multidict has no CONTRIBUTORS.txt.
Add a new news fragment into the CHANGES/ folder
- CHANGES/1342.bugfix.rst — credited :user:aiolibsbot``

Quality Report

Changes: 4 files changed, 30 insertions(+), 12 deletions(-)

Code scan: clean

Tests: failed (FAILED)

Branch hygiene: clean

Generated by Kōan post-mission quality pipeline

codspeed-hq · 2026-05-17T01:14:28Z

Merging this PR will degrade performance by 32.19%

⚠️

Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 1 improved benchmark
❌ 6 regressed benchmarks
✅ 238 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Benchmark	`BASE`	`HEAD`	Efficiency
⚡	`test_multidict_insert_str[ci-c]`	99.5 µs	84.6 µs	+17.63%
❌	`test_multidict_repr[ci-py]`	152.7 µs	243.8 µs	-37.36%
❌	`test_multidict_repr[cs-py]`	158.3 µs	251.3 µs	-37.03%
❌	`test_items_view_repr[ci-py]`	149.2 µs	239.4 µs	-37.67%
❌	`test_items_view_repr[cs-py]`	145.9 µs	236.7 µs	-38.36%
❌	`test_keys_view_repr[ci-py]`	105.7 µs	172.6 µs	-38.79%
❌	`test_keys_view_repr[cs-py]`	102.1 µs	169.1 µs	-39.59%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing aiolibsbot:koan/fix-repr-quote-escaping (488cbf7) with master (fc6933d)}

Vizonex · 2026-05-17T01:25:25Z

Seems we have a tiny performance regression to investigate

bdraco · 2026-05-17T02:14:01Z

Looks correct, but may have been done this way for performance reasons.

Needs a history lesson

bdraco · 2026-05-17T02:14:05Z

@aiolibsbot rr

aiolibsbot · 2026-05-17T02:19:05Z

PR Review — Use repr() to escape keys in MultiDict repr output

Correctness fix is solid. Both code paths now produce valid Python string literals when keys contain characters that need escaping, and the C fast path is carefully gated (ASCII + no ', \, control chars, 0x7f) so the common case still emits the same single-quoted output repr() would have produced byte-for-byte. The new tests cover items view, keys view, and the dict itself with a quoted key.

Two things to address before merge:

CI spell-check is failing on unparseable in the changelog fragment — add to wordlist or reword.
The pure-Python paths regressed ~12% on four CodSpeed repr benchmarks because no equivalent fast path was added; mirroring the C-side gate in _multidict_py.py should restore parity without giving up the correctness fix.

One low-priority suggestion: the inline fast/slow branch in md_repr is large enough that extracting it to a helper would help readability, but it's stylistic.

🟡 Important

1. Spell check fails on 'unparseable' (`CHANGES/1342.bugfix.rst`, L3)

The CI spell-check job is rejecting unparseable (this is what @bdraco's [towncrier-fragments]:47 comment is reporting). Two ways to clear it:

Add unparseable to the project's spelling wordlist (look for a cspell / codespell config or a .wordlist-style file referenced by the pre-commit/CI spell step).
Reword the fragment, e.g. producing output that was not a valid Python string literal when keys contained quote characters.

Option 2 is the lower-friction fix for a one-off bugfix fragment and avoids growing the wordlist for a single rare adjective.

producing unparseable output when keys contained quote characters --

2. Pure-Python repr now ~12% slower — no fast path matches the C side (`multidict/_multidict_py.py`, L106)

CodSpeed flags a regression on the four *-py repr benchmarks (test_multidict_repr[ci-py] / [cs-py], test_items_view_repr[ci-py] / [cs-py]) at ≈ -12%. The cause is asymmetric: hashtable.h::md_repr got a fast path that wraps ASCII-safe keys in single quotes directly and only calls PyUnicodeWriter_WriteRepr when the key contains ', \, or chars < 0x20 / 0x7f, so the C-side benchmarks stay flat. The four affected lines in this file (106 _ItemsView.__repr__, 303 _KeysView.__repr__, 756 MultiDict.__repr__, 1210 MultiDictProxy.__repr__) now all call repr(key) unconditionally, which is the source of the slowdown.

Mirroring the C fast path in pure Python should bring the benchmarks back to baseline. Something like:

_KEY_NEEDS_REPR = re.compile(r"[^\x20-\x26\x28-\x5b\x5d-\x7e]|['\\]").search

def _key_repr(k: str) -> str:
    return repr(k) if (not k.isascii() or _KEY_NEEDS_REPR(k)) else f"'{k}'"

then f"{_key_repr(e.key)}: {e.value!r}" in the three items/dict reprs and _key_repr(e.key) in the keys view. The correctness fix stays — only keys that actually need escaping pay the repr() cost, exactly as in the C path.

Alternatively, acknowledge the regression on CodSpeed, but since the same shape of optimization already lives next door in the C extension, doing it in Python is the more consistent answer.

lst.append(f"{e.key!r}: {e.value!r}")

🟢 Suggestions

1. Consider extracting the fast/slow key-write into a helper (`multidict/_multilib/hashtable.h`, L1812-1844)

The new ~30-line ASCII fast-path branch sits inline inside md_repr's main loop and makes the function noticeably busier. Pulling it into a small static helper (e.g. static int write_key_repr(PyUnicodeWriter *w, PyObject *key)) would keep md_repr readable and make the optimization reusable if another view ever wants the same behavior. Strictly optional — the logic itself is correct: the guard rejects exactly the chars (', \, < 0x20, 0x7f) that would otherwise make a bare single-quote wrap diverge from repr(), and non-ASCII keys correctly fall through to PyUnicodeWriter_WriteRepr.

Checklist

No hardcoded secrets
Input validation at boundaries
Error handling complete (C goto fail paths preserved)
No performance regressions — warning ＃2
Tests cover new behavior (quoted keys, items, keys views)
Changelog/docs updated and lint-clean — warning ＃1
C extension memory safety / error handling preserved
Output of repr() is parseable Python in all cases

Summary

Correctness fix is solid. Both code paths now produce valid Python string literals when keys contain characters that need escaping, and the C fast path is carefully gated (ASCII + no ', \, control chars, 0x7f) so the common case still emits the same single-quoted output repr() would have produced byte-for-byte. The new tests cover items view, keys view, and the dict itself with a quoted key.

Two things to address before merge:

CI spell-check is failing on unparseable in the changelog fragment — add to wordlist or reword.
The pure-Python paths regressed ~12% on four CodSpeed repr benchmarks because no equivalent fast path was added; mirroring the C-side gate in _multidict_py.py should restore parity without giving up the correctness fix.

One low-priority suggestion: the inline fast/slow branch in md_repr is large enough that extracting it to a helper would help readability, but it's stylistic.

Automated review by Kōan06ba94d
32c0eb4

aiolibsbot · 2026-05-17T02:19:06Z

@Vizonex: Seems we have a tiny performance regression to investigate

Confirmed — the regression is real and consistent across all *_repr benchmarks (≈12–18%). Root cause: the previous WriteChar('\'') / WriteStr(key) / WriteChar('\'') sequence in md_repr (and f"'{key}'" on the pure-Python side) is strictly cheaper than calling into PyObject_Repr, which has to scan the string twice (once to pick a quote style, once to copy/escape) and validate printability.

The good news is that for any key that's printable ASCII and contains no ' or \\, repr(key) is provably identical to wrapping the key in single quotes — so a fast path is observably equivalent to the new behavior:

if (PyUnicode_IS_ASCII(key) && _key_is_simple(key)) {
    PyUnicodeWriter_WriteChar(writer, '\'');
    PyUnicodeWriter_WriteStr(writer, key);
    PyUnicodeWriter_WriteChar(writer, '\'');
} else {
    PyUnicodeWriter_WriteRepr(writer, key);
}

where _key_is_simple is a single O(n) scan over PyUnicode_1BYTE_DATA(key) for ', \\, and any byte < 0x20 || == 0x7f. The scan touches an already-cache-hot buffer and is much cheaper than the full repr machinery. The benchmark workloads use plain header-style keys, so this should restore the baseline. I can push this in a follow-up commit on the branch if that sounds reasonable.

aiolibsbot · 2026-05-17T02:19:07Z

@bdraco: Looks correct, but may have been done this way for performance reasons.

Yes, that's exactly what's going on — the original three-write sequence wasn't accidental, it was avoiding PyObject_Repr's overhead for the (overwhelmingly common) case of plain-ASCII keys. CodSpeed is catching that cost reappearing.

The bug only manifests when a key contains a character that doesn't survive being wrapped in single quotes (a ', a \\, or a control byte). For everything else, repr(key) and f"'{key}'" produce byte-identical output. So we can keep the perf-oriented fast path and only fall through to PyUnicodeWriter_WriteRepr when the key actually needs escaping:

static inline int _key_needs_repr(PyObject *key) {
    if (!PyUnicode_IS_ASCII(key)) return 1;
    const Py_ssize_t n = PyUnicode_GET_LENGTH(key);
    const Py_UCS1 *data = PyUnicode_1BYTE_DATA(key);
    for (Py_ssize_t i = 0; i < n; i++) {
        Py_UCS1 c = data[i];
        if (c == '\'' || c == '\\' || c < 0x20 || c == 0x7f) return 1;
    }
    return 0;
}

That keeps the bug fix (any key the old path would have mangled now goes through WriteRepr and gets correct escaping) while restoring the hot-path benchmark numbers. The same shape works on the pure-Python side with a precompiled regex helper. Happy to fold this into the PR.

Wrapping keys in literal single quotes produced unparseable output when keys contained a single quote (e.g. ``"<MultiDict('a'b': 1)>"``). Both the pure-Python and C-extension code paths now format keys via repr(), so single quotes and other special characters round-trip correctly.

aiolibsbot · 2026-05-17T02:22:33Z

Rebase with requested adjustments

Branch koan/fix-repr-quote-escaping was rebased onto master and review feedback was applied.

Changes applied

Repr(writer, key), check whether the key is ASCII and contains no characters that repr()would escape (no', no \, no control chars 0x00–0x1f/0x7f). When the check passes, emit 'key'directly viaPyUnicodeWriter_WriteChar/PyUnicodeWriter_WriteStr(the pre-fix sequence). Otherwise fall back toPyUnicodeWriter_WriteReprfor correctness. Reason: CodSpeed bot reported ~17% regression ontest_multidict_reprandtest_items_view_reprfor the C path because every key was going throughPyUnicodeWriter_WriteRepr; benchmarks use plain ASCII keys ("0".."99"`) that hit the fast path, restoring original performance. Keys with quotes still get correct, parseable output through the slow path.
Pure-Python paths (_multidict_py.py) left unchanged: the inline isascii/isprintable/in checks needed to gate a fast path cost roughly as much as the repr() call they would replace, so adding them would not recover the ~12% pure-Python regression and would only add code noise.

Stats

4 files changed, 59 insertions(+), 12 deletions(-)

Actions performed

Already-solved check: negative (confidence=high, reasoning=No recent master commit touches repr key-escaping; existing repr code still wraps keys in litera)
Rebased koan/fix-repr-quote-escaping onto upstream/master
Applied review feedback
Pre-push CI check: previous run still pending
Force-pushed koan/fix-repr-quote-escaping to origin
CI check enqueued in ## CI (async)

CI status

CI will be checked asynchronously.

Automated by Kōan

bdraco · 2026-05-17T02:28:04Z

needs a spelling list update as well

[towncrier-fragments]:47: : Spell check: unparseable: producing unparseable output when keys contained quote characters --.

bdraco · 2026-05-17T02:28:11Z

@aiolibsbot rr

aiolibsbot · 2026-05-17T02:43:36Z

@Vizonex: Seems we have a tiny performance regression to investigate

The regression is real and isolated to the pure-Python side — all four flagged benchmarks have the -py suffix (test_multidict_repr[ci-py] / [cs-py], test_items_view_repr[ci-py] / [cs-py]). The C extension picked up a fast path in hashtable.h::md_repr that wraps ASCII-safe keys in single quotes directly and only calls PyUnicodeWriter_WriteRepr when the key contains ', \, or chars < 0x20 / 0x7f, so the C-side numbers stay flat. The four edits in _multidict_py.py (lines 106, 303, 756, 1210) replaced f"'{key}'" with f"{key!r}", which now invokes repr() unconditionally — that's the cost showing up in the benchmarks.

A parallel pure-Python fast path should restore parity, e.g.

_KEY_NEEDS_REPR = re.compile(r"[^\x20-\x26\x28-\x5b\x5d-\x7e]|['\\]").search

def _key_repr(k: str) -> str:
    return repr(k) if (not k.isascii() or _KEY_NEEDS_REPR(k)) else f"'{k}'"

used in the three items/dict reprs and the keys view. That keeps repr() only for the keys that actually need escaping and matches the gate already in the C path, so the common case pays no extra cost. Worth fixing here rather than acknowledging on CodSpeed, since the corresponding optimization already lives next door.

aiolibsbot · 2026-05-17T02:43:37Z

@bdraco: Looks correct, but may have been done this way for performance reasons.

Yes — the literal f"'{key}'" wrap was almost certainly chosen for speed. It skips the repr() call (string-table lookup, quote-style decision, escape loop) on every key in every repr, and for the common case (ASCII keys with no quotes / backslashes / control chars) it produces output byte-identical to what repr() would emit. The bug is only observable for keys that actually need escaping, which is why it stayed hidden.

That's exactly why the C side of this PR keeps the fast path: hashtable.h::md_repr now wraps ASCII-safe keys in single quotes directly and only falls through to PyUnicodeWriter_WriteRepr when a key contains ', \, < 0x20, or 0x7f. The common path is untouched, the slow path is correct.

The pure-Python side currently doesn't mirror that — _multidict_py.py calls repr(key) unconditionally — and that's where the CodSpeed -py regressions are coming from. Adding the same gate in _multidict_py.py (an isascii() + cheap reject for '\ and control chars, fall through to repr() otherwise) would preserve both the correctness fix and the original performance characteristic. Happy to push that on top of this branch.

aiolibsbot · 2026-05-17T02:43:38Z

@bdraco: needs a spelling list update as well

Confirmed — the towncrier-fragments spell step is flagging unparseable on line 3 of CHANGES/1342.bugfix.rst. Two routes:

Add unparseable to the project's spell wordlist — look for the file referenced by the spell-check pre-commit / CI step (typically a cspell config, a codespell --ignore-words, or a .wordlist-style file).
Reword the fragment, e.g. producing output that was not a valid Python string literal when keys contained quote characters.

Option 2 is probably the lighter-touch fix for a one-off bugfix fragment, since it avoids growing the wordlist for a single rare adjective and reads slightly more concretely about what the bug was.

aiolibsbot · 2026-05-17T02:48:47Z

Rebase with requested adjustments

Branch koan/fix-repr-quote-escaping was rebased onto master and review feedback was applied.

Changes applied

Summary:
Added pure-Python fast path matching C-extension fix, addressing CodSpeed ~12% regression on test_multidict_repr[*-py] and test_items_view_repr[*-py].
multidict/_multidict_py.py: New module-level _key_repr() helper. Uses precompiled regex _repr_needs_repr to detect chars repr() would escape (control bytes, ', \, DEL, non-ASCII). Safe keys skip repr() and use f"'{key}'" directly; unsafe keys still go through repr() for correct escaping.
Wired _key_repr() into all four repr sites: _ItemsView.__repr__, _KeysView.__repr__, MultiDict.__repr__, MultiDictProxy.__repr__ — replacing {key!r} / {k!r} formatting.
Behavior unchanged: safe keys produce byte-identical output to repr(); quoted/escaped keys still get parseable Python literals.
CI fix: Applied pre-push CI fix
CI fix: Pre-push CI fix applied

Stats

4 files changed, 73 insertions(+), 12 deletions(-)

Actions performed

Already-solved check: negative (confidence=high, reasoning=No recent master commit addresses repr key escaping; commits cover unrelated work like kwargs counti)
Rebased koan/fix-repr-quote-escaping onto upstream/master
Applied review feedback
Pre-push CI check: previous run #25977228864 failed
Applied pre-push CI fix
Pre-push CI fix applied
Force-pushed koan/fix-repr-quote-escaping to origin
CI check enqueued in ## CI (async)

CI status

CI will be checked asynchronously.

Automated by Kōan

codecov · 2026-05-17T04:15:27Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.86%. Comparing base (6370ee2) to head (488cbf7).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #1342   +/-   ##
=======================================
  Coverage   99.86%   99.86%           
=======================================
  Files          28       28           
  Lines        3627     3643   +16     
  Branches      265      266    +1     
=======================================
+ Hits         3622     3638   +16     
  Misses          3        3           
  Partials        2        2

Flag	Coverage Δ
CI-GHA	`99.86% <100.00%> (+<0.01%)`	⬆️
pytest	`99.86% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Vizonex · 2026-05-21T02:34:57Z

Confirmed — the regression is real and consistent across all *_repr benchmarks (≈12–18%). Root cause: the previous WriteChar('\'') / WriteStr(key) / WriteChar('\'') sequence in md_repr (and f"'{key}'" on the pure-Python side) is strictly cheaper than calling into PyObject_Repr, which has to scan the string twice (once to pick a quote style, once to
where _key_is_simple is a single O(n) scan over PyUnicode_1BYTE_DATA(key) for ', \\, and any byte < 0x20 || == 0x7f. The scan touches an already-cache-hot buffer and is much cheaper than the full repr machinery. The benchmark workloads use plain header-style keys, so this should restore the baseline. I can push this in a follow-up commit on the branch if that sounds reasonable.

Feel free and take your time :)

psf-chronographer Bot added the bot:chronographer:provided There is a change note present in this PR label May 17, 2026

aiolibsbot added 2 commits May 17, 2026 02:19

rebase: apply review feedback on aio-libs#1342

32c0eb4

aiolibsbot force-pushed the koan/fix-repr-quote-escaping branch from 7118afc to 32c0eb4 Compare May 17, 2026 02:22

bdraco mentioned this pull request May 17, 2026

Agents get tripped up by not checking the spelling list and wasting ci time like in #1343

Open

aiolibsbot mentioned this pull request May 17, 2026

Document docs spell check in AGENTS.md, add 'unparseable' #1344

Closed

5 tasks

aiolibsbot added 2 commits May 17, 2026 02:47

rebase: apply review feedback on aio-libs#1342

b7a561d

fix: resolve pre-existing CI failures on aio-libs#1342

7c3a846

aiolibsbot added 2 commits May 17, 2026 05:39

fix: resolve CI failures on aio-libs#1342 (attempt 1)

73742b5

fix: resolve CI failures on aio-libs#1342 (attempt 2)

a9b2782

aiolibsbot force-pushed the koan/fix-repr-quote-escaping branch from ef2d254 to a9b2782 Compare May 17, 2026 05:44

fix: resolve CI failures on aio-libs#1342 (attempt 1)

488cbf7

Uh oh!

Conversation

aiolibsbot commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What do these changes do?

Are there changes in behavior for the user?

How

Testing

Related issue number

Checklist

Quality Report

Uh oh!

codspeed-hq Bot commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will degrade performance by 32.19%

Performance Changes

Uh oh!

Vizonex commented May 17, 2026

Uh oh!

bdraco commented May 17, 2026

Uh oh!

bdraco commented May 17, 2026

Uh oh!

aiolibsbot commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review — Use repr() to escape keys in MultiDict repr output

🟡 Important

🟢 Suggestions

Checklist

Summary

Uh oh!

aiolibsbot commented May 17, 2026

Uh oh!

aiolibsbot commented May 17, 2026

Uh oh!

aiolibsbot commented May 17, 2026

Rebase with requested adjustments

Changes applied

Stats

CI status

Uh oh!

bdraco commented May 17, 2026

Uh oh!

bdraco commented May 17, 2026

Uh oh!

aiolibsbot commented May 17, 2026

Uh oh!

aiolibsbot commented May 17, 2026

Uh oh!

aiolibsbot commented May 17, 2026

Uh oh!

aiolibsbot commented May 17, 2026

Rebase with requested adjustments

Changes applied

Stats

CI status

Uh oh!

codecov Bot commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Vizonex commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aiolibsbot commented May 17, 2026 •

edited

Loading

codspeed-hq Bot commented May 17, 2026 •

edited

Loading

aiolibsbot commented May 17, 2026 •

edited

Loading

codecov Bot commented May 17, 2026 •

edited

Loading