from json.encoder import py_encode_basestring, JSONEncoder
s = '\x7f'
encoder = JSONEncoder(ensure_ascii=True)
expected = encoder.encode(s)
actual = py_encode_basestring(s)
assert actual == expected, f"py_encode_basestring({s!r}) = {actual!r}, but JSONEncoder().encode({s!r}) = {expected!r}"
Traceback (most recent call last):
File "/data/test.py", line 10, in <module>
assert actual == expected, f"py_encode_basestring({s!r}) = {actual!r}, but JSONEncoder().encode({s!r}) = {expected!r}"
^^^^^^^^^^^^^^^^^^
AssertionError: py_encode_basestring('\x7f') = '"\x7f"', but JSONEncoder().encode('\x7f') = '"\\u007f"'
Bug report
Bug description:
There is an inconsistency in Python's json module between
py_encode_basestringandJSONEncoder.encode()when handling the DEL character (U+007F, \x7f) withensure_ascii=True.The DEL character (ASCII code 127) is being incorrectly escaped as \u007f by
JSONEncoder.encode()whenensure_ascii=True, while py_encode_basestring() correctly outputs it as a literal character.CPython versions tested on:
3.12
Operating systems tested on:
Linux
Linked PRs
\x7fhandling consistent across JSON #140794