Skip to content

Commit f47f4bd

Browse files
gh-145980: Add support for alternative alphabets in the binascii module
* Add the alphabet parameter in functions b2a_base64(), a2b_base64(), b2a_base85() and a2b_base85(). * And a number of "*_ALPHABET" constants. * Remove b2a_z85() and a2b_z85().
1 parent e167e06 commit f47f4bd

File tree

7 files changed

+451
-305
lines changed

7 files changed

+451
-305
lines changed

Doc/library/binascii.rst

Lines changed: 56 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -48,12 +48,15 @@ The :mod:`!binascii` module defines the following functions:
4848
Added the *backtick* parameter.
4949

5050

51-
.. function:: a2b_base64(string, /, *, strict_mode=False)
52-
a2b_base64(string, /, *, strict_mode=True, ignorechars)
51+
.. function:: a2b_base64(string, /, *, alphabet=BASE64_ALPHABET, strict_mode=False)
52+
a2b_base64(string, /, *, ignorechars, alphabet=BASE64_ALPHABET, strict_mode=True)
5353
5454
Convert a block of base64 data back to binary and return the binary data. More
5555
than one line may be passed at a time.
5656

57+
Optional *alphabet* must be a :class:`bytes` object of length 64 which
58+
specifies an alternative alphabet.
59+
5760
If *ignorechars* is specified, it should be a :term:`bytes-like object`
5861
containing characters to ignore from the input when *strict_mode* is true.
5962
If *ignorechars* contains the pad character ``'='``, the pad characters
@@ -76,10 +79,10 @@ The :mod:`!binascii` module defines the following functions:
7679
Added the *strict_mode* parameter.
7780

7881
.. versionchanged:: 3.15
79-
Added the *ignorechars* parameter.
82+
Added the *alphabet* and *ignorechars* parameters.
8083

8184

82-
.. function:: b2a_base64(data, *, wrapcol=0, newline=True)
85+
.. function:: b2a_base64(data, *, alphabet=BASE64_ALPHABET, wrapcol=0, newline=True)
8386

8487
Convert binary data to a line(s) of ASCII characters in base64 coding,
8588
as specified in :rfc:`4648`.
@@ -95,7 +98,7 @@ The :mod:`!binascii` module defines the following functions:
9598
Added the *newline* parameter.
9699

97100
.. versionchanged:: 3.15
98-
Added the *wrapcol* parameter.
101+
Added the *alphabet* and *wrapcol* parameters.
99102

100103

101104
.. function:: a2b_ascii85(string, /, *, foldspaces=False, adobe=False, ignorechars=b"")
@@ -148,7 +151,7 @@ The :mod:`!binascii` module defines the following functions:
148151
.. versionadded:: 3.15
149152

150153

151-
.. function:: a2b_base85(string, /)
154+
.. function:: a2b_base85(string, /, *, alphabet=BASE85_ALPHABET)
152155

153156
Convert Base85 data back to binary and return the binary data.
154157
More than one line may be passed at a time.
@@ -158,49 +161,25 @@ The :mod:`!binascii` module defines the following functions:
158161
characters). Each group encodes 32 bits of binary data in the range from
159162
``0`` to ``2 ** 32 - 1``, inclusive.
160163

164+
Optional *alphabet* must be a :class:`bytes` object of length 85 which
165+
specifies an alternative alphabet.
166+
161167
Invalid Base85 data will raise :exc:`binascii.Error`.
162168

163169
.. versionadded:: 3.15
164170

165171

166-
.. function:: b2a_base85(data, /, *, pad=False)
172+
.. function:: b2a_base85(data, /, *, alphabet=BASE85_ALPHABET, pad=False)
167173

168174
Convert binary data to a line of ASCII characters in Base85 coding.
169175
The return value is the converted line.
170176

171-
If *pad* is true, the input is padded with ``b'\0'`` so its length is a
172-
multiple of 4 bytes before encoding.
173-
174-
.. versionadded:: 3.15
175-
176-
177-
.. function:: a2b_z85(string, /)
178-
179-
Convert Z85 data back to binary and return the binary data.
180-
More than one line may be passed at a time.
181-
182-
Valid Z85 data contains characters from the Z85 alphabet in groups
183-
of five (except for the final group, which may have from two to five
184-
characters). Each group encodes 32 bits of binary data in the range from
185-
``0`` to ``2 ** 32 - 1``, inclusive.
186-
187-
See `Z85 specification <https://rfc.zeromq.org/spec/32/>`_ for more information.
188-
189-
Invalid Z85 data will raise :exc:`binascii.Error`.
190-
191-
.. versionadded:: 3.15
192-
193-
194-
.. function:: b2a_z85(data, /, *, pad=False)
195-
196-
Convert binary data to a line of ASCII characters in Z85 coding.
197-
The return value is the converted line.
177+
Optional *alphabet* must be a :term:`bytes-like object` of length 85 which
178+
specifies an alternative alphabet.
198179

199180
If *pad* is true, the input is padded with ``b'\0'`` so its length is a
200181
multiple of 4 bytes before encoding.
201182

202-
See `Z85 specification <https://rfc.zeromq.org/spec/32/>`_ for more information.
203-
204183
.. versionadded:: 3.15
205184

206185

@@ -300,6 +279,47 @@ The :mod:`!binascii` module defines the following functions:
300279
but may be handled by reading a little more data and trying again.
301280

302281

282+
.. data:: BASE64_ALPHABET
283+
284+
The Base 64 alphabet according to :rfc:`4648`.
285+
286+
.. data:: URLSAFE_BASE64_ALPHABET
287+
288+
The "URL and filename safe" Base 64 alphabet according to :rfc:`4648`.
289+
290+
.. data:: CRYPT_ALPHABET
291+
292+
The Base 64 alphabet used in the :manpage:`crypt(3)` routine and in the GEDCOM format.
293+
294+
.. data:: BCRYPT_ALPHABET
295+
296+
The Base 64 alphabet used in the ``bcrypt`` hashing function.
297+
298+
.. data:: UU_ALPHABET
299+
300+
The Uuencoding alphabet.
301+
302+
.. data:: XX_ALPHABET
303+
304+
The Xxencoding alphabet.
305+
306+
.. data:: BINHEX_ALPHABET
307+
308+
The Base 64 alphabet used in BinHex 4 (HQX) within the classic Mac OS.
309+
310+
.. data:: BASE85_ALPHABET
311+
312+
The Base85 alphabet.
313+
314+
.. data:: ASCII85_ALPHABET
315+
316+
The Ascii85 alphabet.
317+
318+
.. data:: Z85_ALPHABET
319+
320+
The `Z85 <https://rfc.zeromq.org/spec/32/>`_ alphabet.
321+
322+
303323
.. seealso::
304324

305325
Module :mod:`base64`

Doc/whatsnew/3.15.rst

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -644,13 +644,16 @@ binascii
644644

645645
- :func:`~binascii.b2a_ascii85` and :func:`~binascii.a2b_ascii85`
646646
- :func:`~binascii.b2a_base85` and :func:`~binascii.a2b_base85`
647-
- :func:`~binascii.b2a_z85` and :func:`~binascii.a2b_z85`
648647

649648
(Contributed by James Seo and Serhiy Storchaka in :gh:`101178`.)
650649

651650
* Added the *wrapcol* parameter in :func:`~binascii.b2a_base64`.
652651
(Contributed by Serhiy Storchaka in :gh:`143214`.)
653652

653+
* Added the *alphabet* parameter in :func:`~binascii.b2a_base64` and
654+
:func:`~binascii.a2b_base64`.
655+
(Contributed by Serhiy Storchaka in :gh:`145980`.)
656+
654657
* Added the *ignorechars* parameter in :func:`~binascii.a2b_base64`.
655658
(Contributed by Serhiy Storchaka in :gh:`144001`.)
656659

Lib/base64.py

Lines changed: 15 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -56,11 +56,13 @@ def b64encode(s, altchars=None, *, wrapcol=0):
5656
If wrapcol is non-zero, insert a newline (b'\\n') character after at most
5757
every wrapcol characters.
5858
"""
59-
encoded = binascii.b2a_base64(s, wrapcol=wrapcol, newline=False)
6059
if altchars is not None:
61-
assert len(altchars) == 2, repr(altchars)
62-
return encoded.translate(bytes.maketrans(b'+/', altchars))
63-
return encoded
60+
if len(altchars) != 2:
61+
raise ValueError(f'invalid altchars: {altchars!r}')
62+
alphabet = binascii.BASE64_ALPHABET[:-2] + altchars
63+
return binascii.b2a_base64(s, wrapcol=wrapcol, newline=False,
64+
alphabet=alphabet)
65+
return binascii.b2a_base64(s, wrapcol=wrapcol, newline=False)
6466

6567

6668
def b64decode(s, altchars=None, validate=_NOT_SPECIFIED, *, ignorechars=_NOT_SPECIFIED):
@@ -100,9 +102,11 @@ def b64decode(s, altchars=None, validate=_NOT_SPECIFIED, *, ignorechars=_NOT_SPE
100102
break
101103
s = s.translate(bytes.maketrans(altchars, b'+/'))
102104
else:
103-
trans = bytes.maketrans(b'+/' + altchars, altchars + b'+/')
104-
s = s.translate(trans)
105-
ignorechars = ignorechars.translate(trans)
105+
alphabet = binascii.BASE64_ALPHABET[:-2] + altchars
106+
return binascii.a2b_base64(s, strict_mode=validate,
107+
alphabet=alphabet,
108+
ignorechars=ignorechars)
109+
106110
if ignorechars is _NOT_SPECIFIED:
107111
ignorechars = b''
108112
result = binascii.a2b_base64(s, strict_mode=validate,
@@ -140,7 +144,6 @@ def standard_b64decode(s):
140144
return b64decode(s)
141145

142146

143-
_urlsafe_encode_translation = bytes.maketrans(b'+/', b'-_')
144147
_urlsafe_decode_translation = bytes.maketrans(b'-_', b'+/')
145148

146149
def urlsafe_b64encode(s):
@@ -150,7 +153,8 @@ def urlsafe_b64encode(s):
150153
bytes object. The alphabet uses '-' instead of '+' and '_' instead of
151154
'/'.
152155
"""
153-
return b64encode(s).translate(_urlsafe_encode_translation)
156+
return binascii.b2a_base64(s, newline=False,
157+
alphabet=binascii.URLSAFE_BASE64_ALPHABET)
154158

155159
def urlsafe_b64decode(s):
156160
"""Decode bytes using the URL- and filesystem-safe Base64 alphabet.
@@ -393,14 +397,14 @@ def b85decode(b):
393397

394398
def z85encode(s, pad=False):
395399
"""Encode bytes-like object b in z85 format and return a bytes object."""
396-
return binascii.b2a_z85(s, pad=pad)
400+
return binascii.b2a_base85(s, pad=pad, alphabet=binascii.Z85_ALPHABET)
397401

398402
def z85decode(s):
399403
"""Decode the z85-encoded bytes-like object or ASCII string b
400404
401405
The result is returned as a bytes object.
402406
"""
403-
return binascii.a2b_z85(s)
407+
return binascii.a2b_base85(s, alphabet=binascii.Z85_ALPHABET)
404408

405409
# Legacy interface. This code could be cleaned up since I don't believe
406410
# binascii has any line length limitations. It just doesn't seem worth it

0 commit comments

Comments
 (0)