Skip to content

Commit 6479dce

Browse files
Commit
1 parent 9d3b53c commit 6479dce

File tree

2 files changed

+52
-28
lines changed

2 files changed

+52
-28
lines changed

Doc/library/codecs.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -350,6 +350,8 @@ error handling schemes by accepting the *errors* string argument:
350350
The following error handlers can be used with all Python
351351
:ref:`standard-encodings` codecs:
352352

353+
.. The following tables are reproduced on the library/functions page under open.
354+
353355
.. tabularcolumns:: |l|L|
354356

355357
+-------------------------+-----------------------------------------------+

Doc/library/functions.rst

Lines changed: 50 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1423,37 +1423,59 @@ are always available. They are listed here in alphabetical order.
14231423
*errors* is an optional string that specifies how encoding and decoding
14241424
errors are to be handled—this cannot be used in binary mode.
14251425
A variety of standard error handlers are available
1426-
(listed under :ref:`error-handlers`), though any
1427-
error handling name that has been registered with
1426+
(listed under :ref:`error-handlers`, and reproduced below for convenience),
1427+
though any error handling name that has been registered with
14281428
:func:`codecs.register_error` is also valid. The standard names
14291429
include:
14301430

1431-
* ``'strict'`` to raise a :exc:`ValueError` exception if there is
1432-
an encoding error. The default value of ``None`` has the same
1433-
effect.
1434-
1435-
* ``'ignore'`` ignores errors. Note that ignoring encoding errors
1436-
can lead to data loss.
1437-
1438-
* ``'replace'`` causes a replacement marker (such as ``'?'``) to be inserted
1439-
where there is malformed data.
1440-
1441-
* ``'surrogateescape'`` will represent any incorrect bytes as low
1442-
surrogate code units ranging from U+DC80 to U+DCFF.
1443-
These surrogate code units will then be turned back into
1444-
the same bytes when the ``surrogateescape`` error handler is used
1445-
when writing data. This is useful for processing files in an
1446-
unknown encoding.
1447-
1448-
* ``'xmlcharrefreplace'`` is only supported when writing to a file.
1449-
Characters not supported by the encoding are replaced with the
1450-
appropriate XML character reference :samp:`&#{nnn};`.
1451-
1452-
* ``'backslashreplace'`` replaces malformed data by Python's backslashed
1453-
escape sequences.
1454-
1455-
* ``'namereplace'`` (also only supported when writing)
1456-
replaces unsupported characters with ``\N{...}`` escape sequences.
1431+
.. list-table::
1432+
:header-rows: 1
1433+
1434+
* - Error handler
1435+
- Description
1436+
* - ``'strict'``
1437+
- Raise a :exc:`UnicodeError` (or a subclass) exception if there is
1438+
an error. The default value of ``None`` has the same effect.
1439+
* - ``'ignore'``
1440+
- Ignore the malformed data and continue without further notice.
1441+
Note that ignoring encoding errors can lead to data loss.
1442+
* - ``'replace'``
1443+
- Replace malformed data with a replacement marker.
1444+
On encoding, use ``?`` (ASCII character).
1445+
On decoding, use ```` (U+FFFD, the official REPLACEMENT CHARACTER)
1446+
* - ``'backslashreplace'``
1447+
- Replace malformed data with backslashed escape sequences.
1448+
On encoding, use hexadecimal form of Unicode code point with formats
1449+
:samp:`\\x{hh}` :samp:`\\u{xxxx}` :samp:`\\U{xxxxxxxx}`.
1450+
On decoding, use hexadecimal form of byte value with format :samp:`\\x{hh}`.
1451+
* - ``'surrogateescape'``
1452+
- Will represent any incorrect bytes as low
1453+
surrogate code units ranging from ``U+DC80`` to ``U+DCFF``.
1454+
These surrogate code units will then be turned back into
1455+
the same bytes when the ``'surrogateescape'`` error handler is used
1456+
when writing data. This is useful for processing files in an
1457+
unknown encoding.
1458+
* - ``'surrogatepass'``
1459+
- Only available for Unicode codecs.
1460+
Allow encoding and decoding surrogate code point
1461+
(``U+D800`` - ``U+DFFF``) as normal code point. Otherwise these codecs
1462+
treat the presence of surrogate code point in :class:`str` as an error.
1463+
1464+
The following error handlers are only applicable to encoding (within
1465+
:term:`text encodings <text encoding>`):
1466+
1467+
.. list-table::
1468+
:header-rows: 1
1469+
1470+
* - Error handler
1471+
- Description
1472+
* - ``'xmlcharrefreplace'``
1473+
- Only supported when writing to a file.
1474+
Characters not supported by the encoding are replaced with the
1475+
appropriate XML character reference :samp:`&#{nnn};`.
1476+
* - ``'namereplace'``
1477+
- Only supported when writing. Replaces unsupported characters with
1478+
``\N{...}`` escape sequences.
14571479

14581480
.. index::
14591481
single: universal newlines; open() built-in function

0 commit comments

Comments
 (0)