Skip to content

Commit f5776fd

Browse files
[3.13] Regex HOWTO: invalid string literals result in SyntaxWarning (GH-148092) (#148098)
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
1 parent df89a70 commit f5776fd

File tree

1 file changed

+28
-28
lines changed

1 file changed

+28
-28
lines changed

Doc/howto/regex.rst

Lines changed: 28 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
.. _regex-howto:
22

33
****************************
4-
Regular Expression HOWTO
4+
Regular expression HOWTO
55
****************************
66

77
:Author: A.M. Kuchling <amk@amk.ca>
@@ -47,7 +47,7 @@ Python code to do the processing; while Python code will be slower than an
4747
elaborate regular expression, it will also probably be more understandable.
4848

4949

50-
Simple Patterns
50+
Simple patterns
5151
===============
5252

5353
We'll start by learning about the simplest possible regular expressions. Since
@@ -59,7 +59,7 @@ expressions (deterministic and non-deterministic finite automata), you can refer
5959
to almost any textbook on writing compilers.
6060

6161

62-
Matching Characters
62+
Matching characters
6363
-------------------
6464

6565
Most letters and characters will simply match themselves. For example, the
@@ -159,7 +159,7 @@ match even a newline. ``.`` is often used where you want to match "any
159159
character".
160160

161161

162-
Repeating Things
162+
Repeating things
163163
----------------
164164

165165
Being able to match varying sets of characters is the first thing regular
@@ -210,7 +210,7 @@ this RE against the string ``'abcbd'``.
210210
| | | ``[bcd]*`` is only matching |
211211
| | | ``bc``. |
212212
+------+-----------+---------------------------------+
213-
| 6 | ``abcb`` | Try ``b`` again. This time |
213+
| 7 | ``abcb`` | Try ``b`` again. This time |
214214
| | | the character at the |
215215
| | | current position is ``'b'``, so |
216216
| | | it succeeds. |
@@ -255,7 +255,7 @@ is equivalent to ``+``, and ``{0,1}`` is the same as ``?``. It's better to use
255255
to read.
256256

257257

258-
Using Regular Expressions
258+
Using regular expressions
259259
=========================
260260

261261
Now that we've looked at some simple regular expressions, how do we actually use
@@ -264,7 +264,7 @@ expression engine, allowing you to compile REs into objects and then perform
264264
matches with them.
265265

266266

267-
Compiling Regular Expressions
267+
Compiling regular expressions
268268
-----------------------------
269269

270270
Regular expressions are compiled into pattern objects, which have
@@ -295,7 +295,7 @@ disadvantage which is the topic of the next section.
295295

296296
.. _the-backslash-plague:
297297

298-
The Backslash Plague
298+
The backslash plague
299299
--------------------
300300

301301
As stated earlier, regular expressions use the backslash character (``'\'``) to
@@ -335,7 +335,7 @@ expressions will often be written in Python code using this raw string notation.
335335

336336
In addition, special escape sequences that are valid in regular expressions,
337337
but not valid as Python string literals, now result in a
338-
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`,
338+
:exc:`SyntaxWarning` and will eventually become a :exc:`SyntaxError`,
339339
which means the sequences will be invalid if raw string notation or escaping
340340
the backslashes isn't used.
341341

@@ -351,7 +351,7 @@ the backslashes isn't used.
351351
+-------------------+------------------+
352352

353353

354-
Performing Matches
354+
Performing matches
355355
------------------
356356

357357
Once you have an object representing a compiled regular expression, what do you
@@ -369,10 +369,10 @@ for a complete listing.
369369
| | location where this RE matches. |
370370
+------------------+-----------------------------------------------+
371371
| ``findall()`` | Find all substrings where the RE matches, and |
372-
| | returns them as a list. |
372+
| | return them as a list. |
373373
+------------------+-----------------------------------------------+
374374
| ``finditer()`` | Find all substrings where the RE matches, and |
375-
| | returns them as an :term:`iterator`. |
375+
| | return them as an :term:`iterator`. |
376376
+------------------+-----------------------------------------------+
377377

378378
:meth:`~re.Pattern.match` and :meth:`~re.Pattern.search` return ``None`` if no match can be found. If
@@ -473,7 +473,7 @@ Two pattern methods return all of the matches for a pattern.
473473
The ``r`` prefix, making the literal a raw string literal, is needed in this
474474
example because escape sequences in a normal "cooked" string literal that are
475475
not recognized by Python, as opposed to regular expressions, now result in a
476-
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`. See
476+
:exc:`SyntaxWarning` and will eventually become a :exc:`SyntaxError`. See
477477
:ref:`the-backslash-plague`.
478478

479479
:meth:`~re.Pattern.findall` has to create the entire list before it can be returned as the
@@ -491,7 +491,7 @@ result. The :meth:`~re.Pattern.finditer` method returns a sequence of
491491
(29, 31)
492492

493493

494-
Module-Level Functions
494+
Module-level functions
495495
----------------------
496496

497497
You don't have to create a pattern object and call its methods; the
@@ -518,7 +518,7 @@ Outside of loops, there's not much difference thanks to the internal
518518
cache.
519519

520520

521-
Compilation Flags
521+
Compilation flags
522522
-----------------
523523

524524
.. currentmodule:: re
@@ -642,7 +642,7 @@ of each one.
642642
whitespace is in a character class or preceded by an unescaped backslash; this
643643
lets you organize and indent the RE more clearly. This flag also lets you put
644644
comments within a RE that will be ignored by the engine; comments are marked by
645-
a ``'#'`` that's neither in a character class or preceded by an unescaped
645+
a ``'#'`` that's neither in a character class nor preceded by an unescaped
646646
backslash.
647647

648648
For example, here's a RE that uses :const:`re.VERBOSE`; see how much easier it
@@ -669,7 +669,7 @@ of each one.
669669
to understand than the version using :const:`re.VERBOSE`.
670670

671671

672-
More Pattern Power
672+
More pattern power
673673
==================
674674

675675
So far we've only covered a part of the features of regular expressions. In
@@ -679,7 +679,7 @@ retrieve portions of the text that was matched.
679679

680680
.. _more-metacharacters:
681681

682-
More Metacharacters
682+
More metacharacters
683683
-------------------
684684

685685
There are some metacharacters that we haven't covered yet. Most of them will be
@@ -872,7 +872,7 @@ Backreferences like this aren't often useful for just searching through a string
872872
find out that they're *very* useful when performing string substitutions.
873873

874874

875-
Non-capturing and Named Groups
875+
Non-capturing and named groups
876876
------------------------------
877877

878878
Elaborate REs may use many groups, both to capture substrings of interest, and
@@ -976,7 +976,7 @@ current point. The regular expression for finding doubled words,
976976
'the the'
977977

978978

979-
Lookahead Assertions
979+
Lookahead assertions
980980
--------------------
981981

982982
Another zero-width assertion is the lookahead assertion. Lookahead assertions
@@ -1058,7 +1058,7 @@ end in either ``bat`` or ``exe``:
10581058
``.*[.](?!bat$|exe$)[^.]*$``
10591059

10601060

1061-
Modifying Strings
1061+
Modifying strings
10621062
=================
10631063

10641064
Up to this point, we've simply performed searches against a static string.
@@ -1080,7 +1080,7 @@ using the following pattern methods:
10801080
+------------------+-----------------------------------------------+
10811081

10821082

1083-
Splitting Strings
1083+
Splitting strings
10841084
-----------------
10851085

10861086
The :meth:`~re.Pattern.split` method of a pattern splits a string apart
@@ -1134,7 +1134,7 @@ argument, but is otherwise the same. ::
11341134
['Words', 'words, words.']
11351135

11361136

1137-
Search and Replace
1137+
Search and replace
11381138
------------------
11391139

11401140
Another common task is to find all the matches for a pattern, and replace them
@@ -1233,15 +1233,15 @@ pattern object as the first parameter, or use embedded modifiers in the
12331233
pattern string, e.g. ``sub("(?i)b+", "x", "bbbb BBBB")`` returns ``'x x'``.
12341234

12351235

1236-
Common Problems
1236+
Common problems
12371237
===============
12381238

12391239
Regular expressions are a powerful tool for some applications, but in some ways
12401240
their behaviour isn't intuitive and at times they don't behave the way you may
12411241
expect them to. This section will point out some of the most common pitfalls.
12421242

12431243

1244-
Use String Methods
1244+
Use string methods
12451245
------------------
12461246

12471247
Sometimes using the :mod:`re` module is a mistake. If you're matching a fixed
@@ -1307,7 +1307,7 @@ string and then backtracking to find a match for the rest of the RE. Use
13071307
:func:`re.search` instead.
13081308

13091309

1310-
Greedy versus Non-Greedy
1310+
Greedy versus non-greedy
13111311
------------------------
13121312

13131313
When repeating a regular expression, as in ``a*``, the resulting action is to
@@ -1385,9 +1385,9 @@ Feedback
13851385
========
13861386

13871387
Regular expressions are a complicated topic. Did this document help you
1388-
understand them? Were there parts that were unclear, or Problems you
1388+
understand them? Were there parts that were unclear, or problems you
13891389
encountered that weren't covered here? If so, please send suggestions for
1390-
improvements to the author.
1390+
improvements to the :ref:`issue tracker <using-the-tracker>`.
13911391

13921392
The most complete book on regular expressions is almost certainly Jeffrey
13931393
Friedl's Mastering Regular Expressions, published by O'Reilly. Unfortunately,

0 commit comments

Comments
 (0)