From f82c1fc23d5a93e5c8795bdde388ccd982c1abb4 Mon Sep 17 00:00:00 2001 From: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Date: Sat, 4 Apr 2026 18:18:32 +0300 Subject: [PATCH 01/11] Update regex HOWTO for re.prefixmatch --- Doc/howto/regex.rst | 81 +++++++++++++++++++++++---------------------- 1 file changed, 41 insertions(+), 40 deletions(-) diff --git a/Doc/howto/regex.rst b/Doc/howto/regex.rst index 7486a378dbb06f..5eaac214aacf5b 100644 --- a/Doc/howto/regex.rst +++ b/Doc/howto/regex.rst @@ -362,12 +362,12 @@ for a complete listing. +------------------+-----------------------------------------------+ | Method/Attribute | Purpose | +==================+===============================================+ -| ``match()`` | Determine if the RE matches at the beginning | -| | of the string. | -+------------------+-----------------------------------------------+ | ``search()`` | Scan through a string, looking for any | | | location where this RE matches. | +------------------+-----------------------------------------------+ +| ``prefixmatch()``| Determine if the RE matches at the beginning | +| | of the string. | ++------------------+-----------------------------------------------+ | ``findall()`` | Find all substrings where the RE matches, and | | | returns them as a list. | +------------------+-----------------------------------------------+ @@ -375,7 +375,7 @@ for a complete listing. | | returns them as an :term:`iterator`. | +------------------+-----------------------------------------------+ -:meth:`~re.Pattern.match` and :meth:`~re.Pattern.search` return ``None`` if no match can be found. If +:meth:`~re.Pattern.search` and :meth:`~re.Pattern.prefixmatch` return ``None`` if no match can be found. If they're successful, a :ref:`match object ` instance is returned, containing information about the match: where it starts and ends, the substring it matched, and more. @@ -391,21 +391,21 @@ Python interpreter, import the :mod:`re` module, and compile a RE:: >>> p re.compile('[a-z]+') -Now, you can try matching various strings against the RE ``[a-z]+``. An empty +Now, you can try searching various strings against the RE ``[a-z]+``. An empty string shouldn't match at all, since ``+`` means 'one or more repetitions'. -:meth:`~re.Pattern.match` should return ``None`` in this case, which will cause the +:meth:`~re.Pattern.search` should return ``None`` in this case, which will cause the interpreter to print no output. You can explicitly print the result of -:meth:`!match` to make this clear. :: +:meth:`!search` to make this clear. :: - >>> p.match("") - >>> print(p.match("")) + >>> p.search("") + >>> print(p.search("")) None Now, let's try it on a string that it should match, such as ``tempo``. In this -case, :meth:`~re.Pattern.match` will return a :ref:`match object `, so you +case, :meth:`~re.Pattern.search` will return a :ref:`match object `, so you should store the result in a variable for later use. :: - >>> m = p.match('tempo') + >>> m = p.search('tempo') >>> m @@ -437,27 +437,28 @@ Trying these methods will soon clarify their meaning:: :meth:`~re.Match.group` returns the substring that was matched by the RE. :meth:`~re.Match.start` and :meth:`~re.Match.end` return the starting and ending index of the match. :meth:`~re.Match.span` -returns both start and end indexes in a single tuple. Since the :meth:`~re.Pattern.match` -method only checks if the RE matches at the start of a string, :meth:`!start` -will always be zero. However, the :meth:`~re.Pattern.search` method of patterns -scans through the string, so the match may not start at zero in that -case. :: +returns both start and end indexes in a single tuple. +The :meth:`~re.Pattern.search` method of patterns +scans through the string, so the match may not start at zero. +However, the :meth:`~re.Pattern.prefixmatch` +method only checks if the RE matches at the start of a string, so :meth:`!start` +will always be zero in that case. :: - >>> print(p.match('::: message')) - None >>> m = p.search('::: message'); print(m) >>> m.group() 'message' >>> m.span() (4, 11) + >>> print(p.prefixmatch('::: message')) + None In actual programs, the most common style is to store the :ref:`match object ` in a variable, and then check if it was ``None``. This usually looks like:: p = re.compile( ... ) - m = p.match( 'string goes here' ) + m = p.search( 'string goes here' ) if m: print('Match found: ', m.group()) else: @@ -495,15 +496,15 @@ Module-Level Functions ---------------------- You don't have to create a pattern object and call its methods; the -:mod:`re` module also provides top-level functions called :func:`~re.match`, -:func:`~re.search`, :func:`~re.findall`, :func:`~re.sub`, and so forth. These functions +:mod:`re` module also provides top-level functions called :func:`~re.search`, +:func:`~re.prefixmatch`, :func:`~re.findall`, :func:`~re.sub`, and so forth. These functions take the same arguments as the corresponding pattern method with the RE string added as the first argument, and still return either ``None`` or a :ref:`match object ` instance. :: - >>> print(re.match(r'From\s+', 'Fromage amk')) + >>> print(re.prefixmatch(r'From\s+', 'Fromage amk')) None - >>> re.match(r'From\s+', 'From amk Thu May 14 19:12:10 1998') #doctest: +ELLIPSIS + >>> re.prefixmatch(r'From\s+', 'From amk Thu May 14 19:12:10 1998') #doctest: +ELLIPSIS Under the hood, these functions simply create a pattern object for you @@ -812,7 +813,7 @@ of a group with a quantifier, such as ``*``, ``+``, ``?``, or ``ab``. :: >>> p = re.compile('(ab)*') - >>> print(p.match('ababababab').span()) + >>> print(p.search('ababababab').span()) (0, 10) Groups indicated with ``'('``, ``')'`` also capture the starting and ending @@ -825,7 +826,7 @@ argument. Later we'll see how to express groups that don't capture the span of text that they match. :: >>> p = re.compile('(a)b') - >>> m = p.match('ab') + >>> m = p.search('ab') >>> m.group() 'ab' >>> m.group(0) @@ -836,7 +837,7 @@ to determine the number, just count the opening parenthesis characters, going from left to right. :: >>> p = re.compile('(a(b)c)d') - >>> m = p.match('abcd') + >>> m = p.search('abcd') >>> m.group(0) 'abcd' >>> m.group(1) @@ -912,10 +913,10 @@ but aren't interested in retrieving the group's contents. You can make this fact explicit by using a non-capturing group: ``(?:...)``, where you can replace the ``...`` with any other regular expression. :: - >>> m = re.match("([abc])+", "abc") + >>> m = re.search("([abc])+", "abc") >>> m.groups() ('c',) - >>> m = re.match("(?:[abc])+", "abc") + >>> m = re.search("(?:[abc])+", "abc") >>> m.groups() () @@ -949,7 +950,7 @@ given numbers, so you can retrieve information about a group in two ways:: Additionally, you can retrieve named groups as a dictionary with :meth:`~re.Match.groupdict`:: - >>> m = re.match(r'(?P\w+) (?P\w+)', 'Jane Doe') + >>> m = re.search(r'(?P\w+) (?P\w+)', 'Jane Doe') >>> m.groupdict() {'first': 'Jane', 'last': 'Doe'} @@ -1274,18 +1275,18 @@ In short, before turning to the :mod:`re` module, consider whether your problem can be solved with a faster and simpler string method. -match() versus search() ------------------------ +prefixmatch() versus search() +----------------------------- -The :func:`~re.match` function only checks if the RE matches at the beginning of the +The :func:`~re.prefixmatch` function only checks if the RE matches at the beginning of the string while :func:`~re.search` will scan forward through the string for a match. -It's important to keep this distinction in mind. Remember, :func:`!match` will +It's important to keep this distinction in mind. Remember, :func:`!prefixmatch` will only report a successful match which will start at 0; if the match wouldn't -start at zero, :func:`!match` will *not* report it. :: +start at zero, :func:`!prefixmatch` will *not* report it. :: - >>> print(re.match('super', 'superstition').span()) + >>> print(re.prefixmatch('super', 'superstition').span()) (0, 5) - >>> print(re.match('super', 'insuperable')) + >>> print(re.prefixmatch('super', 'insuperable')) None On the other hand, :func:`~re.search` will scan forward through the string, @@ -1296,7 +1297,7 @@ reporting the first match it finds. :: >>> print(re.search('super', 'insuperable').span()) (2, 7) -Sometimes you'll be tempted to keep using :func:`re.match`, and just add ``.*`` +Sometimes you'll be tempted to keep using :func:`re.prefixmatch`, and just add ``.*`` to the front of your RE. Resist this temptation and use :func:`re.search` instead. The regular expression compiler does some analysis of REs in order to speed up the process of looking for a match. One such analysis figures out what @@ -1322,9 +1323,9 @@ doesn't work because of the greedy nature of ``.*``. :: >>> s = 'Title' >>> len(s) 32 - >>> print(re.match('<.*>', s).span()) + >>> print(re.prefixmatch('<.*>', s).span()) (0, 32) - >>> print(re.match('<.*>', s).group()) + >>> print(re.prefixmatch('<.*>', s).group()) Title The RE matches the ``'<'`` in ``''``, and the ``.*`` consumes the rest of @@ -1340,7 +1341,7 @@ example, the ``'>'`` is tried immediately after the first ``'<'`` matches, and when it fails, the engine advances a character at a time, retrying the ``'>'`` at every step. This produces just the right result:: - >>> print(re.match('<.*?>', s).group()) + >>> print(re.prefixmatch('<.*?>', s).group()) (Note that parsing HTML or XML with regular expressions is painful. From 33621934544d06619035c4c3138e291a630174c6 Mon Sep 17 00:00:00 2001 From: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Date: Sat, 4 Apr 2026 18:42:23 +0300 Subject: [PATCH 02/11] Update fnmatch to prefixmatch --- Doc/library/fnmatch.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/Doc/library/fnmatch.rst b/Doc/library/fnmatch.rst index ee654b7a83e203..a213679e4e2dd7 100644 --- a/Doc/library/fnmatch.rst +++ b/Doc/library/fnmatch.rst @@ -103,7 +103,8 @@ functions: :func:`fnmatch`, :func:`fnmatchcase`, :func:`.filter`, :func:`.filter .. function:: translate(pat) Return the shell-style pattern *pat* converted to a regular expression for - using with :func:`re.match`. The pattern is expected to be a :class:`str`. + using with :func:`re.prefixmatch`. The pattern is expected to be a + :class:`str`. Example: @@ -113,7 +114,7 @@ functions: :func:`fnmatch`, :func:`fnmatchcase`, :func:`.filter`, :func:`.filter >>> regex '(?s:.*\\.txt)\\z' >>> reobj = re.compile(regex) - >>> reobj.match('foobar.txt') + >>> reobj.prefixmatch('foobar.txt') From 8d0571ffd5f805b56b0154a4f95d8c43222a87b1 Mon Sep 17 00:00:00 2001 From: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Date: Sat, 4 Apr 2026 18:43:29 +0300 Subject: [PATCH 03/11] Update glob to prefixmatch --- Doc/library/glob.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/Doc/library/glob.rst b/Doc/library/glob.rst index 52c44928153337..9fd9caca34e8d3 100644 --- a/Doc/library/glob.rst +++ b/Doc/library/glob.rst @@ -130,7 +130,8 @@ The :mod:`!glob` module defines the following functions: .. function:: translate(pathname, *, recursive=False, include_hidden=False, seps=None) Convert the given path specification to a regular expression for use with - :func:`re.match`. The path specification can contain shell-style wildcards. + :func:`re.prefixmatch`. The path specification can contain shell-style + wildcards. For example: @@ -140,7 +141,7 @@ The :mod:`!glob` module defines the following functions: >>> regex '(?s:(?:.+/)?[^/]*\\.txt)\\z' >>> reobj = re.compile(regex) - >>> reobj.match('foo/bar/baz.txt') + >>> reobj.prefixmatch('foo/bar/baz.txt') Path separators and segments are meaningful to this function, unlike From 2fd623670bd34adb920cf9d9288bd3fd3a80c861 Mon Sep 17 00:00:00 2001 From: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Date: Sat, 4 Apr 2026 18:44:55 +0300 Subject: [PATCH 04/11] Update typing to search --- Doc/library/typing.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/library/typing.rst b/Doc/library/typing.rst index 09e9103e1b80d0..2ce868cf84da9d 100644 --- a/Doc/library/typing.rst +++ b/Doc/library/typing.rst @@ -3797,7 +3797,7 @@ Aliases to other concrete types Match Deprecated aliases corresponding to the return types from - :func:`re.compile` and :func:`re.match`. + :func:`re.compile` and :func:`re.search`. These types (and the corresponding functions) are generic over :data:`AnyStr`. ``Pattern`` can be specialised as ``Pattern[str]`` or From ca8b57186c4deccf5bc1a4f6a50379809d318a6e Mon Sep 17 00:00:00 2001 From: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Date: Sat, 4 Apr 2026 18:45:48 +0300 Subject: [PATCH 05/11] Update logging-cookbook to prefixmatch --- Doc/howto/logging-cookbook.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/howto/logging-cookbook.rst b/Doc/howto/logging-cookbook.rst index 21df6ba858d8c2..0ee4c0086dd98c 100644 --- a/Doc/howto/logging-cookbook.rst +++ b/Doc/howto/logging-cookbook.rst @@ -3877,7 +3877,7 @@ subclassed handler which looks something like this:: def format(self, record): version = 1 asctime = dt.datetime.fromtimestamp(record.created).isoformat() - m = self.tz_offset.match(time.strftime('%z')) + m = self.tz_offset.prefixmatch(time.strftime('%z')) has_offset = False if m and time.timezone: hrs, mins = m.groups() From 1f263e944c67a4211c4c08e054a5d4e1a6bde2ac Mon Sep 17 00:00:00 2001 From: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Date: Sat, 4 Apr 2026 18:47:41 +0300 Subject: [PATCH 06/11] Preserve HTML ID --- Doc/howto/regex.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/Doc/howto/regex.rst b/Doc/howto/regex.rst index 5eaac214aacf5b..1d3c2ae73ac5af 100644 --- a/Doc/howto/regex.rst +++ b/Doc/howto/regex.rst @@ -1275,6 +1275,8 @@ In short, before turning to the :mod:`re` module, consider whether your problem can be solved with a faster and simpler string method. +.. _match-versus-search: + prefixmatch() versus search() ----------------------------- From 4090e690c8f702ad254d5a68f3dbce7ebd405771 Mon Sep 17 00:00:00 2001 From: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Date: Sat, 4 Apr 2026 22:07:01 +0300 Subject: [PATCH 07/11] Mention previous name --- Doc/howto/regex.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/Doc/howto/regex.rst b/Doc/howto/regex.rst index 1d3c2ae73ac5af..fceda4e609da38 100644 --- a/Doc/howto/regex.rst +++ b/Doc/howto/regex.rst @@ -366,7 +366,8 @@ for a complete listing. | | location where this RE matches. | +------------------+-----------------------------------------------+ | ``prefixmatch()``| Determine if the RE matches at the beginning | -| | of the string. | +| | of the string. Previously named :ref:`match() | +| | `. | +------------------+-----------------------------------------------+ | ``findall()`` | Find all substrings where the RE matches, and | | | returns them as a list. | From ff22eb27390bf4c0412f5129ee3f6093f9063ca0 Mon Sep 17 00:00:00 2001 From: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Date: Sat, 4 Apr 2026 22:17:09 +0300 Subject: [PATCH 08/11] Revise 'prefixmatch() versus search()' --- Doc/howto/regex.rst | 31 +++++++++++++------------------ 1 file changed, 13 insertions(+), 18 deletions(-) diff --git a/Doc/howto/regex.rst b/Doc/howto/regex.rst index fceda4e609da38..6f5ebaf706af90 100644 --- a/Doc/howto/regex.rst +++ b/Doc/howto/regex.rst @@ -1281,18 +1281,23 @@ can be solved with a faster and simpler string method. prefixmatch() versus search() ----------------------------- -The :func:`~re.prefixmatch` function only checks if the RE matches at the beginning of the -string while :func:`~re.search` will scan forward through the string for a match. -It's important to keep this distinction in mind. Remember, :func:`!prefixmatch` will -only report a successful match which will start at 0; if the match wouldn't -start at zero, :func:`!prefixmatch` will *not* report it. :: +:func:`~re.prefixmatch` was added in Python 3.15 as the :ref:`preferred name +` for :func:`~re.match`. Before this, it was only known +as :func:`!match` and the distinction with :func:`~re.search` was often +misunderstood. + +:func:`!prefixmatch` aka :func:`!match` only checks if the RE matches at the +beginning of the string while :func:`!search` scans forward through the +string for a match. :func:`!prefixmatch` only reports a successful match which +starts at zero; if the match wouldn't start at zero, :func:`!prefixmatch` will +*not* report it. :: >>> print(re.prefixmatch('super', 'superstition').span()) (0, 5) >>> print(re.prefixmatch('super', 'insuperable')) None -On the other hand, :func:`~re.search` will scan forward through the string, +On the other hand, :func:`~re.search` scans forward through the string, reporting the first match it finds. :: >>> print(re.search('super', 'superstition').span()) @@ -1300,18 +1305,8 @@ reporting the first match it finds. :: >>> print(re.search('super', 'insuperable').span()) (2, 7) -Sometimes you'll be tempted to keep using :func:`re.prefixmatch`, and just add ``.*`` -to the front of your RE. Resist this temptation and use :func:`re.search` -instead. The regular expression compiler does some analysis of REs in order to -speed up the process of looking for a match. One such analysis figures out what -the first character of a match must be; for example, a pattern starting with -``Crow`` must match starting with a ``'C'``. The analysis lets the engine -quickly scan through the string looking for the starting character, only trying -the full match if a ``'C'`` is found. - -Adding ``.*`` defeats this optimization, requiring scanning to the end of the -string and then backtracking to find a match for the rest of the RE. Use -:func:`re.search` instead. +This distinction is important to remember when using the old :func:`~re.match` +name in code requiring compatibility with older Python versions. Greedy versus Non-Greedy From 920a69581a0981035bc9de4a560483e9515a2fde Mon Sep 17 00:00:00 2001 From: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Date: Sat, 4 Apr 2026 22:22:45 +0300 Subject: [PATCH 09/11] Trim a bit more from 'prefixmatch() versus search()' --- Doc/howto/regex.rst | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/Doc/howto/regex.rst b/Doc/howto/regex.rst index 6f5ebaf706af90..b7777f027cafab 100644 --- a/Doc/howto/regex.rst +++ b/Doc/howto/regex.rst @@ -1288,9 +1288,7 @@ misunderstood. :func:`!prefixmatch` aka :func:`!match` only checks if the RE matches at the beginning of the string while :func:`!search` scans forward through the -string for a match. :func:`!prefixmatch` only reports a successful match which -starts at zero; if the match wouldn't start at zero, :func:`!prefixmatch` will -*not* report it. :: +string for a match. :: >>> print(re.prefixmatch('super', 'superstition').span()) (0, 5) From 96d80f484bae588d1972b5b95685868626bfe37f Mon Sep 17 00:00:00 2001 From: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Date: Sat, 4 Apr 2026 23:13:41 +0300 Subject: [PATCH 10/11] Improve title Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com> --- Doc/howto/regex.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/howto/regex.rst b/Doc/howto/regex.rst index b7777f027cafab..be9bb11211d923 100644 --- a/Doc/howto/regex.rst +++ b/Doc/howto/regex.rst @@ -1278,7 +1278,7 @@ can be solved with a faster and simpler string method. .. _match-versus-search: -prefixmatch() versus search() +prefixmatch() (aka match) versus search() ----------------------------- :func:`~re.prefixmatch` was added in Python 3.15 as the :ref:`preferred name From d2bdc8ea199a0076d0b22c85423f9a9e83da19aa Mon Sep 17 00:00:00 2001 From: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Date: Sat, 4 Apr 2026 23:29:59 +0300 Subject: [PATCH 11/11] Fix underline length to fix docs build --- Doc/howto/regex.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/howto/regex.rst b/Doc/howto/regex.rst index be9bb11211d923..66a5fc6d053354 100644 --- a/Doc/howto/regex.rst +++ b/Doc/howto/regex.rst @@ -1279,7 +1279,7 @@ can be solved with a faster and simpler string method. .. _match-versus-search: prefixmatch() (aka match) versus search() ------------------------------ +----------------------------------------- :func:`~re.prefixmatch` was added in Python 3.15 as the :ref:`preferred name ` for :func:`~re.match`. Before this, it was only known