Skip to content

fix: unescape escaped regex elements#145

Closed
cbachhuber wants to merge 1 commit into
masterfrom
feat/escape-regex-elements
Closed

fix: unescape escaped regex elements#145
cbachhuber wants to merge 1 commit into
masterfrom
feat/escape-regex-elements

Conversation

@cbachhuber

@cbachhuber cbachhuber commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator

Otherwise, the hook raises paths like foo/bar\.txt as nonexistent, but this is accepted by pre-commit/prek.

@cbachhuber cbachhuber changed the title fix: escape regex elements fix: unescape regex elements Jun 2, 2026
@cbachhuber cbachhuber changed the title fix: unescape regex elements fix: unescape escaped regex elements Jun 2, 2026


def _unescape_literal_regex_elements(text: str) -> str:
return REGEX_ESCAPE_SEQUENCE.sub(r"\1", text)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically the solution from https://mentaljetsam.wordpress.com/2007/04/13/unescape-a-python-escaped-string/ for unescaping

@cbachhuber cbachhuber marked this pull request as ready for review June 2, 2026 12:57

@hofbi hofbi left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far I always used this as a chance to cleanup. Without the escape it is much more readable.

@cbachhuber

Copy link
Copy Markdown
Collaborator Author

That's fair. Such slightly overly general regexes never hurt us. Nobody will accidentally create foo?txt instead of foo.txt. Other hooks would probably fail as well.

Still, this behavior is implicit and users wonder why this hook supports less than pre-commit/prek does. In the words of copilot:

pre-commit/prek applies exclude patterns as Python regexes via re.search() against file paths. In Python regex, \. matches a literal dot — so at1\.dbc correctly excluded at1.dbc, maven_install\.json correctly excluded maven_install.json, etc.

The problem was only in the meta-hook (check-non-existing-and-duplicate-excludes), which has its own naive path-extraction logic that:

  1. Strips (?x)^(, removes ALL ), splits on |
  2. Takes remaining fragments and checks them as literal filesystem paths with Path(entry).resolve().exists()

Should we make this more obvious, for example by documenting?

@cbachhuber

Copy link
Copy Markdown
Collaborator Author

We'll continue with this in #146

@cbachhuber cbachhuber closed this Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants