fix(parser): process escape sequences inside \Q...\E quotemeta regions#553
Merged
fix(parser): process escape sequences inside \Q...\E quotemeta regions#553
Conversation
Previously, StringDoubleQuoted used an `inQuotemeta` flag that disabled
normal escape-sequence handling once \Q was encountered. As a result,
sequences like \t, \n, \\, \x41, \041, \cA, \$ and \@ were not decoded
inside \Q...\E — they were passed through as literal backslash + char
before quotemeta() ran, producing extra backslashes.
Real Perl first applies string escapes, then quotemeta. For example:
"\Q\\\E" -> real: len=2 (\\) jperl (old): len=4
"\Q\t\E" -> real: len=2 (\<tab>) jperl (old): len=3 (\\t)
"\Q\x41\E" -> real: len=1 (A) jperl (old): len=5
This also fixed a related bug where nested \Q\Q...\E\E was incorrectly
treated as idempotent instead of applying quotemeta twice.
Fix: remove the `inQuotemeta` flag entirely. \Q simply pushes a "Q" case
modifier onto the existing stack; \E pops and wraps the accumulated
content in quotemeta(). Escape processing and variable interpolation
continue to work normally inside the region, exactly matching Perl.
Impact:
- Text::Reform 1.20: t/reform.t test 37 now passes (was failing on
`form("<<<<<\Q<[^|>]\\\E",123)`); module now installs cleanly via
jcpan, unblocking Text::Autoformat and WWW::Wikipedia.
- WWW::Wikipedia: with Text::Reform installed, the 11 tests that were
failing on `Can't locate Text/Reform.pm` now load the module. The
remaining failures are live network calls over http:// (Wikipedia
redirects to https://), unrelated to this change.
Verified all 11 escape edge cases now match real Perl byte-for-byte.
All unit tests pass.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes escape-sequence handling inside
\Q...\Equotemeta regions in double-quoted strings.Previously,
StringDoubleQuotedused aninQuotemetaflag that disabled normal escape-sequence processing once\Qwas encountered. Sequences like\t,\n,\\,\x41,\041,\cA,\$,\@were passed through as a literal backslash + following char beforequotemeta()ran, producing extra backslashes. Real Perl first decodes string escapes, then applies quotemeta.Verified against real Perl byte-for-byte:
"\Q\t\E"\<tab>(2)\\t(3)\<tab>(2)"\Q\\\E"\\(2)\\\\(4)\\(2)"\Q\x41\E"A(1)\\x41(5)A(1)"\Q\041\E"\!(2)\\041(5)\!(2)"\Q\$x\E"\$x(3)\\(2)\$x(3)"\Q\@a\E"\@a(3)\\(2)\@a(3)"\Q\cA\E"\<0x01>(2)\\cA(4)\<0x01>(2)"\Q\Qa.b\E\E"a\\\.b(double-escaped)a\.b(idempotent — wrong)a\\\.bFix
Remove the
inQuotemetaflag entirely.\Qsimply pushes a"Q"case modifier onto the existing case-modifier stack (like\U/\L); escape processing and variable interpolation continue normally inside the region;\Epops and wraps the accumulated content inquotemeta(). This also fixes the related bug where nested\Q\Q...\E\Ewas incorrectly treated as idempotent — real Perl appliesquotemetatwice.Net change: -37 / +10 lines; the flag-based special case is gone.
Impact
Investigated via
jcpan -t WWW::Wikipedia:Text::Reform 1.20:t/reform.ttest 37 (form("<<<<<\Q<[^|>]\\\E",123)) now passes. Module now installs cleanly viajcpan, unblockingText::AutoformatandWWW::Wikipedia. Result: PASS (was FAIL).WWW::Wikipedia 2.05: WithText::Reforminstalled, the 11 test files that were dying onCan't locate Text/Reform.pmnow load the module. Remaining failures are live network calls usinghttp://URLs (Wikipedia now only serves HTTPS; port 80 connect fails through our LWP stack) — unrelated to this change.Test plan
make— all unit tests pass./jcpan -t Text::Reform→ Result: PASS./jcpan -t WWW::Wikipedia→ went from 11/13 test files failing to 6/13, all remaining failures are network-related (not parser)Generated with Devin