Skip to content

fix(parser): process escape sequences inside \Q...\E quotemeta regions#553

Merged
fglock merged 1 commit intomasterfrom
fix/quotemeta-escape-processing
Apr 24, 2026
Merged

fix(parser): process escape sequences inside \Q...\E quotemeta regions#553
fglock merged 1 commit intomasterfrom
fix/quotemeta-escape-processing

Conversation

@fglock
Copy link
Copy Markdown
Owner

@fglock fglock commented Apr 24, 2026

Summary

Fixes escape-sequence handling inside \Q...\E quotemeta regions in double-quoted strings.

Previously, StringDoubleQuoted used an inQuotemeta flag that disabled normal escape-sequence processing once \Q was encountered. Sequences like \t, \n, \\, \x41, \041, \cA, \$, \@ were passed through as a literal backslash + following char before quotemeta() ran, producing extra backslashes. Real Perl first decodes string escapes, then applies quotemeta.

Verified against real Perl byte-for-byte:

input real Perl jperl (before) jperl (after)
"\Q\t\E" \<tab> (2) \\t (3) \<tab> (2)
"\Q\\\E" \\ (2) \\\\ (4) \\ (2)
"\Q\x41\E" A (1) \\x41 (5) A (1)
"\Q\041\E" \! (2) \\041 (5) \! (2)
"\Q\$x\E" \$x (3) \\ (2) \$x (3)
"\Q\@a\E" \@a (3) \\ (2) \@a (3)
"\Q\cA\E" \<0x01> (2) \\cA (4) \<0x01> (2)
"\Q\Qa.b\E\E" a\\\.b (double-escaped) a\.b (idempotent — wrong) a\\\.b

Fix

Remove the inQuotemeta flag entirely. \Q simply pushes a "Q" case modifier onto the existing case-modifier stack (like \U/\L); escape processing and variable interpolation continue normally inside the region; \E pops and wraps the accumulated content in quotemeta(). This also fixes the related bug where nested \Q\Q...\E\E was incorrectly treated as idempotent — real Perl applies quotemeta twice.

Net change: -37 / +10 lines; the flag-based special case is gone.

Impact

Investigated via jcpan -t WWW::Wikipedia:

  • Text::Reform 1.20: t/reform.t test 37 (form("<<<<<\Q<[^|>]\\\E",123)) now passes. Module now installs cleanly via jcpan, unblocking Text::Autoformat and WWW::Wikipedia. Result: PASS (was FAIL).
  • WWW::Wikipedia 2.05: With Text::Reform installed, the 11 test files that were dying on Can't locate Text/Reform.pm now load the module. Remaining failures are live network calls using http:// URLs (Wikipedia now only serves HTTPS; port 80 connect fails through our LWP stack) — unrelated to this change.

Test plan

  • Verified all 11 escape edge cases match real Perl byte-for-byte
  • make — all unit tests pass
  • ./jcpan -t Text::Reform → Result: PASS
  • ./jcpan -t WWW::Wikipedia → went from 11/13 test files failing to 6/13, all remaining failures are network-related (not parser)

Generated with Devin

Previously, StringDoubleQuoted used an `inQuotemeta` flag that disabled
normal escape-sequence handling once \Q was encountered. As a result,
sequences like \t, \n, \\, \x41, \041, \cA, \$ and \@ were not decoded
inside \Q...\E — they were passed through as literal backslash + char
before quotemeta() ran, producing extra backslashes.

Real Perl first applies string escapes, then quotemeta. For example:

    "\Q\\\E"   -> real: len=2 (\\)       jperl (old): len=4
    "\Q\t\E"   -> real: len=2 (\<tab>)   jperl (old): len=3 (\\t)
    "\Q\x41\E" -> real: len=1 (A)        jperl (old): len=5

This also fixed a related bug where nested \Q\Q...\E\E was incorrectly
treated as idempotent instead of applying quotemeta twice.

Fix: remove the `inQuotemeta` flag entirely. \Q simply pushes a "Q" case
modifier onto the existing stack; \E pops and wraps the accumulated
content in quotemeta(). Escape processing and variable interpolation
continue to work normally inside the region, exactly matching Perl.

Impact:
- Text::Reform 1.20: t/reform.t test 37 now passes (was failing on
  `form("<<<<<\Q<[^|>]\\\E",123)`); module now installs cleanly via
  jcpan, unblocking Text::Autoformat and WWW::Wikipedia.
- WWW::Wikipedia: with Text::Reform installed, the 11 tests that were
  failing on `Can't locate Text/Reform.pm` now load the module. The
  remaining failures are live network calls over http:// (Wikipedia
  redirects to https://), unrelated to this change.

Verified all 11 escape edge cases now match real Perl byte-for-byte.
All unit tests pass.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@fglock fglock merged commit 6f96f1c into master Apr 24, 2026
2 checks passed
@fglock fglock deleted the fix/quotemeta-escape-processing branch April 24, 2026 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant