Bug
The GNU case-folding escapes inside the s/// replacement —
\U / \L / \u / \l / \E — are not supported. Today, they
are passed through literally as the letter after the backslash.
Reproduction
$ echo abcDEF | /usr/bin/sed 's/.*/\U&/'
ABCDEF
$ echo abcDEF | ./target/release/sed 's/.*/\U&/'
UabcDEF # \U treated as literal 'U'
$ echo ABCDEF | /usr/bin/sed 's/.*/\l&/'
aBCDEF
$ echo ABCDEF | ./target/release/sed 's/.*/\l&/'
\lABCDEF # \l passed through with the backslash
Same issue for \L (all lowercase until \E or end), \u (uppercase
next char only), \E (end the previous \U/\L).
What it should do
GNU semantics:
| Escape |
Meaning |
\U |
uppercase all output until \E or end of replacement |
\u |
uppercase the very next output character |
\L |
lowercase all output until \E or end |
\l |
lowercase the very next output character |
\E |
terminate the effect of the most recent \U/\L |
These nest and apply to text from backreferences (\1, &) as well as
literal text in the replacement.
Suspected place to add it
src/sed/compiler.rs:649 — compile_replacement builds the
replacement template (the structure that processor.rs renders).
Add new template-element variants for each case-folding state, then
have the renderer track a small "case mode" stack while writing
output.
Look at how \1/& are handled (around compiler.rs:1144 and the
test at compile_replacement_backrefs_and_literal) — that's the
pattern to mirror.
Note: the test subst-replacement.sh also checks that \E ends a
range, that \U\1\E\2\3 only uppercases the first group, etc.
Affected GNU testsuite tests
subst-replacement, posix-mode-s (where the absence of \l is the
expected POSIX behavior — i.e. the GNU branch must produce different
output from the POSIX branch). The fix should be gated by "not
--posix" so existing POSIX-mode pass-through is preserved.
Bug
The GNU case-folding escapes inside the
s///replacement —\U/\L/\u/\l/\E— are not supported. Today, theyare passed through literally as the letter after the backslash.
Reproduction
Same issue for
\L(all lowercase until\Eor end),\u(uppercasenext char only),
\E(end the previous\U/\L).What it should do
GNU semantics:
\U\Eor end of replacement\u\L\Eor end\l\E\U/\LThese nest and apply to text from backreferences (
\1,&) as well asliteral text in the replacement.
Suspected place to add it
src/sed/compiler.rs:649—compile_replacementbuilds thereplacement template (the structure that
processor.rsrenders).Add new template-element variants for each case-folding state, then
have the renderer track a small "case mode" stack while writing
output.
Look at how
\1/&are handled (aroundcompiler.rs:1144and thetest at
compile_replacement_backrefs_and_literal) — that's thepattern to mirror.
Note: the test
subst-replacement.shalso checks that\Eends arange, that
\U\1\E\2\3only uppercases the first group, etc.Affected GNU testsuite tests
subst-replacement,posix-mode-s(where the absence of\lis theexpected POSIX behavior — i.e. the GNU branch must produce different
output from the POSIX branch). The fix should be gated by "not
--posix" so existing POSIX-mode pass-through is preserved.