Skip to content

fix(require): preserve $@/$! across on_scope_end callbacks#569

Merged
fglock merged 4 commits intomasterfrom
fix/onscope-end-preserve-error-vars
Apr 27, 2026
Merged

fix(require): preserve $@/$! across on_scope_end callbacks#569
fglock merged 4 commits intomasterfrom
fix/onscope-end-preserve-error-vars

Conversation

@fglock
Copy link
Copy Markdown
Owner

@fglock fglock commented Apr 27, 2026

Summary

When a use Foo; inside a module being loaded throws "Can't locate
Foo.pm in @inc", ModuleOperators.doFile's catch block stores that
message in $@. The finally block then calls
BHooksEndOfScope.endFileLoad, which fires the on_scope_end
callbacks registered for the file. Some of those callbacks (notably
namespace::autoclean's cleanup routine) internally use eval { ... }
blocks, which Perl resets $@ to "" on success — clobbering the
inner cause. By the time the outer require reads $@ it is empty,
only $! (ENOENT) is left, and the outer require falls into the
"file not found" branch and rebuilds an error message naming the
outer module.

This makes failures look very wrong. For example,
jcpan -t Text::WordCounter reports

Can't locate Text/WordCounter.pm in @INC (you may need to install
the Text::WordCounter module) (@INC entries checked: blib/lib …)

even though blib/lib/Text/WordCounter.pm exists; the real failure
is Can't locate Lingua/ZH/MMSEG.pm in @INC from a use inside
Text/WordCounter.pm.

Reproducer

# /tmp/MyWC.pm
package MyWC;
use namespace::autoclean;
use Moose;
use NotExistsBogus;
1;
$ jperl -I/tmp -e 'eval { require MyWC; }; print $@'
# before: Can't locate MyWC.pm in @INC ...   (misleading)
# after : Can't locate NotExistsBogus.pm in @INC ...  (correct)

Fix

BHooksEndOfScope.endFileLoad now saves $@/$! before invoking the
on_scope_end callback loop and restores them in a finally, so the
inner failure message survives.

Test plan

  • make passes (all unit-test shards green)
  • Reproducer above shows the corrected error message
  • jcpan -t Text::WordCounter now reports the real underlying
    cause (Can't locate Lingua/ZH/MMSEG.pm in @INC) instead of a
    misleading "Text/WordCounter.pm" message
  • Configuration.java regenerated by injectGitInfo is included

Generated with Devin

fglock added a commit that referenced this pull request Apr 27, 2026
Mark Work items 1, 2, 3 complete (encoding.pm stub, $SIG handler
parity), document the "real" cause of the URI::Find failure (it
wasn't a regex-parity bug at all), and add Work item 5 covering
the newly-surfaced Unicode::UCD-from-JAR loader bug that's the
remaining blocker for `jcpan -t Text::WordCounter`.

Generated with [Devin](https://devin.ai)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@fglock
Copy link
Copy Markdown
Owner Author

fglock commented Apr 27, 2026

Update — phases 2 and 3 of dev/modules/text_wordcounter.md

This branch now carries three commits, each fixing a distinct
PerlOnJava parity bug surfaced by jcpan -t Text::WordCounter:

1. 4850e9a83 — preserve $@/$! across on_scope_end callbacks (original)

Stops misleading Can't locate <outer-module>.pm errors when an
inner use Foo; fails inside a module that uses
namespace::autoclean + Moose.

2. 88f63c8c9 — bundle no-op encoding.pm pragma

The deprecated encoding pragma (removed from core in 5.26+) is
still loaded by older CJK CPAN modules. PerlOnJava parses sources
as UTF-8 unconditionally, so a no-op stub that honours explicit
filehandle-layer arguments is sufficient. Includes a unit test.
Result: Lingua::ZH::MMSEG 3/3 tests pass.

3. cd93247e4$SIG{__DIE__|__WARN__} = 'DEFAULT'/'IGNORE'

Real Perl 5 treats those two literal strings in %SIG as reserved
("no handler" / "ignore"). PerlOnJava was invoking them as
subroutine names — when URI::Find::find does
local $SIG{__DIE__} = 'DEFAULT'; and URI.pm then does
eval "require URI::git", the failure dispatched through a
bogus &main::DEFAULT() and clobbered $@, making git://
and svn+ssh:// URIs undetectable.
Result: URI::Find 578/578 subtests pass.

4. 7819bbe00 — plan docs

dev/modules/text_wordcounter.md updated to reflect what shipped
here, including a corrected explanation of the URI::Find failure
(not a regex-parity bug after all) and a new Work item 5 covering
the remaining blocker:

  • Loading Unicode::UCD from the shaded jar via the module-name
    form materialises only 3 of 44 subs, so
    Text::WordCounter::split_scripts dies with
    Undefined subroutine &Unicode::UCD::charinfo called. Loading
    the same file via an absolute path (or via do) materialises
    all 44 subs. Diagnosis is left for a follow-up PR.

Cumulative effect on jcpan -t Text::WordCounter

  • Lingua::Stem::Snowball::No: PASS
  • Lingua::Stem::Snowball::Se: PASS
  • Lingua::Stem: PASS
  • Lingua::ZH::MMSEG: PASS (was: FAIL)
  • URI::Find: PASS 578/578 (was: FAIL 2/578)
  • Text::WordCounter: still fails — blocked on Work item 5

Test plan

  • make passes (all unit-test shards green)
  • New encoding_pragma.t unit test passes (6/6 subtests)
  • Lingua::ZH::MMSEG t/000-load.t, t/002-mmseg.t, t/003-fmm.t all pass
  • URI::Find t/Find.t passes 578/578 (previously 576/578)
  • Configuration.java regenerated by injectGitInfo is included

fglock and others added 4 commits April 27, 2026 14:27
When a `use Foo;` inside a module being loaded throws "Can't locate
Foo.pm in @inc", doFile's catch block stores that message in $@.
The finally block then fires on_scope_end callbacks registered for
the file (e.g. by namespace::autoclean). Those callbacks internally
use `eval { ... }` blocks, which Perl resets $@ to "" on success,
clobbering the inner cause. By the time the outer `require` checks
$@, only $! remains, so it falls into the "file not found" branch
and reports a misleading "Can't locate <outer-file>.pm in @inc"
instead of the real inner failure.

Save and restore $@/$! around the callback execution in
BHooksEndOfScope.endFileLoad so the inner failure message survives.

Reproducible with `jcpan -t Text::WordCounter` (Text::WordCounter
uses namespace::autoclean + Moose and pulls in Lingua::ZH::MMSEG;
the latter fails because PerlOnJava lacks the `encoding` pragma,
but the error was attributed to Text/WordCounter.pm itself).

Generated with [Devin](https://devin.ai)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Adds `src/main/perl/lib/encoding.pm` so older CPAN modules that still
write `use encoding 'utf8';` can load. The `encoding` pragma was
deprecated in Perl 5.18 and removed from core in 5.26+; PerlOnJava
parses sources as UTF-8 unconditionally, so the source-encoding
effects of the pragma are unnecessary.

The stub:
- accepts the historical import forms (`use encoding;`,
  `use encoding 'utf8';`, `use encoding 'utf8', STDIN => 'utf8'`);
- applies binmode :encoding(LAYER) for filehandles named in the
  import list (matching the historical pragma's filehandle behaviour);
- intentionally does NOT emulate chr/ord/length overrides;
- exposes `encoding::name()` returning "utf8" so old probes work.

Includes `src/test/resources/unit/encoding_pragma.t` covering all
historical import forms and the real-world `use utf8 + use Encode +
use encoding` combination from Lingua::ZH::MMSEG.

Result: `jcpan -t Lingua::ZH::MMSEG` now passes (3/3 tests).
Tracked in dev/modules/text_wordcounter.md (Work item 1).

Generated with [Devin](https://devin.ai)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Real Perl 5 treats two literal strings in %SIG as reserved:

  'DEFAULT' - use the OS / default disposition (for __DIE__ /
              __WARN__, equivalent to "no handler installed")
  'IGNORE'  - ignore the signal entirely (effective for __WARN__,
              ineffective + warns for __DIE__)

PerlOnJava's WarnDie.catchEval, WarnDie.die and WarnDie.warn were
all unconditionally invoking RuntimeCode.apply() on whatever was in
$SIG{__DIE__} / $SIG{__WARN__}. With the literal string 'DEFAULT'
that became `&main::DEFAULT()`, which croaked with "Undefined
subroutine &main::DEFAULT".

Real-world repro: URI::Find::find() does

    local $SIG{__DIE__} = 'DEFAULT';

then calls URI->new() which internally does `eval "require URI::git"`
to look up implementor classes for unknown schemes. URI::git doesn't
exist, so the eval-string failure dispatched through the bogus
'DEFAULT' "handler" and clobbered $@ inside _is_uri, making URIs
with non-standard schemes (git://, svn+ssh://) undetectable. This
caused two of URI-Find's t/Find.t subtests (355, 364) to fail.

Fix: introduce WarnDie.isReservedSigString() and gate the three
handler invocations on !isReservedSigString(sig). For __WARN__,
additionally honour 'IGNORE' by suppressing the STDERR write.

Reproducer:

  $ jperl -e 'local $SIG{__DIE__} = "DEFAULT";
              eval q{ require NoSuchModule };
              print "[\$\@=$@]"'
  # before: [$@=Undefined subroutine &main::DEFAULT called ...]
  # after : [$@=Can't locate NoSuchModule.pm in @inc ...]

Result: URI::Find passes 578/578 subtests.
Tracked in dev/modules/text_wordcounter.md (Work item 3).

Generated with [Devin](https://devin.ai)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Mark Work items 1, 2, 3 complete (encoding.pm stub, $SIG handler
parity), document the "real" cause of the URI::Find failure (it
wasn't a regex-parity bug at all), and add Work item 5 covering
the newly-surfaced Unicode::UCD-from-JAR loader bug that's the
remaining blocker for `jcpan -t Text::WordCounter`.

Generated with [Devin](https://devin.ai)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@fglock fglock force-pushed the fix/onscope-end-preserve-error-vars branch from 7819bbe to 70495a3 Compare April 27, 2026 12:27
@fglock fglock merged commit 144f6f3 into master Apr 27, 2026
2 checks passed
@fglock fglock deleted the fix/onscope-end-preserve-error-vars branch April 27, 2026 13:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant