diff --git a/dev/design/nested-eval-string-lexicals.md b/dev/design/nested-eval-string-lexicals.md new file mode 100644 index 000000000..e8411fc61 --- /dev/null +++ b/dev/design/nested-eval-string-lexicals.md @@ -0,0 +1,303 @@ +# Fix: nested `eval STRING` cannot see outer `my` lexicals (interpreter backend) + +## Problem + +In standard Perl, a string eval's compile-time lexical scope includes every +`my`/`our`/`state` visible at the call site — including variables declared +inside an enclosing `eval STRING`. + +In PerlOnJava (default interpreter backend, i.e. not +`JPERL_EVAL_NO_INTERPRETER=1`), this works for most forms but fails for a +specific combination: **a named subroutine defined inside a nested `eval STRING` +that references a `my` declared in the outer `eval STRING`**. + +Minimal reproducer (discovered while fixing `Geo::IP`): + +```perl +eval q{ + use strict; + my $y = 99; + eval q{ sub bar { return $y } 1 }; + print $@; # PerlOnJava: Global symbol "$y" requires explicit package name... + # Real Perl: (prints nothing) +}; +``` + +Concrete impact: Geo::IP's pure-Perl path wraps v6 subs inside an inner +`eval <<'__IPV6__'` that references outer-eval `my @countries`, `@code3s`, +`@names`. All v6 methods silently fail to compile, causing `country_v6.t` +and half of `org.t` to die with "Can't locate object method …". + +## Root cause (verified) + +The bug is **interpreter-backend specific**. Setting +`JPERL_EVAL_NO_INTERPRETER=1` makes the reproducer work, because the JVM +backend (`backend/jvm/EmitEval.java` → `runtime/runtimetypes/RuntimeCode.java#evalStringHelper`) +correctly threads the outer eval's `capturedSymbolTable` through to the inner +eval's snapshot. When `handleMyOperator` sees `my $y` at the outer eval's +codegen, it calls `capturedSymbolTable.addVariable("$y", …)`. The inner eval's +`handleEvalOperator` then snapshots that table, so `$y` is present at the +inner parse. + +The **interpreter backend** behaves differently: + +1. At compile time, `backend/bytecode/CompileOperator.java:1097` captures the + caller's lexical pad as a `Map` via + `bytecodeCompiler.symbolTable.getVisibleVariableRegistry()` and stores it in + `BytecodeCompiler.evalSiteRegistries`. That registry is emitted as part of + the `EVAL` opcode operand and delivered at runtime. + +2. At runtime, `backend/bytecode/EvalStringHandler.java:120` + (`evalStringList`) creates a **fresh, empty** `ScopedSymbolTable` and + seeds it with only three entries: + + ```java + symbolTable.enterScope(); + symbolTable.addVariable("this", "", null); + symbolTable.addVariable("@_", "our", null); + symbolTable.addVariable("wantarray", "", null); + ``` + +3. The `siteRegistry` received from the caller (step 1) is used purely to + **build runtime captured-value arrays** (lines 182–246) — the variable + *names* never make it into `symbolTable`. The fresh symbol table is what + the parser sees. + +4. Consequence: `frontend/parser/Variable.java#checkStrictVarsAtParseTime` + (line 285) — which only fires inside named sub bodies — does a + `symbolTable.getSymbolEntry("$y")`, finds nothing, and raises + "Global symbol requires explicit package name". + +Verified via instrumentation (`DEBUG_STRICT=1`): + +``` +[EVAL-INT] tag=eval14 capturedVars=[this, @_, wantarray] +[STRICT] missing $countries in sub=main::foo visible=[this, @_, wantarray] +``` + +The direct-expression case (`eval q{ $y + 1 }` inside outer eval) *works* +only because it bypasses the strict-at-parse-time check (no named sub body) +and because `BytecodeCompiler` resolves references via the +`adjustedRegistry`, not `symbolTable`. + +The anonymous-sub case (`eval q{ sub { $y } }`) works for the same reason +(`checkStrictVarsAtParseTime` is gated on named subs only). + +## Goal + +An inner `eval STRING` must see the caller's visible `my`/`our`/`state` +lexicals at parse time, for all parse paths (direct references, anonymous +subs, named subs, nested evals of any depth) — under both the interpreter and +JVM backends — while preserving existing closure/runtime-capture semantics. + +## Proposed fix + +Treat `siteRegistry` as authoritative lexical information: when +`EvalStringHandler` prepares the fresh `ScopedSymbolTable`, seed it with +placeholder `my`-declared entries for every name in the registry. These +entries only need to be *present* (so parse-time name lookups and +`checkStrictVarsAtParseTime` succeed); the actual storage location is +handled separately by `BytecodeCompiler` via `adjustedRegistry` (runtime +captured-var registers). + +This mirrors what the JVM backend's `RuntimeCode.evalStringHelper` already +does implicitly (by reusing `capturedSymbolTable`). + +### Phase 1 — fix the interpreter `EvalStringHandler` + +In `backend/bytecode/EvalStringHandler.java`, both in `evalStringList` +(around line 126) and `evalString` (around line 340): + +```java +symbolTable.enterScope(); +symbolTable.addVariable("this", "", null); +symbolTable.addVariable("@_", "our", null); +symbolTable.addVariable("wantarray", "", null); + +// NEW: seed the symbol table with outer lexicals so the parser can see +// them (e.g. for strict-vars checks inside named sub bodies). The +// runtime values for these variables are captured separately via +// adjustedRegistry; here we only need the names to be resolvable. +if (siteRegistry != null) { + List> sorted = + new ArrayList<>(siteRegistry.entrySet()); + sorted.sort(Map.Entry.comparingByValue()); + for (Map.Entry e : sorted) { + String name = e.getKey(); + // Skip reserved slots and names already added. + if (e.getValue() < 3) continue; + if (symbolTable.getSymbolEntry(name) != null) continue; + // "my" is the right decl: these variables will not leak back + // into the caller's scope (eval scope is discarded on return), + // and "my" is what strict-vars looks for. + symbolTable.addVariable(name, "my", null); + } +} +``` + +Considerations: + +- **Declaration kind**: use `"my"` for everything (not `"our"`) so the + existing `checkStrictVarsAtParseTime` bypass at line 285 applies. If any + of the names were originally `our`, that's fine — we lose the "our" + distinction inside eval, but that only affects extremely edge-case + diagnostics (e.g. re-declaration warnings) and can be refined later by + carrying the decl kind alongside the index in the registry. +- **Index preservation**: don't carry over the registry's slot indices into + `symbolTable`. `ScopedSymbolTable.addVariable` will pick fresh indices + for the eval's own pad, and `BytecodeCompiler` already uses + `adjustedRegistry` (independent of `symbolTable` indices) to map + captured variables into runtime registers 3…N. +- **`@_` collision**: `@_` is pre-added at reserved slot 1 and will not be + re-added (guarded by the `getSymbolEntry != null` check). +- **`strict`/`feature` flags**: already inherited at lines 139–148. No + change needed. + +### Phase 2 — instrument + prove via Perl-level tests + +Add `src/test/resources/unit/eval_nested_lexicals.t` with one subtest per +failing shape: + +1. `my $x` in outer eval, direct `$x` in inner eval. (already passes) +2. `my $x` in outer eval, anonymous sub in inner eval. (already passes) +3. `my $x` in outer eval, **named sub** in inner eval reading `$x`. +4. `my @arr` in outer eval, **named sub** in inner eval reading `$arr[0]`. +5. Three-deep nesting: `eval q{ my $a = …; eval q{ my $b = …; eval q{ + sub f { $a + $b } }; f() }; }`. +6. `our $x`, `state $x`, and `local $x` variants. +7. Write-through: inner eval assigns to outer `my` variable, outer checks + the value. +8. Compile-time pragma propagation across eval boundary (warnings, strict, + features). + +Run under both backends: + +```bash +./jperl src/test/resources/unit/eval_nested_lexicals.t +JPERL_EVAL_NO_INTERPRETER=1 ./jperl src/test/resources/unit/eval_nested_lexicals.t +./jperl --interpreter src/test/resources/unit/eval_nested_lexicals.t +``` + +### Phase 3 — Perl 5 core eval.t baseline + +Run `perl dev/tools/perl_test_runner.pl perl5_t/t/op/eval.t` on master +and on the fix branch. Key subtests (per subagent report): + +- Lines 105–132: "check navigation of multiple eval boundaries to find lexicals" +- Lines 121–132: "calls from within `eval''` should clone outer lexicals" +- Lines 254–312: explicit `eval q{ my $r; sub fred3 { ...inner eval '$yyy'... } }` +- Lines 186–189: "lexical search terminates correctly at subroutine boundary" + +Target: at least all the above subtests passing. Avoid regressions +elsewhere. + +### Phase 4 — real-world validation + +1. **Geo::IP**: re-run `./jcpan -t Geo::IP`. Expected: all 8 test files + pass, up from the current 6/8 (with the `fix/geo-ip-dynaloader-socket-v6` + branch already merged). +2. **Full unit suite**: `make` must be clean. +3. **Bundled-modules suite**: `make test-bundled-modules` must not regress. +4. **Performance smoke test**: `./jperl -e 'for (1..1000) { eval q{my $x=1; eval q{my $y=2} } }'` + — the fix adds N variable additions to the eval's fresh symbol table; + for large outer pads this is O(N) per inner eval. Measure under + ExifTool's startup to ensure no measurable regression. + +### Phase 5 — JVM-backend audit + +Although `RuntimeCode.evalStringHelper` already handles the inner-eval +case correctly, write an explicit regression test that runs under both +backends so we can't silently diverge in the future. In particular, the +`filteredSnapshot` logic in `frontend/parser/SubroutineParser.java` +(lines 1188–1254) that runs when a named sub inside an eval is being +compiled should be checked against the new test cases — if there's a +path that rebuilds the snapshot off a pre-mutated symbol table, it +might drop outer eval vars. + +## Alternatives considered + +- **Extend `siteRegistry` to carry decl kind ("my"/"our"/"state") and + source `ScopedSymbolTable`**: more invasive, helps edge cases but not + needed for the bug. Revisit if Phase 2 finds sub-suite failures that + care about decl kind. +- **Reuse `RuntimeCode.evalStringHelper` logic inside + `EvalStringHandler`**: biggest refactor, would unify both backends. Not + advisable under schedule pressure; both paths have grown their own + nuance (BEGIN-block aliasing, hint-hash restore, etc.) and need to + converge via a shared helper later. +- **Disable interpreter backend**: regression in startup time; not + acceptable. + +## Risks / open questions + +1. **Name collisions**: if two nested scopes of the caller declared vars + with the same name, only the innermost is in `siteRegistry`. OK — + standard Perl does the same. +2. **`@_` / reserved-slot interactions**: guarded by the + `getSymbolEntry != null` check. Cross-check that the caller's registry + cannot contain entries mapped to slots 0/1/2 that we'd miss. +3. **Named subs that *do* capture outer lexicals**: in real Perl these + warn "Variable `$x` will not stay shared at …". We don't emit that + warning today. Add to a follow-up ticket, not blocking. +4. **BEGIN blocks**: `RuntimeCode.evalStringHelper` has a complex + PersistentVariable aliasing path for BEGIN. The interpreter path may + need a parallel mechanism. Out-of-scope for this fix, but add to open + questions. +5. **`eval STRING` inside BEGIN**: confirm `evalSiteRegistries` is + correctly populated when the outer code runs at BEGIN time. + +## Files likely to change + +| File | Change | +|---|---| +| `src/main/java/org/perlonjava/backend/bytecode/EvalStringHandler.java` | Seed symbol table from `siteRegistry` (both overloads). | +| `src/test/resources/unit/eval_nested_lexicals.t` | New test file. | +| `dev/design/nested-eval-string-lexicals.md` | This document (progress tracking). | +| `AGENTS.md` or similar | Only if new debug env var or workflow is added. | + +No changes planned to: +- `backend/jvm/EmitEval.java` (already correct) +- `runtime/runtimetypes/RuntimeCode.java` (JVM path) +- `frontend/parser/Variable.java` (the strict check is correct; we're + fixing the symbol table it consults) +- `frontend/parser/SubroutineParser.java` (may need inspection, not + change) + +## Progress tracking + +### Current status +Plan drafted; implementation not started. + +### Completed +- [x] Reproduce the bug and isolate to interpreter backend (2026-04-20) +- [x] Instrument `EmitEval`, `RuntimeCode.evalStringHelper`, + `EmitVariable.handleMyOperator`, and + `Variable.checkStrictVarsAtParseTime`; confirm fresh symbol table + in `EvalStringHandler` is the root cause (2026-04-20) +- [x] Confirm JVM backend (`JPERL_EVAL_NO_INTERPRETER=1`) works correctly + on the reproducer (2026-04-20) + +### Next steps +1. Phase 1: implement the symbol-table seeding in `EvalStringHandler` (both + overloads). +2. Phase 2: write and run `eval_nested_lexicals.t`. +3. Phase 3: baseline + compare `perl5_t/t/op/eval.t`. +4. Phase 4: re-run `./jcpan -t Geo::IP`; `make`; `make test-bundled-modules`. +5. Phase 5: add cross-backend parity tests. + +### Open questions +- Should named subs inside eval warn "Variable will not stay shared"? + (separate ticket) +- Do BEGIN blocks in the interpreter path need a parallel to the JVM's + `PersistentVariable` aliasing? (file a follow-up after Phase 4 tests.) + +## References + +- Subagent investigation transcript: + `/var/folders/r9/9y2qm0t12bxc10jbthrttn8h0000gn/T/devin-overflows-501/f1a337c9/content.txt` + (full technical walkthrough of both backends, ~200 lines). +- Related doc: `dev/custom_bytecode/EVAL_STRING_SPEC.md`. +- Related doc: `dev/custom_bytecode/CLOSURE_IMPLEMENTATION_STATUS.md`. +- PR that surfaced the bug: + https://github.com/fglock/PerlOnJava/pull/511 + (Geo::IP fix — 2 of the 8 test files still fail due to this issue.) diff --git a/src/main/java/org/perlonjava/backend/bytecode/EvalStringHandler.java b/src/main/java/org/perlonjava/backend/bytecode/EvalStringHandler.java index 0172c88b7..f519a29aa 100644 --- a/src/main/java/org/perlonjava/backend/bytecode/EvalStringHandler.java +++ b/src/main/java/org/perlonjava/backend/bytecode/EvalStringHandler.java @@ -2,6 +2,7 @@ import org.perlonjava.app.cli.CompilerOptions; import org.perlonjava.backend.jvm.EmitterContext; +import org.perlonjava.backend.jvm.EmitterMethodCreator; import org.perlonjava.backend.jvm.JavaClassInfo; import org.perlonjava.frontend.astnode.Node; import org.perlonjava.frontend.lexer.Lexer; @@ -135,6 +136,109 @@ public static RuntimeList evalStringList(String perlCode, symbolTable.addVariable("@_", "our", null); symbolTable.addVariable("wantarray", "", null); + // Seed the symbol table with the caller's visible lexical variables so + // that parse-time name resolution inside the eval body can find them. + // + // Without this, named subs inside the eval that reference outer `my` + // variables would fail with "Global symbol requires explicit package + // name" (parse-time strict-vars check in Variable.java:285), and if + // they got past parse, their JVM-compiled closure would capture the + // wrong thing because `SubroutineParser.handleNamedSub` relies on + // `parser.ctx.symbolTable` to decide what to capture. + // + // We use the same "BEGIN package alias" trick that the JVM backend + // uses in `RuntimeCode.evalStringHelper` (search for + // `PersistentVariable.beginPackage`): each captured `my` variable is + // aliased into a fresh package global under + // `PerlOnJava::_BEGIN_::`, and seeded into the parser's + // symbol table as `our` with that package. When a named sub inside + // the eval is later compiled by `SubroutineParser.handleNamedSub`, + // it goes through the `decl == "our"` branch (line 1153), resolves + // the global by the aliased name, and picks up the real runtime + // value shared with the outer scope. + // + // Parsing flow is unaffected for direct references: the interpreter + // (BytecodeCompiler) uses its OWN parentRegistry-populated symbol + // table for variable resolution, so direct `$y` in the eval body + // still resolves to the captured-register path. + // + // We compute the capturedVars/adjustedRegistry up-front so the + // seeding step sees the final, filtered set of variables. + // + // See dev/design/nested-eval-string-lexicals.md for full background. + RuntimeBase[] capturedVars = new RuntimeBase[0]; + Map adjustedRegistry = null; + Map registry = siteRegistry != null ? siteRegistry + : (currentCode != null ? currentCode.variableRegistry : null); + if (registry != null && registers != null) { + List> sortedVars = new ArrayList<>(registry.entrySet()); + sortedVars.sort(Map.Entry.comparingByValue()); + List capturedList = new ArrayList<>(); + adjustedRegistry = new HashMap<>(); + adjustedRegistry.put("this", 0); + adjustedRegistry.put("@_", 1); + adjustedRegistry.put("wantarray", 2); + // Per-eval-invocation unique alias namespace for seeded lexicals. + int seedBeginId = EmitterMethodCreator.classCounter++; + String seedPkg = PersistentVariable.beginPackage(seedBeginId); + int captureIndex = 0; + for (Map.Entry entry : sortedVars) { + String varName = entry.getKey(); + int parentRegIndex = entry.getValue(); + if (parentRegIndex < 3) continue; + if (parentRegIndex >= registers.length) continue; + RuntimeBase value = registers[parentRegIndex]; + // Skip non-Perl values (like Iterator objects from for loops). + if (value == null) { + // Null is fine — capture it. + } else if (value instanceof RuntimeScalar scalar) { + if (scalar.value instanceof java.util.Iterator) continue; + } else if (!(value instanceof RuntimeArray || + value instanceof RuntimeHash || + value instanceof RuntimeCode)) { + continue; + } + capturedList.add(value); + int newRegIndex = 3 + captureIndex; + adjustedRegistry.put(varName, newRegIndex); + captureIndex++; + + // Alias this variable into the seed package's globals AND + // declare it as `our` in the parser's symbol table so named + // subs inside the eval body capture it correctly via the + // JVM subroutine-compilation path. + if (varName.length() < 2) continue; + char sigil = varName.charAt(0); + if (sigil != '$' && sigil != '@' && sigil != '%') continue; + String bareName = varName.substring(1); + String fullName = seedPkg + "::" + bareName; + if (sigil == '$' && value instanceof RuntimeScalar rs) { + GlobalVariable.globalVariables.put(fullName, rs); + } else if (sigil == '@' && value instanceof RuntimeArray ra) { + GlobalVariable.globalArrays.put(fullName, ra); + } else if (sigil == '%' && value instanceof RuntimeHash rh) { + GlobalVariable.globalHashes.put(fullName, rh); + } else { + // Sigil / value-type mismatch (e.g. captured as null). + // Skip the alias but still proceed to the symbol-table + // seeding below — it keeps parse-time checks happy + // even when the runtime capture is missing. + } + if (symbolTable.getSymbolEntry(varName) == null) { + symbolTable.addVariable(varName, "our", seedPkg, null); + } + } + capturedVars = capturedList.toArray(new RuntimeBase[0]); + if (EVAL_TRACE) { + evalTrace("EvalStringHandler varRegistry keys=" + registry.keySet()); + evalTrace("EvalStringHandler adjustedRegistry=" + adjustedRegistry); + evalTrace("EvalStringHandler seedPkg=" + seedPkg); + for (int ci = 0; ci < capturedVars.length; ci++) { + evalTrace("EvalStringHandler captured[" + ci + "]=" + (capturedVars[ci] != null ? capturedVars[ci].getClass().getSimpleName() + ":" + capturedVars[ci] : "null")); + } + } + } + // Inherit lexical pragma flags from parent if available if (currentCode != null) { int strictOpts = (siteStrictOptions >= 0) ? siteStrictOptions : currentCode.strictOptions; @@ -173,77 +277,9 @@ public static RuntimeList evalStringList(String perlCode, Parser parser = new Parser(ctx, tokens); Node ast = parser.parse(); - // Step 3: Build captured variables and adjusted registry for eval context - // Collect all parent scope variables (except reserved registers 0-2) - RuntimeBase[] capturedVars = new RuntimeBase[0]; - Map adjustedRegistry = null; - - // Use per-eval-site registry if available, otherwise fall back to global registry - Map registry = siteRegistry != null ? siteRegistry - : (currentCode != null ? currentCode.variableRegistry : null); - - if (registry != null && registers != null) { - - List> sortedVars = new ArrayList<>( - registry.entrySet() - ); - sortedVars.sort(Map.Entry.comparingByValue()); - - // Build capturedVars array and adjusted registry - // Captured variables will be placed at registers 3+ in eval'd code - List capturedList = new ArrayList<>(); - adjustedRegistry = new HashMap<>(); - - // Always include reserved registers in adjusted registry - adjustedRegistry.put("this", 0); - adjustedRegistry.put("@_", 1); - adjustedRegistry.put("wantarray", 2); - - int captureIndex = 0; - for (Map.Entry entry : sortedVars) { - String varName = entry.getKey(); - int parentRegIndex = entry.getValue(); - - // Skip reserved registers (they're handled separately in interpreter) - if (parentRegIndex < 3) { - continue; - } - - if (parentRegIndex < registers.length) { - RuntimeBase value = registers[parentRegIndex]; - - // Skip non-Perl values (like Iterator objects from for loops) - // Only capture actual Perl variables: Scalar, Array, Hash, Code - if (value == null) { - // Null is fine - capture it - } else if (value instanceof RuntimeScalar scalar) { - // Check if the scalar contains an Iterator (used by for loops) - if (scalar.value instanceof java.util.Iterator) { - // Skip - this is a for loop iterator, not a user variable - continue; - } - } else if (!(value instanceof RuntimeArray || - value instanceof RuntimeHash || - value instanceof RuntimeCode)) { - // Skip this register - it contains an internal object - continue; - } - - capturedList.add(value); - // Map to new register index starting at 3 - adjustedRegistry.put(varName, 3 + captureIndex); - captureIndex++; - } - } - capturedVars = capturedList.toArray(new RuntimeBase[0]); - if (EVAL_TRACE) { - evalTrace("EvalStringHandler varRegistry keys=" + registry.keySet()); - evalTrace("EvalStringHandler adjustedRegistry=" + adjustedRegistry); - for (int ci = 0; ci < capturedVars.length; ci++) { - evalTrace("EvalStringHandler captured[" + ci + "]=" + (capturedVars[ci] != null ? capturedVars[ci].getClass().getSimpleName() + ":" + capturedVars[ci] : "null")); - } - } - } + // (Captured variables and adjustedRegistry were computed above, + // before parsing, so the parser's symbol table could be seeded + // with consistent register indices.) // Step 4: Compile AST to interpreter bytecode with adjusted variable registry. // The compile-time package is already propagated via ctx.symbolTable. diff --git a/src/main/java/org/perlonjava/core/Configuration.java b/src/main/java/org/perlonjava/core/Configuration.java index 594805953..e910d1012 100644 --- a/src/main/java/org/perlonjava/core/Configuration.java +++ b/src/main/java/org/perlonjava/core/Configuration.java @@ -33,7 +33,7 @@ public final class Configuration { * Automatically populated by Gradle/Maven during build. * DO NOT EDIT MANUALLY - this value is replaced at build time. */ - public static final String gitCommitId = "077ce69bd"; + public static final String gitCommitId = "c92620f4d"; /** * Git commit date of the build (ISO format: YYYY-MM-DD). @@ -48,7 +48,7 @@ public final class Configuration { * Parsed by App::perlbrew and other tools via: perl -V | grep "Compiled at" * DO NOT EDIT MANUALLY - this value is replaced at build time. */ - public static final String buildTimestamp = "Apr 20 2026 15:31:29"; + public static final String buildTimestamp = "Apr 20 2026 15:59:13"; // Prevent instantiation private Configuration() { diff --git a/src/main/java/org/perlonjava/runtime/perlmodule/DynaLoader.java b/src/main/java/org/perlonjava/runtime/perlmodule/DynaLoader.java index ebe20deda..da5cb605c 100644 --- a/src/main/java/org/perlonjava/runtime/perlmodule/DynaLoader.java +++ b/src/main/java/org/perlonjava/runtime/perlmodule/DynaLoader.java @@ -22,6 +22,20 @@ public static void initialize() { dynaLoader.registerMethod("bootstrap", null); dynaLoader.registerMethod("boot_DynaLoader", null); + // PerlOnJava has no shared-library loading support. Some CPAN + // Makefile.PL files (e.g. Geo::IP's) probe for native C libraries + // via DynaLoader::dl_findfile/dl_load_file/dl_find_symbol to + // decide between XS and pure-Perl (PP) code paths. We register + // "not found" stubs so those probes succeed (returning empty/undef) + // and the modules fall through to their pure-Perl implementations. + dynaLoader.registerMethod("dl_findfile", "dl_empty", null); + dynaLoader.registerMethod("dl_load_file", "dl_empty", null); + dynaLoader.registerMethod("dl_find_symbol", "dl_empty", null); + dynaLoader.registerMethod("dl_find_symbol_anywhere", "dl_empty", null); + dynaLoader.registerMethod("dl_install_xsub", "dl_empty", null); + dynaLoader.registerMethod("dl_undef_symbols", "dl_empty", null); + dynaLoader.registerMethod("dl_error", "dl_error", null); + // Set $DynaLoader::VERSION so CPAN dependency checking works GlobalVariable.getGlobalVariable("DynaLoader::VERSION").set("1.56"); } catch (NoSuchMethodException e) { @@ -52,4 +66,16 @@ public static RuntimeList bootstrap(RuntimeArray args, int ctx) { public static RuntimeList boot_DynaLoader(RuntimeArray args, int ctx) { return new RuntimeList(); } + + /** + * No-op stub used for all DynaLoader dl_* probe functions. Returns an + * empty list (undef in scalar context). See initialize() for why. + */ + public static RuntimeList dl_empty(RuntimeArray args, int ctx) { + return new RuntimeList(); + } + + public static RuntimeList dl_error(RuntimeArray args, int ctx) { + return new RuntimeScalar("DynaLoader is not supported in PerlOnJava").getList(); + } } diff --git a/src/main/java/org/perlonjava/runtime/perlmodule/Socket.java b/src/main/java/org/perlonjava/runtime/perlmodule/Socket.java index a6f8657f7..2b084c57a 100644 --- a/src/main/java/org/perlonjava/runtime/perlmodule/Socket.java +++ b/src/main/java/org/perlonjava/runtime/perlmodule/Socket.java @@ -89,6 +89,8 @@ public static void initialize() { // Register socket functions socket.registerMethod("pack_sockaddr_in", null); socket.registerMethod("unpack_sockaddr_in", null); + socket.registerMethod("pack_sockaddr_in6", null); + socket.registerMethod("unpack_sockaddr_in6", null); socket.registerMethod("inet_aton", null); socket.registerMethod("inet_ntoa", null); socket.registerMethod("inet_pton", null); @@ -262,6 +264,80 @@ public static RuntimeList unpack_sockaddr_in(RuntimeArray args, int ctx) { } } + /** + * pack_sockaddr_in6(PORT, IP6_ADDRESS [, SCOPE_ID [, FLOWINFO]]) + * Packs a port and 16-byte IPv6 address into a sockaddr_in6 structure. + */ + public static RuntimeList pack_sockaddr_in6(RuntimeArray args, int ctx) { + if (args.size() < 2) { + throw new IllegalArgumentException("Not enough arguments for pack_sockaddr_in6"); + } + int port = args.get(0).getInt(); + String addrStr = args.get(1).toString(); + int scopeId = args.size() > 2 ? args.get(2).getInt() : 0; + int flowinfo = args.size() > 3 ? args.get(3).getInt() : 0; + + byte[] addrBytes = addrStr.getBytes(StandardCharsets.ISO_8859_1); + if (addrBytes.length != 16) { + throw new IllegalArgumentException("pack_sockaddr_in6: address must be 16 bytes, got " + addrBytes.length); + } + + // sockaddr_in6: family(2) + port(2) + flowinfo(4) + addr(16) + scope_id(4) = 28 bytes + byte[] sockaddr = new byte[28]; + // Family in native byte order - use big-endian matching pack_sockaddr_in convention + sockaddr[0] = 0; + sockaddr[1] = (byte) AF_INET6; + // Port (network byte order) + sockaddr[2] = (byte) ((port >> 8) & 0xFF); + sockaddr[3] = (byte) (port & 0xFF); + // Flowinfo (network byte order) + sockaddr[4] = (byte) ((flowinfo >> 24) & 0xFF); + sockaddr[5] = (byte) ((flowinfo >> 16) & 0xFF); + sockaddr[6] = (byte) ((flowinfo >> 8) & 0xFF); + sockaddr[7] = (byte) (flowinfo & 0xFF); + // Address (16 bytes) + System.arraycopy(addrBytes, 0, sockaddr, 8, 16); + // Scope ID (native byte order — emit as big-endian for round-trip consistency) + sockaddr[24] = (byte) ((scopeId >> 24) & 0xFF); + sockaddr[25] = (byte) ((scopeId >> 16) & 0xFF); + sockaddr[26] = (byte) ((scopeId >> 8) & 0xFF); + sockaddr[27] = (byte) (scopeId & 0xFF); + + return new RuntimeScalar(new String(sockaddr, StandardCharsets.ISO_8859_1)).getList(); + } + + /** + * unpack_sockaddr_in6(SOCKADDR) + * Returns (port, addr, scope_id, flowinfo) in list context, + * or just port in scalar context. + */ + public static RuntimeList unpack_sockaddr_in6(RuntimeArray args, int ctx) { + if (args.size() < 1) { + throw new IllegalArgumentException("Not enough arguments for unpack_sockaddr_in6"); + } + byte[] sockaddr = args.get(0).toString().getBytes(StandardCharsets.ISO_8859_1); + if (sockaddr.length < 28) { + throw new IllegalArgumentException("Invalid sockaddr_in6 structure (length " + sockaddr.length + ")"); + } + int port = ((sockaddr[2] & 0xFF) << 8) | (sockaddr[3] & 0xFF); + int flowinfo = ((sockaddr[4] & 0xFF) << 24) | ((sockaddr[5] & 0xFF) << 16) + | ((sockaddr[6] & 0xFF) << 8) | (sockaddr[7] & 0xFF); + byte[] addrBytes = new byte[16]; + System.arraycopy(sockaddr, 8, addrBytes, 0, 16); + int scopeId = ((sockaddr[24] & 0xFF) << 24) | ((sockaddr[25] & 0xFF) << 16) + | ((sockaddr[26] & 0xFF) << 8) | (sockaddr[27] & 0xFF); + + if (ctx == RuntimeContextType.LIST) { + RuntimeList result = new RuntimeList(); + result.add(new RuntimeScalar(port)); + result.add(new RuntimeScalar(new String(addrBytes, StandardCharsets.ISO_8859_1))); + result.add(new RuntimeScalar(scopeId)); + result.add(new RuntimeScalar(flowinfo)); + return result; + } + return new RuntimeScalar(port).getList(); + } + /** * inet_aton(HOSTNAME) * Converts a hostname or IP address to a 4-byte binary string diff --git a/src/main/perl/lib/DynaLoader.pm b/src/main/perl/lib/DynaLoader.pm index 86018865b..0ff227b34 100644 --- a/src/main/perl/lib/DynaLoader.pm +++ b/src/main/perl/lib/DynaLoader.pm @@ -30,6 +30,38 @@ BEGIN { unless (defined &boot_DynaLoader) { *boot_DynaLoader = sub { return }; } + + # Stubs for CPAN Makefile.PL files that probe for native C libraries + # (e.g. Geo::IP's Makefile.PL calls DynaLoader::dl_findfile('GeoIP')). + # PerlOnJava has no shared-library support, so these return a "not found" + # result, letting modules fall through to their pure-Perl (PP) code paths. + unless (defined &dl_findfile) { + *dl_findfile = sub { return }; + } + unless (defined &dl_load_file) { + *dl_load_file = sub { return }; + } + unless (defined &dl_find_symbol) { + *dl_find_symbol = sub { return }; + } + unless (defined &dl_find_symbol_anywhere) { + *dl_find_symbol_anywhere = sub { return }; + } + unless (defined &dl_install_xsub) { + *dl_install_xsub = sub { return }; + } + unless (defined &dl_error) { + *dl_error = sub { return "DynaLoader is not supported in PerlOnJava" }; + } + unless (defined &dl_undef_symbols) { + *dl_undef_symbols = sub { return () }; + } + our @dl_library_path = (); + our @dl_resolve_using = (); + our @dl_require_symbols = (); + our @dl_librefs = (); + our @dl_modules = (); + our @dl_shared_objects = (); } 1; diff --git a/src/main/perl/lib/Socket.pm b/src/main/perl/lib/Socket.pm index 86b6a7b61..1f3d256e1 100644 --- a/src/main/perl/lib/Socket.pm +++ b/src/main/perl/lib/Socket.pm @@ -22,6 +22,7 @@ XSLoader::load('Socket'); our @EXPORT = qw( pack_sockaddr_in unpack_sockaddr_in + pack_sockaddr_in6 unpack_sockaddr_in6 pack_sockaddr_un unpack_sockaddr_un inet_aton inet_ntoa inet_pton inet_ntop getnameinfo getaddrinfo sockaddr_in sockaddr_un sockaddr_family diff --git a/src/test/resources/unit/eval_nested_lexicals.t b/src/test/resources/unit/eval_nested_lexicals.t new file mode 100644 index 000000000..6c6b6c417 --- /dev/null +++ b/src/test/resources/unit/eval_nested_lexicals.t @@ -0,0 +1,102 @@ +#!/usr/bin/env perl +# Regression tests for nested eval STRING lexical scoping. +# +# In standard Perl, a string-eval's compile-time lexical scope includes every +# `my` visible at the call site — including variables declared inside an +# enclosing eval STRING. PerlOnJava's interpreter backend previously broke +# this for named subs defined inside nested evals: strict-vars fired at +# parse time, or (once parsing was fixed) the closure captured the wrong +# thing at runtime. +# +# See dev/design/nested-eval-string-lexicals.md. + +use strict; +use warnings; +use Test::More; + +# --- Case A: direct reference in nested eval -------------------------------- +{ + my $r = eval q{ + my $x = 41; + eval q{ $x + 1 }; + }; + is( $r, 42, 'direct scalar ref in nested eval' ); +} + +# --- Case B: named sub inside nested eval reads outer my scalar ------------- +{ + my $r = eval q{ + my $y = 99; + eval q{ sub ned_bar { return $y } 1 }; + die $@ if $@; + ned_bar(); + }; + is( $r, 99, 'named sub in nested eval captures outer my scalar' ); +} + +# --- Case C: anonymous sub inside nested eval ------------------------------ +{ + my $r = eval q{ + my $z = 77; + my $code = eval q{ sub { return $z } }; + die $@ if $@; + $code->(); + }; + is( $r, 77, 'anon sub in nested eval captures outer my scalar' ); +} + +# --- Case D: array subscript inside nested eval named sub ------------------- +{ + my $r = eval q{ + my @countries = ('', 'JP', 'US'); + eval q{ sub ned_country_for { return $countries[$_[0]] } 1 }; + die $@ if $@; + ned_country_for(1); + }; + is( $r, 'JP', 'named sub in nested eval reads @countries[idx]' ); +} + +# --- Case E: three-deep nesting -------------------------------------------- +{ + my $r = eval q{ + my $a = 10; + eval q{ + my $b = 20; + eval q{ + sub ned_three { return $a + $b } + 1; + }; + die $@ if $@; + ned_three(); + }; + }; + is( $r, 30, 'three-deep nested eval with named sub + 2 outer my vars' ); +} + +# --- Case F: plain compiled outer (not eval) still works -------------------- +{ + my $u = 11; + eval q{ sub ned_e { return $u } 1 }; + is( ned_e(), 11, 'single-eval named sub captures compiled outer my' ); +} + +# --- Case G: my declared and used inside same eval (no outer) --------------- +my $g = eval q{ + my $v = 22; + sub ned_g { return $v } + ned_g(); +}; +is( $g, 22, 'named sub inside single eval sees sibling my' ); + +# --- Case H: hash lookup in nested eval named sub --------------------------- +{ + my $r = eval q{ + my %map = ( a => 1, b => 2 ); + eval q{ sub ned_h { return $map{$_[0]} } 1 }; + die $@ if $@; + ned_h('b'); + }; + is( $r, 2, 'named sub in nested eval reads %map{key}' ); +} + +done_testing();