diff --git a/dev/modules/perlio_via.md b/dev/modules/perlio_via.md new file mode 100644 index 000000000..1e95ce8d3 --- /dev/null +++ b/dev/modules/perlio_via.md @@ -0,0 +1,331 @@ +# PerlIO::via — functional implementation plan + +## Motivation + +`./jcpan -t Redis` cascades into a dependency chain: + +``` +Redis → IO::Socket::Timeout → PerlIO::via::Timeout → PerlIO::via +``` + +`PerlIO::via` in upstream perl is an XS bootstrap (just +`XSLoader::load()`). The real work lives in `ext/PerlIO-via/via.xs`, +which teaches the C layer-dispatch core to route IO operations through +Perl methods (`PUSHED`, `POPPED`, `OPEN`, `FDOPEN`, `SYSOPEN`, +`FILENO`, `READ`, `WRITE`, `FILL`, `CLOSE`, `SEEK`, `TELL`, `UNREAD`, +`FLUSH`, `SETLINEBUF`, `CLEARERR`, `ERROR`, `EOF`, `BINMODE`, `UTF8`) +on the class named inside `:via(Foo)`. + +PerlOnJava does not ship `PerlIO::via` at all today. CPAN's +resolver therefore tries to "install" it from +`SHAY/perl-5.42.2.tar.gz`, fails, marks the whole chain `NA`, and +`Redis`'s `t/00-compile.t` can't even `require` itself. + +**Near-term (separate PR)** — ship a stub `src/main/perl/lib/PerlIO/via.pm` +(mirroring the existing `PerlIO::encoding` stub) so the dependency +chain resolves. At the same time, make the layer parser in Java throw +a clear error when `:via(Foo)` is actually used at `open`/`binmode` +time. That gives us a loud failure when a real call site needs the +layer, without breaking modules that only `use PerlIO::via` at compile +time. + +This document is about the **follow-up**: a real, functional +`PerlIO::via` that actually dispatches IO through a user-supplied Perl +class. + +## Current layer infrastructure in PerlOnJava + +Relevant files: + +``` +src/main/java/org/perlonjava/runtime/io/ + IOLayer.java interface: processInput / processOutput / reset + LayeredIOHandle.java parseAndSetLayers / splitLayers / addLayer + EncodingLayer.java :encoding(...) / :utf8 + CrlfLayer.java :crlf + IOHandle.java read / write / eof / tell / seek / close / flush / sync + RuntimeIO.java ties a LayeredIOHandle to a filehandle +``` + +Key properties of the existing design: + +1. **Layers are pure string transforms.** `IOLayer` exposes + `processInput(String)` / `processOutput(String)` where each char is + one byte (0-255). Layers are composed into two `Function` pipelines (`inputPipeline`, `outputPipeline`) inside + `LayeredIOHandle`. +2. **Layers see bytes only after the delegate returns them.** They + cannot intercept `open`, `seek`, `tell`, `eof`, `close`, or + `fileno`. Those calls go straight through to the delegate handle. +3. **Unknown layer names currently throw** + `IllegalArgumentException("Unknown layer: " + layerSpec)` in + `addLayer` — *but* `splitLayers` only special-cases `encoding(...)`, + so `via(Foo)` is parsed as a single token and reaches `addLayer`, + where it falls into the default arm and (should) throw. In practice + an `open(..., "<:via(Foo)", ...)` today returns success because of a + separate path in open-mode parsing; the near-term loud-fail work + plugs that hole. + +## What upstream PerlIO::via expects + +A layer class implements some subset of: + +| Method | Direction | Return | Notes | +|--------|-----------|--------|-------| +| `PUSHED($class, $mode, $fh)` | on layer push | blessed obj or `$class` or `-1` | always; gets called before open | +| `POPPED($obj, $fh)` | on layer pop | ignored | cleanup | +| `OPEN($obj, $path, $mode, $fh)` | on open | truthy = we opened it | if absent, lower layer opens | +| `FDOPEN($obj, $fd, $fh)` | on fdopen | truthy | optional | +| `SYSOPEN($obj, $path, $imode, $perm, $fh)` | | truthy | optional | +| `BINMODE($obj, $fh)` | on binmode | 0 / -1 / undef | undef = pop me | +| `UTF8($obj, $belowFlag, $fh)` | just after PUSHED | bool | | +| `FILENO($obj, $fh)` | | int | default = `fileno($fh)` | +| `READ($obj, $buffer, $len, $fh)` | read | octets placed | default = use FILL | +| `WRITE($obj, $buffer, $fh)` | write | octets written | required for writers | +| `FILL($obj, $fh)` | read | string or undef | default read path | +| `CLOSE($obj, $fh)` | close | 0 / -1 | | +| `SEEK($obj, $posn, $whence, $fh)` | seek | 0 / -1 | | +| `TELL($obj, $fh)` | tell | pos | | +| `UNREAD($obj, $buffer, $fh)` | | octets | default = temp push-back layer | +| `FLUSH($obj, $fh)` | flush | 0 / -1 | | +| `SETLINEBUF($obj, $fh)` | | — | | +| `CLEARERR($obj, $fh)` | | — | | +| `ERROR($obj, $fh)` | | bool | | +| `EOF($obj, $fh)` | | bool | default derived from FILL/READ | + +There are two important semantic rules: + +1. **`$fh` is the handle *below* this layer**, given as a glob. The + callback reads/writes through that glob to reach the next layer + down. This implies layers form a linked list at runtime. +2. **`READ`/`WRITE` return octet counts, not transformed strings.** The + callback mutates `$buffer` in place via an aliased argument for + `READ`. The existing PerlOnJava `IOLayer.processInput/Output` + pipeline model does not match this shape. + +## Gap analysis + +Mapping upstream semantics onto PerlOnJava: + +| Concern | Current state | Gap for `:via` | +|---------|---------------|---------------| +| Name lookup `via(Foo)` → class | Not parsed | Add `splitLayers` case for `via(...)`; resolve class name (prefixing `PerlIO::via::` if bare class not loaded, matching upstream). | +| Lifecycle (PUSHED/POPPED) | `IOLayer.reset()` only | Need a Java class that holds the layer's Perl object, invokes PUSHED on creation, POPPED on removal. | +| Layer-below handle | `IOLayer` has no access to delegate | Need to expose an "inner handle" glob to the Perl callback. Requires turning a `LayeredIOHandle` slice into a Perl `GLOB` on demand. | +| READ via FILL | `processInput(String)` is a pure transform | Need a `ViaLayer` that repeatedly calls `FILL` (or `READ`) on the Perl side and feeds the result into the pipeline's byte stream. This is a *pull* model, whereas existing layers are *push* transforms. | +| WRITE | `processOutput(String)` returns transformed bytes | Rework so `ViaLayer.processOutput` calls `WRITE($obj, $buf, $fh_below)` and returns `""` (since the callback itself writes downward), or short-circuit to bypass the downstream pipeline entirely. | +| SEEK/TELL/EOF/CLOSE/FLUSH | Pass-through to delegate | Need per-op hooks on `IOLayer` (new interface methods with default no-op implementations) so `LayeredIOHandle` can consult the topmost `ViaLayer` before delegating. | +| FILENO | `IOHandle.fileno()` goes straight to delegate | Add optional layer override. | +| BINMODE | `binmode` reparses layers | On `:raw` or pop, must call `POPPED` on the Via layer. | +| UTF8 flag | Not modeled | Probably skip for first cut; document as known gap. | +| Error propagation | Layers don't have errno | Map `die` inside callbacks to `$!`/`$@` on the outer op, with a reasonable default (ETIMEDOUT / EIO) on `-1` returns. | +| Push-back (UNREAD) | No support | Document as unimplemented; default upstream behavior uses a temp layer — out of scope for v1. | + +## Proposed design + +### 1. Perl side — `src/main/perl/lib/PerlIO/via.pm` + +Minimal module. `use PerlIO::via;` has no args in upstream; the +package's job is just to exist so the `:via(...)` layer parser can +lazy-load classes. Keep the stub from the near-term PR, bump VERSION +to match upstream (`0.19`). All real logic lives in Java. + +### 2. Java side — `ViaLayer` + +New file `src/main/java/org/perlonjava/runtime/io/ViaLayer.java` +implementing `IOLayer` *plus* new optional hooks (see §3). Holds: + +``` +RuntimeScalar perlObject; // blessed ref returned by PUSHED +String className; // e.g. "PerlIO::via::Timeout" +RuntimeScalar belowGlob; // tied to the layer below, passed as $fh +EnumSet implemented;// which callbacks the class defines +``` + +Construction flow inside `LayeredIOHandle.addLayer("via(Foo)")`: + +1. Resolve class — if `Foo::` has no symbol table, try + `PerlIO::via::Foo`; if both fail, throw. +2. `require` the class via the existing module-loader entry point. +3. Introspect which methods exist (`can(...)`) and cache the + `EnumSet` so hot paths don't re-lookup. +4. Build a `belowGlob` that mirrors the current inner handle (a + `BorrowedIOHandle` wrapped in a `RuntimeIO` wrapped in a `GLOB`). +5. Call `Foo->PUSHED($mode, $belowGlob)` and stash the returned + blessed ref as `perlObject`. `-1` return must propagate as an + `open` failure with `$!` set. + +### 3. Extend `IOLayer` with optional hooks + +Add default methods so existing `CrlfLayer` / `EncodingLayer` don't +need to change: + +```java +default boolean isViaLayer() { return false; } +default RuntimeScalar onOpen(String path, String mode) { /* let + LayeredIOHandle open normally */ return null; } +default int onRead(byte[] buf, int len) { return -2; } // -2 = "not + handled, use processInput pipeline" +default int onWrite(byte[] buf, int len) { return -2; } +default int onSeek(long off, int whence) { return -2; } +default long onTell() { return -2; } +default boolean onEof() { return false; } +default int onClose() { return -2; } +default int onFlush() { return -2; } +default int onFileno() { return -2; } +default void onBinmode() {} +default void onPopped() {} +``` + +`LayeredIOHandle.read / write / seek / tell / eof / close / flush / +fileno / binmode` check the topmost layer's hook first; `-2` means +"fall through to the existing pipeline", any other value is the +result. + +### 4. Below-handle wrapper + +The callback's `$fh` argument must let the user call +`sysread($fh, ...)` / `syswrite($fh, ...)` / `sysseek($fh, ...)` and +have those go to the layer *below* the Via layer. Introduce a new +`BelowLayerIOHandle` (thin adapter over `LayeredIOHandle` that skips +the topmost N layers) and expose it through a glob created at layer- +push time. `BorrowedIOHandle` already exists — extend it, or add a +sibling, to carry the "start at layer N" offset. + +### 5. Open / read / write flow + +``` +open(FH, "<:via(Foo)", $path) + 1. parse layers → [":raw", ":via(Foo)"] + 2. open raw handle at bottom + 3. wrap in LayeredIOHandle + 4. push ViaLayer(Foo): + require Foo + $obj = Foo->PUSHED("<", $fh_below) + if ($obj == -1) return open failure + if (Foo->can("OPEN")) { + $obj->OPEN($path, "<", $fh_below) or return open failure + } else { + // lower layer already opened it in step 2 + } + if (Foo->can("UTF8") && $obj->UTF8($below_utf8, $fh_below)) { + push :utf8 on top + } + +read: prefer $obj->READ → fallback $obj->FILL → fallback delegate +write: prefer $obj->WRITE → fallback delegate +seek: prefer $obj->SEEK → fallback delegate +close: call $obj->CLOSE if present, then POPPED, then close delegate +``` + +### 6. Hot-path cost + +Every `read` / `write` becomes a Perl method call. That is +intrinsically slow. Two mitigations: + +- Cache `can(...)` lookups at push time — no repeated symbol-table + probes. +- For layers that implement `FILL` (the common case), read in larger + chunks (e.g. 8 KiB) and amortize the call across the pipeline's + `processInput` consumers. + +This is acceptable because `:via` users are opting into a +Perl-implemented layer on purpose. + +### 7. Errors + +- Any `die` in a callback is caught inside the `ViaLayer` adapter, + stashed into `$@`, and turned into the documented failure mode + (`-1` / `undef` / false depending on which method). +- An un-caught `die` in PUSHED is a propagated error; the `open` call + returns false and `$!` is set to `EIO` (matching perl's behavior + when PUSHED returns `-1` from an XS layer). + +## Test strategy + +1. **PerlIO-via's own tests** — once functional, enable + `perl5/ext/PerlIO-via/t/via.t` under `perl5_t/ext/PerlIO-via/`. + Fork-heavy parts skip. +2. **PerlIO::via::QuotedPrint** (core since 5.8) — smallest realistic + user. Round-trip a fixture file. +3. **PerlIO::via::Timeout** — what Redis actually loads. Without a + live Redis, exercise `t/00-compile.t` and the `setsockopt`-based + path, which doesn't require the layer to *do* anything — but does + require it to load. +4. **`jcpan -t Redis`** — the motivating end-to-end. Target is + `Result: PASS` on the compile-only tests; live-server tests will + still skip because PerlOnJava has no fork-based test harness for + spinning up a Redis instance. +5. Add a unit test under `src/test/perl/` that writes a tiny Perl + layer class and asserts PUSHED/READ/WRITE/CLOSE all fire in order. + +## Scope boundaries (v1 explicitly excludes) + +- `UNREAD` (push-back layer synthesis) +- `UTF8` flag propagation into the PerlIO core's SvUTF8 state +- `FDOPEN` / `SYSOPEN` (return false → caller falls back to lower + layer, which works fine for the common cases) +- Stackable `:via(...)` (N>1) in a single open — legal in upstream + but rare; first cut may reject if it's non-trivial in the pipeline + model +- Binmode-induced re-push (`binmode $fh, ":via(Foo)"` on an already- + open handle) — acceptable to defer if the push path is only wired + into `open` + +Each of these should either fall back cleanly or throw a clear error +with JPERL_UNIMPLEMENTED honored. + +## Effort estimate + +Rough sizing, assuming the near-term stub + Java loud-fail PR has +already landed: + +| Piece | Size | +|-------|------| +| `IOLayer` interface extension + existing layers adjusting to defaults | XS | +| `ViaLayer.java` (PUSHED/POPPED, READ/FILL, WRITE, CLOSE) | M | +| `BelowLayerIOHandle` / glob wrapper | S | +| `LayeredIOHandle` hook dispatch (read/write/seek/tell/eof/close/flush/fileno/binmode) | M | +| `splitLayers` + addLayer plumbing for `via(...)` | S | +| Error mapping / `$!` integration | S | +| Tests (unit + `perl5_t` enablement + `jcpan -t Redis` check) | M | + +Expect ~1-2 days of focused work, the biggest risk being the +glob-wrapping "below handle" — that has to interoperate cleanly with +`sysread` / `syswrite` inside the Perl callback. + +## Progress tracking + +### Current status: planning only + +### Completed phases + +_none yet_ + +### Next steps + +1. Land the near-term PR: stub `PerlIO/via.pm` + Java loud-fail for + `:via(...)` layer-parse (separate, small PR — unblocks `jcpan -t + Redis` compile phase). +2. Spike `ViaLayer` against `PerlIO::via::QuotedPrint` to validate + the hook shape before doing the full `LayeredIOHandle` surgery. +3. Full implementation per §3-§5. +4. Re-run `jcpan -t Redis`, capture result in this doc. + +### Open questions + +- Should the below-handle be a real `GLOB` (so `readline($fh)` works + inside callbacks), or a minimal `IO::Handle`-shaped thing? Upstream + passes a glob; mimicking it keeps existing CPAN layers happy but + requires a bit more plumbing. +- How do we surface `$!` from Java-level IO failures to a Perl-level + callback that uses `die`? Probably the same mechanism the rest of + PerlOnJava's IO uses (look at `IOOperator.java` for precedent). + +## Related + +- Near-term stub: `src/main/perl/lib/PerlIO/via.pm` (TBD) +- Similar stub precedent: `src/main/perl/lib/PerlIO/encoding.pm` +- Motivating module: `dev/modules/...` (Redis — no doc yet; add one + alongside this when the near-term PR lands) +- Layer plumbing: `src/main/java/org/perlonjava/runtime/io/LayeredIOHandle.java` diff --git a/src/main/java/org/perlonjava/backend/jvm/Dereference.java b/src/main/java/org/perlonjava/backend/jvm/Dereference.java index a57f6ee82..f6e53455a 100644 --- a/src/main/java/org/perlonjava/backend/jvm/Dereference.java +++ b/src/main/java/org/perlonjava/backend/jvm/Dereference.java @@ -1284,7 +1284,17 @@ public static void handleArrowHashDeref(EmitterVisitor emitterVisitor, BinaryOpe } if (CompilerOptions.DEBUG_ENABLED) emitterVisitor.ctx.logDebug("visit -> (HashLiteralNode) autoquote " + node.right); - nodeRight.accept(emitterVisitor.with(RuntimeContextType.SCALAR)); + if (nodeRight.elements.size() > 1) { + // Multiple elements: join them with $; (SUBSEP), like $h{a,b,c} + emitterVisitor.ctx.mv.visitLdcInsn("main::;"); + emitterVisitor.ctx.mv.visitMethodInsn(Opcodes.INVOKESTATIC, "org/perlonjava/runtime/runtimetypes/GlobalVariable", + "getGlobalVariable", "(Ljava/lang/String;)Lorg/perlonjava/runtime/runtimetypes/RuntimeScalar;", false); + nodeRight.accept(emitterVisitor.with(RuntimeContextType.LIST)); + emitterVisitor.ctx.mv.visitMethodInsn(Opcodes.INVOKESTATIC, "org/perlonjava/runtime/operators/StringOperators", + "join", "(Lorg/perlonjava/runtime/runtimetypes/RuntimeScalar;Lorg/perlonjava/runtime/runtimetypes/RuntimeBase;)Lorg/perlonjava/runtime/runtimetypes/RuntimeScalar;", false); + } else { + nodeRight.accept(emitterVisitor.with(RuntimeContextType.SCALAR)); + } int keySlot = emitterVisitor.ctx.javaClassInfo.acquireSpillSlot(); boolean pooledKey = keySlot >= 0; diff --git a/src/main/java/org/perlonjava/core/Configuration.java b/src/main/java/org/perlonjava/core/Configuration.java index 7cf7384ed..f54332d68 100644 --- a/src/main/java/org/perlonjava/core/Configuration.java +++ b/src/main/java/org/perlonjava/core/Configuration.java @@ -33,7 +33,7 @@ public final class Configuration { * Automatically populated by Gradle/Maven during build. * DO NOT EDIT MANUALLY - this value is replaced at build time. */ - public static final String gitCommitId = "6f96f1c74"; + public static final String gitCommitId = "8a43b2cec"; /** * Git commit date of the build (ISO format: YYYY-MM-DD). @@ -48,7 +48,7 @@ public final class Configuration { * Parsed by App::perlbrew and other tools via: perl -V | grep "Compiled at" * DO NOT EDIT MANUALLY - this value is replaced at build time. */ - public static final String buildTimestamp = "Apr 24 2026 13:47:17"; + public static final String buildTimestamp = "Apr 24 2026 17:28:46"; // Prevent instantiation private Configuration() { diff --git a/src/main/java/org/perlonjava/frontend/parser/OperatorParser.java b/src/main/java/org/perlonjava/frontend/parser/OperatorParser.java index 4646b1818..154c89301 100644 --- a/src/main/java/org/perlonjava/frontend/parser/OperatorParser.java +++ b/src/main/java/org/perlonjava/frontend/parser/OperatorParser.java @@ -1072,7 +1072,19 @@ static OperatorNode parseLast(Parser parser, LexerToken token, int currentIndex) static OperatorNode parseReturn(Parser parser, int currentIndex) { Node operand; - // Handle 'return' keyword as a unary operator with an operand + // Handle 'return' keyword as a unary operator with an operand. + // + // Special case: `return ~~ EXPR` — `~~` here is the prefix + // double-bitwise-complement (numeric-scalar idiom), not binary + // smartmatch. parseZeroOrMoreList's looksLikeEmptyList sees `~~` + // as an infix operator and would treat the list as empty, + // silently dropping EXPR. Force a prefix parse in that case. + if (TokenUtils.peek(parser).text.equals("~~")) { + Node expr = parser.parseExpression(parser.getPrecedence(",") + 1); + ListNode list = new ListNode(currentIndex); + list.elements.add(expr); + return new OperatorNode("return", list, currentIndex); + } operand = ListParser.parseZeroOrMoreList(parser, 0, false, false, false, false); return new OperatorNode("return", operand, currentIndex); } diff --git a/src/main/java/org/perlonjava/frontend/parser/ParsePrimary.java b/src/main/java/org/perlonjava/frontend/parser/ParsePrimary.java index cd31f8070..74e342631 100644 --- a/src/main/java/org/perlonjava/frontend/parser/ParsePrimary.java +++ b/src/main/java/org/perlonjava/frontend/parser/ParsePrimary.java @@ -391,21 +391,21 @@ static Node parseOperator(Parser parser, LexerToken token, String operator) { return new OperatorNode(operator, operand, parser.tokenIndex); case "~~": - // Handle ~~ as two separate ~ operators - // First, handle it as a single ~ operator - String firstOperator = "~"; + // Handle prefix ~~ as double bitwise complement: ~(~EXPR) + // This forces numeric scalar context (commonly used as ~~@array to count). + String tildeOp = "~"; if (parser.ctx.symbolTable.isFeatureCategoryEnabled("bitwise")) { - firstOperator = "binary~"; + tildeOp = "binary~"; } - // Put back a single ~ token for the next parse - parser.tokenIndex--; // Back up - LexerToken currentToken = parser.tokens.get(parser.tokenIndex); - currentToken.text = "~"; - - // Parse the operand (which will start with the second ~) + // Parse the operand at the same precedence as a single ~ operand = parser.parseExpression(parser.getPrecedence("~") + 1); - return new OperatorNode(firstOperator, operand, parser.tokenIndex); + if (operand == null) { + parser.throwError("syntax error"); + } + return new OperatorNode(tildeOp, + new OperatorNode(tildeOp, operand, parser.tokenIndex), + parser.tokenIndex); case "--": case "++": diff --git a/src/main/java/org/perlonjava/frontend/parser/SubroutineParser.java b/src/main/java/org/perlonjava/frontend/parser/SubroutineParser.java index 1f1f3acd4..612d236ff 100644 --- a/src/main/java/org/perlonjava/frontend/parser/SubroutineParser.java +++ b/src/main/java/org/perlonjava/frontend/parser/SubroutineParser.java @@ -202,7 +202,15 @@ static Node parseSubroutineCall(Parser parser, boolean isMethod) { // If a package name follows, then it looks like a indirect method // Unless the subName looks like an operator // Unless the subName has a prototype with `*` - if (peek(parser).type == LexerTokenType.IDENTIFIER && isValidIndirectMethod(subName) && !prototypeHasGlob) { + // + // Note: feature-gated core keywords (`try`, `catch`, `finally`) should + // participate in indirect-object parsing when their feature is *off* — + // this is how Error.pm's classic + // try { ... } catch Error::Simple with { ... } + // idiom is recognised (parses as `Error::Simple->catch(with {...})`). + if (peek(parser).type == LexerTokenType.IDENTIFIER + && isValidIndirectMethod(subName, parser) + && !prototypeHasGlob) { int currentIndex2 = parser.tokenIndex; String packageName = IdentifierParser.parseSubroutineIdentifier(parser); // System.out.println("maybe indirect object: " + packageName + "->" + subName); @@ -484,7 +492,21 @@ static Node parseSubroutineCall(Parser parser, boolean isMethod) { } private static boolean isValidIndirectMethod(String subName) { - return !CORE_PROTOTYPES.containsKey(subName) && !subName.startsWith("CORE::"); + return isValidIndirectMethod(subName, null); + } + + private static boolean isValidIndirectMethod(String subName, Parser parser) { + if (subName.startsWith("CORE::")) return false; + if (!CORE_PROTOTYPES.containsKey(subName)) return true; + // `try`, `catch`, `finally` are feature-gated. When the `try` + // feature is *off* they are not reserved and can participate in + // indirect-object parsing (Error.pm's `catch CLASS with {...}` idiom). + if (parser != null + && (subName.equals("try") || subName.equals("catch") || subName.equals("finally")) + && !parser.ctx.symbolTable.isFeatureCategoryEnabled("try")) { + return true; + } + return false; } private static Node parseIndirectMethodCall(Parser parser, IdentifierNode nameNode) { diff --git a/src/main/java/org/perlonjava/runtime/io/LayeredIOHandle.java b/src/main/java/org/perlonjava/runtime/io/LayeredIOHandle.java index 108378923..11b30810c 100644 --- a/src/main/java/org/perlonjava/runtime/io/LayeredIOHandle.java +++ b/src/main/java/org/perlonjava/runtime/io/LayeredIOHandle.java @@ -1,5 +1,6 @@ package org.perlonjava.runtime.io; +import org.perlonjava.runtime.runtimetypes.PerlJavaUnimplementedException; import org.perlonjava.runtime.runtimetypes.RuntimeScalar; import java.nio.charset.Charset; @@ -248,6 +249,15 @@ public RuntimeScalar binmode(String modeStr) { // Parse and apply new layers parseAndSetLayers(modeStr); return new RuntimeScalar(1); + } catch (PerlJavaUnimplementedException e) { + // Loud-fail for unimplemented layers (e.g. :via(Foo)). + // Matches upstream behavior of returning false from binmode on + // layer push failure, but also surfaces the reason via a warning + // so users don't silently lose their layer configuration. + org.perlonjava.runtime.operators.WarnDie.warn( + new RuntimeScalar(e.getMessage() + "\n"), + new RuntimeScalar("")); + return new RuntimeScalar(0); } catch (Exception e) { return new RuntimeScalar(0); } @@ -303,19 +313,21 @@ private String[] splitLayers(String modeStr) { } start = i + 1; i++; - } else if (modeStr.startsWith("encoding(", i)) { - // Handle encoding(...) specially to preserve parentheses + } else if (modeStr.startsWith("encoding(", i) || modeStr.startsWith("via(", i)) { + // Handle encoding(...) / via(...) specially to preserve parentheses. + // Without this, ":via(Foo::Bar)" would be split at the "::" inside + // the class name because ":" is the layer separator. int closeIdx = modeStr.indexOf(')', i); if (closeIdx != -1) { - // Extract everything before encoding() if any + // Extract everything before the layer() call if any if (i > start) { result.add(modeStr.substring(start, i)); } - // Extract the complete encoding(...) specification + // Extract the complete layer(...) specification result.add(modeStr.substring(i, closeIdx + 1)); i = closeIdx + 1; start = i; - // Skip separator if present after encoding() + // Skip separator if present after the layer() if (i < modeStr.length() && modeStr.charAt(i) == ':') { start++; i++; @@ -389,6 +401,18 @@ private void addLayer(String layerSpec) { } catch (Exception e) { throw new IllegalArgumentException("Unknown encoding: " + charsetName); } + } else if (layerSpec.startsWith("via(") && layerSpec.endsWith(")")) { + // :via(Foo) invokes a Perl-implemented PerlIO layer. PerlOnJava + // does not yet bridge the :via(...) layer dispatch back into + // Perl callbacks (PUSHED / FILL / READ / WRITE / CLOSE ...). + // Fail loudly so users don't get a silent no-op; see + // dev/modules/perlio_via.md for the plan to make this + // functional. Under JPERL_UNIMPLEMENTED=warn this is still + // caught by binmode()/open() and surfaced via $!. + String className = layerSpec.substring(4, layerSpec.length() - 1); + throw new PerlJavaUnimplementedException( + "PerlIO layer :via(" + className + ") not implemented " + + "in PerlOnJava (see dev/modules/perlio_via.md)"); } else { throw new IllegalArgumentException("Unknown layer: " + layerSpec); } diff --git a/src/main/java/org/perlonjava/runtime/operators/IOOperator.java b/src/main/java/org/perlonjava/runtime/operators/IOOperator.java index 79234c533..84600de81 100644 --- a/src/main/java/org/perlonjava/runtime/operators/IOOperator.java +++ b/src/main/java/org/perlonjava/runtime/operators/IOOperator.java @@ -636,10 +636,10 @@ public static RuntimeScalar open(int ctx, RuntimeBase... args) { } // Check if it's a numeric file descriptor - if (argStr.matches("^\\d+$")) { + if (argStr.matches("^-?\\d+$")) { int fd = Integer.parseInt(argStr); // Handle numeric file descriptor duplication - RuntimeIO sourceHandle = findFileHandleByDescriptor(fd); + RuntimeIO sourceHandle = fd >= 0 ? findFileHandleByDescriptor(fd) : null; if (sourceHandle != null && sourceHandle.ioHandle != null) { if (isParsimonious) { // &= mode: non-owning wrapper sharing the same fd @@ -649,7 +649,14 @@ public static RuntimeScalar open(int ctx, RuntimeBase... args) { fh = duplicateFileHandle(sourceHandle); } } else { - throw new PerlCompilerException("Bad file descriptor: " + fd); + // Match real Perl: negative fd -> return undef with empty $!, + // unknown non-negative fd -> return undef with $! = EBADF + if (fd >= 0) { + GlobalVariable.getGlobalVariable("main::!").set(9); + } else { + GlobalVariable.getGlobalVariable("main::!").set(""); + } + fh = null; } } // Check if it's a GLOB or GLOBREFERENCE diff --git a/src/main/java/org/perlonjava/runtime/runtimetypes/RuntimeGlob.java b/src/main/java/org/perlonjava/runtime/runtimetypes/RuntimeGlob.java index 42e3e58d3..35446fcfc 100644 --- a/src/main/java/org/perlonjava/runtime/runtimetypes/RuntimeGlob.java +++ b/src/main/java/org/perlonjava/runtime/runtimetypes/RuntimeGlob.java @@ -415,21 +415,33 @@ public RuntimeScalar set(RuntimeGlob value) { // Alias the IO slot: both names point to the same IO object // Must update BOTH this.IO (for detached copies) AND the global glob's IO + // + // NOTE: Prefer `value.IO` (the RuntimeGlob instance that was actually + // passed in) over `getGlobalIO(globName).IO`. The two may differ when + // the source glob has been removed from its stash, which is exactly + // what `Symbol::gensym` does (it creates `*Symbol::GEN`, stashes a + // reference, and then `delete $Symbol::{GEN}` to make the glob + // "anonymous"). A subsequent `tie *$ref, ...` stores the TieHandle + // on that original RuntimeGlob — not in any stash — so + // `getGlobalIO(globName)` would materialise a *fresh* empty glob and + // the tie would be lost during `*STDERR = $fh` / `*STDERR = *$fh`. RuntimeGlob sourceIO = GlobalVariable.getGlobalIO(globName); RuntimeGlob targetIO = GlobalVariable.getGlobalIO(this.globName); + RuntimeScalar ioSource = (value.IO != null) ? value.IO : sourceIO.IO; + // Save old IO for selectedHandle check (needed for local *STDOUT = *OTHER) RuntimeIO oldRuntimeIO = null; if (this.IO != null && this.IO.value instanceof RuntimeIO rio) { oldRuntimeIO = rio; } - this.IO = sourceIO.IO; - targetIO.IO = sourceIO.IO; + this.IO = ioSource; + targetIO.IO = ioSource; // Update selectedHandle if the old IO was the currently selected output handle if (oldRuntimeIO != null && oldRuntimeIO == RuntimeIO.selectedHandle - && sourceIO.IO != null && sourceIO.IO.value instanceof RuntimeIO newRIO) { + && ioSource != null && ioSource.value instanceof RuntimeIO newRIO) { RuntimeIO.selectedHandle = newRIO; } diff --git a/src/main/perl/lib/PerlIO/via.pm b/src/main/perl/lib/PerlIO/via.pm new file mode 100644 index 000000000..3b75c4694 --- /dev/null +++ b/src/main/perl/lib/PerlIO/via.pm @@ -0,0 +1,42 @@ +package PerlIO::via; + +use strict; +use warnings; + +our $VERSION = '0.19'; + +# In standard Perl, PerlIO::via is an XS module that lets you implement +# PerlIO layers in pure Perl via callbacks (PUSHED, FILL, READ, WRITE, +# CLOSE, ...). In PerlOnJava, layered I/O is dispatched in Java +# (LayeredIOHandle), and there is currently no bridge from a :via(Foo) +# layer back into user-supplied Perl callbacks. +# +# This stub exists so that: +# * `use PerlIO::via;` loads successfully. +# * CPAN modules whose prerequisite chain lists PerlIO::via (e.g. +# PerlIO::via::Timeout -> IO::Socket::Timeout -> Redis) can be +# installed and loaded. +# +# Actually opening a handle with `:via(Foo)` is a separate concern: the +# Java-side layer parser throws a clear error when it sees `:via(...)`, +# so the lack of dispatch does not fail silently. See +# dev/modules/perlio_via.md for the plan to make this functional. + +1; +__END__ + +=head1 NAME + +PerlIO::via - stub module for PerlOnJava + +=head1 DESCRIPTION + +Loading-only stub. The real XS implementation of the C<:via(...)> +PerlIO layer is not yet bridged to PerlOnJava's Java-side layered +I/O. An C that includes C<:via(Foo)> will raise an explicit +error from the layer parser rather than silently ignoring the layer. + +See C for the plan to make this module +functional. + +=cut