Skip to content

parser & IO-layer grab-bag: prefix ~~ in return, chained hash subscripts, PerlIO::via stub#555

Merged
fglock merged 5 commits intomasterfrom
fix/prefix-double-tilde
Apr 24, 2026
Merged

parser & IO-layer grab-bag: prefix ~~ in return, chained hash subscripts, PerlIO::via stub#555
fglock merged 5 commits intomasterfrom
fix/prefix-double-tilde

Conversation

@fglock
Copy link
Copy Markdown
Owner

@fglock fglock commented Apr 24, 2026

Summary

Grab-bag PR picked up while running jcpan -t against CPAN modules. Three independent fixes, each motivated by a different module; unified here per request.

1. Prefix ~~ at start of return arguments

Fixes prefix ~~ (double bitwise complement, idiomatic for "force numeric scalar context", e.g. ~~@array to get a count) in return statements.

Two bugs, both triggered by the same Perl idiom in SQL::Beautify::_is_keyword:

return ~~ grep { $_ eq uc($token) } @{$self->{keywords}};
  • In return ~~ EXPR, parseZeroOrMoreList's looksLikeEmptyList saw ~~ (registered as an INFIX_OP for smartmatch) as an infix operator and treated the list as empty, silently dropping EXPR. parseReturn now detects a leading ~~ and parses the operand as a normal expression. We can't change looksLikeEmptyList globally, because constructs like undef ~~ undef (where undef is a ;$-prototype op) legitimately parse as undef() followed by binary smartmatch — an earlier version of this fix broke that and regressed op/smartmatch.t.
  • ParsePrimary case "~~" tried to split ~~ into two ~ tokens by rewriting the single ~~ lexer token to ~ and re-parsing — but the second ~ token never existed (the lexer emits ~~ as a single token), so ~~@x was effectively ~@x. Rewrote to directly build ~(~EXPR).

Found via jcpan -t SQL::Beautify, which failed 5 subtests because _is_keyword always returned undef, so uc_keywords never uppercased anything.

2. Multi-element subscripts in chained hash deref

Previously, chained hash access like $h{a}{-word => 'ou'} (implicit arrow deref) evaluated the multi-element subscript in scalar context, keeping only the last element ('ou'). The initial (non-deref) level already joined keys with $; (SUBSEP) to form 'a$;b'-style keys, so the two paths disagreed:

$h{-word => 'ou'}       -> FETCH("-word\x1cou")  OK
$h{a}{-word => 'ou'}    -> FETCH("ou")            wrong

Fixed the chained-deref path to join multi-element subscripts with $; too, matching both upstream Perl semantics and PerlOnJava's own behavior on the non-deref case.

Found via jcpan -t Regexp::Common.

3. PerlIO::via stub + loud-fail for :via(...) layer

Unblocks jcpan -t Redis (and the wider PerlIO::via::TimeoutIO::Socket::TimeoutRedis dependency chain). Before this change, PerlIO::via was missing entirely, so CPAN flagged it as a failed prerequisite and the whole chain was marked NA, causing the compile tests of the downstream modules to error with Can't locate ....

  • src/main/perl/lib/PerlIO/via.pm — new stub (same pattern as the existing PerlIO::encoding stub). Lets use PerlIO::via; succeed so CPAN's prerequisite resolver is happy and downstream modules can be installed. No layer dispatch happens here — that's out of scope for this change.
  • LayeredIOHandle — teach splitLayers that via(...) is paren-grouped (same treatment as encoding(...)), so a spec like :via(Foo::Bar) is not split at the :: inside the class name. addLayer now recognizes via(...) and throws PerlJavaUnimplementedException. binmode catches that exception type specifically and emits a Perl-level warning instead of silently swallowing it — so open($fh, "<:via(Foo)", ...) now says "PerlIO layer :via(Foo) not implemented" rather than returning a silently layer-less handle.
  • dev/modules/perlio_via.md — design doc covering the full plan for a functional PerlIO::via implementation (bridging layer dispatch into user-supplied Perl callbacks). Not attempted here; tracked separately.

Result:

./jcpan -t Redis
...
Result: PASS  (PerlIO::via::Timeout)
Result: PASS  (IO::Socket::Timeout)
Result: PASS  (Redis)

All Redis tests that can run without fork / a live redis-server now pass.

Test plan

  • make passes
  • jcpan -t SQL::Beautify passes
  • jcpan -t Regexp::Common passes
  • jcpan -t Redis passes (all three modules in the chain: PerlIO::via::Timeout, IO::Socket::Timeout, Redis)
  • Existing layer specs (:encoding(UTF-8):crlf, etc.) still work — regression-checked with a small round-trip

Generated with Devin

The unary prefix `~~` (double bitwise complement, idiomatically used
to force numeric scalar context, e.g. `~~@array` to get a count) was
broken in two ways:

1. In `return ~~ EXPR`, `parseZeroOrMoreList`'s `looksLikeEmptyList`
   saw `~~` (registered as an INFIX_OP for smartmatch) as an infix
   operator and treated the list as empty, silently dropping EXPR.
   `parseReturn` now detects a leading `~~` and parses the operand
   as a normal expression. We can't change `looksLikeEmptyList` in
   general, because constructs like `undef ~~ undef` (where `undef`
   is a ;$-prototype op) legitimately parse as `undef()` followed by
   binary smartmatch.

2. `ParsePrimary` case `"~~"` tried to handle `~~` as two `~` tokens
   by rewriting the single `~~` lexer token to `~` and re-parsing,
   but the second `~` token never existed (the lexer emits `~~` as a
   single token). As a result `~~@x` was effectively `~@x`. Rewrote
   it to directly build `~(~EXPR)`.

Found via `jcpan -t SQL::Beautify`, where `SQL::Beautify::_is_keyword`
uses `return ~~ grep { ... } @kws` and always returned undef, so
`uc_keywords` never uppercased anything. All 46 SQL::Beautify tests
now pass, and op/smartmatch.t failure count matches master.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@fglock fglock force-pushed the fix/prefix-double-tilde branch from 1438e81 to bb9e3d9 Compare April 24, 2026 13:01
Previously, chained hash access like $h{a}{-word => 'ou'} (implicit
arrow deref) evaluated the multi-element subscript in scalar context,
keeping only the last element ('ou'). The initial (non-deref) level
already joined keys with $; (SUBSEP) to form 'a$;b'-style keys, so the
two paths disagreed:

  $h{-word => 'ou'}       -> FETCH("-word\x1cou")  OK
  $h{a}{-word => 'ou'}    -> FETCH("ou")            BUG

Fix: in handleArrowHashDeref, when the HashLiteralNode subscript has
more than one element, emit it as a list and join with $; just like
the top-level case does.

Found while investigating `jcpan -t Regexp::Common`. Regexp::Common's
FETCH-chaining tied hash relies on this semantic (e.g.
$RE{list}{conj}{-word => 'ou'}). With the fix:
  - Failed test files: 22/73 -> 11/73
  - Completed tests:   116146 -> 140752 (previously aborted tests
    such as t/test_list.t, t/URI/http.t, t/number/decimal.t,
    t/zip/us.t now run to completion)

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@fglock fglock changed the title fix(parser): prefix ~~ at list-operator start fix(parser): prefix ~~ in return + multi-element subscripts in chained hash deref Apr 24, 2026
Unblocks `jcpan -t Redis` (and the wider PerlIO::via::Timeout ->
IO::Socket::Timeout -> Redis dependency chain). Before this change,
`PerlIO::via` was missing entirely, so CPAN flagged it as a failed
prerequisite and the whole chain was marked NA, causing the compile
tests of the downstream modules to error with "Can't locate ...".

Changes:

* src/main/perl/lib/PerlIO/via.pm — new stub (same pattern as the
  existing PerlIO::encoding stub). Lets `use PerlIO::via;` succeed so
  CPAN's prerequisite resolver is happy and downstream modules can be
  installed. No layer dispatch happens here — that's out of scope for
  this change.

* src/main/java/org/perlonjava/runtime/io/LayeredIOHandle.java —
  teach `splitLayers` that `via(...)` is paren-grouped (same treatment
  as `encoding(...)`), so a spec like `:via(Foo::Bar)` is not split at
  the `::` inside the class name. Then in `addLayer`, recognize
  `via(...)` and throw PerlJavaUnimplementedException with a message
  pointing at `dev/modules/perlio_via.md`. The `binmode` entry point
  catches this exception type specifically and emits a Perl-level
  warning (instead of silently swallowing it), so a user that writes
  `open($fh, "<:via(Foo)", ...)` sees a clear "PerlIO layer :via(Foo)
  not implemented" message rather than getting a silently layer-less
  handle.

* dev/modules/perlio_via.md — design doc covering the full plan for a
  functional `PerlIO::via` implementation (bridging layer dispatch
  into Perl callbacks). Not attempted here; tracked separately.

Result:

    ./jcpan -t Redis
    ...
    Result: PASS (PerlIO::via::Timeout)
    Result: PASS (IO::Socket::Timeout)
    Result: PASS (Redis)

All Redis tests that can run without fork / a live `redis-server` now
pass.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@fglock fglock changed the title fix(parser): prefix ~~ in return + multi-element subscripts in chained hash deref parser & IO-layer grab-bag: prefix ~~ in return, chained hash subscripts, PerlIO::via stub Apr 24, 2026
fglock and others added 2 commits April 24, 2026 15:56
`open($fh, "<&=", $fd)` now matches real Perl when the file descriptor
cannot be duplicated:

- Negative fd (e.g. `fileno($in_memory_fh)` returns -1): return undef
  with `$!` left empty.
- Unknown non-negative fd: return undef with `$!` set to EBADF (9).

Previously both cases threw a fatal "Bad filehandle: N" /
"Bad file descriptor: N" compile exception, which broke idioms like:

    open my $fh, '<', \'' or die $!;
    my $io = IO::Handle->new;
    $io->fdopen(fileno($fh), "r");   # fileno == -1

Fixes `jcpan -t Protocol::WebSocket` (t/client.t and
t/draft-ietf-hybi-17/request_psgi.t); the full Protocol-WebSocket test
suite is now `Result: PASS` (685 tests).

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Two independent fixes uncovered while investigating `jcpan -t Pipeline`.

1. parser: indirect-method parsing of `catch`/`try`/`finally`

   Error.pm's classic syntax

       try { ... } catch Error::Simple with { ... } finally { ... };

   is parsed by Perl as

       'Error::Simple'->catch(&with(sub { ... })); ...

   i.e. the `catch` here is an indirect-method invocation, not a call
   to a sub named `catch`.  Real Perl only treats `try`/`catch`/`finally`
   as reserved when the `try` feature is enabled; without it, they are
   regular identifiers and the Error.pm idiom works.

   PerlOnJava unconditionally listed these names in `CORE_PROTOTYPES`,
   which caused `isValidIndirectMethod` to reject them regardless of
   feature state — producing `syntax error near "::Simple with "` on
   every Error.pm test.

   Fix: make `isValidIndirectMethod` feature-aware.  When the `try`
   feature is *off*, `try`/`catch`/`finally` participate in
   indirect-object parsing, matching real Perl.

2. runtime: preserve tied IO across `*A = *B` when the source glob
   is anonymous (Symbol::gensym).

   `Symbol::gensym()` creates `*Symbol::GEN<n>` and then
   `delete $Symbol::{GEN<n>}` — the returned glob reference is the
   only live handle to the underlying RuntimeGlob.  Modules like
   `IO::String` then `tie *$ref, ...`, storing a `TieHandle` on that
   specific RuntimeGlob instance.

   `RuntimeGlob.set(RuntimeGlob)` was copying the IO slot via
   `GlobalVariable.getGlobalIO(globName)`, which for a deleted
   stash entry materialises a fresh empty glob.  Result: `*STDERR = $fh`
   (or `*STDERR = *$fh`) silently dropped the tie, so `print STDERR`
   never invoked the `PRINT` method.

   Fix: prefer `value.IO` (the RuntimeGlob instance the caller
   actually handed us) over the stash lookup.  Fall back to the
   stash only if the caller's instance has no IO slot.

Impact
------

* Error.pm test suite: 10/15 failing → 1/15 failing (the remaining one
  is `t/08warndie.t`, which needs `fork()` — pre-existing).
* Pipeline test suite: nearly all files failing at compile → 16/19
  passing.  The newly-fixed `t/13emit.t` exercises exactly the
  `*STDERR = $fh` + IO::String pattern.
* `make` passes all unit tests on both backends.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@fglock fglock merged commit 31cbeec into master Apr 24, 2026
2 checks passed
@fglock fglock deleted the fix/prefix-double-tilde branch April 24, 2026 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant