Skip to content

fix(parser): join multi-element subscripts with $; in chained hash deref#556

Closed
fglock wants to merge 1 commit intomasterfrom
fix/chained-hash-multi-subscript
Closed

fix(parser): join multi-element subscripts with $; in chained hash deref#556
fglock wants to merge 1 commit intomasterfrom
fix/chained-hash-multi-subscript

Conversation

@fglock
Copy link
Copy Markdown
Owner

@fglock fglock commented Apr 24, 2026

Summary

Fixes a bug where chained hash dereference with multi-element subscripts kept only the last element instead of joining with $; (SUBSEP).

$h{-word => 'ou'}       # FETCH("-word\x1cou")  — already correct
$h{a}{-word => 'ou'}    # FETCH("ou")           — BUG, now fixed

Under the hood, $h{a}{b} goes through handleArrowHashDeref (implicit arrow deref). That path evaluated the subscript list in scalar context, which drops all but the last element. The non-deref top-level path already joined multi-element subscripts with $;, so the two disagreed.

Fix: in handleArrowHashDeref, when the HashLiteralNode has more than one element, emit it as a list and join($;, ...), matching the top-level case.

Discovery

Found while investigating jcpan -t Regexp::Common. Its tied hash chains FETCH calls, and patterns like $RE{list}{conj}{-word => 'ou'} rely on -word reaching FETCH.

Test plan

  • make (full unit test suite) passes
  • Repro script shows FETCH now receives the full joined key at every level
  • Regexp::Common test suite improves from 22/73 failed test files to 11/73; completed subtests jump from 116,146 to 140,752 as previously-aborted files (t/test_list.t, t/URI/http.t, t/number/decimal.t, t/zip/us.t, etc.) now run to completion
  • t/test_list.t in Regexp::Common: 50/50 pass (was aborting at test 33)

Repro

package T;
sub TIEHASH { my ($c, @d) = @_; bless \@d, $c }
sub FETCH { my ($s,$k)=@_; print "FETCH len=",length($k),"\n"; bless ref($s)->new(@$s,$k), ref($s) }
sub new { my ($c, @d)=@_; my %h; tie %h, $c, @d; \%h }
package main;
my %RE; tie %RE, 'T';
my $x = $RE{list}{conj}{-word => 'ou'};

Before: last FETCH reported len=2. After: len=8, matching real Perl.

Generated with Devin

Previously, chained hash access like $h{a}{-word => 'ou'} (implicit
arrow deref) evaluated the multi-element subscript in scalar context,
keeping only the last element ('ou'). The initial (non-deref) level
already joined keys with $; (SUBSEP) to form 'a$;b'-style keys, so the
two paths disagreed:

  $h{-word => 'ou'}       -> FETCH("-word\x1cou")  OK
  $h{a}{-word => 'ou'}    -> FETCH("ou")            BUG

Fix: in handleArrowHashDeref, when the HashLiteralNode subscript has
more than one element, emit it as a list and join with $; just like
the top-level case does.

Found while investigating `jcpan -t Regexp::Common`. Regexp::Common's
FETCH-chaining tied hash relies on this semantic (e.g.
$RE{list}{conj}{-word => 'ou'}). With the fix:
  - Failed test files: 22/73 -> 11/73
  - Completed tests:   116146 -> 140752 (previously aborted tests
    such as t/test_list.t, t/URI/http.t, t/number/decimal.t,
    t/zip/us.t now run to completion)

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@fglock
Copy link
Copy Markdown
Owner Author

fglock commented Apr 24, 2026

Superseded by #555 — that PR now includes this commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant