feat(pod): bundle Pod::Html + regex parity fix; module porting plans#557
Merged
feat(pod): bundle Pod::Html + regex parity fix; module porting plans#557
Conversation
Plan-only design doc in dev/modules/math_int64.md covering:
- Why no Maven dep is needed (java.lang.Long, ByteBuffer, SecureRandom,
Math.*Exact cover the entire Int64.xs surface, signed and unsigned).
- Three XS MODULE blocks (miu64_, mi64, mu64) mapped to a single
MathInt64.java with an Int64Holder pattern matching dev/modules/bit_vector.md.
- Reuse of upstream lib/Math/Int64.pm and the two pure-Perl pragmas
(die_on_overflow, native_if_available) unchanged.
- Six implementation phases tied to the upstream .t files.
- Phase 0 prerequisite (separate PR): make ExtUtils::CBuilder fail
loudly when Config{cc}=javac and fix the relative archlibexp /
empty obj_ext issues uncovered while investigating
`jcpan -t Math::Int64`.
No implementation yet.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Plan-only design doc in dev/modules/pod_html.md covering two phases:
Phase 0 — Regex `^/m/g` fix (general infrastructure):
- Diagnoses a bug in RuntimeRegex.matchRegexDirect where, in LIST
context global matches, matcher.region(startPos, ...) is called
after every non-zero-length match (the `startPos > matchStart`
predicate is always true, despite a comment claiming otherwise).
- Java's Matcher.region defaults to useAnchoringBounds(true), making
^ match at the artificial region boundary even when that offset
is not actually preceded by \n. Result:
"ab\ncd\n" =~ /^(.*)/mg yields 4 matches in jperl vs 2 in perl.
- Verified the fix with a direct Java repro:
matcher.useAnchoringBounds(false) after each region() call restores
Perl-compatible ^/$ semantics.
- Includes a reduced unit-test outline.
Phase 1 — Bundle Pod::Html:
- Pod::Html is dual-life and only shipped on CPAN inside the full
perl source tarball, so `jcpan -t Pod::Html` is structurally a
dead end. Plan to add it via dev/import-perl5/sync.pl.
- All dependencies (Pod::Simple{,::XHTML,::SimpleTree,::Search},
Text::Tabs, etc.) already work in PerlOnJava.
- 13 of 16 substantive upstream tests already pass against the
in-tree code; the 3 failures all trace back to the Phase 0 regex
bug via Pod::Html::Util::trim_leading_whitespace.
No implementation yet.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Implements dev/modules/pod_html.md.
Phase 0 — Regex ^ in /m mode under /g.
In LIST-context global matches, matcher.region(startPos, ...) was
called after every non-zero-length match. Java's region() defaults
useAnchoringBounds(true), making ^ match at the artificial region
boundary even when that offset is not actually preceded by \n. Result:
"ab\ncd\n" =~ /^(.*)/mg yielded 4 matches in jperl, 2 in perl.
This silently corrupted any line-walking idiom that combines ^/$
under /m with /g — including Pod::Html::Util::trim_leading_whitespace,
which is why Pod::Html's verbatim-block dedenting was broken.
Fix in RuntimeRegex.matchRegexDirect:
- Tighten the predicate so matcher.region(...) is only called when
the engine forcibly advanced past a zero-length match (the
matchEnd = matchStart + 1 path); in every other case Java's
find() already continues from end() naturally.
- Add matcher.useAnchoringBounds(false) at the remaining
region() call sites (the initial pos()-based seek and the
zero-length-advance redirect), restoring Perl's ^/$ semantics.
New unit test src/test/resources/unit/regex/regex_caret_multiline_global.t
covers the canonical line-walking forms (15 subtests).
Phase 1 — Bundle Pod::Html.
Pod::Html is dual-life and CPAN ships it only inside the full perl
source tarball, so jcpan -t Pod::Html is structurally a dead end.
Bundle it via dev/import-perl5/sync.pl instead:
- Add perl5/ext/Pod-Html/lib/Pod entry to dev/import-perl5/config.yaml
(imports Pod/Html.pm 1.36 and Pod/Html/Util.pm).
- Copy upstream t/ and corpus/ into src/test/resources/module/Pod-Html/.
- All 18 upstream tests pass under `make test-bundled-modules`.
Cosmetic Config fix folded in (needed for feature2.t):
- Config{perladmin}, Config{cf_email}, Config{cf_by}, Config{myhostname}
are now populated from the running JVM's user.name + Sys::Hostname
(real perl gets these from Configure-time autoconf probing). They
show up in pod2html's <link rev="made" href="mailto:..."> tag and
in test fixtures that interpolate $Config{perladmin}.
All unit tests pass. All bundled module tests pass.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Module-port plans + first implementation: bundles
Pod::Html, fixes the regex bug it depends on, and lands accompanying%Configimprovements.Math::Int64remains plan-only for a future PR.What's implemented in this PR
Phase 0 — Regex
^/m/gparity fix (general infrastructure)In
RuntimeRegex.matchRegexDirect, the LIST-context global-match loop was callingmatcher.region(startPos, ...)after every non-zero-length match. Java'sMatcher.region(...)defaults touseAnchoringBounds(true), which made^match at the artificial region boundary even when that offset wasn't actually preceded by\n. Result:This silently corrupted any line-walking idiom under
/^...$/mg, including thePod::Html::Util::trim_leading_whitespacededent that broke verbatim block rendering.Fix:
matcher.region(...)only runs when the engine forcibly advances past a zero-length match (thematchEnd = matchStart + 1path).matcher.useAnchoringBounds(false)at the remainingregion()call sites so^/$only anchor at real\nline boundaries in the input string.New unit test (
src/test/resources/unit/regex/regex_caret_multiline_global.t, 15 subtests) covers the canonical idioms.Phase 1 — Bundle
Pod::HtmlPod::Htmlis dual-life: CPAN ships it only inside the full perl source tarball, sojcpan -t Pod::Htmlis structurally a dead end (it tries to run perl'sConfigureshell script). Bundled viadev/import-perl5/sync.plinstead:perl5/ext/Pod-Html/lib/Podentry todev/import-perl5/config.yaml→ importsPod/Html.pm(1.36) andPod/Html/Util.pm.t/andcorpus/intosrc/test/resources/module/Pod-Html/.make test-bundled-modules.Config cosmetic fix (folded in)
$Config{perladmin},$Config{cf_email},$Config{cf_by},$Config{myhostname}are now populated from the running JVM'suser.name+Sys::Hostname(real perl gets these from Configure-time autoconf). They show up inpod2html's<link rev="made" href="mailto:user@host">tag and in test fixtures that interpolate$Config{perladmin}. Was originally tracked as Phase 3 of the Pod::Html plan; needed in this PR forfeature2.tto pass without spurious "Use of uninitialized value" warnings.Plan documents
dev/modules/math_int64.mddev/modules/pod_html.mdTest plan
make(full unit suite) — green.make test-bundled-modules— green (all bundled modules including new Pod-Html subtree).JPERL_TEST_FILTER=Pod-Html make test-bundled-modules— 18/18 tests pass../jperl -e 'use Pod::Html; print "v=$Pod::Html::VERSION ok\n"'→v=1.36 ok../jperland systemperl(locks in cross-engine behaviour).Generated with Devin