feat: CPAN smoke test, HTML::Parser XS backend, and MakeMaker deferred installation#412
Merged
Conversation
…_sprintf562) Devel::Cover Makefile.PL imports $VERSION from ExtUtils::MakeMaker, which was not in @EXPORT_OK. Also adds $Verbose variable and _sprintf562 positional sprintf variant used by ExtUtils::MM_Any and other internal modules. - Add $VERSION, $Verbose, _sprintf562 to @EXPORT_OK - Add our $Verbose = 0 declaration - Add _sprintf562() subroutine (positional %1$s format) - Add dev/modules/devel_cover.md fix plan Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Adds a tool to run jcpan -t on a curated registry of CPAN modules and report installation/test status, XS detection, and regressions. Features: - Registry of 27 modules with category (known-good/partial/blocked) - XS status tracking (pure-perl/java-xs/xs-with-pp-fallback/xs-required) - Parses Test::Harness output for pass/fail counts - Isolates target module results from dependency test output - Regression detection via --compare with previous .dat files - --quick mode for known-good regression checks Usage: perl dev/tools/cpan_smoke_test.pl [--quick|--list] [Module...] Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Fork-based job pool allows running multiple jcpan -t processes concurrently. Each child writes results to a temp file; parent collects and reports as they finish. Default remains sequential (--jobs 1) for safety since parallel jcpan runs share ~/.perlonjava/lib/. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
… to smoke test All three are pure-Perl modules. Parse::RecDescent depends on the bundled Text::Balanced. Spreadsheet::WriteExcel would unlock the ParseExcel skipped t/46_save_parser.t. Image::ExifTool already passes 590/600 tests via its dedicated runner. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Add 9 modules from CPAN top-20 (by favorites + reverse deps): - Partial: JSON, Type::Tiny, List::MoreUtils, Template, Mojolicious - Blocked: Plack, LWP::UserAgent, DBIx::Class, DBI Create dev/modules/smoke_test_investigation.md documenting: - Shared root causes (Clone::PP, MIME::Base64 $VERSION, Encode::Locale, PerlIO::encoding, exit codes) - Per-module failure analysis for all 39 registered modules - Prioritized fix order by impact Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
P1: Create Clone::PP using Storable::dclone (unblocks HTTP::Message chain) P2: Set MIME::Base64 $VERSION=3.16 in both .pm and Java backend P4: Create PerlIO::encoding stub (helps IO::HTML and encoding-aware modules) Additional fixes: - Add $VERSION to bundled JSON.pm (fixes JSON CONFIG_FAIL in smoke test) - Create Template::Stash::XS shim inheriting from Template::Stash (pure Perl fallback; Template still blocked by P6 regex bug) Update investigation plan with P6 (regex engine \| alternation bug) and progress tracking for completed fixes. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
…ipes
Two bugs fixed in RegexPreprocessor.java:
1. Quantifier validation used raw StringBuilder inspection to detect
"quantifier follows nothing" after |. This failed for escaped \|
(literal pipe) which also ends with | in the output buffer.
Replaced with the existing lastWasQuantifiable flag which already
correctly distinguishes alternation | (sets false) from escape
sequences like \| (sets true).
2. Lookahead (?=), lookbehind (?<=, (?<!), negative lookahead (?!),
and atomic groups (?>) were routed through handleRegularParentheses
which only appended '(' and started recursive parsing at the '?'.
The recursive handleRegex then treated '?' as a quantifier, causing
"Quantifier follows nothing" errors. Fixed by appending the full
group opener (e.g., "(?=") and starting recursive parsing after it.
This also fixes incorrect capture group counting - these non-capturing
constructs were being counted as capturing groups.
Patterns that now work:
- qr/\|\|?/ (escaped pipe with quantifier)
- /(a)(?=b)/ (lookahead after capture)
- /(.*)(?:::|')(?=.)/ (constant.pm pattern used by Template Toolkit)
- /(?<=a)b/ (lookbehind)
- /(?>a+)b/ (atomic group)
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
… load Previously Encode was pre-loaded at startup (GlobalContext.initialize), which set %INC and prevented Encode.pm from executing. This meant any Perl-level wrapper code in Encode.pm was dead. Changes: - Defer Encode initialization to XSLoader::load (like TimeHiRes, UnicodeNormalize). Encode.pm now runs normally when `use Encode` is called. - Set Encode constructor to setInc=false so %INC is managed by require, not the Java module. - Add a Perl-level find_encoding wrapper in Encode.pm that falls back to Encode::Alias::find_alias when the Java charset lookup fails. This enables coderef/regex/string aliases registered by modules like Encode::Locale (e.g. "locale" -> "UTF-8"). - Add resolve_alias() implementation. - Per-name recursion guard prevents circular alias chains. Unblocks: Encode::Locale, HTTP::Message chain (partially) Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Two bugs fixed:
1. exit() ignored $? modifications by END blocks (Perl 5 perlvar
semantics). Now sets $? to the exit code before running END blocks,
calls runEndBlocks(false) to preserve it, and reads $? back as the
final exit code. This fixes Test::Needs (200/227 -> 227/227) where
Test2's END block needs to override exit(0) with a failure code.
2. require error messages were missing "Compilation failed in require".
Now builds the full Perl 5-compatible error (original + newline +
"Compilation failed in require") and sets $@ before throwing, so
eval{} sees the complete message.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Implements HTMLParser.java providing: - Full HTML::Entities decode_entities() and _decode_entities() with numeric (decimal/hex), named entity, surrogate pair, and prefix expansion support - ported from util.c - UNICODE_SUPPORT() and _probably_utf8_chunk() stubs - HTML::Parser construction (_alloc_pstate), 13 boolean accessors, handler registration, and basic event-driven HTML parsing - Cross-package registration matching original Parser.xs layout Unblocks: HTTP::Message (now PASS), Devel::Cover (now PASS), HTML::Parser 190/415 tests passing. Also comments out Mojolicious/Moose/Plack from smoke tests (timeout). Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
All priority fixes P1-P7 now marked as DONE: - P3: Encode::Alias find_encoding wrapper - P5: exit() END block $? handling - P6: Regex preprocessor lookaheads/escaped pipes - P7: HTML::Parser Java XS backend Phase 1 Updated module status: HTTP::Message PASS, Devel::Cover PASS, HTML::Parser 190/415, Test::Needs 227/227 PASS. Added latest smoke test results table. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
… resolution
Previously, WriteMakefile() installed .pm files immediately during
Makefile.PL execution, before CPAN.pm could detect and install missing
dependencies. This meant modules were installed in a broken state when
deps were missing.
Now WriteMakefile() only writes MYMETA.yml (for dep detection) and a
Makefile with real cp commands. Actual file installation happens when
CPAN.pm runs 'make', after it has resolved and installed all
dependencies from MYMETA.yml.
Flow: Makefile.PL → MYMETA.yml written → CPAN detects deps → installs
deps → runs 'make' → files installed.
Also fixes IO::Socket::INET loading by replacing exists(&Errno::EINVAL)
(unsupported dynamic pattern) with eval { Errno::EINVAL() }.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
…lone Storable::dclone does not preserve code references - it turns them into broken strings where ref() returns empty and calling them fails with Undefined subroutine errors. This broke DateTime (via Specio which uses Clone::PP to clone attribute definitions containing inline_generator coderefs). The new implementation handles hashes, arrays, scalar refs, and circular references. Code refs, globs, and regexps are returned as-is (shared) since they are immutable. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Two fixes in RegexPreprocessor.handleRegex():
1. When a quantifier follows a non-quantifiable item and the previous
character in the output buffer is itself a quantifier (*, +, ?, }),
emit "Nested quantifiers" instead of "Quantifier follows nothing".
The lastWasQuantifiable check was short-circuiting before the nested
quantifier detection could run, producing the wrong error message
for patterns like a**, .{1}??, .{1}?+.
2. When stripping \G from the beginning of a group, also strip any
following quantifier (?, *, +). Since \G is removed for Java regex
compilation, its quantifier would be left dangling. For example,
(\G?[ac])? is now correctly preprocessed to ([ac])?.
Fixes re/regexp.t (-4 across 6 variant files) and re/reg_mesg.t (-2)
regressions introduced in fb776af.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
816ab70 to
6c322a9
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
MakeMaker deferred installation (key change)
makephase: PreviouslyWriteMakefile()installed .pm files immediately duringMakefile.PL, before CPAN.pm could detect and install dependencies. Now it writes MYMETA.yml + a Makefile with realcpcommands, and files are only installed whenmakeruns — after CPAN.pm has resolved all dependencies.HTML::Parser/HTML::Entities Java XS backend
HTML::Parser(basic event-driven parsing) andHTML::Entities(full decode support including numeric, named, surrogate pairs)Other fixes
exists(&Errno::EINVAL)(unsupported dynamic pattern) →eval { Errno::EINVAL() }find_encoding()via XSLoader deferred load$?; require shows full error messages$VERSION,$Verbose,_sprintf562to@EXPORT_OKCPAN smoke test infrastructure
dev/tools/cpan_smoke_test.plwith curated module registry, regression detection,--comparesupport--jobs NSmoke test results (post changes)
Test plan
makepasses (all unit tests)Data::Compare→File::Find::Rule→Number::Compare+Text::Globall auto-resolvedperl -c dev/tools/cpan_smoke_test.pl— syntax OKGenerated with Devin