Skip to content

fix(YAML::PP): stop CPAN suite crashes; add remaining-work plan#525

Merged
fglock merged 3 commits intomasterfrom
fix/yaml-pp-cpan-tests
Apr 21, 2026
Merged

fix(YAML::PP): stop CPAN suite crashes; add remaining-work plan#525
fglock merged 3 commits intomasterfrom
fix/yaml-pp-cpan-tests

Conversation

@fglock
Copy link
Copy Markdown
Owner

@fglock fglock commented Apr 21, 2026

Summary

Investigation of ./jcpan -t YAML::PP uncovered several crashes that were truncating the CPAN test suite via SIGTERM. This PR fixes the crashes and documents the remaining failures.

Fixes (YAMLPP.java)

  • Catch NumberFormatException and YamlEngineException. snakeyaml-engine wraps NFEs from its Int/Float resolvers inside a YamlEngineException whose message is literally "java.lang.NumberFormatException: For input string: ...". Translate these to a clean Perl-level die "YAML::PP: invalid numeric value: ...". Previously these killed whole test files (e.g. !!int .inf, !!int 1_000).
  • !!set tag: Java Set<?> is now converted to a Perl hashref with undef values (matches YAML::PP convention). Previously fell through to toString()"[a, b, c]".
  • !!binary tag: byte[] is now base64-encoded. Previously returned the JVM identity hash ("[B@494c8f29").
  • Cyclic-reference message: now says "Found cyclic reference in YAML structure" so t/32.cyclic-refs.t's regex (?i:found cyclic ref) matches.
  • schema option parsing: secondary list entries are now read. key=value options are validated (empty=null|str only); invalid ones raise "Invalid option: ...". Non-key=value entries (e.g. 'Perl') are accepted to stay compatible with the existing unit test.

Results

Metric Before After
Suite behavior SIGTERM'd partway through Completes in ~76s
Programs failed 32/44 30/44
Subtests failed 73/2275 (counted before timeout) 149/2581
t/32.cyclic-refs.t 4/7 1/7
t/36.debug.t, t/37.schema-catchall.t FAIL PASS

The absolute failure count rose because the whole suite now runs; the pre-fix total only counted failures seen before the crash.

Plan for Remaining Work

Added dev/modules/yaml_pp.md with a per-file status table and groups the outstanding failures by infrastructure change needed:

  • P1 — snakeyaml-engine parser strictness (59 subtests in t/12.load-json.t)
  • B1/B2/B3 — YAML 1.1 boolean handling (~80 subtests across 14/22/31)
  • F1 — load_file/DumpFile on open filehandles
  • B4 & dump-side tag emission — binary/boolean/null/anchor (22/23/34/45)
  • A1/R1/W1/SP1/SP2/CT1 — pure-Perl API surface (Loader/Parser/Emitter/Representer/custom schemas)
  • C1 — dump-side cyclic detection
  • K1 — complex mapping keys (array/hash refs)
  • DI1/PR1/CL1 — directive/style preservation round-trip
  • I1/M1 — Merge and Include schemas
  • DK1/G1/E1/E2/T1/H1 — smaller issues (duplicate_keys, typeglobs, escape handling, !!null, tokenizer, header/footer)

Test plan

  • make (all unit tests pass, including unit/yaml_pp.t)
  • ./jcpan -t YAML::PP runs to completion instead of being killed
  • t/32.cyclic-refs.t regression fixed (4 → 1 failure)
  • New files that now pass: t/36.debug.t, t/37.schema-catchall.t

Generated with Devin

fglock and others added 3 commits April 21, 2026 14:44
…align cyclic-ref message

Investigated failures in `./jcpan -t YAML::PP` (YAML-PP v0.39.0).
Previously the test suite hit SIGTERM partway through because several
load_string() calls threw uncaught Java exceptions and Dubious-failed
entire test programs.

Changes to YAMLPP.java:

* Catch NumberFormatException and YamlEngineException around
  load.loadAllFromString(). SnakeYAML Engine wraps NFEs from its
  Int/Float resolvers in a YamlEngineException whose message is
  "java.lang.NumberFormatException: For input string: ...". Translate
  these to a clean "YAML::PP: invalid numeric value: ..." die so tests
  like `!!int .inf`, `!!int 1_000`, `!!float x` continue instead of
  aborting the whole test file.

* Handle Set<?> results from !!set tags: convert to a Perl hashref with
  undef values (previously fell through to default and stringified to
  "[a, b, c]").

* Handle byte[] results from !!binary tags: base64-encode instead of
  returning the JVM identity hash of the byte array.

* Change cyclic-reference message to "Found cyclic reference in YAML
  structure" so it matches the `(?i:found cyclic ref)` regex used by
  t/32.cyclic-refs.t.

* Parse `schema => ['Name', 'key=value', ...]` options. Only validate
  known key=value pairs (currently `empty=null|str`); non-key=value
  entries (e.g. secondary schema names like `'Perl'`) are accepted for
  backward compatibility with the existing unit test.

Net effect on the CPAN t/ suite:
* Whole suite now runs to completion (previously SIGTERM'd).
* 2 more test files pass (t/36.debug.t, t/37.schema-catchall.t).
* t/32.cyclic-refs.t: 4 failures -> 1.
* t/31.schema.t no longer crashes on `!!int .inf`; now reports real
  subtest failures (bool/YAML1.1 not implemented) instead of dying.

Remaining known limitations (not addressed here): YAML 1.1 boolean
tags (!!bool yes/on/TRUE/...), !!omap ordering, load_file() of open
filehandles, and many of snakeyaml-engine's parser limitations vs
YAML::PP's Perl parser (tabs, %TAG directives, flow-mapping colon
edge cases, etc.).

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Records current CPAN test baseline, what was fixed in the preceding
commit, and groups remaining failures by the infrastructure change
they need (parser strictness, YAML 1.1 schema, dump-side tagging,
Perl-side API surface, etc.).

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
… msg

Follow-up to the previous YAML::PP fix. Picks up the low-hanging
remaining CPAN-test failures:

* load_file/DumpFile now accept an already-open Perl filehandle
  (GLOB or IO::Handle-ish object). Implemented in the Perl wrapper
  src/main/perl/lib/YAML/PP.pm so we can reuse existing Perl-side
  filehandle machinery. DumpFile(path) also opens the file itself so
  it can produce the "Could not open ..." error string that the
  CPAN tests expect.

* load_string in scalar context now returns the first document
  (matches CPAN YAML::PP's contract); list context still yields all
  documents. load_file gets the same scalar/list handling.

* convertYamlToRuntimeScalar splits its identity map into an
  "in-progress" set (real cycle detection) and a "finished" cache
  (DAG memoization). Shared references created by anchor redefinition
  + alias (e.g. &NODEA ... &NODEA ... *NODEA) are no longer
  misidentified as cyclic.

* SnakeYAML's "found duplicate key X" is rewritten to also contain
  "Duplicate key 'X'" so t/57.dup-keys.t's qr{Duplicate key 'a'}
  regex matches.

* cyclic_refs => 'nonsense' now dies with "Invalid value for
  cyclic_refs: ..." instead of silently falling through to FATAL
  (fixes t/32.cyclic-refs.t last subtest).

CPAN suite result for YAML-PP v0.39.0:
* 30/44 -> 26/44 programs failed.
* Newly passing: t/19.file.t, t/30.legacy.t, t/32.cyclic-refs.t,
  t/57.dup-keys.t.
* dev/modules/yaml_pp.md updated with the new status table.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@fglock fglock force-pushed the fix/yaml-pp-cpan-tests branch from ed2aa19 to eaa7728 Compare April 21, 2026 12:45
@fglock fglock merged commit 5ab2ed2 into master Apr 21, 2026
2 checks passed
@fglock fglock deleted the fix/yaml-pp-cpan-tests branch April 21, 2026 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant