|
| 1 | +--- |
| 2 | +name: fix-pat-sprintf |
| 3 | +description: Fix re/pat.t and op/sprintf2.t test regressions on fix-exiftool-cli branch |
| 4 | +argument-hint: "[test-name or specific failure]" |
| 5 | +triggers: |
| 6 | + - user |
| 7 | + - model |
| 8 | +--- |
| 9 | + |
| 10 | +# Fix pat.t and sprintf2.t Regressions |
| 11 | + |
| 12 | +You are fixing test regressions in `re/pat.t` (-17 tests) and `op/sprintf2.t` (-3 tests) on the `fix-exiftool-cli` branch of PerlOnJava. |
| 13 | + |
| 14 | +## Hard Constraints |
| 15 | + |
| 16 | +1. **No AST refactoring fallback.** The `LargeBlockRefactorer` / AST splitter must NOT be restored. This is non-negotiable. |
| 17 | +2. **Fix the interpreter.** The bytecode interpreter must achieve feature parity with the JVM compiler. Both backends must produce identical results for all Perl constructs. |
| 18 | +3. **Match the baseline exactly.** Target is the master baseline scores — no more, no less: |
| 19 | + - `re/pat.t`: 1056/1296 |
| 20 | + - `op/sprintf2.t`: 1652/1655 |
| 21 | +4. **Do NOT modify shared runtime** (`RuntimeRegex.java`, `RegexFlags.java`, `RegexPreprocessor.java`, etc.). The runtime is shared between both backends. Fixes must be in the interpreter code. |
| 22 | + |
| 23 | +## Why the Interpreter Is Involved |
| 24 | + |
| 25 | +Large subroutines that exceed the JVM 64KB method limit fall back to the bytecode interpreter via `EmitterMethodCreator.createRuntimeCode()`. |
| 26 | + |
| 27 | +- **pat.t**: The `run_tests` subroutine (lines 38-2652, ~2614 lines) falls back to interpreter. All 1296 tests run through it. Confirmed with `JPERL_SHOW_FALLBACK=1`. |
| 28 | +- **sprintf2.t**: Same mechanism — large test body falls back to interpreter. |
| 29 | + |
| 30 | +## Baseline vs Branch |
| 31 | + |
| 32 | +| Test | Master baseline (397ba45d) | Branch HEAD | Delta | |
| 33 | +|------|---------------------------|-------------|-------| |
| 34 | +| re/pat.t | 1056/1296 | 1039/1296 | -17 | |
| 35 | +| op/sprintf2.t | 1652/1655 | 1649/1655 | -3 | |
| 36 | + |
| 37 | +## Methodology |
| 38 | + |
| 39 | +For each failing test: |
| 40 | + |
| 41 | +1. **Extract** the specific Perl code from the test file |
| 42 | +2. **Compare** JVM vs interpreter output: |
| 43 | + ```bash |
| 44 | + ./jperl -E 'extracted code' # JVM backend (correct behavior) |
| 45 | + ./jperl --interpreter -E 'extracted code' # Interpreter (may differ) |
| 46 | + ``` |
| 47 | +3. **When they differ**: identify the root cause in the interpreter code (BytecodeCompiler, BytecodeInterpreter, etc.) and fix it |
| 48 | +4. **When they don't differ standalone**: the failure depends on context from earlier tests in the same large function. Investigate what prior state affects the result — look at regex state, variable scoping, match variables, pos(), etc. |
| 49 | +5. **Verify** the fix doesn't break other tests |
| 50 | + |
| 51 | +## Running the Tests |
| 52 | + |
| 53 | +```bash |
| 54 | +# Build |
| 55 | +make build |
| 56 | + |
| 57 | +# Run individual tests via test runner (sets correct ENV vars) |
| 58 | +perl dev/tools/perl_test_runner.pl perl5_t/t/re/pat.t |
| 59 | +perl dev/tools/perl_test_runner.pl perl5_t/t/op/sprintf2.t |
| 60 | + |
| 61 | +# Run manually with correct ENV |
| 62 | +cd perl5_t/t |
| 63 | +PERL_SKIP_BIG_MEM_TESTS=1 JPERL_UNIMPLEMENTED=warn JPERL_OPTS="-Xss256m" ../../jperl re/pat.t |
| 64 | +PERL_SKIP_BIG_MEM_TESTS=1 JPERL_UNIMPLEMENTED=warn ../../jperl op/sprintf2.t |
| 65 | + |
| 66 | +# Compare JVM vs interpreter for a specific construct |
| 67 | +./jperl -E 'code' |
| 68 | +./jperl --interpreter -E 'code' |
| 69 | + |
| 70 | +# Check if a test file uses interpreter fallback |
| 71 | +cd perl5_t/t && JPERL_SHOW_FALLBACK=1 ../../jperl re/pat.t 2>&1 | grep 'interpreter backend' |
| 72 | + |
| 73 | +# Get interpreter bytecodes for a construct |
| 74 | +./jperl --interpreter --disassemble -E 'code' 2>&1 |
| 75 | +``` |
| 76 | + |
| 77 | +## pat.t: Exact Regressions (18 PASS->FAIL, 1 FAIL->PASS, net -17) |
| 78 | + |
| 79 | +### Tests that went from PASS to FAIL |
| 80 | + |
| 81 | +| # | Test Description | pat.t Line | Category | |
| 82 | +|---|-----------------|------------|----------| |
| 83 | +| 1 | Stack may be bad | 508 | regex match | |
| 84 | +| 2 | $^N, @- and @+ are read-only | 845-851 | eval STRING special vars | |
| 85 | +| 3-4 | \G testing (x2) | 858, 866 | \G anchor | |
| 86 | +| 5 | \b is not special | 1089 | word boundary | |
| 87 | +| 6-8 | \s, [[:space:]] and [[:blank:]] (x3) | 1223-1225 | POSIX classes | |
| 88 | +| 9 | got a latin string - rt75680 | 1252 | latin/unicode | |
| 89 | +| 10-11 | RT #3516 A, B | 1329, 1335 | \G loop | |
| 90 | +| 12 | Qr3 bare | ~1490 | qr// overload | |
| 91 | +| 13 | Qr3 bare - with use re eval | ~1498 | qr// eval | |
| 92 | +| 14 | Eval-group not allowed at runtime | 524 | regex eval | |
| 93 | +| 15-18 | Branch reset pattern 1-4 | 2392-2409 | branch reset | |
| 94 | + |
| 95 | +### Test that went from FAIL to PASS |
| 96 | + |
| 97 | +| Test Description | Category | |
| 98 | +|-----------------|----------| |
| 99 | +| 1 '', '1', '12' (Eval-group) | regex eval | |
| 100 | + |
| 101 | +## Interpreter Architecture |
| 102 | + |
| 103 | +``` |
| 104 | +Source -> Lexer -> Parser -> AST --+--> JVM Compiler (EmitterMethodCreator) -> JVM bytecode |
| 105 | + \--> BytecodeCompiler -> InterpretedCode -> BytecodeInterpreter |
| 106 | +``` |
| 107 | + |
| 108 | +Both backends share the same runtime (RuntimeRegex, RuntimeScalar, etc.). The difference is ONLY in how the AST is lowered to executable form. The interpreter must handle every construct identically to the JVM compiler. |
| 109 | + |
| 110 | +### Key interpreter files |
| 111 | + |
| 112 | +| File | Role | |
| 113 | +|------|------| |
| 114 | +| `backend/bytecode/BytecodeCompiler.java` | AST -> interpreter bytecodes | |
| 115 | +| `backend/bytecode/BytecodeInterpreter.java` | Main dispatch loop | |
| 116 | +| `backend/bytecode/InterpretedCode.java` | Code object + disassembler | |
| 117 | +| `backend/bytecode/Opcodes.java` | Opcode constants | |
| 118 | +| `backend/bytecode/CompileAssignment.java` | Assignment compilation | |
| 119 | +| `backend/bytecode/CompileBinaryOperator.java` | Binary ops compilation | |
| 120 | +| `backend/bytecode/CompileOperator.java` | Unary/misc ops compilation | |
| 121 | +| `backend/bytecode/SlowOpcodeHandler.java` | Rarely-used op handlers | |
| 122 | +| `backend/bytecode/OpcodeHandlerExtended.java` | CREATE_CLOSURE, STORE_GLOB, etc. | |
| 123 | +| `backend/bytecode/MiscOpcodeHandler.java` | Misc operations | |
| 124 | +| `backend/bytecode/EvalStringHandler.java` | eval STRING compilation for interpreter | |
| 125 | + |
| 126 | +All paths relative to `src/main/java/org/perlonjava/`. |
| 127 | + |
| 128 | +### Key source files (do NOT modify) |
| 129 | + |
| 130 | +| Area | File | Notes | |
| 131 | +|------|------|-------| |
| 132 | +| Regex runtime | `runtime/regex/RuntimeRegex.java` | DO NOT MODIFY | |
| 133 | +| Regex flags | `runtime/regex/RegexFlags.java` | DO NOT MODIFY | |
| 134 | +| Regex preprocessor | `runtime/regex/RegexPreprocessor.java` | DO NOT MODIFY | |
| 135 | + |
| 136 | +All paths relative to `src/main/java/org/perlonjava/`. |
| 137 | + |
| 138 | +## Verification Steps |
| 139 | + |
| 140 | +After any fix: |
| 141 | + |
| 142 | +```bash |
| 143 | +# 1. Build must pass |
| 144 | +make build |
| 145 | + |
| 146 | +# 2. Unit tests must pass |
| 147 | +make test-unit |
| 148 | + |
| 149 | +# 3. Check pat.t — must match baseline (1056/1296) |
| 150 | +perl dev/tools/perl_test_runner.pl perl5_t/t/re/pat.t |
| 151 | + |
| 152 | +# 4. Check sprintf2.t — must match baseline (1652/1655) |
| 153 | +perl dev/tools/perl_test_runner.pl perl5_t/t/op/sprintf2.t |
| 154 | + |
| 155 | +# 5. No regressions in other key tests |
| 156 | +perl dev/tools/perl_test_runner.pl perl5_t/t/op/pack.t |
| 157 | +perl dev/tools/perl_test_runner.pl perl5_t/t/re/pat_rt_report.t |
| 158 | +``` |
| 159 | + |
| 160 | +## Debugging Tips |
| 161 | + |
| 162 | +### Compare raw output between baseline and branch |
| 163 | +```bash |
| 164 | +# Save branch output |
| 165 | +cd perl5_t/t && PERL_SKIP_BIG_MEM_TESTS=1 JPERL_UNIMPLEMENTED=warn JPERL_OPTS="-Xss256m" ../../jperl re/pat.t > /tmp/pat_branch.txt 2>&1 |
| 166 | + |
| 167 | +# Compare by test name against saved baseline |
| 168 | +LC_ALL=C diff \ |
| 169 | + <(LC_ALL=C grep -E '^(ok|not ok)' /tmp/pat_base_raw.txt | LC_ALL=C sed 's/^ok [0-9]* - /PASS: /;s/^not ok [0-9]* - /FAIL: /' | LC_ALL=C sort) \ |
| 170 | + <(LC_ALL=C grep -E '^(ok|not ok)' /tmp/pat_branch.txt | LC_ALL=C sed 's/^ok [0-9]* - /PASS: /;s/^not ok [0-9]* - /FAIL: /' | LC_ALL=C sort) \ |
| 171 | + | grep '^[<>]' |
| 172 | +``` |
| 173 | + |
| 174 | +### Test specific construct through both backends |
| 175 | +```bash |
| 176 | +./jperl -E 'my $s="abcde"; pos $s=2; say $s =~ /^\G/ ? "match" : "no"' |
| 177 | +./jperl --interpreter -E 'my $s="abcde"; pos $s=2; say $s =~ /^\G/ ? "match" : "no"' |
| 178 | +``` |
0 commit comments