feat: Interpreter fallback for large subroutines by fglock · Pull Request #204 · fglock/PerlOnJava

fglock · 2026-02-16T18:25:02Z

Summary

Implements interpreter fallback for large subroutines that exceed the JVM's 65,535 byte method size limit. When compilation fails due to size limits, the system automatically falls back to the bytecode interpreter, which has no size restrictions.

Key Changes

1. Fixed Critical Bug in RuntimeScalar Management (`614f949`)

Root Cause: SubroutineParser extracted a local reference to the RuntimeCode placeholder, which became stale when InterpretedCode replaced codeRef.value.

Solution:

SubroutineParser: Never use persistent local RuntimeCode references
For CompiledCode: Fill in existing placeholder via codeRef.value
For InterpretedCode: Replace codeRef.value entirely
RuntimeCode: Add compilerSupplier check to all static apply methods

Ensures:

RuntimeScalar wrapper in globalCodeRefs is single source of truth
All access goes through codeRef.value (no stale references)
Lazy compilation works for both CompiledCode and InterpretedCode
Polymorphic dispatch works correctly for both code types

2. Infrastructure Additions

CompiledCode class: Wrapper for compiled RuntimeCode with generated class reference
Unified API: EmitterMethodCreator.createRuntimeCode() returns either CompiledCode or InterpretedCode
Test infrastructure: Large subroutine tests (15,000 statements)
Debug output: Optional compilation fallback path messages
Larger EVAL_TRY offsets: 4-byte offsets to support large code blocks

Environment Variable

JPERL_USE_INTERPRETER_FALLBACK=1    # Enable interpreter fallback
                                     # (skips AST splitter for faster testing)

When set:

Skips AST splitting retry on "Method too large" errors
Directly falls back to interpreter backend
Shows fallback path messages

When not set:

Tries AST splitting first (existing behavior)
Only uses interpreter if AST splitting fails
Maintains backward compatibility

Architecture

Compilation Flow

SubroutineParser
  └─> Creates placeholder RuntimeCode
  └─> Stores Supplier for lazy compilation
  
First Invocation
  └─> RuntimeCode.apply() runs Supplier
      └─> EmitterMethodCreator.createRuntimeCode()
          ├─> Try JVM compilation
          │   └─> Success: CompiledCode
          │       └─> Fill placeholder with MethodHandle
          └─> Catch MethodTooLargeException
              ├─> If JPERL_USE_INTERPRETER_FALLBACK: Skip AST splitter
              └─> Fall back to BytecodeCompiler
                  └─> InterpretedCode
                      └─> REPLACE codeRef.value entirely

Single Source of Truth

// Global storage
RuntimeScalar codeRef = GlobalVariable.getGlobalCodeRef(fullName);

// ✅ Correct: Access via codeRef.value each time
RuntimeCode placeholder = (RuntimeCode) codeRef.value;
placeholder.methodHandle = ...;

// ❌ Wrong: Local reference becomes stale
RuntimeCode code = (RuntimeCode) codeRef.value;
// ... later ...
code.methodHandle = ...;  // Modifies wrong object!

Testing

✅ All unit tests pass (make test)
✅ Large subroutines compile to interpreter (15,000 statements)
✅ Interpreter executes correctly with correct results
✅ Normal compilation path unchanged (AST splitter still works)
✅ No "Undefined subroutine" errors
✅ Polymorphic dispatch works (both CompiledCode and InterpretedCode)
✅ Code references work correctly

Test Results

# Normal compilation (AST splitter path)
$ ./jperl large_sub.pl
large_sub result: 112507500
OK

# Interpreter fallback enabled
$ JPERL_USE_INTERPRETER_FALLBACK=1 ./jperl large_sub.pl
Note: Method too large, skipping AST splitter (interpreter fallback enabled).
Note: Method too large after AST splitting, using interpreter backend.
large_sub result: 112507500
OK

# Polymorphic dispatch (mixed compiled/interpreted)
$ JPERL_USE_INTERPRETER_FALLBACK=1 ./jperl polymorphic_test.pl
small: 42
medium: 500500
large: 50005000
OK: All subroutines returned correct values
via coderef - small: 42, medium: 500500, large: 50005000
OK: Code references work correctly

Benefits

No size limits: Subroutines can be arbitrarily large
Seamless fallback: Automatic, no user intervention needed
Binary compatibility: InterpretedCode and CompiledCode are indistinguishable
Performance: Compiled code still fast, interpreter only for edge cases
Testing: Easy to test interpreter path with environment variable

Design Principles

Single Source of Truth: RuntimeScalar wrapper is authoritative
No Local References: Never extract and persist RuntimeCode from wrapper
Lazy Compilation: Supplier pattern preserved for both code types
Polymorphic Dispatch: Virtual method dispatch works for both implementations

🤖 Generated with Claude Code

Changes EVAL_TRY catch offset from 2 bytes to 4 bytes to match all other control flow opcodes (GOTO, GOTO_IF_FALSE, etc.) and support bytecode larger than 32KB. Changes: - BytecodeCompiler: Use emitInt() and patchIntOffset() for 4-byte absolute addresses - BytecodeInterpreter: Use readInt() to properly read 4-byte catch target - InterpretedCode (disassembler): Fix offset reading (was using 8-bit shift, now 16-bit) The original implementation had 3 bugs: 1. Compiler patched only 1 short instead of 2 2. Interpreter combined shorts with 8-bit shift ((high << 8) | low) instead of 16-bit 3. GOTO after EVAL_END had the same patching issue All control flow opcodes now consistently use 4-byte absolute addresses, supporting bytecode up to ~2GB. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

This commit introduces the foundation for interpreter fallback: 1. Created CompiledCode extends RuntimeCode - mirrors InterpretedCode pattern 2. Added EmitterMethodCreator.createRuntimeCode() factory method - Returns RuntimeCode (either CompiledCode or InterpretedCode) - Handles MethodTooLargeException with interpreter fallback (when JPERL_USE_INTERPRETER_FALLBACK env var is set) - Falls back to AST splitter if flag not set (existing behavior) 3. Updated SubroutineParser to use new unified API - Handles both CompiledCode and InterpretedCode - For CompiledCode: uses reflection as before - For InterpretedCode: replaces RuntimeCode object to enable polymorphic dispatch The fallback is per-compilation-unit (not recursive) - if interpreted code creates a closure and it compiles successfully, it will be compiled. Only falls back to interpreter when individual compilation units are too large. Next steps: - Update evalStringHelper to return RuntimeCode instead of Class<?> - Update EmitEval to generate bytecode that handles RuntimeCode - Update EmitSubroutine (complex case with compile-time bytecode generation) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Created gen_large_sub_test.pl to generate tests with many statements - Added large_sub_interpreter_fallback.t test file - Tests verify both small and large subroutines work correctly The fallback architecture is now complete: 1. First try: Normal JVM compilation 2. Second try: AST splitter (if MethodTooLargeException) 3. Third try: Interpreter (if JPERL_USE_INTERPRETER_FALLBACK set and AST split fails) Next steps: - Update evalStringHelper to return RuntimeCode - Update EmitEval to handle RuntimeCode instead of Class<?> - Update EmitSubroutine for compile-time bytecode generation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Added JPERL_SHOW_FALLBACK environment variable to show which compilation path is taken: 1. Normal JVM compilation 2. AST splitter (when method too large) 3. Interpreter fallback (when AST splitter also fails) Example output with large subroutine: ``` Note: Method too large, retrying with AST splitter (automatic refactoring). Note: AST splitter succeeded. Note: JVM compilation succeeded. ``` This helps debug and understand when each fallback mechanism is used. The 3-level fallback architecture is now complete and working. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Current status: - ✅ Bypass AST splitter when interpreter fallback enabled - ✅ InterpretedCode compiles successfully - ✅ Debug output shows compilation paths - ⚠️ Issue: Lazy Supplier pattern conflicts with InterpretedCode The problem: - Supplier replaces codeRef.value with InterpretedCode - But caller still uses old RuntimeCode object - Old object has null methodHandle, fails with "Undefined subroutine" Next steps: - Either compile InterpretedCode eagerly (no Supplier) - Or reload code object after Supplier runs - Need to review interpreter closure creation pattern Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Status: ✅ Interpreter fallback WORKS for large subroutines ✅ Large sub test passes (test 1: execution works) ⚠️ Eager compilation breaks most tests (Test2 module loading) The interpreter fallback mechanism itself is working: - Large subroutines compile to interpreter successfully - Interpreter execution works correctly - Debug output confirms the flow The problem is that eager compilation changes timing: - Modules like Test2 expect lazy compilation - Subroutines compile during parse instead of runtime - This breaks Test2/API/Context.pm parsing Next step: Implement conditional compilation: - Lazy by default (keeps tests passing) - Eager only when interpreter fallback actually happens Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Current status: ✅ All existing tests pass (lazy compilation preserved) ✅ Parsing/syntax check works (-c flag) ✅ InterpretedCode is created and codeRef.value is replaced ⚠️ Runtime execution fails with "Undefined subroutine" Investigation findings: - Compilation succeeds (both parse and Supplier execution) - codeRef.value is correctly replaced with InterpretedCode - InterpretedCode.defined() returns true - Error happens at EXECUTION time, not parse time - Debug output from RuntimeCode.apply(static) not showing - Error likely comes from different code path than expected Next steps: - Add stack trace to identify exact error source - May need to check EmitOperator generated bytecode - Possible issue with how subroutines are called in generated code Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fixed critical bug in interpreter fallback where local RuntimeCode references became stale after InterpretedCode replacement. The issue occurred because code extracted a local reference to the placeholder RuntimeCode, but when the Supplier replaced codeRef.value with InterpretedCode, the local reference still pointed to the old placeholder. Changes: - SubroutineParser: Never use persistent local RuntimeCode references - SubroutineParser: For CompiledCode, fill in existing placeholder - SubroutineParser: For InterpretedCode, replace codeRef.value entirely - RuntimeCode: Add compilerSupplier check to all static apply methods - Remove all debug output from SubroutineParser, GlobalVariable, RuntimeCode The fix ensures that: - RuntimeScalar wrapper in globalCodeRefs is single source of truth - All access goes through codeRef.value (no stale local references) - Lazy compilation works for both CompiledCode and InterpretedCode - Polymorphic dispatch works correctly for both code types Tested with large subroutines (15,000 statements) that trigger interpreter fallback, verifying both direct calls and code reference calls work correctly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Extended interpreter fallback to handle main script bodies, not just subroutines. When the main script body exceeds JVM method size limits and JPERL_USE_INTERPRETER_FALLBACK is set, the system now catches the "Method too large" RuntimeException from ASM and falls back to the bytecode interpreter. Changes: - PerlLanguageProvider.compileToExecutable(): Catch RuntimeException with "Method too large" message - Fall back to interpreter path when size limit exceeded - Show fallback message when enabled This allows arbitrarily large main scripts to execute via interpreter backend. Tested with perl5_t/t/op/signatures.t which has large test body that triggers "Method too large" error. With fallback enabled, script now compiles to interpreter and fails with expected "Unsupported operator" error (interpreter limitation), not compilation error. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Refactored PerlLanguageProvider to use RuntimeCode as the unified return type for both compiled and interpreted code paths, improving type safety and code clarity. Changes: - compileToExecutable() now returns RuntimeCode instead of Object - Wrap compiled main scripts in CompiledCode (like subroutines) - executeCode() takes RuntimeCode parameter and calls apply() directly - Removed unnecessary MethodHandle.findVirtual() lookup in executeCode() - Added CompiledCode import, simplified code flow Benefits: - Type safety: No casting from Object to RuntimeCode - Consistency: Both compiler and interpreter paths return RuntimeCode - Cleaner API: RuntimeCode.apply() provides uniform interface - No performance impact: Direct apply() call uses same MethodHandle path This follows the same pattern used for subroutines, where both CompiledCode and InterpretedCode extend RuntimeCode and can be used interchangeably. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Added support for the prototype() operator in the bytecode interpreter, allowing it to query subroutine prototypes like the compiler does. Changes: - Opcodes.java: Added PROTOTYPE opcode (158) - BytecodeInterpreter.java: Implemented PROTOTYPE in executeTypeOps() - BytecodeCompiler.java: Added emission for prototype operator - InterpretedCode.java: Added PROTOTYPE disassembly This operator: - Takes a code reference or function name - Returns the prototype string for that subroutine - Checks CORE_PROTOTYPES for built-in functions - Looks up user-defined subroutines in global symbol table Format: PROTOTYPE rd rs package_name_idx(int) Effect: rd = RuntimeCode.prototype(rs, package_name) Progress: op/signatures.t now gets past line 18 (was "Unsupported operator: prototype"), now fails on line 25 with "quoteRegex" which will be addressed next. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Added support for three missing operators in the bytecode interpreter: 1. QUOTE_REGEX (159): Compiled regex operator (qr{pattern}flags) - Calls RuntimeRegex.getQuotedRegex(pattern, flags) - Format: QUOTE_REGEX rd pattern_reg flags_reg 2. LE_NUM (160): Numeric less than or equal (<=) - Calls CompareOperators.lessThanOrEqual(rs1, rs2) - Format: LE_NUM rd rs1 rs2 3. GE_NUM (161): Numeric greater than or equal (>=) - Calls CompareOperators.greaterThanOrEqual(rs1, rs2) - Format: GE_NUM rd rs1 rs2 Changes: - Opcodes.java: Added QUOTE_REGEX, LE_NUM, GE_NUM (159-161) - BytecodeInterpreter.java: Implemented all three opcodes - BytecodeCompiler.java: Added emission for all three operators - InterpretedCode.java: Added disassembly for all three opcodes Progress: op/signatures.t now gets past "quoteRegex" and "<=" errors. These operators were already supported by the compiler but missing from the interpreter, causing fallback compilation to fail. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Implement interpreter support for operators to match compiler functionality: - Improved error messages: All exceptions now include filename and line numbers - "my" list assignments: my ($x, $y) = ... with proper initialization - .= operator (STRING_CONCAT_ASSIGN, opcode 162): String concatenation assignment - PUSH_LOCAL_VARIABLE (opcode 163): Support for DynamicVariableManager integration - local scalar/array/hash: Proper localization semantics matching compiler - local hash element: local $SIG{__WARN__} = sub { ... } - Typeglob assignment (STORE_GLOB, opcode 164): *foo = sub {} - open operator (OPEN, opcode 165): open my $fh, "<", "file.txt" - readline operator (READLINE, opcode 166): while(<$fh>) { ... } All opcodes are sequential (162-166) for JVM tableswitch optimization. Test progress: op/signatures.t now reaches line 1431/1600 (89% through file) with interpreter fallback, up from line 1387 initially. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Continue adding operators to reach further into op/signatures.t: - Block dereference: ${\expr} - execute block and dereference result - matchRegex operator (opcode 167): Create compiled regex from m/pattern/ - =~ operator (MATCH_REGEX): Regex matching with RuntimeRegex.matchRegex - chomp operator (opcode 168): Remove trailing newlines All opcodes remain sequential (167-168) for JVM tableswitch optimization. Test progress: op/signatures.t now reaches line 1450/1600 (90.6% through file), up from line 1431. Successfully handles complex Perl constructs like ${\\0} and regex operations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add support for lvalue subroutine assignment (f() = value) - When a function is called in lvalue context, it returns a RuntimeBaseProxy - Assign to it using SET_SCALAR which calls .set() on the proxy - Add unary + operator support - Forces numeric/scalar context on operand - For arrays/hashes in scalar context, returns size - Add STORE_GLOBAL_ARRAY and STORE_GLOBAL_HASH runtime support - Implements opcodes 13 and 15 in BytecodeInterpreter - Adds disassembly cases in InterpretedCode Test progress: op/signatures.t reaches line 1466 (91.6% through file) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ecial I/O opcodes - Add SET_SCALAR (99) disassembly after CREATE_CLOSURE - Add EVAL_STRING (151), SELECT_OP (152), LOAD_GLOB (153), SLEEP_OP (154) - Fixes disassembly misalignment that caused all subsequent opcodes to appear corrupted - Each opcode now properly advances pc for its operands This resolves the 'UNKNOWN(99)' and 'UNKNOWN(151)' issues that were causing bytecode to appear misaligned in debug output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…guide - Reduced from 1901 to 350 lines (81% compression) - Added 'Adding New Operators' section with 5 detailed examples: 1. Fast opcode (unary +) 2. STORE_GLOBAL_* runtime support 3. Lvalue subroutine assignment 4. Testing procedures 5. Critical lessons learned Key learnings documented: - Disassembly is NOT optional (causes PC misalignment) - Opcode contiguity is performance-critical (tableswitch vs lookupswitch) - Match compiler semantics exactly (check EmitterVisitor) - Never hide problems with null checks - Error messages must include context (throwCompilerException) Emphasized: Opcodes MUST be CONTIGUOUS and IN ORDER in ALL switch statements Removed verbose/redundant content while preserving essential information: - Detailed implementation patterns - Common pitfalls - Performance targets - Runtime sharing architecture Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add ARRAY_SET (43), ARRAY_PUSH (44), ARRAY_POP (45), ARRAY_SHIFT (46), ARRAY_UNSHIFT (47) - Add DEREF_ARRAY (114) and RETRIEVE_BEGIN_SCALAR (128) These missing disassembly cases were causing PC misalignment issues, making all subsequent bytecode appear corrupted. Each opcode must properly advance PC for all its operands. Note: Disassembly organization needs improvement - should be ordered by opcode number for easier maintenance. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add SUB_SCALAR_INT (25), MUL_SCALAR_INT (26) - Add CONCAT (27), REPEAT (28) - Add SPLIT (124), LOCAL_SCALAR (131) All disassembly cases properly advance PC for their operands to prevent misalignment issues. Verified with javap that BytecodeInterpreter uses tableswitch for opcodes 0-168, confirming contiguous opcode numbering is working correctly. Updated SKILL.md with tableswitch verification command and example output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The PROTOTYPE opcode handler was advancing PC by 4 instead of 2 after readInt(). Since readInt() reads 2 shorts, PC should only advance by 2. This bug caused a 2-short misalignment in the bytecode stream, making all subsequent opcodes appear at wrong positions. The disassembler would read register numbers as opcodes, causing cascading failures. Impact:- Before: Test failed at line 18 with "Register r10 is null" - After: Test progresses to line 1212, passing tests 855-864 Similar bug was also in InterpretedCode disassembler - both fixed. Verified all other readInt() usage patterns correctly use pc += 2. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Critical fix for BytecodeCompiler function call handling. Problem: - When compiling scalar(t110()) with empty args, the () was compiled in SCALAR context, producing LOAD_UNDEF instead of empty RuntimeList - This caused interpreter to see 1 argument instead of 0 - Error: "Too many arguments for subroutine 'main::t110' (got 1; expected 0)" Solution: - Added special handling for "(" and "()" operators before line 3103 - Function arguments now ALWAYS compiled in LIST context - Code reference compiled in SCALAR context - Matches behavior of "->" operator handling (line 2794) Impact: - Tests: 561 → 602 passing (+41 tests) - Progress: line 1212 → 1321 (+109 lines, 82.6% of file) - Interpreter now AHEAD of compiler: 602/908 vs 595/908 - Both backends now fail at same point with same error Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Problem: - When JPERL_USE_INTERPRETER_FALLBACK=1 was set, SHOW_FALLBACK was automatically enabled - This caused "Note: JVM compilation succeeded." messages to be printed to stderr for every compiled subroutine - Test harnesses (like op/tie.t) capture stderr and compare with expected output, causing 36 test failures (5/41 passing instead of 41/41) Root Cause: - Line 1480-1481 in EmitterMethodCreator.java: private static final boolean SHOW_FALLBACK = System.getenv("JPERL_SHOW_FALLBACK") != null || System.getenv("JPERL_USE_INTERPRETER_FALLBACK") != null; // WRONG! Solution: - Remove JPERL_USE_INTERPRETER_FALLBACK from SHOW_FALLBACK check - These are now independent flags: * JPERL_USE_INTERPRETER_FALLBACK: enables interpreter fallback (silent) * JPERL_SHOW_FALLBACK: shows diagnostic messages (for debugging only) Impact: - op/tie.t: 5/95 → 41/95 passing with JPERL_USE_INTERPRETER_FALLBACK=1 - Matches compiler results: 41/95 - No spurious output in test results Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

This fixes the op/lexsub.t regression from 98/151 to 103/157 tests passing. Root cause: The Supplier lambda was creating NEW local variables (placeholderCode) inside the lambda instead of using the captured 'placeholder' variable from the outer scope. This caused the compilerSupplier to not be properly cleared, leading to duplicate compilation attempts and LinkageErrors. The fix uses the outer 'placeholder' variable that's captured by the lambda closure, matching the pattern of the working version which used a local 'code' variable. This ensures the compilerSupplier is cleared on the correct RuntimeCode object. Changes: - CompiledCode path: Use captured 'placeholder' instead of creating 'placeholderCode' - InterpretedCode path: Use captured 'placeholder' for metadata copying - Clear compilerSupplier once at the end using the captured 'placeholder' Test results: - Before: 98/151 tests passing (LinkageError after test 151) - After: 103/157 tests passing (6 more tests run, 5 more pass) - Interpreter fallback: Still works correctly (103/157 tests) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fglock and others added 23 commits February 16, 2026 18:08

fglock merged commit b1518f7 into master Feb 16, 2026
2 checks passed

fglock deleted the feature/interpreter-fallback-unified-api branch February 16, 2026 22:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Interpreter fallback for large subroutines#204

feat: Interpreter fallback for large subroutines#204
fglock merged 23 commits intomasterfrom
feature/interpreter-fallback-unified-api

fglock commented Feb 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fglock commented Feb 16, 2026

Summary

Key Changes

1. Fixed Critical Bug in RuntimeScalar Management (614f949)

2. Infrastructure Additions

Environment Variable

Architecture

Compilation Flow

Single Source of Truth

Testing

Test Results

Benefits

Design Principles

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. Fixed Critical Bug in RuntimeScalar Management (`614f949`)