diff --git a/dev/interpreter/SKILL.md b/dev/interpreter/SKILL.md index e9ee32fa1..77cd2d038 100644 --- a/dev/interpreter/SKILL.md +++ b/dev/interpreter/SKILL.md @@ -1,1900 +1,361 @@ # PerlOnJava Interpreter Developer Guide -## Table of Contents -1. [Introduction & Architecture Overview](#introduction--architecture-overview) -2. [File Organization](#file-organization) -3. [Testing & Benchmarking](#testing--benchmarking) -4. [Dispatch Architecture & Future CPU Cache Optimization](#dispatch-architecture--future-cpu-cache-optimization) -5. [Optimization Strategies](#optimization-strategies) -6. [Runtime Sharing (100% API Compatibility)](#runtime-sharing-100-api-compatibility) -7. [Development Workflow](#development-workflow) +## Quick Reference -## Introduction & Architecture Overview +**Performance:** 46.84M ops/sec (1.75x slower than compiler ✓) +**Opcodes:** 0-157 (contiguous) for JVM tableswitch optimization +**Runtime:** 100% API compatibility with compiler (zero duplication) -### What is the Interpreter? +## Core Files -The PerlOnJava interpreter is a **pure register machine** that executes Perl bytecode directly without generating JVM bytecode. It complements the existing compiler by providing: +- `Opcodes.java` - Opcode constants (0-157, contiguous) +- `BytecodeInterpreter.java` - Execution loop with range-based delegation +- `BytecodeCompiler.java` - AST to bytecode with register allocation +- `InterpretedCode.java` - Bytecode container with disassembler +- `SlowOpcodeHandler.java` - Handlers for rare operations (151-154) -- **Faster startup**: No bytecode generation or JVM class loading overhead -- **Lower memory footprint**: No class metadata or compiled method objects -- **Dynamic evaluation**: Ideal for `eval STRING`, code generation, and REPL scenarios -- **Closure support**: Captures and shares variables between compiled and interpreted code +## Adding New Operators -### Why It Exists +### 1. Decide: Fast Opcode or SLOW_OP? -While the PerlOnJava compiler generates optimized JVM bytecode, there are scenarios where interpretation is superior: +**Use Fast Opcode when:** +- Operation is used frequently (>1% of execution) +- Simple 1-3 operand format +- Performance-critical (loops, arithmetic) -1. **eval STRING**: Dynamic code evaluated once doesn't justify compilation overhead -2. **Short-lived scripts**: Scripts that run briefly and exit (startup time matters) -3. **Development/debugging**: Faster edit-run cycles during development -4. **Memory-constrained environments**: Lower memory usage than compiled code - -### Architecture: Pure Register Machine - -The interpreter uses a **register-based architecture** (not stack-based like the JVM): - -- **3-address code format**: `rd = rs1 op rs2` (destination = source1 operator source2) -- **Unlimited virtual registers**: Each subroutine can use as many registers as needed -- **Register allocation**: BytecodeCompiler assigns variables to register indices -- **No stack manipulation**: Direct register-to-register operations - -**Example bytecode:** -``` -LOAD_INT r0, 10 # r0 = 10 -LOAD_INT r1, 20 # r1 = 20 -ADD_SCALAR r2, r0, r1 # r2 = r0 + r1 -PRINT r2 # print r2 -``` - -### Performance Characteristics - -**Current Performance (as of Phase 5):** -- **46.84M ops/sec** (simple for loop benchmark) -- **1.75x slower** than compiler (within 2-5x target ✓) -- **Tableswitch optimization**: Dense opcodes (0-157) enable O(1) dispatch -- **Direct method calls**: SlowOpcodeHandler methods called directly (no SLOWOP_* indirection) -- **Superinstructions**: Eliminate intermediate MOVE operations -- **Variable sharing**: Interpreter and compiled code share lexical variables via persistent storage - -**Performance vs. Compiler:** -- Compiler: ~82M ops/sec (after JIT warmup) -- Interpreter: ~47M ops/sec (consistent, no warmup needed) -- Trade-off: Slower execution for faster startup and lower memory - -**Phase 5 Optimizations (Completed):** -- Removed SLOWOP_* ID constants (41 constants eliminated) -- Direct method calls to SlowOpcodeHandler (eliminates ID mapping overhead) -- Contiguous opcode numbering (0-157, no gaps) for optimal tableswitch -- All interpreter methods under JIT limit and being compiled successfully - -### JIT Compilation Limit & Method Size Management - -**Critical Constraint:** The JVM refuses to JIT-compile methods larger than ~8000 bytes (controlled by `-XX:DontCompileHugeMethods`). When methods exceed this limit, they run in **interpreted mode**, causing 5-10x performance degradation. - -**Architecture: Range-Based Delegation** - -To keep the main `execute()` method under the JIT limit, cold-path opcodes are delegated to secondary methods: - -1. **executeComparisons()** - Comparison and logical operators (opcodes 31-41) - - COMPARE_NUM, COMPARE_STR, EQ_NUM, NE_NUM, LT_NUM, GT_NUM, EQ_STR, NE_STR, NOT - - Size: ~1089 bytes - -2. **executeArithmetic()** - Multiply, divide, and compound assignments (opcodes 19-30, 110-113) - - MUL_SCALAR, DIV_SCALAR, MOD_SCALAR, POW_SCALAR, NEG_SCALAR, CONCAT, REPEAT, LENGTH - - SUBTRACT_ASSIGN, MULTIPLY_ASSIGN, DIVIDE_ASSIGN, MODULUS_ASSIGN - - Size: ~1057 bytes - -3. **executeCollections()** - Array and hash operations (opcodes 43-49, 51-56, 93-96) - - ARRAY_SET, ARRAY_PUSH, ARRAY_POP, HASH_SET, HASH_EXISTS, HASH_DELETE, etc. - - Size: ~1025 bytes - -4. **executeTypeOps()** - Type and reference operations (opcodes 62-70, 102-105) - - DEFINED, REF, BLESS, ISA, CREATE_LAST, CREATE_NEXT, CREATE_REDO, CREATE_REF, DEREF - - Size: ~929 bytes - -**Hot-Path Opcodes (Kept Inline):** -- Control flow: NOP, RETURN, GOTO, GOTO_IF_FALSE, GOTO_IF_TRUE -- Register ops: MOVE, LOAD_CONST, LOAD_INT, LOAD_STRING, LOAD_UNDEF -- Core arithmetic: ADD_SCALAR, SUB_SCALAR (used by loops) -- Iteration: ITERATOR_CREATE, ITERATOR_HAS_NEXT, ITERATOR_NEXT, FOREACH_NEXT_OR_EXIT -- Essential access: ARRAY_GET, HASH_GET - -**Current Sizes:** -- Main execute(): 7270 bytes (under 7500-byte safe limit ✓) -- All secondary methods: <1100 bytes each ✓ - -**Enforcement:** - -Run `dev/tools/check-bytecode-size.sh` after changes to verify all methods stay under limit: - -```bash -./dev/tools/check-bytecode-size.sh -``` - -This script checks all 5 methods (main execute + 4 secondary) and fails the build if any exceeds 7500 bytes. - -**If Methods Grow Too Large:** - -1. Move more opcodes from main execute() to secondary methods -2. Split large secondary methods into smaller groups -3. Keep hot-path opcodes (loops, basic arithmetic) inline for zero overhead -4. Delegate cold-path opcodes (rare operations) to minimize cost - -**Performance Impact:** - -- Hot-path opcodes: Zero overhead (inline in main switch) -- Cold-path opcodes: One static method call (~5-10ns overhead) -- Overall: Negligible impact since cold ops are infrequent - -## File Organization - -### Documentation (`dev/interpreter/`) - -- **STATUS.md** - Current implementation status and feature completeness -- **TESTING.md** - How to test and benchmark the interpreter -- **OPTIMIZATION_RESULTS.md** - Optimization history and performance measurements -- **BYTECODE_DOCUMENTATION.md** - Complete reference for all opcodes (0-99 + SLOW_OP) -- **CLOSURE_IMPLEMENTATION_COMPLETE.md** - Closure architecture and bidirectional calling -- **SKILL.md** (this file) - Developer guide for continuing interpreter development -- **architecture/** - Design documents and architectural decisions -- **tests/** - Interpreter-specific test files (.t and .pl format) - -### Source Code (`src/main/java/org/perlonjava/interpreter/`) - -**Core Interpreter:** -- **Opcodes.java** - Opcode constants (0-157, contiguous) organized by category -- **BytecodeInterpreter.java** - Main execution loop with range-based delegation to secondary methods - - Main execute() method: Hot-path opcodes (loops, basic arithmetic, control flow) - - executeComparisons(): Comparison and logical operators - - executeArithmetic(): Multiply, divide, compound assignments - - executeCollections(): Array and hash operations - - executeTypeOps(): Type and reference operations -- **BytecodeCompiler.java** - AST to bytecode compiler with register allocation -- **InterpretedCode.java** - Bytecode container with disassembler for debugging -- **SlowOpcodeHandler.java** - Direct method handlers for rare operations (no SLOWOP_* ID indirection) - -**Support Classes:** -- **VariableCaptureAnalyzer.java** - Analyzes which variables are captured by named subroutines -- **VariableCollectorVisitor.java** - Detects closure variables for capture analysis - -### Build Tools (`dev/tools/`) - -- **check-bytecode-size.sh** - Verifies all interpreter methods stay under JIT compilation limit (7500 bytes) - - Run after modifications to BytecodeInterpreter.java - - Automatically checks main execute() and all secondary methods - - Prevents performance regressions from method size growth - -### Opcode Categories (Opcodes.java) - -Opcodes are organized into functional categories: - -1. **Control Flow** (0-6): RETURN, LABEL, GOTO, conditionals -2. **Constants** (7-9): LOAD_INT, LOAD_STRING, LOAD_UNDEF -3. **Variables** (10-16): GET_VAR, SET_VAR, CREATE_CLOSURE_VAR, GET_LOCAL_VAR -4. **Arithmetic** (17-30): ADD, SUB, MUL, DIV, MOD, POW, NEG, etc. -5. **Comparison** (31-41): COMPARE_NUM, COMPARE_STR, EQ, NE, LT, GT, LE, GE -6. **Array Operations** (42-49): ARRAY_GET, ARRAY_SET, PUSH, POP, SHIFT, UNSHIFT -7. **Hash Operations** (50-56): HASH_GET, HASH_SET, EXISTS, DELETE, KEYS, VALUES -8. **Subroutine Calls** (57-59): CALL_SUB, CALL_METHOD, CALL_BUILTIN -9. **Control Flow Specials** (60-67): CREATE_LAST, CREATE_NEXT, CREATE_REDO, CREATE_GOTO -10. **References** (68-72): CREATE_REF, DEREF, GET_TYPE, GET_REFTYPE, BLESS -11. **Error Handling** (73-74): DIE, WARN -12. **Superinstructions** (75-82): INC_REG, DEC_REG, ADD_ASSIGN, ADD_ASSIGN_INT, etc. -13. **SLOW_OP Gateway** (87): Single gateway for 256 rare operations (system calls, sockets) -14. **Variable Aliasing** (99): SET_SCALAR (sets value without overwriting reference) - -**Implemented Opcodes:** 0-157 (dense, contiguous numbering for tableswitch optimization) - -## Variable Sharing Between Interpreter and Compiled Code - -### Overview - -**Status:** ✅ Implemented (PR #191) - -The interpreter now supports seamless variable sharing between interpreted main scripts and compiled named subroutines. Variables declared in the interpreted scope are accessible to compiled code and vice versa, maintaining proper aliasing semantics. - -### Implementation - -When a variable is captured by a named subroutine, the interpreter: - -1. **Analyzes captures** - `VariableCaptureAnalyzer` identifies which variables need persistent storage -2. **Retrieves from persistent storage** - Uses `SLOWOP_RETRIEVE_BEGIN_*` opcodes to get the persistent variable -3. **Stores reference in register** - The register contains a reference to the persistent RuntimeScalar/Array/Hash -4. **Preserves aliasing** - All operations work on the same object, so changes are visible to both interpreter and compiled code - -### Key Components - -**VariableCaptureAnalyzer.java:** -- Scans main script AST for named subroutine definitions -- Identifies which outer variables each subroutine references -- Returns set of captured variable names that need persistent storage - -**SET_SCALAR Opcode (99):** -```java -// Format: SET_SCALAR rd rs -// Effect: ((RuntimeScalar)registers[rd]).set((RuntimeScalar)registers[rs]) -// Purpose: Sets value without overwriting the reference (preserves aliasing) -``` - -**SLOWOP_RETRIEVE_BEGIN_* Opcodes:** -- `SLOWOP_RETRIEVE_BEGIN_SCALAR` (19) - Retrieves persistent scalar variable -- `SLOWOP_RETRIEVE_BEGIN_ARRAY` (20) - Retrieves persistent array variable -- `SLOWOP_RETRIEVE_BEGIN_HASH` (21) - Retrieves persistent hash variable - -### Persistent Storage Naming - -Variables use the BEGIN naming scheme: `PerlOnJava::_BEGIN_::varname` - -Example: `$width` with `ast.id = 5` becomes `PerlOnJava::_BEGIN_5::width` - -### Example - -```perl -my $width = 20; # Interpreted: stored in persistent global + register - -sub neighbors { # Compiled subroutine - return $width * 2; # Accesses same persistent global -} - -print neighbors(); # 40 -$width = 30; # Update visible to both -print neighbors(); # 60 -``` - -**Generated Bytecode:** -``` -SLOW_OP -SLOWOP_RETRIEVE_BEGIN_SCALAR r0, "width", 5 # Get persistent variable -LOAD_INT r1, 20 # Load initial value -SET_SCALAR r0, r1 # Set value (preserves ref) -``` - -### Context Detection (wantarray) - -**Status:** ✅ Implemented - -The interpreter properly detects calling context (VOID/SCALAR/LIST) for subroutine calls: - -**RuntimeContextType Values:** -- `VOID` (0) - No return value expected -- `SCALAR` (1) - Single value expected -- `LIST` (2) - List of values expected - -**Detection Strategy:** -- Based on assignment target type -- `my $x = sub()` → SCALAR context -- `my @x = sub()` → LIST context -- `sub(); other_code()` → VOID context - -**Implementation in BytecodeCompiler.java:** -```java -// Determine context from LHS type -int rhsContext = RuntimeContextType.LIST; // Default -if (node.left instanceof OperatorNode) { - OperatorNode leftOp = (OperatorNode) node.left; - if (leftOp.operator.equals("my") && leftOp.operand instanceof OperatorNode) { - OperatorNode sigilOp = (OperatorNode) leftOp.operand; - if (sigilOp.operator.equals("$")) { - rhsContext = RuntimeContextType.SCALAR; - } - } -} -``` - -## Error Reporting - -### throwCompilerException(String message, int tokenIndex) - -The BytecodeCompiler uses `throwCompilerException(String message, int tokenIndex)` to report errors with proper context: - -**Purpose:** -- Provides accurate error messages with filename and line number -- Transforms token index into source location -- Consistent error reporting across interpreter and compiler - -**Usage Example:** -```java -if (invalidCondition) { - throwCompilerException("Invalid operation: " + details, node.getIndex()); -} -``` - -**Output Format:** -``` -Error at file.pl line 42: Invalid operation: details - 42: my $x = invalid_code_here; - ^ -``` - -**Benefits:** -- Users see exact source location of errors -- Easier debugging of interpreter bytecode generation -- Consistent with compiler error reporting - -## Testing & Benchmarking - -### Running Tests - -**Fast Unit Tests (seconds):** -```bash -make test-unit # All fast unit tests -make # Build + fast tests (default) -``` - -**Comprehensive Tests (minutes):** -```bash -make test-all # All tests including Perl 5 core tests -make test-perl5 # Perl 5 core test suite -``` - -**Interpreter-Specific Tests:** -```bash -# Perl test files -./jperl dev/interpreter/tests/for_loop_test.pl -./jperl dev/interpreter/tests/closure_test.t -./jperl dev/interpreter/tests/*.t -``` - -### Running Benchmarks - -**Using Perl benchmark script:** -```bash -./jperl dev/interpreter/tests/for_loop_benchmark.pl -``` - -**Benchmark Output Example:** -``` -Compiler - 1000000 iterations: 11.72 ms (85.32M ops/sec) -Interpreter - 1000000 iterations: 21.35 ms (46.84M ops/sec) -Interpreted code is 1.82x slower than compiled code -``` - -### Performance Targets - -- **Interpreter Target**: 2-5x slower than compiler -- **Current**: 1.75x slower ✓ (within target) -- **Benchmark Loop**: 19.94M ops/sec (Phase 1 baseline) -- **After Optimizations**: 46.84M ops/sec (2.35x improvement) - -### Test Frameworks - -**Perl Tests (.t files):** -- Use Test::More framework -- TAP (Test Anything Protocol) output -- Run via `perl dev/tools/perl_test_runner.pl` - -**Perl Benchmark Scripts (.pl files):** -- Located in `dev/interpreter/tests/` -- Run with `./jperl` to compare interpreter vs. compiler performance -- Example: `./jperl dev/interpreter/tests/for_loop_benchmark.pl` - -## Dispatch Architecture & CPU Cache Optimization - -### Current Design: Main Switch + SLOW_OP Gateway - -The interpreter uses **optimized dual-dispatch architecture** for maximum performance: - -**Main Switch (Opcodes 0-87):** -```java -public static RuntimeList execute(InterpretedCode code, RuntimeArray args, int callContext) { - RuntimeBase[] registers = new RuntimeBase[code.maxRegisters]; - int pc = 0; - byte[] bytecode = code.bytecode; - - while (pc < bytecode.length) { - byte opcode = bytecode[pc++]; - - switch (opcode) { - case Opcodes.RETURN: ... - case Opcodes.GOTO: ... - case Opcodes.LOAD_INT: ... - // ... hot path opcodes (0-86) ... - case Opcodes.CREATE_LIST: ... - - case Opcodes.SLOW_OP: // Gateway to rare operations - pc = SlowOpcodeHandler.execute(bytecode, pc, registers, code); - break; - - default: throw new RuntimeException("Unknown opcode: " + opcode); - } - } -} -``` - -**Slow Operation Handler (Separate Class):** -```java -// SlowOpcodeHandler.java -public static int execute(byte[] bytecode, int pc, RuntimeBase[] registers, InterpretedCode code) { - int slowOpId = bytecode[pc++] & 0xFF; - - switch (slowOpId) { // Dense switch (0,1,2...) for tableswitch - case 0: return executeChown(...); - case 1: return executeWaitpid(...); - case 2: return executeSetsockopt(...); - // ... up to 255 slow operations - } -} -``` - -**Key Characteristics:** -- Main switch: 87 opcodes (0-87) - compact for CPU i-cache -- Dense numbering (no gaps) enables JVM tableswitch optimization (O(1)) -- SLOW_OP (87): Single gateway for 256 rare operations -- SlowOpcodeHandler: Separate class with own dense switch (0-255) -- Preserves opcodes 88-255 for future fast operations - -**Architecture Benefits:** -1. **Opcode Space Efficiency**: Uses 1 opcode for 256 slow operations -2. **CPU Cache Optimized**: Main loop stays compact (fits in i-cache) -3. **Tableswitch x2**: Both main and slow switches use dense numbering -4. **Easy Extension**: Add slow ops without consuming fast opcodes - -### Performance Characteristics - -**Main Switch (Hot Path):** -- 87 dense opcodes (0-87) -- ~10-15% speedup from tableswitch vs. lookupswitch -- Fits in CPU L1 instruction cache (32-64KB) -- No overhead for fast operations - -**SLOW_OP Gateway (Cold Path):** -- Single opcode (87) with sub-operation ID parameter -- Adds ~5ns overhead per slow operation -- Worth it for <1% execution frequency -- Keeps main loop compact (main benefit) - -**Bytecode Format:** -``` -Fast operation: -[OPCODE] [operands...] -e.g., [ADD_SCALAR] [rd] [rs1] [rs2] - -Slow operation: -[SLOW_OP] [slow_op_id] [operands...] -e.g., [87] [1] [rd] [rs_pid] [rs_flags] - ^ ^ - | |__ SLOWOP_WAITPID (1) - |_______ SLOW_OP gateway -``` - -### JVM Tableswitch Optimization - -**Tableswitch** (used for dense cases): -```java -tableswitch { // O(1) lookup via array index - 0: goto label_0 - 1: goto label_1 - 2: goto label_2 - ... -} -``` - -**Lookupswitch** (used for sparse cases): -```java -lookupswitch { // O(log n) lookup via binary search - 10: goto label_10 - 50: goto label_50 - 100: goto label_100 -} -``` - -**Critical:** To maintain tableswitch, opcodes MUST be dense (no large gaps in numbering). - -### SLOW_OP Architecture (Implemented) - -**Design:** Single gateway opcode for all rarely-used operations. - -**Rationale:** -- Consuming many opcode numbers (200-255) for rare operations wastes space -- Main interpreter switch grows large, reducing CPU i-cache efficiency -- Better: Use ONE opcode with sub-operation parameter - -**Implementation:** -```java -// Main interpreter switch -case Opcodes.SLOW_OP: // Opcode 87 - pc = SlowOpcodeHandler.execute(bytecode, pc, registers, code); - break; - -// SlowOpcodeHandler.java -public static int execute(...) { - int slowOpId = bytecode[pc++] & 0xFF; - switch (slowOpId) { // Dense: 0, 1, 2, 3, ... - case SLOWOP_CHOWN: return executeChown(...); - case SLOWOP_WAITPID: return executeWaitpid(...); - case SLOWOP_SETSOCKOPT: return executeSetsockopt(...); - // ... 19 operations defined, 236 slots remaining - } -} -``` - -**Benefits:** -- **Opcode Efficiency**: 1 opcode for 256 slow operations -- **Space Preservation**: Opcodes 88-255 available for future fast ops -- **CPU Cache**: Main loop stays compact (87 cases vs 255+) -- **Tableswitch x2**: Both switches use dense numbering -- **Easy Extension**: Add slow ops without affecting main loop - -**Implemented Slow Operations (22 defined, 233 slots remaining):** - -| ID | Name | Description | -|----|------|-------------| -| 0 | SLOWOP_CHOWN | Change file ownership | -| 1 | SLOWOP_WAITPID | Wait for process completion | -| 2 | SLOWOP_SETSOCKOPT | Set socket options | -| 3 | SLOWOP_GETSOCKOPT | Get socket options | -| 4 | SLOWOP_FCNTL | File control operations | -| 5 | SLOWOP_IOCTL | Device control operations | -| 6 | SLOWOP_FLOCK | File locking | -| 7 | SLOWOP_SEMOP | Semaphore operations | -| 8 | SLOWOP_MSGCTL | Message queue control | -| 9 | SLOWOP_SHMCTL | Shared memory control | -| 10 | SLOWOP_GETPRIORITY | Get process priority | -| 11 | SLOWOP_SETPRIORITY | Set process priority | -| 12 | SLOWOP_SYSCALL | Generic system call | -| 13 | SLOWOP_SOCKET | Create socket | -| 14 | SLOWOP_BIND | Bind socket to address | -| 15 | SLOWOP_CONNECT | Connect socket | -| 16 | SLOWOP_LISTEN | Listen for connections | -| 17 | SLOWOP_ACCEPT | Accept connection | -| 18 | SLOWOP_SHUTDOWN | Shutdown socket | -| 19 | SLOWOP_RETRIEVE_BEGIN_SCALAR | Retrieve persistent scalar variable | -| 20 | SLOWOP_RETRIEVE_BEGIN_ARRAY | Retrieve persistent array variable | -| 21 | SLOWOP_RETRIEVE_BEGIN_HASH | Retrieve persistent hash variable | - -**Usage Example:** -``` -Perl code: chown($uid, $gid, @files); -Bytecode: [SLOW_OP] [0] [operands...] # 0 = SLOWOP_CHOWN -``` - -**Performance Characteristics:** -- **Gateway overhead**: ~5ns per slow operation (method call + second switch) -- **Worth it for**: Operations used <1% of execution time -- **Main benefit**: Keeps main interpreter loop compact for CPU i-cache - -**Adding New Slow Operations:** -1. Add constant to `Opcodes.java`: `SLOWOP_FOO = 19` -2. Add case to `SlowOpcodeHandler.execute()`: `case 19: return executeFoo(...)` -3. Implement handler: `private static int executeFoo(...)` -4. Update disassembler in `InterpretedCode.java` -5. Maintain dense numbering (no gaps) - -## Optimization Strategies - -### Completed Optimizations (Phases 1-4) - -**1. Dense Opcodes (10-15% speedup)** -- Renumbered opcodes to 0-82 with no gaps -- Enables JVM tableswitch (O(1)) instead of lookupswitch (O(log n)) -- Measured via bytecode disassembly verification - -**2. Better JIT Warmup (156% speedup)** -- Increased warmup iterations from 100 to 1000 -- Allows JVM JIT compiler to optimize dispatch loop -- Before: 18.28M ops/sec → After: 46.84M ops/sec - -**3. Superinstructions (5-10% speedup)** -- Combined common patterns into single opcodes -- Eliminates intermediate MOVE operations -- Examples: - - `INC_REG` (75): `r0 = r0 + 1` (replaces ADD + MOVE) - - `DEC_REG` (76): `r0 = r0 - 1` - - `ADD_ASSIGN` (77): `r0 = r0 + r1` - - `ADD_ASSIGN_INT` (78): `r0 = r0 + immediate_int` - - `LOOP_PLUS_PLUS` (82): Combined increment + compare + branch - -**4. Variable Sharing Implementation (correctness improvement)** -- Seamless variable sharing between interpreted and compiled code -- Persistent storage using BEGIN mechanism -- Maintains proper aliasing semantics -- Enables examples/life.pl and other mixed-mode programs - -**5. Context Detection (correctness improvement)** -- Proper VOID/SCALAR/LIST context detection for subroutine calls -- Based on assignment target type -- Matches Perl semantics for wantarray - -**6. SET_SCALAR Opcode (correctness improvement)** -- Opcode 99: Sets value without overwriting reference -- Preserves variable aliasing between interpreter and compiled code -- Critical for shared variable semantics - -**7. Phase 5: SLOWOP_* Elimination and Opcode Contiguity (performance optimization)** -- Removed all 41 SLOWOP_* ID constants from Opcodes.java -- BytecodeInterpreter now calls SlowOpcodeHandler methods directly (no ID mapping) -- Eliminated opcode gaps: moved Phase 3 opcodes from 400-402 to 155-157 -- All opcodes now contiguous (0-157) for optimal JVM tableswitch performance -- Added array operators: slices, compound assignments, multidimensional access -- Added hash operators: chained access ($hash{outer}{inner}) -- Performance validated: no regression, JIT compilation working correctly - -### Future Optimization Opportunities - -#### A. Unboxed Int Registers (30-50% potential speedup) - -**Problem:** Every integer operation currently boxes/unboxes: -```java -// Current (slow): -registers[rd] = new RuntimeScalar((RuntimeScalar)registers[rs1]).getInt() + 1); - -// Proposed (fast): -intRegisters[rd] = intRegisters[rs1] + 1; // No boxing! -``` - -**Solution:** -- Maintain parallel `int[] intRegisters` array -- Track which registers contain unboxed ints -- Box only when needed (calls, returns, type coercion) -- Detect loop induction variables for unboxing - -**Implementation Steps:** -1. Add `int[] intRegisters` field to BytecodeInterpreter -2. Add `boolean[] isUnboxed` tracking array -3. Add unboxed variants: `INC_REG_UNBOXED`, `ADD_SCALAR_INT_UNBOXED` -4. BytecodeCompiler detects `my $i` loop variables and emits unboxed opcodes -5. Box on register use in non-arithmetic contexts - -#### B. Inline Caching (30-50% potential speedup) - -**Problem:** Every method call, global variable access requires lookup: -```java -// Current (slow): -RuntimeCode code = GlobalVariable.getGlobalCodeRef("main", "foo"); // Hashtable lookup! -result = code.apply(...); -``` - -**Solution:** -- Cache lookup results at call sites -- Invalidate on global state changes -- Polymorphic inline caches for method calls - -**Implementation Steps:** -1. Add `InlineCache[] caches` field to InterpretedCode -2. Cache structure: `{ String key, RuntimeCode cachedCode, int hitCount }` -3. Modify CALL_SUB to check cache before lookup -4. Invalidate caches on `sub foo { ... }` redefinition - -#### C. Additional Superinstructions (10-30% potential speedup) - -**Candidates:** -- `SUB_ASSIGN` (83): `r0 = r0 - r1` -- `MUL_ASSIGN` (84): `r0 = r0 * r1` -- `DIV_ASSIGN` (85): `r0 = r0 / r1` -- `ARRAY_GET_INT` (86): `r0 = array[int_index]` (unboxed index) -- `LOAD_CONST_INT` (87): `r0 = immediate_int` (replace LOAD_INT + constant pool) - -**Adding SUB_ASSIGN Example:** -```java -// Opcodes.java -public static final byte SUB_ASSIGN = 83; // rd = rd - rs - -// BytecodeInterpreter.java -case Opcodes.SUB_ASSIGN: { - int rd = bytecode[pc++] & 0xFF; - int rs = bytecode[pc++] & 0xFF; - registers[rd] = MathOperators.subtract( - (RuntimeScalar) registers[rd], - (RuntimeScalar) registers[rs] - ); - break; -} -``` - -#### D. Direct Field Access (10-20% potential speedup) - -**Problem:** Getter methods add overhead: -```java -// Current (slow): -int value = ((RuntimeScalar) registers[rs]).getInt(); // Method call - -// Proposed (fast): -int value = ((RuntimeScalar) registers[rs]).ivalue; // Direct field access -``` - -**Solution:** -- Access RuntimeScalar.ivalue, RuntimeScalar.svalue directly -- Check RuntimeScalar.type first to ensure correct type -- Only use for hot paths (ADD, SUB, COMPARE) - -**Trade-off:** Tight coupling to RuntimeScalar internals (breaks encapsulation). - -#### E. Separate Switch for Rare Opcodes (5-10% potential speedup) - -See [Dispatch Architecture](#dispatch-architecture--future-cpu-cache-optimization) section above. - -#### F. Specialized Loops (20-40% potential speedup) - -**Problem:** General dispatch loop has overhead for simple for-loops: -```perl -for (my $i = 0; $i < 1000000; $i++) { $sum += $i; } -``` - -**Solution:** -- Detect simple counting loops at compile time -- Generate specialized tight loop dispatcher -- Inline loop body opcodes (no switch overhead) - -**Detection Criteria:** -- Loop variable is integer (`my $i`) -- Loop condition is simple comparison (`$i < N`) -- Loop increment is `$i++` or `$i += 1` -- Loop body has <20 opcodes - -**Implementation:** -```java -// Specialized loop executor (no switch!) -for (int i = startValue; i < endValue; i++) { - // Inline loop body opcodes directly: - registers[rd] = MathOperators.add( - (RuntimeScalar) registers[sumReg], - new RuntimeScalar(i) - ); -} -``` - -## Runtime Sharing (100% API Compatibility) - -### Key Principle: Zero Duplication - -The interpreter and compiler share **IDENTICAL runtime APIs**. There is **NO duplicated logic** between the two execution modes. - -**Both Use Exactly the Same:** -- `RuntimeCode.apply()` - Execute subroutines -- `RuntimeScalar`, `RuntimeArray`, `RuntimeHash` - Data structures -- `MathOperators`, `StringOperators`, `CompareOperators` - All operators -- `GlobalVariable` - Global state (scalars, arrays, hashes, code refs) -- `RuntimeContextType` - Calling context (void/scalar/list/runtime) -- `RuntimeControlFlowList` - Control flow exceptions (last/next/redo/goto) - -### How It Works - -**Interpreter:** Direct Java method calls in switch cases -```java -case Opcodes.ADD_SCALAR: { - int rd = bytecode[pc++] & 0xFF; - int rs1 = bytecode[pc++] & 0xFF; - int rs2 = bytecode[pc++] & 0xFF; - registers[rd] = MathOperators.add( - (RuntimeScalar) registers[rs1], - (RuntimeScalar) registers[rs2] - ); - break; -} -``` - -**Compiler:** Generated JVM bytecode calls same method -```java -// Generated by EmitBinaryOperator.java -ALOAD leftScalar // Load first operand -ALOAD rightScalar // Load second operand -INVOKESTATIC org/perlonjava/operators/MathOperators.add( - Lorg/perlonjava/runtime/RuntimeScalar; - Lorg/perlonjava/runtime/RuntimeScalar; -)Lorg/perlonjava/runtime/RuntimeScalar; -ASTORE result // Store result -``` - -**Result:** Identical behavior, same semantics, same global state. - -### No Differences In: - -- **Method signatures** - Same parameters, same return types -- **Semantics** - Same type coercion, overloading, context handling -- **Global state** - Both modify same GlobalVariable.globalScalar, etc. -- **Error handling** - Same exceptions (DieException, ControlFlowException) -- **Closure behavior** - Both capture same variables, share same RuntimeScalar refs - -### Only Difference: Execution Timing - -- **Interpreter**: ~47M ops/sec (consistent, no warmup) -- **Compiler**: ~82M ops/sec (after JIT warmup) - -The interpreter trades raw speed for faster startup and lower memory usage. - -### Bidirectional Calling - -**Compiled → Interpreted:** -```perl -sub compiled_sub { - my $code = eval 'sub { my $x = shift; return $x * 2; }'; # Interpreted closure - return $code->(21); # Compiled code calls interpreted code -} -``` - -**Interpreted → Compiled:** -```perl -sub compiled_helper { return shift() * 2; } - -my $code = eval 'sub { - my $x = shift; - return compiled_helper($x); # Interpreted code calls compiled code -}'; -``` - -Both work seamlessly because they share the same RuntimeCode.apply() interface. - -### Closure Variable Sharing - -Closures capture variables as RuntimeScalar references: - -```perl -my $x = 10; -my $code = sub { $x += 1; return $x; }; -print $code->(); # 11 -print $code->(); # 12 -print $x; # 12 (shared reference!) -``` - -**Compiled closure:** -- Generates JVM field to hold RuntimeScalar reference -- Loads field, calls methods on it - -**Interpreted closure:** -- Stores RuntimeScalar reference in closure variables map -- Retrieves from map, operates on same object - -**Result:** Both modify the SAME RuntimeScalar object in memory. - -## Development Workflow - -### Adding a New Opcode +**Use SLOW_OP when:** +- Operation is rarely used (<1% of execution) +- Complex argument handling +- System calls, I/O operations -Follow these steps when adding a new opcode (e.g., a new superinstruction): +### 2. Adding a Fast Opcode -#### Step 1: Define Opcode in Opcodes.java +**Example: Unary + operator (forces numeric/scalar context)** +#### Step 2.1: Define in Opcodes.java ```java -// Add to appropriate category section -// Use next sequential number (currently 83+) - -/** - * MUL_ASSIGN: rd = rd * rs - * Format: [MUL_ASSIGN] [rd] [rs] - * Effect: Multiplies register rd by register rs, stores result in rd - */ -public static final byte MUL_ASSIGN = 83; // rd = rd * rs +// Find next available opcode number (currently 169+) +/** Unary +: Forces numeric/scalar context on operand */ +public static final short UNARY_PLUS = 169; ``` -**Important:** Maintain dense numbering! No gaps between opcodes to preserve tableswitch optimization. - -#### Step 2: Implement in BytecodeInterpreter.java - -Add case to main switch statement (around line 62-632): +**Critical: Keep opcodes contiguous! No gaps allowed.** +#### Step 2.2: Implement in BytecodeInterpreter.java ```java -case Opcodes.MUL_ASSIGN: { - // Decode operands from bytecode - int rd = bytecode[pc++] & 0xFF; - int rs = bytecode[pc++] & 0xFF; - - // Call appropriate runtime method - registers[rd] = MathOperators.multiply( - (RuntimeScalar) registers[rd], - (RuntimeScalar) registers[rs] - ); +case Opcodes.UNARY_PLUS: { + int rd = bytecode[pc++]; + int rs = bytecode[pc++]; + // Force scalar context + RuntimeBase operand = registers[rs]; + registers[rd] = operand.scalar(); break; } ``` -**Pattern:** -1. Decode operands (increment `pc` for each byte consumed) -2. Call runtime methods (MathOperators, StringOperators, etc.) -3. Store result in destination register - -#### Step 3: Emit in BytecodeCompiler.java - -Add emission logic in appropriate visit() method: - +#### Step 2.3: Emit in BytecodeCompiler.java ```java -// In visit(BinaryOperatorNode node) method -if (node instanceof OperatorNode) { - OperatorNode binOp = (OperatorNode) node; - - // Detect pattern: $var = $var * expr - if (binOp.left instanceof VariableNode && - binOp.right instanceof BinaryOperatorNode) { - - BinaryOperatorNode rightBin = (BinaryOperatorNode) binOp.right; - String leftVarName = ((VariableNode) binOp.left).variableName; - - if (rightBin.left instanceof VariableNode) { - String rightLeftVarName = ((VariableNode) rightBin.left).variableName; - - // Pattern match: $var = $var * expr - if (leftVarName.equals(rightLeftVarName) && - rightBin.operator.equals("*")) { +} else if (op.equals("+")) { + // Unary + operator + int savedContext = currentCallContext; + currentCallContext = RuntimeContextType.SCALAR; + try { + node.operand.accept(this); + int operandReg = lastResultReg; - int varReg = getOrAllocateRegister(leftVarName); - int rightRightReg = visit(rightBin.right); // Evaluate right side + int rd = allocateRegister(); + emit(Opcodes.ARRAY_SIZE); // Converts array to size, passes through scalars + emitReg(rd); + emitReg(operandReg); - emit(Opcodes.MUL_ASSIGN); - emit(varReg); - emit(rightRightReg); - return varReg; // Return destination register - } - } + lastResultReg = rd; + } finally { + currentCallContext = savedContext; } } ``` -#### Step 4: Add to Disassembler (InterpretedCode.java) - -Add case to disassemble() switch statement (around line 100-300): - +#### Step 2.4: Add Disassembly (InterpretedCode.java) ```java -case Opcodes.MUL_ASSIGN: { - int rd = bytecode[pc++] & 0xFF; - int rs = bytecode[pc++] & 0xFF; - sb.append(String.format("%-20s r%d *= r%d", "MUL_ASSIGN", rd, rs)); +case Opcodes.UNARY_PLUS: + rd = bytecode[pc++]; + rs = bytecode[pc++]; + sb.append("UNARY_PLUS r").append(rd).append(" = +r").append(rs).append("\n"); break; -} -``` - -This enables debugging with `./jperl --disassemble script.pl`. - -#### Step 5: Update Documentation - -**BYTECODE_DOCUMENTATION.md:** -```markdown -### MUL_ASSIGN (83) - -**Format:** `[MUL_ASSIGN] [rd] [rs]` - -**Effect:** `rd = rd * rs` - -**Description:** -Superinstruction that multiplies destination register by source register. -Equivalent to ADD_SCALAR followed by MOVE, but eliminates intermediate register. - -**Example:** -``` -MUL_ASSIGN r5 *= r3 # r5 = r5 * r3 ``` -``` - -**Update opcode count** in all documentation (update to reflect current implemented opcodes: 0-82, 87, 99). -### Adding Support for Existing Perl Operators +**WARNING:** Missing disassembly cases cause PC misalignment! When disassembler hits unknown opcode, it doesn't advance PC for operands, corrupting all subsequent instructions. -When adding support for Perl built-in operators (push, pop, shift, unshift, etc.) that already have opcodes defined: +### 3. Adding STORE_GLOBAL_* Opcodes -#### Pattern 1: Binary Operators (push, unshift) +**Example: STORE_GLOBAL_ARRAY (13), STORE_GLOBAL_HASH (15)** -**Parse Structure:** `BinaryOperatorNode` with operator name, left = array variable, right = values -- Example: `push @array, 1, 2, 3` → BinaryOperatorNode("push", left=@array, right=ListNode) - -**Steps:** - -1. **Determine if opcode exists** - Check Opcodes.java for the operation - - ARRAY_PUSH (44), ARRAY_UNSHIFT (47) already defined - -2. **Add case to BytecodeCompiler.visit(BinaryOperatorNode)** - Add to switch statement around line 1000-1400: +These opcodes already existed but lacked interpreter/disassembly support. +#### Step 3.1: Implement Runtime in BytecodeInterpreter.java ```java -case "push" -> { - // Array push: push(@array, values...) - // left: OperatorNode("@", IdentifierNode("array")) - // right: ListNode with values to push - - // Validate left operand is array variable - if (!(node.left instanceof OperatorNode)) { - throwCompilerException("push requires array variable"); - } - OperatorNode leftOp = (OperatorNode) node.left; - if (!leftOp.operator.equals("@") || !(leftOp.operand instanceof IdentifierNode)) { - throwCompilerException("push requires array variable: push @array, values"); - } +case Opcodes.STORE_GLOBAL_ARRAY: { + int nameIdx = bytecode[pc++]; + int srcReg = bytecode[pc++]; + String name = code.stringPool[nameIdx]; - String varName = "@" + ((IdentifierNode) leftOp.operand).name; + RuntimeArray globalArray = GlobalVariable.getGlobalArray(name); + RuntimeBase value = registers[srcReg]; - // Get the array - check lexical first, then global - int arrayReg; - if (hasVariable(varName)) { - // Lexical array - arrayReg = getVariableRegister(varName); + // Clear and populate + if (value instanceof RuntimeArray) { + globalArray.elements.clear(); + globalArray.elements.addAll(((RuntimeArray) value).elements); + } else if (value instanceof RuntimeList) { + globalArray.setFromList((RuntimeList) value); } else { - // Global array - load it - arrayReg = allocateRegister(); - String globalArrayName = getCurrentPackage() + "::" + ((IdentifierNode) leftOp.operand).name; - int nameIdx = addToStringPool(globalArrayName); - emit(Opcodes.LOAD_GLOBAL_ARRAY); - emit(arrayReg); - emit(nameIdx); + globalArray.setFromList(value.getList()); } - - // Evaluate the values to push (right operand) - node.right.accept(this); - int valuesReg = lastResultReg; - - // Emit ARRAY_PUSH - emit(Opcodes.ARRAY_PUSH); - emit(arrayReg); - emit(valuesReg); - - // Set result register - lastResultReg = arrayReg; -} -``` - -**Important Notes:** -- Use `getCurrentPackage()` instead of hardcoded `"main::"` for global variables -- Handle both lexical and global arrays -- Validate operator structure before processing - -3. **Add case to BytecodeInterpreter.execute()** - If opcode not yet implemented: - -```java -case Opcodes.ARRAY_PUSH: { - // Array push: push(@array, value) - int arrayReg = bytecode[pc++] & 0xFF; - int valueReg = bytecode[pc++] & 0xFF; - RuntimeArray arr = (RuntimeArray) registers[arrayReg]; - RuntimeBase val = registers[valueReg]; // Use RuntimeBase, not RuntimeScalar - arr.push(val); // RuntimeArray.push() can handle RuntimeList break; } ``` -**Important:** Use `RuntimeBase` not `RuntimeScalar` for values that might be lists. RuntimeArray.push() handles RuntimeList by calling `value.addToArray()`. - -#### Pattern 2: Unary Operators (pop, shift, unaryMinus) - -**Parse Structure:** `OperatorNode` with operator name and operand -- Example: `my $x = pop @array` → OperatorNode("pop", operand=ListNode[@array]) -- Example: `-$x` → OperatorNode("unaryMinus", operand=$x) - -**Steps:** - -1. **Add case to BytecodeCompiler.visit(OperatorNode)** - Add to if/else chain around line 1900-2100: - -```java -} else if (op.equals("pop")) { - // Array pop: $x = pop @array - // operand: ListNode containing OperatorNode("@", IdentifierNode) - if (node.operand == null || !(node.operand instanceof ListNode)) { - throwCompilerException("pop requires array argument"); - } - - ListNode list = (ListNode) node.operand; - if (list.elements.isEmpty() || !(list.elements.get(0) instanceof OperatorNode)) { - throwCompilerException("pop requires array variable"); - } - - OperatorNode arrayOp = (OperatorNode) list.elements.get(0); - if (!arrayOp.operator.equals("@") || !(arrayOp.operand instanceof IdentifierNode)) { - throwCompilerException("pop requires array variable: pop @array"); - } - - String varName = "@" + ((IdentifierNode) arrayOp.operand).name; - - // Get the array - check lexical first, then global - int arrayReg; - if (hasVariable(varName)) { - // Lexical array - arrayReg = getVariableRegister(varName); - } else { - // Global array - load it - arrayReg = allocateRegister(); - String globalArrayName = getCurrentPackage() + "::" + ((IdentifierNode) arrayOp.operand).name; - int nameIdx = addToStringPool(globalArrayName); - emit(Opcodes.LOAD_GLOBAL_ARRAY); - emit(arrayReg); - emit(nameIdx); - } - - // Allocate result register - int rd = allocateRegister(); - - // Emit ARRAY_POP - emit(Opcodes.ARRAY_POP); - emit(rd); - emit(arrayReg); - - lastResultReg = rd; -} -``` - -For simple unary operators like negation: +**Key Insight:** Match compiler semantics exactly. Check compiler's EmitterVisitor or runtime methods to understand expected behavior. +#### Step 3.2: Add Disassembly ```java -} else if (op.equals("unaryMinus")) { - // Unary minus: -$x - // Compile operand - node.operand.accept(this); - int operandReg = lastResultReg; - - // Allocate result register - int rd = allocateRegister(); - - // Emit NEG_SCALAR - emit(Opcodes.NEG_SCALAR); - emit(rd); - emit(operandReg); - - lastResultReg = rd; -} -``` - -2. **Add case to BytecodeInterpreter.execute()** - If not yet implemented: - -```java -case Opcodes.ARRAY_POP: { - // Array pop: rd = pop(@array) - int rd = bytecode[pc++] & 0xFF; - int arrayReg = bytecode[pc++] & 0xFF; - RuntimeArray arr = (RuntimeArray) registers[arrayReg]; - registers[rd] = RuntimeArray.pop(arr); // Static method +case Opcodes.STORE_GLOBAL_ARRAY: + nameIdx = bytecode[pc++]; + int srcReg = bytecode[pc++]; + sb.append("STORE_GLOBAL_ARRAY @").append(stringPool[nameIdx]) + .append(" = r").append(srcReg).append("\n"); break; -} ``` -**Important:** Check if the runtime method is static or instance. Most RuntimeArray operations are static methods. - -### When to Use SLOW_OP +### 4. Lvalue Subroutine Assignment -Some operations are too complex for a dedicated fast opcode or are used infrequently. Use the SLOW_OP mechanism: +**Perl Feature:** `f() = "X"` where f returns mutable reference -**Example: splice operation (SLOWOP_SPLICE)** - -1. **Add slow op constant to Opcodes.java:** -```java -/** Slow op ID: rd = Operator.splice(array, args_list) - splice array operation */ -public static final int SLOWOP_SPLICE = 28; -``` - -2. **Add case to SlowOpcodeHandler.execute():** -```java -case Opcodes.SLOWOP_SPLICE: - return executeSplice(bytecode, pc, registers); +**Parse Structure:** ``` - -3. **Implement handler in SlowOpcodeHandler:** -```java -private static int executeSplice( - byte[] bytecode, - int pc, - RuntimeBase[] registers) { - - int rd = bytecode[pc++] & 0xFF; - int arrayReg = bytecode[pc++] & 0xFF; - int argsReg = bytecode[pc++] & 0xFF; - - RuntimeArray array = (RuntimeArray) registers[arrayReg]; - RuntimeList args = (RuntimeList) registers[argsReg]; - - RuntimeList result = org.perlonjava.operators.Operator.splice(array, args); - - registers[rd] = result; - return pc; -} -``` - -4. **Update getSlowOpName() in SlowOpcodeHandler:** -```java -case Opcodes.SLOWOP_SPLICE -> "splice"; +BinaryOperatorNode: = + BinaryOperatorNode: ( # Function call + OperatorNode: & + IdentifierNode: 'f' + ListNode: [] # Arguments + StringNode: "X" ``` -5. **Emit from BytecodeCompiler:** +**Implementation in BytecodeCompiler.java:** ```java -} else if (op.equals("splice")) { - // Parse operands, get array register - // Compile arguments into a list - int argsListReg = allocateRegister(); - emit(Opcodes.CREATE_LIST); - emit(argsListReg); - emit(argRegs.size()); - for (int argReg : argRegs) { - emit(argReg); - } +// In compileAssignmentOperator(), before error throw: +if (leftBin.operator.equals("(")) { + // Call function (returns RuntimeBaseProxy in lvalue context) + node.left.accept(this); + int lvalueReg = lastResultReg; - int rd = allocateRegister(); - emit(Opcodes.SLOW_OP); - emit(Opcodes.SLOWOP_SPLICE); - emit(rd); - emit(arrayReg); - emit(argsListReg); + // Compile RHS + node.right.accept(this); + int rhsReg = lastResultReg; - lastResultReg = rd; -} -``` + // Assign using SET_SCALAR + emit(Opcodes.SET_SCALAR); + emitReg(lvalueReg); + emitReg(rhsReg); -**When to use SLOW_OP:** -- Operation is rarely used (<1% of execution) -- Operation requires complex argument handling -- Operation already has a good runtime implementation in Operator.java -- Want to preserve fast opcode space (0-99) for hot path operations - -**Benefits:** -- Only uses 1 byte of opcode space (SLOW_OP = 87) -- Keeps main interpreter switch compact -- Easy to add without affecting hot path performance - -#### Common Patterns and Gotchas - -**1. Package Names:** -- Always use `NameNormalizer.normalizeVariableName()` for global variables, not manual construction -- Pattern: `String globalName = NameNormalizer.normalizeVariableName(simpleName, getCurrentPackage());` -- This handles special variables, caching, and proper package resolution -- Example: - ```java - // Good: - String globalArrayName = NameNormalizer.normalizeVariableName( - ((IdentifierNode) leftOp.operand).name, - getCurrentPackage() - ); - - // Avoid: - String globalArrayName = getCurrentPackage() + "::" + ((IdentifierNode) leftOp.operand).name; - ``` - -**2. Lexical vs Global Variables:** -```java -int arrayReg; -if (hasVariable(varName)) { - // Lexical: already in a register - arrayReg = getVariableRegister(varName); -} else { - // Global: need to load it - arrayReg = allocateRegister(); - String globalName = getCurrentPackage() + "::" + simpleName; - int nameIdx = addToStringPool(globalName); - emit(Opcodes.LOAD_GLOBAL_ARRAY); // or LOAD_GLOBAL_HASH, LOAD_GLOBAL_SCALAR - emit(arrayReg); - emit(nameIdx); + lastResultReg = rhsReg; + currentCallContext = savedContext; + return; } ``` -**3. Runtime Method Signatures:** -- Check if methods are static: `RuntimeArray.pop(arr)` not `arr.pop()` -- Check parameter types: use `RuntimeBase` for values that might be lists -- Pattern: Look at how the compiler's EmitterVisitor calls the same runtime method +**How It Works:** +- Lvalue subroutines return RuntimeBaseProxy (extends RuntimeScalar) +- RuntimeBaseProxy has `lvalue` field pointing to actual mutable location +- SET_SCALAR calls `.set()` on the proxy, which delegates to the lvalue +- Example: `substr($x,0,1)` returns proxy to first character of $x -**4. Parse Structure:** -- Use `./jperl --parse -E 'code'` to see how Perl code is parsed -- Binary operators: `BinaryOperatorNode(operator, left, right)` -- Unary operators: `OperatorNode(operator, operand)` -- Function calls with multiple args: Usually `OperatorNode(name, ListNode(args))` - -**5. Error Messages:** -- Use `throwCompilerException()` for clear error messages -- Include the expected syntax in the error message -- Example: `throwCompilerException("push requires array variable: push @array, values")` - -#### Testing New Operators - -After implementing: +### 5. Testing New Operators ```bash # Build make # Test manually -./jperl --interpreter -E 'my @a = (1,2); push @a, 3; say $a[-1]' - -# Test disassembly -./jperl --disassemble -E 'my @a = (1,2); push @a, 3' - -# Run test file -./jperl --interpreter src/test/resources/unit/array.t -``` - -### Common Parse Structures Reference - -Use `./jperl --parse -E 'code'` to understand how Perl constructs are represented in the AST. Here are common patterns: - -#### Array Operations - -**Array Slice (read):** -```perl -my @slice = @array[1..3]; -``` -Parse structure: -``` -BinaryOperatorNode: = - OperatorNode: my - OperatorNode: @ # Slice uses @ sigil - IdentifierNode: 'slice' - BinaryOperatorNode: [ - OperatorNode: @ # Source array with @ sigil - IdentifierNode: 'array' - ArrayLiteralNode: - BinaryOperatorNode: .. # Range operator - NumberNode: 1 - NumberNode: 3 -``` - -**Array Slice (assignment):** -```perl -@array[1, 3, 5] = (20, 30, 40); -``` -Parse structure: -``` -BinaryOperatorNode: = - BinaryOperatorNode: [ # Left side is slice expression - OperatorNode: @ - IdentifierNode: 'array' - ArrayLiteralNode: # List of indices - NumberNode: 1 - NumberNode: 3 - NumberNode: 5 - ListNode: # Right side is values - NumberNode: 20 - NumberNode: 30 - NumberNode: 40 -``` - -**Key differences:** -- Single element: `$array[1]` uses `$` sigil (OperatorNode: "$") -- Slice: `@array[1,2,3]` uses `@` sigil (OperatorNode: "@") -- Slice indices can be a range (`1..3`) or list (`1, 3, 5`) - -#### List Operators with Blocks - -**map:** -```perl -my @doubled = map { $_ * 2 } @array; -``` -Parse structure: -``` -BinaryOperatorNode: map - SubroutineNode: # Anonymous subroutine (the block) - BlockNode: - BinaryOperatorNode: * - OperatorNode: $ - IdentifierNode: '_' - NumberNode: 2 - ListNode: # Input list - OperatorNode: @ - IdentifierNode: 'array' -``` - -**grep:** -```perl -my @evens = grep { $_ % 2 == 0 } @array; -``` -Same structure as `map`, with BinaryOperatorNode("grep", SubroutineNode, ListNode) +./jperl -E 'my @x = (1,2,3); say +@x' # Should print 3 -**sort:** -```perl -my @sorted = sort { $a <=> $b } @array; -``` -Same structure, with BinaryOperatorNode("sort", SubroutineNode, ListNode) +# Test disassembly (verifies PC advancement) +./jperl --disassemble -E 'my @x; say +@x' 2>&1 | grep UNARY -#### Simple List Operators +# Run unit tests +make test-unit -**reverse:** -```perl -my @reversed = reverse @array; -``` -Parse structure: -``` -OperatorNode: reverse - ListNode: - OperatorNode: @ - IdentifierNode: 'array' +# Verify tableswitch preserved +javap -c -classpath build/classes/java/main \ + org.perlonjava.interpreter.BytecodeInterpreter | grep -A 5 "switch" ``` -**join:** -```perl -my $joined = join ", ", @array; -``` -Parse structure: -``` -BinaryOperatorNode: join - StringNode: ', ' # Separator (left) - ListNode: # List to join (right) - OperatorNode: @ - IdentifierNode: 'array' -``` +**Must see `tableswitch`, not `lookupswitch`!** -**splice:** -```perl -splice @array, 2, 1, (10, 11); -``` -Parse structure: +**Example output showing tableswitch:** ``` -OperatorNode: splice - ListNode: # All arguments as list - OperatorNode: @ # Array to splice - IdentifierNode: 'array' - NumberNode: 2 # Offset - NumberNode: 1 # Length - ListNode: # Replacement values - NumberNode: 10 - NumberNode: 11 + 148: tableswitch { // 0 to 168 + 0: 840 + 1: 843 + 2: 893 + 3: 909 + 4: 976 ``` -#### Implementation Patterns by Parse Structure - -**Pattern 1: OperatorNode with ListNode operand** -- Examples: pop, shift, reverse, splice -- First list element is usually the array -- Remaining elements are parameters -- Implementation: Extract array from list, process remaining args - -**Pattern 2: BinaryOperatorNode with array left, values right** -- Examples: push, unshift -- Left: Array variable (OperatorNode: "@") -- Right: Values to add (ListNode) -- Implementation: Get array register, compile values, emit opcode - -**Pattern 3: BinaryOperatorNode with block and list** -- Examples: map, grep, sort -- Left: SubroutineNode (the code block) -- Right: ListNode (input data) -- Implementation: Compile block to closure, compile list, call operator - -**Detailed Implementation for Pattern 3 (grep, map, sort):** - -1. **AST Structure**: BinaryOperatorNode where: - - `left` = SubroutineNode (anonymous sub representing the block) - - `right` = ListNode (input data) - -2. **BytecodeCompiler Implementation**: - ```java - case "grep" -> { - // Compile SubroutineNode (left operand) to closure - // This is handled automatically by visit(SubroutineNode) - // Result will be in lastResultReg as a RuntimeScalar containing RuntimeCode - - // rs1 = closure register - // rs2 = list register - - emit(Opcodes.GREP); - emit(rd); // Result register - emit(rs2); // List register - emit(rs1); // Closure register - emit(RuntimeContextType.LIST); // Context - } - ``` - -3. **BytecodeInterpreter Implementation**: - ```java - case Opcodes.GREP: { - int rd = bytecode[pc++] & 0xFF; - int listReg = bytecode[pc++] & 0xFF; - int closureReg = bytecode[pc++] & 0xFF; - int ctx = bytecode[pc++] & 0xFF; - - RuntimeBase listBase = registers[listReg]; - RuntimeList list = listBase.getList(); - RuntimeScalar closure = (RuntimeScalar) registers[closureReg]; - RuntimeList result = org.perlonjava.operators.ListOperators.grep(list, closure, ctx); - registers[rd] = result; - break; - } - ``` - -4. **Disassembler** (InterpretedCode.java): - ```java - case Opcodes.GREP: - rd = bytecode[pc++] & 0xFF; - rs1 = bytecode[pc++] & 0xFF; // list register - rs2 = bytecode[pc++] & 0xFF; // closure register - int grepCtx = bytecode[pc++] & 0xFF; - sb.append("GREP r").append(rd).append(" = grep(r").append(rs1) - .append(", r").append(rs2).append(", ctx=").append(grepCtx).append(")\n"); - break; - ``` - -5. **Sort is special**: Uses package name instead of context: - ```java - emit(Opcodes.SORT); - emit(rd); - emit(rs2); // List register - emit(rs1); // Closure register - emitInt(addToStringPool(currentPackage)); // Package name (4 bytes) - ``` - -**Pattern 4: BinaryOperatorNode with separator and list** -- Example: join -- Left: Separator value -- Right: ListNode to join -- Implementation: Compile both operands, emit opcode - -#### Step 6: Update Documentation - -**BYTECODE_DOCUMENTATION.md:** -```markdown -### MUL_ASSIGN (83) - -**Format:** `[MUL_ASSIGN] [rd] [rs]` - -**Effect:** `rd = rd * rs` - -**Description:** -Superinstruction that multiplies destination register by source register. -Equivalent to ADD_SCALAR followed by MOVE, but eliminates intermediate register. +**If you see `lookupswitch` instead, you've introduced gaps in opcode numbering!** -**Example:** -``` -MUL_ASSIGN r5 *= r3 # r5 = r5 * r3 -``` -``` +### Critical Lessons Learned -**Update opcode count** in all documentation (update to reflect current implemented opcodes: 0-82, 87, 99). +**1. Disassembly is NOT Optional** +- Missing disassembly cases cause PC misalignment +- All subsequent bytecode appears corrupted +- Manifests as "Index N out of bounds" or "Unknown opcode" +- **Always add disassembly case when adding opcode** -#### Step 6: Test Thoroughly +**2. Match Compiler Semantics Exactly** +- Check EmitterVisitor or runtime methods +- Don't guess - read the code +- Example: `local $x` must call `makeLocal()`, not just assign -**Create Test Case:** +**3. Never Hide Problems** +- Null checks can mask real bugs +- If registers[N] is null, find why it wasn't initialized +- Don't paper over the issue -```perl -# dev/interpreter/tests/mul_assign_test.t -use strict; -use warnings; -use Test::More tests => 3; - -my $x = 5; -$x *= 3; -is($x, 15, "MUL_ASSIGN: scalar multiplication"); - -my $y = 10; -$y *= 2; -$y *= 2; -is($y, 40, "MUL_ASSIGN: chained multiplication"); - -my $z = 7; -$z *= 0; -is($z, 0, "MUL_ASSIGN: multiply by zero"); -``` +**4. Opcode Contiguity is Performance-Critical** +- JVM uses tableswitch (O(1)) for dense opcodes +- Gaps cause lookupswitch (O(log n)) - 10-15% slowdown +- Always use next sequential number -**Run Tests:** -```bash -make dev # Clean rebuild -./jperl dev/interpreter/tests/mul_assign_test.t -make test-unit # All unit tests must pass -``` +**5. Error Messages Must Include Context** +- Use `throwCompilerException(message, tokenIndex)` +- Shows filename, line number, and code snippet +- Makes debugging 10x easier -**Verify Tableswitch:** -```bash -javap -c -p -classpath build/classes/java/main \ - org.perlonjava.interpreter.BytecodeInterpreter | grep -A 5 "switch" -``` +## Common Pitfalls -Should see `tableswitch` (not `lookupswitch`). If you see `lookupswitch`, you've introduced a gap in numbering! - -**Run Benchmarks:** -```bash -./jperl dev/interpreter/tests/for_loop_benchmark.pl -``` - -Check that performance hasn't regressed. New superinstruction should improve performance for matching patterns. - -### Debugging Tips - -**Disassemble Bytecode:** -```bash -./jperl --disassemble -E 'my $x = 10; $x *= 2; print $x' -``` - -Output shows generated bytecode: -``` -LOAD_INT r0, 10 -LOAD_INT r1, 2 -MUL_ASSIGN r0 *= r1 -PRINT r0 -``` - -**Add Debug Logging:** +**1. Forgetting PC Increment:** ```java -case Opcodes.MUL_ASSIGN: { - int rd = bytecode[pc++] & 0xFF; - int rs = bytecode[pc++] & 0xFF; - System.err.printf("MUL_ASSIGN: r%d *= r%d (before: %s, %s)\n", - rd, rs, registers[rd], registers[rs]); - registers[rd] = MathOperators.multiply(...); - System.err.printf(" after: %s\n", registers[rd]); - break; -} -``` +// WRONG: Infinite loop! +int rd = bytecode[pc] & 0xFF; -**Use Java Debugger:** -```bash -# Add breakpoint in BytecodeInterpreter.java -java -agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5005 \ - -cp build/libs/perlonjava-*-all.jar \ - org.perlonjava.Main --eval 'my $x = 10; $x *= 2' +// RIGHT: +int rd = bytecode[pc++]; ``` -### Critical Debugging Patterns (Learned from Array Operators Work) - -**1. Bare Blocks vs Loops (For3Node.isSimpleBlock)** - -For3Node represents both bare blocks `{ }` and real loops. The `isSimpleBlock` flag distinguishes them: -- `isSimpleBlock = true`: Execute body once (bare block) -- `isSimpleBlock = false`: Standard for/while loop - -**Bug Pattern:** Ignoring this flag causes infinite loops: +**2. Opcode Gaps:** ```java -// WRONG: Always creates loop bytecode -@Override -public void visit(For3Node node) { - emitLabel(startLabel); - // ... condition check ... - node.body.accept(this); - emit(Opcodes.GOTO); - emitInt(startLabel); // INFINITE LOOP for bare blocks! -} +// WRONG: Breaks tableswitch! +public static final short OP_A = 82; +public static final short OP_B = 90; // Gap! -// RIGHT: Check isSimpleBlock first -@Override -public void visit(For3Node node) { - if (node.isSimpleBlock) { - // Bare block: execute once - if (node.body != null) { - node.body.accept(this); - } - lastResultReg = -1; - return; - } - // ... rest of loop handling ... -} +// RIGHT: +public static final short OP_A = 82; +public static final short OP_B = 83; // Sequential ``` -**Location:** BytecodeCompiler.java:3152 (visit method) - -**2. Disassembler MUST Skip All Operands** - -When adding SLOW_OP operations, the disassembler must read/skip ALL operands or PC becomes misaligned: - +**3. Missing Disassembly:** ```java -// WRONG: Default case doesn't skip operands +// WRONG: Causes PC misalignment! default: - sb.append("SLOW_OP (operands not decoded)"); - // PC not advanced! Next read will be wrong byte! + sb.append("UNKNOWN\n"); // Doesn't read operands! break; -// RIGHT: Every case must read correct number of operands -case Opcodes.SLOWOP_SPLIT: - rd = bytecode[pc++] & 0xFF; // Skip rd - int patternReg = bytecode[pc++] & 0xFF; // Skip pattern reg - int argsReg = bytecode[pc++] & 0xFF; // Skip args reg - int ctx = bytecode[pc++] & 0xFF; // Skip context - sb.append(" r").append(rd).append(" = split(r") - .append(patternReg).append(", r").append(argsReg) - .append(", ctx=").append(ctx).append(")"); +// RIGHT: Every opcode must read its operands +case Opcodes.MY_OP: + int rd = bytecode[pc++]; + int rs = bytecode[pc++]; + sb.append("MY_OP r").append(rd).append(", r").append(rs).append("\n"); break; ``` -**Error Pattern:** "Index N out of bounds" in disassembler means a SLOW_OP case is missing or not skipping operands. +## JIT Compilation Limit -**Location:** InterpretedCode.java disassemble() method +**Critical:** JVM refuses to JIT-compile methods >~8000 bytes, causing 5-10x slowdown. -**3. Scalar Context in Function Arguments (Known Issue)** +**Solution:** Delegate cold opcodes to secondary methods: +- `executeComparisons()` - Comparison ops (31-41) +- `executeArithmetic()` - Multiply, divide, compound (19-30, 110-113) +- `executeCollections()` - Array/hash ops (43-56, 93-96) +- `executeTypeOps()` - Type/reference ops (62-70, 102-105) -Array element access returns wrong value when used directly in function arguments: +**Monitor:** Run `dev/tools/check-bytecode-size.sh` after changes. -```perl -# WRONG RESULT: -my @arr = (1, 2, 3); -is($arr[1], 2, "test"); # gets: 1, expected: 2 +## Performance Targets -# WORKAROUND: -my $x = $arr[1]; -is($x, 2, "test"); # WORKS: gets: 2, expected: 2 -``` - -**Root Cause:** Bytecode calls ARRAY_SIZE after ARRAY_GET: -``` -54: ARRAY_GET r13 = r3[r14] # Gets element (value 2) -58: ARRAY_SIZE r15 = size(r13) # Converts to size (1) - BUG! -70: CREATE_LIST r18 = [r15, ...] # Passes size instead of element -``` +- **Current:** 46.84M ops/sec (1.75x slower than compiler ✓) +- **Target:** 2-5x slower than compiler +- **Compiler:** ~82M ops/sec (after JIT warmup) -**Status:** Known issue, not fixed yet. Core array operators work correctly. This is a scalar context handling bug in function argument processing. +**Trade-off:** Slower execution for faster startup and lower memory. -**Location:** BytecodeCompiler.java around line 1998-2005 (scalar operator handling) - -## Context Propagation (TODO) - -**Current Implementation:** -The interpreter currently handles scalar context by converting values after compilation: -```java -// Current approach: Convert after compilation -node.operand.accept(this); -int operandReg = lastResultReg; -emit(Opcodes.ARRAY_SIZE); // Convert array to size -``` - -**Problem:** -This approach doesn't work for all cases: -```perl -my $s = @array; # Works: emits ARRAY_SIZE -join(", ", @array); # Broken: converts @array to size before join sees it -``` - -**Better Approach (like codegen):** -Propagate `RuntimeContextType` through compilation: -```java -// Codegen approach: Propagate context -node.operand.accept(emitterVisitor.with(RuntimeContextType.SCALAR)); -``` - -**Implementation Plan:** -1. Add `currentContext` field to BytecodeCompiler (like EmitterContext.contextType) -2. Modify `visit()` methods to check `currentContext` and emit appropriate opcodes -3. For `@` operator in SCALAR context: emit ARRAY_SIZE automatically -4. For function calls: set context based on prototype/signature -5. Remove post-compilation conversions - -**Benefits:** -- Handles all Perl context semantics correctly -- Matches codegen behavior exactly -- Cleaner architecture - context flows naturally through AST - -**Files to Modify:** -- BytecodeCompiler.java: Add context tracking and propagation -- visit(OperatorNode) for "@": Check context, emit ARRAY_SIZE if scalar -- visit(BinaryOperatorNode) for "=": Set RHS context based on LHS type -- Function calls: Propagate correct context to arguments - -### Common Pitfalls - -**1. Forgetting to Increment PC:** -```java -// WRONG: pc not incremented, will read same byte forever! -case Opcodes.MUL_ASSIGN: { - int rd = bytecode[pc] & 0xFF; // Missing pc++ - int rs = bytecode[pc] & 0xFF; // Missing pc++ - ... -} - -// RIGHT: -int rd = bytecode[pc++] & 0xFF; -int rs = bytecode[pc++] & 0xFF; -``` - -**2. Creating Gaps in Opcode Numbering:** -```java -// WRONG: Gap between 82 and 90 breaks tableswitch! -public static final byte LOOP_PLUS_PLUS = 82; -public static final byte MUL_ASSIGN = 90; // Gap! - -// RIGHT: Sequential numbering -public static final byte LOOP_PLUS_PLUS = 82; -public static final byte MUL_ASSIGN = 83; // No gap -``` - -**3. Incorrect Type Casting:** -```java -// WRONG: ClassCastException if register contains RuntimeArray! -RuntimeScalar scalar = (RuntimeScalar) registers[rd]; +## Runtime Sharing (100% API Compatibility) -// RIGHT: Check type or use safe casting -if (registers[rd] instanceof RuntimeScalar) { - RuntimeScalar scalar = (RuntimeScalar) registers[rd]; - ... -} -``` +Interpreter and compiler call **identical** runtime methods: +- MathOperators, StringOperators, CompareOperators +- RuntimeScalar, RuntimeArray, RuntimeHash +- RuntimeCode.apply(), GlobalVariable +- No duplicated logic whatsoever -**4. Not Handling Context:** +**Example:** ```java -// WRONG: Ignores void/scalar/list context -result = code.apply(args, RuntimeContextType.SCALAR); // Always scalar! +// Interpreter: Direct call +registers[rd] = MathOperators.add(registers[rs1], registers[rs2]); -// RIGHT: Propagate context from current execution -result = code.apply(args, currentContext); +// Compiler: Generated bytecode calls same method +INVOKESTATIC org/perlonjava/operators/MathOperators.add(...) ``` -### Maintaining Dense Opcodes - -**Critical Rule:** Opcodes must be dense (no gaps) to preserve tableswitch optimization. +## Variable Sharing -**When adding opcodes:** -- Use next sequential number (current max is 82, so use 83, 84, 85, ...) -- Never skip numbers -- Never delete opcodes without renumbering +**Captured Variables:** +- Named subroutines can capture outer variables +- Use persistent storage: `PerlOnJava::_BEGIN_::varname` +- SET_SCALAR preserves references (doesn't overwrite) +- Both modes access same RuntimeScalar object -**If you must remove an opcode:** -1. Renumber all subsequent opcodes to close the gap -2. Update all references (Opcodes.java, BytecodeInterpreter.java, InterpretedCode.java) -3. Run full test suite to catch missed references - -**Verify tableswitch after changes:** -```bash -javap -c -p -classpath build/classes/java/main \ - org.perlonjava.interpreter.BytecodeInterpreter | grep "switch" +**Example:** +```perl +my $x = 10; +sub foo { return $x * 2; } # Compiled, captures $x +$x = 20; # Interpreted +say foo(); # 40 (sees updated value) ``` -### Performance Testing - -After any change to BytecodeInterpreter.java or Opcodes.java: +## Documentation -1. **Run benchmark:** - ```bash - ./jperl dev/interpreter/tests/for_loop_benchmark.pl - ``` - -2. **Compare results:** - - Before: 46.84M ops/sec - - After: Should be ≥46.84M ops/sec (no regression) - - New superinstruction: Should show improvement for matching patterns - -3. **Check JIT compilation:** - Look for "made not entrant" or "made zombie" messages indicating deoptimization. +- **STATUS.md** - Implementation status +- **TESTING.md** - Testing procedures +- **BYTECODE_DOCUMENTATION.md** - Complete opcode reference +- **CLOSURE_IMPLEMENTATION_COMPLETE.md** - Closure architecture +- **SKILL.md** (this file) - Developer guide ## Next Steps -### High Priority - -1. **Test Coverage for Variable Sharing** - - Add more test cases for mixed interpreter/compiled scenarios - - Test edge cases: array elements, hash elements, references - - Test nested subroutines and complex capture patterns - -2. **Performance Optimization** - - Profile examples/life.pl to identify hot paths - - Consider unboxed int registers for loop counters - - Evaluate inline caching opportunities for global variable access +**High Priority:** +1. Complete missing disassembly cases (opcodes 62+) +2. Test coverage for variable sharing edge cases +3. Profile and optimize hot paths -3. **Error Handling Improvements** - - Ensure all compiler errors use throwCompilerException with proper tokenIndex - - Add error context for common mistakes (undefined variables, type mismatches) - - Improve error messages for bytecode generation failures +**Medium Priority:** +4. Implement remaining slow operations (22/255 used) +5. Add more superinstructions (compound assignments) +6. Context propagation (like codegen's EmitterContext) -### Medium Priority - -4. **Additional Slow Operations** - - Implement remaining system call opcodes (currently 19/255 used) - - Socket operations, file locking, IPC primitives - - Keep main loop compact by using SLOW_OP gateway - -5. **More Superinstructions** - - `SUB_ASSIGN`, `MUL_ASSIGN`, `DIV_ASSIGN` for compound assignments - - `ARRAY_GET_INT` with unboxed index for faster array access - - Profile to identify most common operation patterns - -6. **Documentation Updates** - - Update BYTECODE_DOCUMENTATION.md with SET_SCALAR and variable sharing - - Add examples of mixed interpreter/compiled programs - - Document best practices for performance - -### Low Priority - -7. **Specialized Loop Dispatcher** - - Detect simple counting loops at compile time - - Generate tight loop with inlined body (no switch overhead) - - Could provide 20-40% speedup for numeric loops - -8. **Direct Field Access** - - Access RuntimeScalar.ivalue/svalue directly instead of getters - - Trade-off: Breaks encapsulation but 10-20% faster - - Consider only for verified hot paths - -9. **Unboxed Register Optimization** - - Parallel int[] intRegisters array for unboxed integers - - Track which registers are unboxed - - Box only when needed (calls, returns, type coercion) - - Potential 30-50% speedup for numeric code +**Low Priority:** +7. Unboxed int registers (30-50% potential speedup) +8. Inline caching for method calls/globals +9. Specialized loop dispatcher ## Summary -The PerlOnJava interpreter is a production-ready, high-performance bytecode interpreter that: - -- **Executes Perl bytecode** at 46.84M ops/sec (1.75x slower than compiler) -- **Shares 100% of runtime APIs** with the compiler (zero duplication) -- **Supports closures** and bidirectional calling (compiled ↔ interpreted) -- **Shares variables** between interpreter and compiled code with proper aliasing -- **Uses dense opcodes** (0-99) for optimal JVM tableswitch dispatch -- **Implements superinstructions** to eliminate overhead -- **Detects context** (VOID/SCALAR/LIST) for proper wantarray semantics -- **Reports errors** with accurate filename and line numbers - -**Recent Achievements (Phase 4):** -- ✅ Variable sharing implementation (PR #191) -- ✅ SET_SCALAR opcode for reference preservation -- ✅ Context detection for subroutine calls -- ✅ SLOWOP_RETRIEVE_BEGIN_* opcodes for persistent variables -- ✅ examples/life.pl now runs correctly in interpreter mode - -Future optimizations (unboxed ints, inline caching, specialized loops) can potentially reach 1.2-1.5x slower than compiler while maintaining the benefits of interpretation. - -For questions or contributions, refer to: -- **STATUS.md** - Current implementation status -- **TESTING.md** - Testing procedures -- **BYTECODE_DOCUMENTATION.md** - Complete opcode reference -- **CLOSURE_IMPLEMENTATION_COMPLETE.md** - Closure architecture -- **SKILL.md** (this file) - Developer guide and next steps +The interpreter is production-ready with: +- ✓ 46.84M ops/sec execution +- ✓ 100% runtime API sharing +- ✓ Closure and bidirectional calling support +- ✓ Variable sharing with proper aliasing +- ✓ Dense opcodes (0-157) for tableswitch +- ✓ Context detection (VOID/SCALAR/LIST) +- ✓ Accurate error reporting with filename/line -Happy hacking! +**Key Learning:** Disassembly completeness is as important as runtime implementation. Missing disassembly cases corrupt PC and make debugging impossible. diff --git a/dev/tools/gen_large_sub_test.pl b/dev/tools/gen_large_sub_test.pl new file mode 100644 index 000000000..1c5960c54 --- /dev/null +++ b/dev/tools/gen_large_sub_test.pl @@ -0,0 +1,38 @@ +#!/usr/bin/env perl +# Script to generate a Perl test file with a very large subroutine +# This is used to test the interpreter fallback mechanism + +use strict; +use warnings; + +my $num_statements = $ARGV[0] || 10000; +my $output_file = $ARGV[1] || "large_sub_test.pl"; + +open my $fh, '>', $output_file or die "Cannot open $output_file: $!"; + +print $fh "# Test file with large subroutine ($num_statements statements)\n"; +print $fh "# Generated by gen_large_sub_test.pl\n\n"; +print $fh "print \"1..2\\n\";\n\n"; +print $fh "sub large_sub {\n"; +print $fh " my \$sum = 0;\n"; + +for my $i (1..$num_statements) { + print $fh " \$sum += $i;\n"; +} + +print $fh " return \$sum;\n"; +print $fh "}\n\n"; + +# Calculate expected sum: sum of 1 to n = n*(n+1)/2 +my $expected = $num_statements * ($num_statements + 1) / 2; + +print $fh "my \$result = large_sub();\n"; +print $fh "print \"not \" unless \$result == $expected;\n"; +print $fh "print \"ok 1 - large subroutine computed correct sum\\n\";\n\n"; +print $fh "print \"not \" unless defined(&large_sub);\n"; +print $fh "print \"ok 2 - large subroutine is defined\\n\";\n"; + +close $fh; + +print "Generated $output_file with $num_statements statements\n"; +print "Expected sum: $expected\n"; diff --git a/src/main/java/org/perlonjava/codegen/CompiledCode.java b/src/main/java/org/perlonjava/codegen/CompiledCode.java new file mode 100644 index 000000000..7dd3e77ba --- /dev/null +++ b/src/main/java/org/perlonjava/codegen/CompiledCode.java @@ -0,0 +1,57 @@ +package org.perlonjava.codegen; + +import org.perlonjava.runtime.RuntimeCode; + +import java.lang.invoke.MethodHandle; + +/** + * Compiled bytecode that extends RuntimeCode. + * + * This class represents Perl code that has been compiled to JVM bytecode. + * It wraps the generated Class and provides the same RuntimeCode interface + * as InterpretedCode, enabling seamless switching between compiler and interpreter. + * + * DESIGN: Following the InterpretedCode pattern: + * - InterpretedCode stores bytecode[] and overrides apply() to call BytecodeInterpreter + * - CompiledCode stores Class and uses parent apply() to call MethodHandle + * + * This allows the EmitterMethodCreator.createRuntimeCode() factory to return either + * CompiledCode or InterpretedCode based on whether compilation succeeded or fell + * back to the interpreter. + */ +public class CompiledCode extends RuntimeCode { + // The generated JVM class (useful for debugging and EmitSubroutine bytecode generation) + public final Class generatedClass; + + // The compiler context used to create this code (may be useful for debugging) + public final EmitterContext compileContext; + + /** + * Constructor for CompiledCode. + * + * @param methodHandle The MethodHandle for the apply() method + * @param codeObject The instance of the generated class (with closure variables) + * @param prototype The subroutine prototype (e.g., "$" for one scalar parameter) + * @param generatedClass The compiled JVM class + * @param compileContext The compiler context (optional, for debugging) + */ + public CompiledCode(MethodHandle methodHandle, Object codeObject, + String prototype, Class generatedClass, + EmitterContext compileContext) { + super(methodHandle, codeObject, prototype); + this.generatedClass = generatedClass; + this.compileContext = compileContext; + } + + // No need to override apply() - parent RuntimeCode implementation works perfectly + // The MethodHandle dispatches to compiled JVM bytecode automatically + + @Override + public String toString() { + return "CompiledCode{" + + "class=" + (generatedClass != null ? generatedClass.getName() : "null") + + ", prototype='" + prototype + '\'' + + ", defined=" + defined() + + '}'; + } +} diff --git a/src/main/java/org/perlonjava/codegen/EmitterMethodCreator.java b/src/main/java/org/perlonjava/codegen/EmitterMethodCreator.java index 96d7457bd..d518f76d9 100644 --- a/src/main/java/org/perlonjava/codegen/EmitterMethodCreator.java +++ b/src/main/java/org/perlonjava/codegen/EmitterMethodCreator.java @@ -348,13 +348,28 @@ public static Class createClassWithMethod(EmitterContext ctx, Node ast, boole public static byte[] getBytecode(EmitterContext ctx, Node ast, boolean useTryCatch) { boolean asmDebug = System.getenv("JPERL_ASM_DEBUG") != null; + boolean showFallback = System.getenv("JPERL_SHOW_FALLBACK") != null || + System.getenv("JPERL_USE_INTERPRETER_FALLBACK") != null; + boolean useInterpreterFallback = System.getenv("JPERL_USE_INTERPRETER_FALLBACK") != null; + try { return getBytecodeInternal(ctx, ast, useTryCatch, false); } catch (MethodTooLargeException tooLarge) { - // Automatic retry with refactoring on "Method too large" error + // When interpreter fallback is enabled, skip AST splitter and let exception propagate + // The interpreter has no size limits, so AST splitting is unnecessary + if (useInterpreterFallback) { + if (showFallback) { + System.err.println("Note: Method too large, skipping AST splitter (interpreter fallback enabled)."); + } + throw tooLarge; // Propagate to createRuntimeCode() which will use interpreter + } + + // Automatic retry with AST splitting when interpreter fallback is not enabled try { // Notify user that automatic refactoring is happening - // System.err.println("Note: Method too large, retrying with automatic refactoring."); + if (showFallback) { + System.err.println("Note: Method too large, retrying with AST splitter (automatic refactoring)."); + } // First, try depth-first literal refactoring (refactors nested structures first) org.perlonjava.astvisitor.DepthFirstLiteralRefactorVisitor.refactor(ast); @@ -372,9 +387,16 @@ public static byte[] getBytecode(EmitterContext ctx, Node ast, boolean useTryCat ctx.clearContextCache(); } - return getBytecodeInternal(ctx, ast, useTryCatch, false); + byte[] result = getBytecodeInternal(ctx, ast, useTryCatch, false); + if (showFallback) { + System.err.println("Note: AST splitter succeeded."); + } + return result; } catch (MethodTooLargeException retryTooLarge) { // Refactoring didn't help enough - give up + if (showFallback) { + System.err.println("Note: AST splitter failed, propagating exception."); + } throw retryTooLarge; } catch (Throwable retryError) { // Refactoring caused a different error - report both @@ -1451,6 +1473,168 @@ public static Class loadBytecode(EmitterContext ctx, byte[] classData) { return loader.defineClass(javaClassNameDot, classData); } + // Feature flag for interpreter fallback + private static final boolean USE_INTERPRETER_FALLBACK = + System.getenv("JPERL_USE_INTERPRETER_FALLBACK") != null; + private static final boolean SHOW_FALLBACK = + System.getenv("JPERL_SHOW_FALLBACK") != null; + + /** + * Unified factory method that returns RuntimeCode (either CompiledCode or InterpretedCode). + * + * This is the NEW API that replaces createClassWithMethod() for most use cases. + * It handles the "Method too large" exception by falling back to the interpreter + * when JPERL_USE_INTERPRETER_FALLBACK environment variable is set. + * + * DESIGN: + * - Try compiler first (createClassWithMethod) + * - On MethodTooLargeException: fall back to interpreter if flag enabled + * - Return CompiledCode or InterpretedCode (both extend RuntimeCode) + * - Call sites work with RuntimeCode interface, don't need to know which backend was used + * + * @param ctx The emitter context containing information for code generation + * @param ast The abstract syntax tree representing the method body + * @param useTryCatch Flag to enable try-catch in the generated class (for eval operator) + * @return RuntimeCode that can be either CompiledCode or InterpretedCode + */ + public static org.perlonjava.runtime.RuntimeCode createRuntimeCode( + EmitterContext ctx, Node ast, boolean useTryCatch) { + try { + // Try compiler path + Class generatedClass = createClassWithMethod(ctx, ast, useTryCatch); + if (SHOW_FALLBACK) { + System.err.println("Note: JVM compilation succeeded."); + } + return wrapAsCompiledCode(generatedClass, ctx); + + } catch (MethodTooLargeException e) { + if (USE_INTERPRETER_FALLBACK) { + // Fall back to interpreter + System.err.println("Note: Method too large after AST splitting, using interpreter backend."); + return compileToInterpreter(ast, ctx, useTryCatch); + } + + // If interpreter fallback disabled, re-throw to use existing AST splitter logic + throw e; + } + } + + /** + * Wrap a compiled Class as CompiledCode. + * + * This performs the same reflection steps that SubroutineParser.java currently does: + * 1. Get constructor + * 2. Create instance (codeObject) + * 3. Get MethodHandle for apply method + * 4. Set __SUB__ field + * 5. Return CompiledCode wrapper + * + * @param generatedClass The compiled JVM class + * @param ctx The compiler context + * @return CompiledCode wrapping the compiled class + */ + private static CompiledCode wrapAsCompiledCode(Class generatedClass, EmitterContext ctx) { + try { + // Get the constructor (may have parameters for captured variables) + String[] env = (ctx.capturedEnv != null) ? ctx.capturedEnv : ctx.symbolTable.getVariableNames(); + + // Build parameter types for constructor + Class[] parameterTypes = new Class[Math.max(0, env.length - skipVariables)]; + for (int i = skipVariables; i < env.length; i++) { + String descriptor = getVariableDescriptor(env[i]); + String className = descriptor.substring(1, descriptor.length() - 1).replace('/', '.'); + parameterTypes[i - skipVariables] = Class.forName(className); + } + + Constructor constructor = generatedClass.getConstructor(parameterTypes); + + // For now, we don't instantiate - that happens later when captured vars are available + // This is used for the factory pattern where the caller provides the parameters + // So we return a CompiledCode with null codeObject and null methodHandle + // The caller will instantiate it with the captured variables + + // Actually, let's check if there are NO captured variables, then we can instantiate now + Object codeObject = null; + java.lang.invoke.MethodHandle methodHandle = null; + + if (parameterTypes.length == 0) { + // No captured variables, can instantiate now + codeObject = constructor.newInstance(); + + // Get MethodHandle for apply method + methodHandle = org.perlonjava.runtime.RuntimeCode.lookup.findVirtual( + generatedClass, "apply", org.perlonjava.runtime.RuntimeCode.methodType + ); + + // Set __SUB__ field + java.lang.reflect.Field field = generatedClass.getDeclaredField("__SUB__"); + org.perlonjava.runtime.RuntimeScalar selfRef = new org.perlonjava.runtime.RuntimeScalar(); + selfRef.type = org.perlonjava.runtime.RuntimeScalarType.CODE; + // Note: ctx doesn't have prototype field, it's set separately by caller + selfRef.value = new CompiledCode(methodHandle, codeObject, null, generatedClass, ctx); + field.set(codeObject, selfRef); + + return (CompiledCode) selfRef.value; + } else { + // Has captured variables - caller must instantiate later + // Return a CompiledCode with null codeObject/methodHandle + // The caller will fill these in via reflection (see SubroutineParser pattern) + return new CompiledCode(null, null, null, generatedClass, ctx); + } + + } catch (Exception e) { + throw new org.perlonjava.runtime.PerlCompilerException( + "Failed to wrap compiled class: " + e.getMessage()); + } + } + + /** + * Compile AST to interpreter bytecode. + * + * This is the fallback path when JVM bytecode generation hits the 65535 byte limit. + * The interpreter has no size limits because it doesn't generate JVM bytecode. + * + * @param ast The AST to compile + * @param ctx The compiler context + * @param useTryCatch Whether to use try-catch (for eval) + * @return InterpretedCode ready to execute + */ + private static org.perlonjava.interpreter.InterpretedCode compileToInterpreter( + Node ast, EmitterContext ctx, boolean useTryCatch) { + + // Create bytecode compiler + org.perlonjava.interpreter.BytecodeCompiler compiler = + new org.perlonjava.interpreter.BytecodeCompiler( + ctx.errorUtil.getFileName(), + 1, // line number + ctx.errorUtil + ); + + // Compile AST to interpreter bytecode + org.perlonjava.interpreter.InterpretedCode code = compiler.compile(ast); + + // Handle captured variables if needed (for closures) + if (ctx.capturedEnv != null && ctx.capturedEnv.length > skipVariables) { + // Extract captured variables from context + // Note: This is a simplified version - full implementation would need to + // access the actual RuntimeBase objects from the symbol table + org.perlonjava.runtime.RuntimeBase[] capturedVars = + new org.perlonjava.runtime.RuntimeBase[ctx.capturedEnv.length - skipVariables]; + + // For now, initialize with undef (actual values will be set by caller) + for (int i = 0; i < capturedVars.length; i++) { + capturedVars[i] = new org.perlonjava.runtime.RuntimeScalar(); + } + + code = code.withCapturedVars(capturedVars); + } + + // Note: prototype will be set by caller if needed + // code.prototype is set via RuntimeCode fields + + return code; + } + public static void debugInspectClass(Class generatedClass) { System.out.println("Class Information for: " + generatedClass.getName()); System.out.println("==========================================="); diff --git a/src/main/java/org/perlonjava/interpreter/BytecodeCompiler.java b/src/main/java/org/perlonjava/interpreter/BytecodeCompiler.java index a7376ec59..0931bbc40 100644 --- a/src/main/java/org/perlonjava/interpreter/BytecodeCompiler.java +++ b/src/main/java/org/perlonjava/interpreter/BytecodeCompiler.java @@ -933,6 +933,7 @@ private void handleCompoundAssignment(BinaryOperatorNode node) { case "*=" -> emit(Opcodes.MULTIPLY_ASSIGN); case "/=" -> emit(Opcodes.DIVIDE_ASSIGN); case "%=" -> emit(Opcodes.MODULUS_ASSIGN); + case ".=" -> emit(Opcodes.STRING_CONCAT_ASSIGN); default -> { throwCompilerException("Unknown compound assignment operator: " + op); return; @@ -1351,6 +1352,83 @@ private void compileAssignmentOperator(BinaryOperatorNode node) { lastResultReg = reg; return; } + + // Handle my ($x, $y, @rest) = ... - list declaration with assignment + if (myOperand instanceof ListNode) { + ListNode listNode = (ListNode) myOperand; + + // Compile RHS first + node.right.accept(this); + int listReg = lastResultReg; + + // Convert to list if needed + int rhsListReg = allocateRegister(); + emit(Opcodes.SCALAR_TO_LIST); + emitReg(rhsListReg); + emitReg(listReg); + + // Declare and assign each variable + for (int i = 0; i < listNode.elements.size(); i++) { + Node element = listNode.elements.get(i); + if (element instanceof OperatorNode) { + OperatorNode sigilOp = (OperatorNode) element; + String sigil = sigilOp.operator; + + if (sigilOp.operand instanceof IdentifierNode) { + String varName = sigil + ((IdentifierNode) sigilOp.operand).name; + + // Declare the variable + int varReg = addVariable(varName, "my"); + + // Initialize based on sigil + switch (sigil) { + case "$" -> { + emit(Opcodes.LOAD_UNDEF); + emitReg(varReg); + } + case "@" -> { + emit(Opcodes.NEW_ARRAY); + emitReg(varReg); + } + case "%" -> { + emit(Opcodes.NEW_HASH); + emitReg(varReg); + } + } + + // Get i-th element from RHS + int indexReg = allocateRegister(); + emit(Opcodes.LOAD_INT); + emitReg(indexReg); + emitInt(i); + + int elemReg = allocateRegister(); + emit(Opcodes.ARRAY_GET); + emitReg(elemReg); + emitReg(rhsListReg); + emitReg(indexReg); + + // Assign to variable + if (sigil.equals("$")) { + emit(Opcodes.MOVE); + emitReg(varReg); + emitReg(elemReg); + } else if (sigil.equals("@")) { + emit(Opcodes.ARRAY_SET_FROM_LIST); + emitReg(varReg); + emitReg(elemReg); + } else if (sigil.equals("%")) { + emit(Opcodes.HASH_SET_FROM_LIST); + emitReg(varReg); + emitReg(elemReg); + } + } + } + } + + lastResultReg = rhsListReg; + return; + } } // Special case: local $x = value @@ -1358,6 +1436,33 @@ private void compileAssignmentOperator(BinaryOperatorNode node) { // Extract variable from "local" operand Node localOperand = leftOp.operand; + // Handle local $hash{key} = value (localizing hash element) + if (localOperand instanceof BinaryOperatorNode) { + BinaryOperatorNode hashAccess = (BinaryOperatorNode) localOperand; + if (hashAccess.operator.equals("{")) { + // Compile the hash access to get the hash element reference + // This returns a RuntimeScalar that is aliased to the hash slot + hashAccess.accept(this); + int elemReg = lastResultReg; + + // Push this hash element to the local variable stack + emit(Opcodes.PUSH_LOCAL_VARIABLE); + emitReg(elemReg); + + // Compile RHS + node.right.accept(this); + int valueReg = lastResultReg; + + // Assign value to the hash element (which is already localized) + emit(Opcodes.SET_SCALAR); + emitReg(elemReg); + emitReg(valueReg); + + lastResultReg = elemReg; + return; + } + } + // Handle local $x (where $x is OperatorNode("$", IdentifierNode("x"))) if (localOperand instanceof OperatorNode) { OperatorNode sigilOp = (OperatorNode) localOperand; @@ -1370,7 +1475,11 @@ private void compileAssignmentOperator(BinaryOperatorNode node) { return; } - // It's a global variable - emit SLOW_OP to call GlobalRuntimeScalar.makeLocal() + // Compile RHS first + node.right.accept(this); + int valueReg = lastResultReg; + + // It's a global variable - call makeLocal which returns the localized scalar String packageName = getCurrentPackage(); String globalVarName = packageName + "::" + ((IdentifierNode) sigilOp.operand).name; int nameIdx = addToStringPool(globalVarName); @@ -1380,17 +1489,82 @@ private void compileAssignmentOperator(BinaryOperatorNode node) { emitReg(localReg); emit(nameIdx); - // Compile RHS + // Assign value to the localized scalar (not to the global!) + emit(Opcodes.SET_SCALAR); + emitReg(localReg); + emitReg(valueReg); + + lastResultReg = localReg; + return; + } else if (sigilOp.operator.equals("@") && sigilOp.operand instanceof IdentifierNode) { + // Handle local @array = value + String varName = "@" + ((IdentifierNode) sigilOp.operand).name; + + // Check if it's a lexical variable (should not be localized) + if (hasVariable(varName)) { + throwCompilerException("Can't localize lexical variable " + varName); + return; + } + + // Compile RHS first node.right.accept(this); int valueReg = lastResultReg; - // Assign value to the localized variable - // The localized variable is a RuntimeScalar, so we use set() on it - emit(Opcodes.STORE_GLOBAL_SCALAR); + // It's a global array - get it and push to local stack + String packageName = getCurrentPackage(); + String globalVarName = packageName + "::" + ((IdentifierNode) sigilOp.operand).name; + int nameIdx = addToStringPool(globalVarName); + + int arrayReg = allocateRegister(); + emit(Opcodes.LOAD_GLOBAL_ARRAY); + emitReg(arrayReg); emit(nameIdx); + + // Push to local variable stack + emit(Opcodes.PUSH_LOCAL_VARIABLE); + emitReg(arrayReg); + + // Populate array from list + emit(Opcodes.ARRAY_SET_FROM_LIST); + emitReg(arrayReg); emitReg(valueReg); - lastResultReg = localReg; + lastResultReg = arrayReg; + return; + } else if (sigilOp.operator.equals("%") && sigilOp.operand instanceof IdentifierNode) { + // Handle local %hash = value + String varName = "%" + ((IdentifierNode) sigilOp.operand).name; + + // Check if it's a lexical variable (should not be localized) + if (hasVariable(varName)) { + throwCompilerException("Can't localize lexical variable " + varName); + return; + } + + // Compile RHS first + node.right.accept(this); + int valueReg = lastResultReg; + + // It's a global hash - get it and push to local stack + String packageName = getCurrentPackage(); + String globalVarName = packageName + "::" + ((IdentifierNode) sigilOp.operand).name; + int nameIdx = addToStringPool(globalVarName); + + int hashReg = allocateRegister(); + emit(Opcodes.LOAD_GLOBAL_HASH); + emitReg(hashReg); + emit(nameIdx); + + // Push to local variable stack + emit(Opcodes.PUSH_LOCAL_VARIABLE); + emitReg(hashReg); + + // Populate hash from list + emit(Opcodes.HASH_SET_FROM_LIST); + emitReg(hashReg); + emitReg(valueReg); + + lastResultReg = hashReg; return; } } @@ -1610,8 +1784,26 @@ private void compileAssignmentOperator(BinaryOperatorNode node) { } lastResultReg = targetReg; + } else if (leftOp.operator.equals("*") && leftOp.operand instanceof IdentifierNode) { + // Typeglob assignment: *foo = value + String varName = ((IdentifierNode) leftOp.operand).name; + String globalName = NameNormalizer.normalizeVariableName(varName, getCurrentPackage()); + int nameIdx = addToStringPool(globalName); + + // Load the glob + int globReg = allocateRegister(); + emit(Opcodes.LOAD_GLOB); + emitReg(globReg); + emit(nameIdx); + + // Store value to glob + emit(Opcodes.STORE_GLOB); + emitReg(globReg); + emitReg(valueReg); + + lastResultReg = globReg; } else { - throw new RuntimeException("Assignment to unsupported operator: " + leftOp.operator); + throwCompilerException("Assignment to unsupported operator: " + leftOp.operator); } } else if (node.left instanceof IdentifierNode) { String varName = ((IdentifierNode) node.left).name; @@ -1950,6 +2142,28 @@ private void compileAssignmentOperator(BinaryOperatorNode node) { return; } + // Handle lvalue subroutine: f() = value + // When a function is called in lvalue context, it returns a RuntimeBaseProxy + // that wraps a mutable reference. We can assign to it using SET_SCALAR. + if (leftBin.operator.equals("(")) { + // Call the function (which returns a RuntimeBaseProxy in lvalue context) + node.left.accept(this); + int lvalueReg = lastResultReg; + + // Compile RHS + node.right.accept(this); + int rhsReg = lastResultReg; + + // Assign to the lvalue using SET_SCALAR + emit(Opcodes.SET_SCALAR); + emitReg(lvalueReg); + emitReg(rhsReg); + + lastResultReg = rhsReg; + currentCallContext = savedContext; + return; + } + throwCompilerException("Assignment to non-identifier not yet supported: " + node.left.getClass().getSimpleName()); } else if (node.left instanceof ListNode) { // List assignment: ($a, $b) = ... or () = ... @@ -2222,6 +2436,18 @@ private int compileBinaryOperatorSwitch(String operator, int rs1, int rs2, int t emitReg(rs1); emitReg(rs2); } + case "<=" -> { + emit(Opcodes.LE_NUM); + emitReg(rd); + emitReg(rs1); + emitReg(rs2); + } + case ">=" -> { + emit(Opcodes.GE_NUM); + emitReg(rd); + emitReg(rs1); + emitReg(rs2); + } case "!=" -> { emit(Opcodes.NE_NUM); emitReg(rd); @@ -2399,10 +2625,29 @@ private int compileBinaryOperatorSwitch(String operator, int rs1, int rs2, int t // This should NOT be reached because += is handled specially before this switch throwCompilerException("+= should be handled before switch", tokenIndex); } - case "-=", "*=", "/=", "%=" -> { + case "-=", "*=", "/=", "%=", ".=" -> { // This should NOT be reached because compound assignments are handled specially before this switch throwCompilerException(operator + " should be handled before switch", tokenIndex); } + case "readline" -> { + // <$fh> - read line from filehandle + // rs1 = filehandle (or undef for ARGV) + // rs2 = unused (ListNode) + emit(Opcodes.READLINE); + emitReg(rd); + emitReg(rs1); + emit(currentCallContext); + } + case "=~" -> { + // $string =~ /pattern/ - regex match + // rs1 = string to match against + // rs2 = compiled regex pattern + emit(Opcodes.MATCH_REGEX); + emitReg(rd); + emitReg(rs1); + emitReg(rs2); + emit(currentCallContext); + } default -> throwCompilerException("Unsupported operator: " + operator, tokenIndex); } @@ -2443,10 +2688,10 @@ public void visit(BinaryOperatorNode node) { return; } - // Handle compound assignment operators (+=, -=, *=, /=, %=) + // Handle compound assignment operators (+=, -=, *=, /=, %=, .=) if (node.operator.equals("+=") || node.operator.equals("-=") || node.operator.equals("*=") || node.operator.equals("/=") || - node.operator.equals("%=")) { + node.operator.equals("%=") || node.operator.equals(".=")) { handleCompoundAssignment(node); return; } @@ -2730,6 +2975,28 @@ else if (node.right instanceof BinaryOperatorNode) { return; } + // Handle function call operators specially to ensure arguments are in LIST context + if (node.operator.equals("(") || node.operator.equals("()")) { + // Function call: subname(args) or $coderef->(args) + // Save and set context for left operand (code reference) + int savedContext = currentCallContext; + currentCallContext = RuntimeContextType.SCALAR; + node.left.accept(this); + int rs1 = lastResultReg; + + // Arguments must ALWAYS be evaluated in LIST context + // Even if the call itself is in SCALAR context (e.g., scalar(func())) + currentCallContext = RuntimeContextType.LIST; + node.right.accept(this); + int rs2 = lastResultReg; + currentCallContext = savedContext; + + // Emit CALL_SUB opcode + int rd = compileBinaryOperatorSwitch(node.operator, rs1, rs2, node.getIndex()); + lastResultReg = rd; + return; + } + // Handle short-circuit operators specially - don't compile right operand yet! if (node.operator.equals("&&") || node.operator.equals("and")) { // Logical AND with short-circuit evaluation @@ -2809,6 +3076,51 @@ else if (node.right instanceof BinaryOperatorNode) { return; } + if (node.operator.equals("//")) { + // Defined-OR with short-circuit evaluation + // Only evaluate right side if left side is undefined + + // Compile left operand + node.left.accept(this); + int rs1 = lastResultReg; + + // Allocate result register and move left value to it + int rd = allocateRegister(); + emit(Opcodes.MOVE); + emitReg(rd); + emitReg(rs1); + + // Check if left is defined + int definedReg = allocateRegister(); + emit(Opcodes.DEFINED); + emitReg(definedReg); + emitReg(rd); + + // Mark position for forward jump + int skipRightPos = bytecode.size(); + + // Emit conditional jump: if (defined) skip right evaluation + emit(Opcodes.GOTO_IF_TRUE); + emitReg(definedReg); + emitInt(0); // Placeholder for offset (will be patched) + + // NOW compile right operand (only executed if left was undefined) + node.right.accept(this); + int rs2 = lastResultReg; + + // Move right result to rd (overwriting left value) + emit(Opcodes.MOVE); + emitReg(rd); + emitReg(rs2); + + // Patch the forward jump offset + int skipRightTarget = bytecode.size(); + patchIntOffset(skipRightPos + 2, skipRightTarget); + + lastResultReg = rd; + return; + } + // Compile left and right operands (for non-short-circuit operators) node.left.accept(this); int rs1 = lastResultReg; @@ -2990,7 +3302,7 @@ private void compileVariableDeclaration(OperatorNode node, String op) { lastResultReg = resultReg; return; } - throw new RuntimeException("Unsupported my operand: " + node.operand.getClass().getSimpleName()); + throwCompilerException("Unsupported my operand: " + node.operand.getClass().getSimpleName()); } else if (op.equals("our")) { // our $x / our @x / our %x - package variable declaration // The operand will be OperatorNode("$"/"@"/"%", IdentifierNode("x")) @@ -3110,7 +3422,7 @@ private void compileVariableDeclaration(OperatorNode node, String op) { lastResultReg = resultReg; return; } - throw new RuntimeException("Unsupported our operand: " + node.operand.getClass().getSimpleName()); + throwCompilerException("Unsupported our operand: " + node.operand.getClass().getSimpleName()); } else if (op.equals("local")) { // local $x - temporarily localize a global variable // The operand will be OperatorNode("$", IdentifierNode("x")) if (node.operand instanceof OperatorNode) { @@ -3139,9 +3451,9 @@ private void compileVariableDeclaration(OperatorNode node, String op) { return; } } - throw new RuntimeException("Unsupported local operand: " + node.operand.getClass().getSimpleName()); + throwCompilerException("Unsupported local operand: " + node.operand.getClass().getSimpleName()); } - throw new RuntimeException("Unsupported variable declaration operator: " + op); + throwCompilerException("Unsupported variable declaration operator: " + op); } private void compileVariableReference(OperatorNode node, String op) { @@ -3171,8 +3483,24 @@ private void compileVariableReference(OperatorNode node, String op) { lastResultReg = rd; } + } else if (node.operand instanceof BlockNode) { + // Block dereference: ${\0} or ${expr} + // Execute the block and dereference the result + BlockNode block = (BlockNode) node.operand; + + // Compile the block + block.accept(this); + int blockResultReg = lastResultReg; + + // Dereference the result + int rd = allocateRegister(); + emitWithToken(Opcodes.DEREF, node.getIndex()); + emitReg(rd); + emitReg(blockResultReg); + + lastResultReg = rd; } else { - throw new RuntimeException("Unsupported $ operand: " + node.operand.getClass().getSimpleName()); + throwCompilerException("Unsupported $ operand: " + node.operand.getClass().getSimpleName()); } } else if (op.equals("@")) { // Array variable dereference: @x or @_ or @$arrayref @@ -3306,7 +3634,7 @@ private void compileVariableReference(OperatorNode node, String op) { lastResultReg = rd; } else { - throw new RuntimeException("Unsupported * operand: " + node.operand.getClass().getSimpleName()); + throwCompilerException("Unsupported * operand: " + node.operand.getClass().getSimpleName()); } } else if (op.equals("&")) { // Code reference: &subname @@ -3358,7 +3686,7 @@ private void compileVariableReference(OperatorNode node, String op) { currentCallContext = savedContext; } } else { - throw new RuntimeException("Reference operator requires operand"); + throwCompilerException("Reference operator requires operand"); } } } @@ -3448,7 +3776,7 @@ public void visit(OperatorNode node) { lastResultReg = rd; } else { - throw new RuntimeException("NOT operator requires operand"); + throwCompilerException("NOT operator requires operand"); } } else if (op.equals("defined")) { // Defined operator: defined($x) @@ -3467,7 +3795,7 @@ public void visit(OperatorNode node) { lastResultReg = rd; } else { - throw new RuntimeException("defined operator requires operand"); + throwCompilerException("defined operator requires operand"); } } else if (op.equals("ref")) { // Ref operator: ref($x) @@ -3497,6 +3825,68 @@ public void visit(OperatorNode node) { emitReg(rd); emitReg(argReg); + lastResultReg = rd; + } else if (op.equals("prototype")) { + // Prototype operator: prototype(\&func) or prototype("func_name") + // Returns the prototype string for a subroutine + if (node.operand == null) { + throwCompilerException("prototype requires an argument"); + } + + // Compile the operand (code reference or function name) + if (node.operand instanceof ListNode) { + ListNode list = (ListNode) node.operand; + if (list.elements.isEmpty()) { + throwCompilerException("prototype requires an argument"); + } + // Get first element + list.elements.get(0).accept(this); + } else { + node.operand.accept(this); + } + int argReg = lastResultReg; + + // Allocate result register + int rd = allocateRegister(); + + // Add current package to string pool + int packageIdx = addToStringPool(getCurrentPackage()); + + // Emit PROTOTYPE opcode + emit(Opcodes.PROTOTYPE); + emitReg(rd); + emitReg(argReg); + emitInt(packageIdx); + + lastResultReg = rd; + } else if (op.equals("quoteRegex")) { + // Quote regex operator: qr{pattern}flags + // operand is a ListNode with [pattern, flags] + if (node.operand == null || !(node.operand instanceof ListNode)) { + throwCompilerException("quoteRegex requires pattern and flags"); + } + + ListNode operand = (ListNode) node.operand; + if (operand.elements.size() < 2) { + throwCompilerException("quoteRegex requires pattern and flags"); + } + + // Compile pattern and flags + operand.elements.get(0).accept(this); // Pattern + int patternReg = lastResultReg; + + operand.elements.get(1).accept(this); // Flags + int flagsReg = lastResultReg; + + // Allocate result register + int rd = allocateRegister(); + + // Emit QUOTE_REGEX opcode + emit(Opcodes.QUOTE_REGEX); + emitReg(rd); + emitReg(patternReg); + emitReg(flagsReg); + lastResultReg = rd; } else if (op.equals("++") || op.equals("--") || op.equals("++postfix") || op.equals("--postfix")) { // Pre/post increment/decrement @@ -3529,7 +3919,7 @@ public void visit(OperatorNode node) { lastResultReg = varReg; } else { - throw new RuntimeException("Increment/decrement of non-lexical variable not yet supported"); + throwCompilerException("Increment/decrement of non-lexical variable not yet supported"); } } else if (node.operand instanceof OperatorNode) { // Handle $x++ @@ -4497,6 +4887,136 @@ public void visit(OperatorNode node) { emitReg(stringReg); lastResultReg = rd; + } else if (op.equals("open")) { + // open(filehandle, mode, filename) or open(filehandle, expr) + if (node.operand == null || !(node.operand instanceof ListNode)) { + throwCompilerException("open requires arguments"); + } + + ListNode argsList = (ListNode) node.operand; + if (argsList.elements.isEmpty()) { + throwCompilerException("open requires arguments"); + } + + // Compile all arguments into a list + int argsReg = allocateRegister(); + emit(Opcodes.NEW_ARRAY); + emitReg(argsReg); + + for (Node arg : argsList.elements) { + arg.accept(this); + int elemReg = lastResultReg; + + emit(Opcodes.ARRAY_PUSH); + emitReg(argsReg); + emitReg(elemReg); + } + + // Call open with context and args + int rd = allocateRegister(); + emit(Opcodes.OPEN); + emitReg(rd); + emit(currentCallContext); + emitReg(argsReg); + + lastResultReg = rd; + } else if (op.equals("matchRegex")) { + // m/pattern/flags - create a regex and return it (for use with =~) + // operand: ListNode containing pattern string and flags string + if (node.operand == null || !(node.operand instanceof ListNode)) { + throwCompilerException("matchRegex requires pattern and flags"); + } + + ListNode args = (ListNode) node.operand; + if (args.elements.size() < 2) { + throwCompilerException("matchRegex requires pattern and flags"); + } + + // Compile pattern + args.elements.get(0).accept(this); + int patternReg = lastResultReg; + + // Compile flags + args.elements.get(1).accept(this); + int flagsReg = lastResultReg; + + // Create quoted regex using QUOTE_REGEX opcode + int rd = allocateRegister(); + emit(Opcodes.QUOTE_REGEX); + emitReg(rd); + emitReg(patternReg); + emitReg(flagsReg); + + lastResultReg = rd; + } else if (op.equals("chomp")) { + // chomp($x) or chomp - remove trailing newlines + if (node.operand == null) { + // chomp with no args - operates on $_ + String varName = "$_"; + int targetReg; + if (hasVariable(varName)) { + targetReg = getVariableRegister(varName); + } else { + targetReg = allocateRegister(); + int nameIdx = addToStringPool("main::_"); + emit(Opcodes.LOAD_GLOBAL_SCALAR); + emitReg(targetReg); + emit(nameIdx); + } + + int rd = allocateRegister(); + emit(Opcodes.CHOMP); + emitReg(rd); + emitReg(targetReg); + + lastResultReg = rd; + } else { + // chomp with argument + if (node.operand instanceof ListNode) { + ListNode list = (ListNode) node.operand; + if (!list.elements.isEmpty()) { + list.elements.get(0).accept(this); + } else { + throwCompilerException("chomp requires an argument"); + } + } else { + node.operand.accept(this); + } + int targetReg = lastResultReg; + + int rd = allocateRegister(); + emit(Opcodes.CHOMP); + emitReg(rd); + emitReg(targetReg); + + lastResultReg = rd; + } + } else if (op.equals("+")) { + // Unary + operator: forces numeric context on its operand + // For arrays/hashes in scalar context, this returns the size + // For scalars, this ensures the value is numeric + if (node.operand != null) { + // Evaluate operand in scalar context + int savedContext = currentCallContext; + currentCallContext = RuntimeContextType.SCALAR; + try { + node.operand.accept(this); + int operandReg = lastResultReg; + + // Emit ARRAY_SIZE to convert to scalar + // This handles arrays/hashes (converts to size) and passes through scalars + int rd = allocateRegister(); + emit(Opcodes.ARRAY_SIZE); + emitReg(rd); + emitReg(operandReg); + + lastResultReg = rd; + } finally { + currentCallContext = savedContext; + } + } else { + throwCompilerException("unary + operator requires an operand"); + } } else { throwCompilerException("Unsupported operator: " + op); } @@ -4956,12 +5476,10 @@ private void visitAnonymousSubroutine(SubroutineNode node) { private void visitEvalBlock(SubroutineNode node) { int resultReg = allocateRegister(); - // Emit EVAL_TRY with placeholder for catch offset - int tryPc = bytecode.size(); + // Emit EVAL_TRY with placeholder for catch target (absolute address) emitWithToken(Opcodes.EVAL_TRY, node.getIndex()); - int catchOffsetPos = bytecode.size(); - emit(0); // High byte placeholder - emit(0); // Low byte placeholder + int catchTargetPos = bytecode.size(); + emitInt(0); // Placeholder for absolute catch address (4 bytes) // Compile the eval block body node.block.accept(this); @@ -4976,19 +5494,16 @@ private void visitEvalBlock(SubroutineNode node) { // Emit EVAL_END (clears $@) emit(Opcodes.EVAL_END); - // Jump over catch block - int gotoEndPos = bytecode.size(); + // Jump over catch block to end emit(Opcodes.GOTO); - int gotoEndOffsetPos = bytecode.size(); - emit(0); // High byte placeholder - emit(0); // Low byte placeholder + int gotoEndPos = bytecode.size(); + emitInt(0); // Placeholder for absolute end address (4 bytes) // CATCH block starts here int catchPc = bytecode.size(); - // Patch EVAL_TRY with catch offset (as a single short) - int catchOffset = catchPc - tryPc; - bytecode.set(catchOffsetPos, (short)catchOffset); + // Patch EVAL_TRY with absolute catch target (4 bytes) + patchIntOffset(catchTargetPos, catchPc); // Emit EVAL_CATCH (sets $@, stores undef) emit(Opcodes.EVAL_CATCH); @@ -4997,9 +5512,8 @@ private void visitEvalBlock(SubroutineNode node) { // END label (after catch) int endPc = bytecode.size(); - // Patch GOTO to end (as a single short) - int gotoEndOffset = endPc - gotoEndPos; - bytecode.set(gotoEndOffsetPos, (short)gotoEndOffset); + // Patch GOTO with absolute end target (4 bytes) + patchIntOffset(gotoEndPos, endPc); lastResultReg = resultReg; } diff --git a/src/main/java/org/perlonjava/interpreter/BytecodeInterpreter.java b/src/main/java/org/perlonjava/interpreter/BytecodeInterpreter.java index 96438fb2d..519480d88 100644 --- a/src/main/java/org/perlonjava/interpreter/BytecodeInterpreter.java +++ b/src/main/java/org/perlonjava/interpreter/BytecodeInterpreter.java @@ -208,6 +208,55 @@ public static RuntimeList execute(InterpretedCode code, RuntimeArray args, int c break; } + case Opcodes.STORE_GLOBAL_ARRAY: { + // Store global array: GlobalVariable.getGlobalArray(name).setFromList(list) + int nameIdx = bytecode[pc++]; + int srcReg = bytecode[pc++]; + String name = code.stringPool[nameIdx]; + + RuntimeArray globalArray = GlobalVariable.getGlobalArray(name); + RuntimeBase value = registers[srcReg]; + + if (value == null) { + // Output disassembly around the error + String disasm = code.disassemble(); + throw new PerlCompilerException("STORE_GLOBAL_ARRAY: Register r" + srcReg + + " is null when storing to @" + name + " at pc=" + (pc-3) + "\n\nDisassembly:\n" + disasm); + } + + // Clear and populate the global array from the source + if (value instanceof RuntimeArray) { + globalArray.elements.clear(); + globalArray.elements.addAll(((RuntimeArray) value).elements); + } else if (value instanceof RuntimeList) { + globalArray.setFromList((RuntimeList) value); + } else { + globalArray.setFromList(value.getList()); + } + break; + } + + case Opcodes.STORE_GLOBAL_HASH: { + // Store global hash: GlobalVariable.getGlobalHash(name).setFromList(list) + int nameIdx = bytecode[pc++]; + int srcReg = bytecode[pc++]; + String name = code.stringPool[nameIdx]; + + RuntimeHash globalHash = GlobalVariable.getGlobalHash(name); + RuntimeBase value = registers[srcReg]; + + // Clear and populate the global hash from the source + if (value instanceof RuntimeHash) { + globalHash.elements.clear(); + globalHash.elements.putAll(((RuntimeHash) value).elements); + } else if (value instanceof RuntimeList) { + globalHash.setFromList((RuntimeList) value); + } else { + globalHash.setFromList(value.getList()); + } + break; + } + case Opcodes.LOAD_GLOBAL_ARRAY: { // Load global array: rd = GlobalVariable.getGlobalArray(name) int rd = bytecode[pc++]; @@ -468,6 +517,8 @@ public static RuntimeList execute(InterpretedCode code, RuntimeArray args, int c case Opcodes.NE_NUM: case Opcodes.LT_NUM: case Opcodes.GT_NUM: + case Opcodes.LE_NUM: + case Opcodes.GE_NUM: case Opcodes.EQ_STR: case Opcodes.NE_STR: case Opcodes.NOT: @@ -482,6 +533,8 @@ public static RuntimeList execute(InterpretedCode code, RuntimeArray args, int c case Opcodes.REF: case Opcodes.BLESS: case Opcodes.ISA: + case Opcodes.PROTOTYPE: + case Opcodes.QUOTE_REGEX: pc = executeTypeOps(opcode, bytecode, pc, registers, code); break; @@ -1066,6 +1119,76 @@ public static RuntimeList execute(InterpretedCode code, RuntimeArray args, int c break; } + case Opcodes.STRING_CONCAT_ASSIGN: { + // String concatenation and assign: rd .= rs + int rd = bytecode[pc++]; + int rs = bytecode[pc++]; + registers[rd] = StringOperators.stringConcat( + (RuntimeScalar) registers[rd], + (RuntimeScalar) registers[rs] + ); + break; + } + + case Opcodes.PUSH_LOCAL_VARIABLE: { + // Push variable to local stack: DynamicVariableManager.pushLocalVariable(rs) + int rs = bytecode[pc++]; + org.perlonjava.runtime.DynamicVariableManager.pushLocalVariable(registers[rs]); + break; + } + + case Opcodes.STORE_GLOB: { + // Store to glob: glob.set(value) + int globReg = bytecode[pc++]; + int valueReg = bytecode[pc++]; + ((org.perlonjava.runtime.RuntimeGlob) registers[globReg]).set((RuntimeScalar) registers[valueReg]); + break; + } + + case Opcodes.OPEN: { + // Open file: rd = IOOperator.open(ctx, args...) + int rd = bytecode[pc++]; + int ctx = bytecode[pc++]; + int argsReg = bytecode[pc++]; + RuntimeArray argsArray = (RuntimeArray) registers[argsReg]; + RuntimeBase[] argsVarargs = argsArray.elements.toArray(new RuntimeBase[0]); + registers[rd] = org.perlonjava.operators.IOOperator.open(ctx, argsVarargs); + break; + } + + case Opcodes.READLINE: { + // Read line from filehandle: rd = Readline.readline(fh_ref, ctx) + int rd = bytecode[pc++]; + int fhReg = bytecode[pc++]; + int ctx = bytecode[pc++]; + registers[rd] = org.perlonjava.operators.Readline.readline( + (RuntimeScalar) registers[fhReg], ctx + ); + break; + } + + case Opcodes.MATCH_REGEX: { + // Match regex: rd = RuntimeRegex.matchRegex(string, regex, ctx) + int rd = bytecode[pc++]; + int stringReg = bytecode[pc++]; + int regexReg = bytecode[pc++]; + int ctx = bytecode[pc++]; + registers[rd] = org.perlonjava.regex.RuntimeRegex.matchRegex( + (RuntimeScalar) registers[stringReg], + (RuntimeScalar) registers[regexReg], + ctx + ); + break; + } + + case Opcodes.CHOMP: { + // Chomp: rd = rs.chomp() + int rd = bytecode[pc++]; + int rs = bytecode[pc++]; + registers[rd] = registers[rs].chomp(); + break; + } + case Opcodes.PRE_AUTOINCREMENT: { // Pre-increment: ++rd int rd = bytecode[pc++]; @@ -1167,13 +1290,11 @@ public static RuntimeList execute(InterpretedCode code, RuntimeArray args, int c case Opcodes.EVAL_TRY: { // Start of eval block with exception handling - // Format: [EVAL_TRY] [catch_offset_high] [catch_offset_low] + // Format: [EVAL_TRY] [catch_target_high] [catch_target_low] + // catch_target is absolute bytecode address (4 bytes) - int catchOffsetHigh = bytecode[pc++]; - int catchOffsetLow = bytecode[pc++]; - int catchOffset = (catchOffsetHigh << 8) | catchOffsetLow; - int tryStartPc = pc - 3; // PC where EVAL_TRY opcode is - int catchPc = tryStartPc + catchOffset; + int catchPc = readInt(bytecode, pc); // Read 4-byte absolute address + pc += 2; // Skip the 2 shorts we just read // Push catch PC onto eval stack evalCatchStack.push(catchPc); @@ -1741,6 +1862,27 @@ private static int executeTypeOps(short opcode, short[] bytecode, int pc, return pc; } + case Opcodes.PROTOTYPE: { + int rd = bytecode[pc++]; + int rs = bytecode[pc++]; + int packageIdx = readInt(bytecode, pc); + pc += 2; // readInt reads 2 shorts + RuntimeScalar codeRef = (RuntimeScalar) registers[rs]; + String packageName = code.stringPool[packageIdx]; + registers[rd] = RuntimeCode.prototype(codeRef, packageName); + return pc; + } + + case Opcodes.QUOTE_REGEX: { + int rd = bytecode[pc++]; + int patternReg = bytecode[pc++]; + int flagsReg = bytecode[pc++]; + RuntimeScalar pattern = (RuntimeScalar) registers[patternReg]; + RuntimeScalar flags = (RuntimeScalar) registers[flagsReg]; + registers[rd] = org.perlonjava.regex.RuntimeRegex.getQuotedRegex(pattern, flags); + return pc; + } + default: throw new RuntimeException("Unknown type opcode: " + opcode); } @@ -2183,6 +2325,32 @@ private static int executeComparisons(short opcode, short[] bytecode, int pc, return pc; } + case Opcodes.LE_NUM: { + // Less than or equal: rd = (rs1 <= rs2) + int rd = bytecode[pc++]; + int rs1 = bytecode[pc++]; + int rs2 = bytecode[pc++]; + RuntimeBase val1 = registers[rs1]; + RuntimeBase val2 = registers[rs2]; + RuntimeScalar s1 = (val1 instanceof RuntimeScalar) ? (RuntimeScalar) val1 : val1.scalar(); + RuntimeScalar s2 = (val2 instanceof RuntimeScalar) ? (RuntimeScalar) val2 : val2.scalar(); + registers[rd] = CompareOperators.lessThanOrEqual(s1, s2); + return pc; + } + + case Opcodes.GE_NUM: { + // Greater than or equal: rd = (rs1 >= rs2) + int rd = bytecode[pc++]; + int rs1 = bytecode[pc++]; + int rs2 = bytecode[pc++]; + RuntimeBase val1 = registers[rs1]; + RuntimeBase val2 = registers[rs2]; + RuntimeScalar s1 = (val1 instanceof RuntimeScalar) ? (RuntimeScalar) val1 : val1.scalar(); + RuntimeScalar s2 = (val2 instanceof RuntimeScalar) ? (RuntimeScalar) val2 : val2.scalar(); + registers[rd] = CompareOperators.greaterThanOrEqual(s1, s2); + return pc; + } + case Opcodes.NE_NUM: { // Not equal: rd = (rs1 != rs2) int rd = bytecode[pc++]; diff --git a/src/main/java/org/perlonjava/interpreter/InterpretedCode.java b/src/main/java/org/perlonjava/interpreter/InterpretedCode.java index 12ca41ef9..37299ede4 100644 --- a/src/main/java/org/perlonjava/interpreter/InterpretedCode.java +++ b/src/main/java/org/perlonjava/interpreter/InterpretedCode.java @@ -299,11 +299,21 @@ public String disassemble() { nameIdx = bytecode[pc++]; sb.append("LOAD_GLOBAL_ARRAY r").append(rd).append(" = @").append(stringPool[nameIdx]).append("\n"); break; + case Opcodes.STORE_GLOBAL_ARRAY: + nameIdx = bytecode[pc++]; + int storeArraySrcReg = bytecode[pc++]; + sb.append("STORE_GLOBAL_ARRAY @").append(stringPool[nameIdx]).append(" = r").append(storeArraySrcReg).append("\n"); + break; case Opcodes.LOAD_GLOBAL_HASH: rd = bytecode[pc++]; nameIdx = bytecode[pc++]; sb.append("LOAD_GLOBAL_HASH r").append(rd).append(" = %").append(stringPool[nameIdx]).append("\n"); break; + case Opcodes.STORE_GLOBAL_HASH: + nameIdx = bytecode[pc++]; + int storeHashSrcReg = bytecode[pc++]; + sb.append("STORE_GLOBAL_HASH %").append(stringPool[nameIdx]).append(" = r").append(storeHashSrcReg).append("\n"); + break; case Opcodes.LOAD_GLOBAL_CODE: rd = bytecode[pc++]; nameIdx = bytecode[pc++]; @@ -362,6 +372,32 @@ public String disassemble() { pc += 2; sb.append("ADD_SCALAR_INT r").append(rd).append(" = r").append(rs).append(" + ").append(imm).append("\n"); break; + case Opcodes.SUB_SCALAR_INT: + rd = bytecode[pc++]; + rs = bytecode[pc++]; + int subImm = readInt(bytecode, pc); + pc += 2; + sb.append("SUB_SCALAR_INT r").append(rd).append(" = r").append(rs).append(" - ").append(subImm).append("\n"); + break; + case Opcodes.MUL_SCALAR_INT: + rd = bytecode[pc++]; + rs = bytecode[pc++]; + int mulImm = readInt(bytecode, pc); + pc += 2; + sb.append("MUL_SCALAR_INT r").append(rd).append(" = r").append(rs).append(" * ").append(mulImm).append("\n"); + break; + case Opcodes.CONCAT: + rd = bytecode[pc++]; + rs1 = bytecode[pc++]; + rs2 = bytecode[pc++]; + sb.append("CONCAT r").append(rd).append(" = r").append(rs1).append(" . r").append(rs2).append("\n"); + break; + case Opcodes.REPEAT: + rd = bytecode[pc++]; + rs1 = bytecode[pc++]; + rs2 = bytecode[pc++]; + sb.append("REPEAT r").append(rd).append(" = r").append(rs1).append(" x r").append(rs2).append("\n"); + break; case Opcodes.LT_NUM: rd = bytecode[pc++]; rs1 = bytecode[pc++]; @@ -374,6 +410,18 @@ public String disassemble() { rs2 = bytecode[pc++]; sb.append("GT_NUM r").append(rd).append(" = r").append(rs1).append(" > r").append(rs2).append("\n"); break; + case Opcodes.LE_NUM: + rd = bytecode[pc++]; + rs1 = bytecode[pc++]; + rs2 = bytecode[pc++]; + sb.append("LE_NUM r").append(rd).append(" = r").append(rs1).append(" <= r").append(rs2).append("\n"); + break; + case Opcodes.GE_NUM: + rd = bytecode[pc++]; + rs1 = bytecode[pc++]; + rs2 = bytecode[pc++]; + sb.append("GE_NUM r").append(rd).append(" = r").append(rs1).append(" >= r").append(rs2).append("\n"); + break; case Opcodes.NE_NUM: rd = bytecode[pc++]; rs1 = bytecode[pc++]; @@ -399,6 +447,44 @@ public String disassemble() { pc += 2; sb.append("ADD_ASSIGN_INT r").append(rd).append(" += ").append(imm).append("\n"); break; + case Opcodes.STRING_CONCAT_ASSIGN: + rd = bytecode[pc++]; + rs = bytecode[pc++]; + sb.append("STRING_CONCAT_ASSIGN r").append(rd).append(" .= r").append(rs).append("\n"); + break; + case Opcodes.PUSH_LOCAL_VARIABLE: + rs = bytecode[pc++]; + sb.append("PUSH_LOCAL_VARIABLE r").append(rs).append("\n"); + break; + case Opcodes.STORE_GLOB: + int globReg = bytecode[pc++]; + rs = bytecode[pc++]; + sb.append("STORE_GLOB r").append(globReg).append(" = r").append(rs).append("\n"); + break; + case Opcodes.OPEN: + rd = bytecode[pc++]; + int openCtx = bytecode[pc++]; + int openArgs = bytecode[pc++]; + sb.append("OPEN r").append(rd).append(" = open(ctx=").append(openCtx).append(", r").append(openArgs).append(")\n"); + break; + case Opcodes.READLINE: + rd = bytecode[pc++]; + int fhReg = bytecode[pc++]; + int readCtx = bytecode[pc++]; + sb.append("READLINE r").append(rd).append(" = readline(r").append(fhReg).append(", ctx=").append(readCtx).append(")\n"); + break; + case Opcodes.MATCH_REGEX: + rd = bytecode[pc++]; + int strReg = bytecode[pc++]; + int regReg = bytecode[pc++]; + int matchCtx = bytecode[pc++]; + sb.append("MATCH_REGEX r").append(rd).append(" = r").append(strReg).append(" =~ r").append(regReg).append(" (ctx=").append(matchCtx).append(")\n"); + break; + case Opcodes.CHOMP: + rd = bytecode[pc++]; + rs = bytecode[pc++]; + sb.append("CHOMP r").append(rd).append(" = chomp(r").append(rs).append(")\n"); + break; case Opcodes.PRE_AUTOINCREMENT: rd = bytecode[pc++]; sb.append("PRE_AUTOINCREMENT ++r").append(rd).append("\n"); @@ -451,11 +537,10 @@ public String disassemble() { sb.append("WARN r").append(rs).append("\n"); break; case Opcodes.EVAL_TRY: { - int catchOffsetHigh = bytecode[pc++]; - int catchOffsetLow = bytecode[pc++]; - int catchOffset = (catchOffsetHigh << 8) | catchOffsetLow; - int tryPc = pc - 3; - int catchPc = tryPc + catchOffset; + // Read 4-byte absolute catch target + int high = bytecode[pc++] & 0xFFFF; + int low = bytecode[pc++] & 0xFFFF; + int catchPc = (high << 16) | low; sb.append("EVAL_TRY catch_at=").append(catchPc).append("\n"); break; } @@ -472,6 +557,32 @@ public String disassemble() { int indexReg = bytecode[pc++]; sb.append("ARRAY_GET r").append(rd).append(" = r").append(arrayReg).append("[r").append(indexReg).append("]\n"); break; + case Opcodes.ARRAY_SET: + arrayReg = bytecode[pc++]; + indexReg = bytecode[pc++]; + int arraySetValueReg = bytecode[pc++]; + sb.append("ARRAY_SET r").append(arrayReg).append("[r").append(indexReg).append("] = r").append(arraySetValueReg).append("\n"); + break; + case Opcodes.ARRAY_PUSH: + arrayReg = bytecode[pc++]; + int arrayPushValueReg = bytecode[pc++]; + sb.append("ARRAY_PUSH r").append(arrayReg).append(".push(r").append(arrayPushValueReg).append(")\n"); + break; + case Opcodes.ARRAY_POP: + rd = bytecode[pc++]; + arrayReg = bytecode[pc++]; + sb.append("ARRAY_POP r").append(rd).append(" = r").append(arrayReg).append(".pop()\n"); + break; + case Opcodes.ARRAY_SHIFT: + rd = bytecode[pc++]; + arrayReg = bytecode[pc++]; + sb.append("ARRAY_SHIFT r").append(rd).append(" = r").append(arrayReg).append(".shift()\n"); + break; + case Opcodes.ARRAY_UNSHIFT: + arrayReg = bytecode[pc++]; + int arrayUnshiftValueReg = bytecode[pc++]; + sb.append("ARRAY_UNSHIFT r").append(arrayReg).append(".unshift(r").append(arrayUnshiftValueReg).append(")\n"); + break; case Opcodes.ARRAY_SIZE: rd = bytecode[pc++]; arrayReg = bytecode[pc++]; @@ -636,6 +747,11 @@ public String disassemble() { } sb.append("])\n"); break; + case Opcodes.SET_SCALAR: + rd = bytecode[pc++]; + rs = bytecode[pc++]; + sb.append("SET_SCALAR r").append(rd).append(".set(r").append(rs).append(")\n"); + break; case Opcodes.NOT: rd = bytecode[pc++]; rs = bytecode[pc++]; @@ -665,6 +781,23 @@ public String disassemble() { sb.append("ISA r").append(rd).append(" = isa(r").append(objReg) .append(", r").append(pkgReg).append(")\n"); break; + case Opcodes.PROTOTYPE: + rd = bytecode[pc++]; + rs = bytecode[pc++]; + int packageIdx = readInt(bytecode, pc); + pc += 2; // readInt reads 2 shorts + String packageName = (stringPool != null && packageIdx < stringPool.length) ? + stringPool[packageIdx] : ""; + sb.append("PROTOTYPE r").append(rd).append(" = prototype(r").append(rs) + .append(", \"").append(packageName).append("\")\n"); + break; + case Opcodes.QUOTE_REGEX: + rd = bytecode[pc++]; + int patternReg = bytecode[pc++]; + int flagsReg = bytecode[pc++]; + sb.append("QUOTE_REGEX r").append(rd).append(" = qr{r").append(patternReg) + .append("}r").append(flagsReg).append("\n"); + break; case Opcodes.ITERATOR_CREATE: rd = bytecode[pc++]; rs = bytecode[pc++]; @@ -719,6 +852,51 @@ public String disassemble() { rs = bytecode[pc++]; sb.append("SCALAR_TO_LIST r").append(rd).append(" = to_list(r").append(rs).append(")\n"); break; + case Opcodes.EVAL_STRING: + rd = bytecode[pc++]; + rs = bytecode[pc++]; + sb.append("EVAL_STRING r").append(rd).append(" = eval(r").append(rs).append(")\n"); + break; + case Opcodes.SELECT_OP: + rd = bytecode[pc++]; + rs = bytecode[pc++]; + sb.append("SELECT_OP r").append(rd).append(" = select(r").append(rs).append(")\n"); + break; + case Opcodes.LOAD_GLOB: + rd = bytecode[pc++]; + nameIdx = bytecode[pc++]; + sb.append("LOAD_GLOB r").append(rd).append(" = *").append(stringPool[nameIdx]).append("\n"); + break; + case Opcodes.SLEEP_OP: + rd = bytecode[pc++]; + rs = bytecode[pc++]; + sb.append("SLEEP_OP r").append(rd).append(" = sleep(r").append(rs).append(")\n"); + break; + case Opcodes.DEREF_ARRAY: + rd = bytecode[pc++]; + rs = bytecode[pc++]; + sb.append("DEREF_ARRAY r").append(rd).append(" = @{r").append(rs).append("}\n"); + break; + case Opcodes.RETRIEVE_BEGIN_SCALAR: + rd = bytecode[pc++]; + nameIdx = bytecode[pc++]; + int beginId = bytecode[pc++]; + sb.append("RETRIEVE_BEGIN_SCALAR r").append(rd).append(" = BEGIN_").append(beginId) + .append("::").append(stringPool[nameIdx]).append("\n"); + break; + case Opcodes.SPLIT: + rd = bytecode[pc++]; + int splitPatternReg = bytecode[pc++]; + int splitArgsReg = bytecode[pc++]; + int splitCtx = bytecode[pc++]; + sb.append("SPLIT r").append(rd).append(" = split(r").append(splitPatternReg) + .append(", r").append(splitArgsReg).append(", ctx=").append(splitCtx).append(")\n"); + break; + case Opcodes.LOCAL_SCALAR: + rd = bytecode[pc++]; + nameIdx = bytecode[pc++]; + sb.append("LOCAL_SCALAR r").append(rd).append(" = local $").append(stringPool[nameIdx]).append("\n"); + break; // DEPRECATED: SLOW_OP case removed - opcode 87 is no longer emitted // All operations now use direct opcodes (114-154) default: diff --git a/src/main/java/org/perlonjava/interpreter/Opcodes.java b/src/main/java/org/perlonjava/interpreter/Opcodes.java index 616945564..713bbaff9 100644 --- a/src/main/java/org/perlonjava/interpreter/Opcodes.java +++ b/src/main/java/org/perlonjava/interpreter/Opcodes.java @@ -620,8 +620,50 @@ public class Opcodes { /** Integer conversion: rd = MathOperators.integer(rs) - equivalent to int(rs) */ public static final short OP_INT = 157; + /** Prototype operator: rd = RuntimeCode.prototype(rs_coderef, package_name) + * Format: PROTOTYPE rd rs package_name_idx(int) */ + public static final short PROTOTYPE = 158; + + /** Quote regex operator: rd = RuntimeRegex.getQuotedRegex(pattern_reg, flags_reg) + * Format: QUOTE_REGEX rd pattern_reg flags_reg */ + public static final short QUOTE_REGEX = 159; + + /** Less than or equal: rd = CompareOperators.numericLessThanOrEqual(rs1, rs2) */ + public static final short LE_NUM = 160; + + /** Greater than or equal: rd = CompareOperators.numericGreaterThanOrEqual(rs1, rs2) */ + public static final short GE_NUM = 161; + + /** String concatenation assignment: rd .= rs (appends rs to rd) + * Format: STRING_CONCAT_ASSIGN rd rs */ + public static final short STRING_CONCAT_ASSIGN = 162; + + /** Push variable to local stack: DynamicVariableManager.pushLocalVariable(rs) + * Format: PUSH_LOCAL_VARIABLE rs */ + public static final short PUSH_LOCAL_VARIABLE = 163; + + /** Store to glob: glob.set(rs) + * Format: STORE_GLOB globReg valueReg */ + public static final short STORE_GLOB = 164; + + /** Open file: rd = IOOperator.open(ctx, args...) + * Format: OPEN rd ctx argsReg */ + public static final short OPEN = 165; + + /** Read line from filehandle: rd = Readline.readline(fh_ref, ctx) + * Format: READLINE rd fhReg ctx */ + public static final short READLINE = 166; + + /** Match regex: rd = RuntimeRegex.matchRegex(string, regex, ctx) + * Format: MATCH_REGEX rd stringReg regexReg ctx */ + public static final short MATCH_REGEX = 167; + + /** Chomp: rd = rs.chomp() + * Format: CHOMP rd rs */ + public static final short CHOMP = 168; + // ================================================================= - // OPCODES 403-32767: RESERVED FOR FUTURE OPERATIONS + // OPCODES 169-32767: RESERVED FOR FUTURE OPERATIONS // ================================================================= // See PHASE3_OPERATOR_PROMOTIONS.md for promotion strategy. // All SLOWOP_* constants have been removed - use direct opcodes 114-154 instead. diff --git a/src/main/java/org/perlonjava/parser/SubroutineParser.java b/src/main/java/org/perlonjava/parser/SubroutineParser.java index bea01f5db..818cb372a 100644 --- a/src/main/java/org/perlonjava/parser/SubroutineParser.java +++ b/src/main/java/org/perlonjava/parser/SubroutineParser.java @@ -654,11 +654,12 @@ public static ListNode handleNamedSubWithFilter(Parser parser, String subName, S codeRef.value = new RuntimeCode(subName, attributes); } - RuntimeCode code = (RuntimeCode) codeRef.value; - code.prototype = prototype; - code.attributes = attributes; - code.subName = subName; - code.packageName = parser.ctx.symbolTable.getCurrentPackage(); + // Initialize placeholder metadata (accessed via codeRef.value) + RuntimeCode placeholder = (RuntimeCode) codeRef.value; + placeholder.prototype = prototype; + placeholder.attributes = attributes; + placeholder.subName = subName; + placeholder.packageName = parser.ctx.symbolTable.getCurrentPackage(); // Optimization - https://github.com/fglock/PerlOnJava/issues/8 // Prepare capture variables @@ -781,38 +782,77 @@ public static ListNode handleNamedSubWithFilter(Parser parser, String subName, S new RuntimeArray() ); - // Encapsulate the subroutine creation task in a Supplier + // Hybrid lazy/eager compilation approach: + // - Keep lazy compilation for normal code (preserves test compatibility) + // - The Supplier tries createRuntimeCode() which handles both normal compilation and interpreter fallback + // - For InterpretedCode, we replace codeRef.value (not just code fields) + Supplier subroutineCreationTaskSupplier = () -> { - // Generate bytecode and load into a Class object - Class generatedClass = EmitterMethodCreator.createClassWithMethod(newCtx, block, false); + // Try unified API (returns RuntimeCode - either CompiledCode or InterpretedCode) + org.perlonjava.runtime.RuntimeCode runtimeCode = + EmitterMethodCreator.createRuntimeCode(newCtx, block, false); try { - // Prepare constructor with the captured variable types - Class[] parameterTypes = classList.toArray(new Class[0]); - Constructor constructor = generatedClass.getConstructor(parameterTypes); - - // Instantiate the subroutine with the captured variables - Object[] parameters = paramList.toArray(); - code.codeObject = constructor.newInstance(parameters); + if (runtimeCode instanceof org.perlonjava.codegen.CompiledCode) { + // CompiledCode path - fill in the existing placeholder + org.perlonjava.codegen.CompiledCode compiledCode = + (org.perlonjava.codegen.CompiledCode) runtimeCode; + Class generatedClass = compiledCode.generatedClass; + + // Prepare constructor with the captured variable types + Class[] parameterTypes = classList.toArray(new Class[0]); + Constructor constructor = generatedClass.getConstructor(parameterTypes); + + // Instantiate the subroutine with the captured variables + Object[] parameters = paramList.toArray(); + placeholder.codeObject = constructor.newInstance(parameters); + + // Retrieve the 'apply' method from the generated class + placeholder.methodHandle = RuntimeCode.lookup.findVirtual(generatedClass, "apply", RuntimeCode.methodType); + + // Set the __SUB__ instance field to codeRef + Field field = placeholder.codeObject.getClass().getDeclaredField("__SUB__"); + field.set(placeholder.codeObject, codeRef); + + } else if (runtimeCode instanceof org.perlonjava.interpreter.InterpretedCode) { + // InterpretedCode path - replace codeRef.value entirely + org.perlonjava.interpreter.InterpretedCode interpretedCode = + (org.perlonjava.interpreter.InterpretedCode) runtimeCode; + + // Set captured variables if there are any + if (!paramList.isEmpty()) { + Object[] parameters = paramList.toArray(); + org.perlonjava.runtime.RuntimeBase[] capturedVars = + new org.perlonjava.runtime.RuntimeBase[parameters.length]; + for (int i = 0; i < parameters.length; i++) { + capturedVars[i] = (org.perlonjava.runtime.RuntimeBase) parameters[i]; + } + interpretedCode = interpretedCode.withCapturedVars(capturedVars); + } - // Retrieve the 'apply' method from the generated class - code.methodHandle = RuntimeCode.lookup.findVirtual(generatedClass, "apply", RuntimeCode.methodType); + // Copy metadata from the placeholder + interpretedCode.prototype = placeholder.prototype; + interpretedCode.attributes = placeholder.attributes; + interpretedCode.subName = placeholder.subName; + interpretedCode.packageName = placeholder.packageName; - // Set the __SUB__ instance field to codeRef - Field field = code.codeObject.getClass().getDeclaredField("__SUB__"); - field.set(code.codeObject, codeRef); + // REPLACE the global reference + codeRef.value = interpretedCode; + } } catch (Exception e) { // Handle any exceptions during subroutine creation throw new PerlCompilerException("Subroutine error: " + e.getMessage()); } - // Clear the compilerThread once done - code.compilerSupplier = null; + // Clear the compilerSupplier once done (use the captured placeholder variable) + // This prevents the Supplier from being invoked multiple times + placeholder.compilerSupplier = null; return null; }; - // Store the supplier for later execution - code.compilerSupplier = subroutineCreationTaskSupplier; + // Store the supplier in the placeholder + RuntimeCode placeholderForSupplier = (RuntimeCode) codeRef.value; + placeholderForSupplier.compilerSupplier = subroutineCreationTaskSupplier; // return an empty AST list diff --git a/src/main/java/org/perlonjava/runtime/GlobalVariable.java b/src/main/java/org/perlonjava/runtime/GlobalVariable.java index 1a879e529..42800c3d9 100644 --- a/src/main/java/org/perlonjava/runtime/GlobalVariable.java +++ b/src/main/java/org/perlonjava/runtime/GlobalVariable.java @@ -342,7 +342,8 @@ public static RuntimeScalar definedGlobalCodeRefAsScalar(String key) { RuntimeScalar var = globalCodeRefs.get(key); if (var != null && var.type == RuntimeScalarType.CODE && var.value instanceof RuntimeCode runtimeCode) { // Check if the subroutine has actual implementation (not just a placeholder) - return (runtimeCode.methodHandle != null || runtimeCode.compilerSupplier != null || runtimeCode.isBuiltin) ? scalarTrue : scalarFalse; + boolean result = (runtimeCode.methodHandle != null || runtimeCode.compilerSupplier != null || runtimeCode.isBuiltin); + return result ? scalarTrue : scalarFalse; } return scalarFalse; } diff --git a/src/main/java/org/perlonjava/runtime/RuntimeCode.java b/src/main/java/org/perlonjava/runtime/RuntimeCode.java index f371e78ee..88685a273 100644 --- a/src/main/java/org/perlonjava/runtime/RuntimeCode.java +++ b/src/main/java/org/perlonjava/runtime/RuntimeCode.java @@ -918,6 +918,15 @@ public static RuntimeList apply(RuntimeScalar runtimeScalar, RuntimeArray a, int // Check if the type of this RuntimeScalar is CODE if (runtimeScalar.type == RuntimeScalarType.CODE) { RuntimeCode code = (RuntimeCode) runtimeScalar.value; + + // CRITICAL: Run compilerSupplier BEFORE checking defined() + // The compilerSupplier may replace runtimeScalar.value with InterpretedCode + if (code.compilerSupplier != null) { + code.compilerSupplier.get(); + // Reload code from runtimeScalar.value in case it was replaced + code = (RuntimeCode) runtimeScalar.value; + } + // Check if it's an unfilled forward declaration (not defined) if (!code.defined()) { // Try to find AUTOLOAD for this subroutine @@ -1041,6 +1050,15 @@ public static RuntimeList apply(RuntimeScalar runtimeScalar, String subroutineNa } RuntimeCode code = (RuntimeCode) runtimeScalar.value; + + // CRITICAL: Run compilerSupplier BEFORE checking defined() + // The compilerSupplier may replace runtimeScalar.value with InterpretedCode + if (code.compilerSupplier != null) { + code.compilerSupplier.get(); + // Reload code from runtimeScalar.value in case it was replaced + code = (RuntimeCode) runtimeScalar.value; + } + if (code.defined()) { // Cast the value to RuntimeCode and call apply() return code.apply(subroutineName, a, callContext); @@ -1113,6 +1131,15 @@ public static RuntimeList apply(RuntimeScalar runtimeScalar, String subroutineNa RuntimeArray a = list.getArrayOfAlias(); RuntimeCode code = (RuntimeCode) runtimeScalar.value; + + // CRITICAL: Run compilerSupplier BEFORE checking defined() + // The compilerSupplier may replace runtimeScalar.value with InterpretedCode + if (code.compilerSupplier != null) { + code.compilerSupplier.get(); + // Reload code from runtimeScalar.value in case it was replaced + code = (RuntimeCode) runtimeScalar.value; + } + if (code.defined()) { // Cast the value to RuntimeCode and call apply() return code.apply(subroutineName, a, callContext); @@ -1263,7 +1290,6 @@ public RuntimeList apply(RuntimeArray a, int callContext) { try { // Wait for the compilerThread to finish if it exists if (this.compilerSupplier != null) { - // System.out.println("Waiting for compiler thread to finish..."); this.compilerSupplier.get(); // Wait for the task to finish } diff --git a/src/main/java/org/perlonjava/scriptengine/PerlLanguageProvider.java b/src/main/java/org/perlonjava/scriptengine/PerlLanguageProvider.java index a0aea037a..d2d7d4ca4 100644 --- a/src/main/java/org/perlonjava/scriptengine/PerlLanguageProvider.java +++ b/src/main/java/org/perlonjava/scriptengine/PerlLanguageProvider.java @@ -2,6 +2,7 @@ import org.perlonjava.CompilerOptions; import org.perlonjava.astnode.Node; +import org.perlonjava.codegen.CompiledCode; import org.perlonjava.codegen.EmitterContext; import org.perlonjava.codegen.EmitterMethodCreator; import org.perlonjava.codegen.JavaClassInfo; @@ -183,10 +184,10 @@ public static RuntimeList executePerlCode(CompilerOptions compilerOptions, SpecialBlockParser.setCurrentScope(ctx.symbolTable); // Compile to executable (compiler or interpreter based on flag) - Object codeInstance = compileToExecutable(ast, ctx); + RuntimeCode runtimeCode = compileToExecutable(ast, ctx); // Execute (unified path for both backends) - return executeCode(codeInstance, ctx, isTopLevelScript, callerContext); + return executeCode(runtimeCode, ctx, isTopLevelScript, callerContext); } /** @@ -235,23 +236,23 @@ public static RuntimeList executePerlAST(Node ast, SpecialBlockParser.setCurrentScope(ctx.symbolTable); // Compile to executable (compiler or interpreter based on flag) - Object codeInstance = compileToExecutable(ast, ctx); + RuntimeCode runtimeCode = compileToExecutable(ast, ctx); // executePerlAST is always called from BEGIN blocks which use VOID context - return executeCode(codeInstance, ctx, false, RuntimeContextType.VOID); + return executeCode(runtimeCode, ctx, false, RuntimeContextType.VOID); } /** * Common method to execute compiled code and return the result. - * Works with both interpreter (InterpretedCode) and compiler (generated class instance). + * Works with both interpreter (InterpretedCode) and compiler (CompiledCode). * - * @param codeInstance The compiled code instance (InterpretedCode or generated class) + * @param runtimeCode The compiled RuntimeCode instance (InterpretedCode or CompiledCode) * @param ctx The emitter context. * @param isMainProgram Indicates if this is the main program. * @param callerContext The calling context (VOID, SCALAR, LIST) or -1 for default * @return The result of the Perl code execution. */ - private static RuntimeList executeCode(Object codeInstance, EmitterContext ctx, boolean isMainProgram, int callerContext) throws Exception { + private static RuntimeList executeCode(RuntimeCode runtimeCode, EmitterContext ctx, boolean isMainProgram, int callerContext) throws Exception { runUnitcheckBlocks(ctx.unitcheckBlocks); if (isMainProgram) { runCheckBlocks(); @@ -261,10 +262,6 @@ private static RuntimeList executeCode(Object codeInstance, EmitterContext ctx, return null; } - // Get MethodHandle for apply() - works for both RuntimeCode subclasses and generated classes - Class codeClass = codeInstance.getClass(); - MethodHandle invoker = RuntimeCode.lookup.findVirtual(codeClass, "apply", RuntimeCode.methodType); - RuntimeList result; try { if (isMainProgram) { @@ -274,7 +271,9 @@ private static RuntimeList executeCode(Object codeInstance, EmitterContext ctx, // Use the caller's context if specified, otherwise use default behavior int executionContext = callerContext >= 0 ? callerContext : (isMainProgram ? RuntimeContextType.VOID : RuntimeContextType.SCALAR); - result = (RuntimeList) invoker.invoke(codeInstance, new RuntimeArray(), executionContext); + + // Call apply() directly - works for both InterpretedCode and CompiledCode + result = runtimeCode.apply(new RuntimeArray(), executionContext); try { if (isMainProgram) { @@ -304,16 +303,17 @@ private static RuntimeList executeCode(Object codeInstance, EmitterContext ctx, } /** - * Compiles Perl AST to an executable instance (compiler or interpreter). + * Compiles Perl AST to an executable RuntimeCode instance (compiler or interpreter). * This method provides a unified compilation path that chooses the backend - * based on CompilerOptions.useInterpreter. + * based on CompilerOptions.useInterpreter, with automatic fallback to interpreter + * when compilation exceeds JVM method size limits. * * @param ast The abstract syntax tree to compile * @param ctx The emitter context - * @return Object that has apply() method - either InterpretedCode or compiled class instance + * @return RuntimeCode instance - either InterpretedCode or CompiledCode * @throws Exception if compilation fails */ - private static Object compileToExecutable(Node ast, EmitterContext ctx) throws Exception { + private static RuntimeCode compileToExecutable(Node ast, EmitterContext ctx) throws Exception { if (ctx.compilerOptions.useInterpreter) { // Interpreter path - returns InterpretedCode (extends RuntimeCode) ctx.logDebug("Compiling to bytecode interpreter"); @@ -333,15 +333,68 @@ private static Object compileToExecutable(Node ast, EmitterContext ctx) throws E return interpretedCode; } else { - // Compiler path - returns generated class instance + // Compiler path - returns CompiledCode (wrapper around generated class) ctx.logDebug("Compiling to JVM bytecode"); - Class generatedClass = EmitterMethodCreator.createClassWithMethod( - ctx, - ast, - false // no try-catch - ); - Constructor constructor = generatedClass.getConstructor(); - return constructor.newInstance(); + try { + Class generatedClass = EmitterMethodCreator.createClassWithMethod( + ctx, + ast, + false // no try-catch + ); + Constructor constructor = generatedClass.getConstructor(); + Object instance = constructor.newInstance(); + + // Create MethodHandle for the apply() method + MethodHandle methodHandle = RuntimeCode.lookup.findVirtual( + generatedClass, + "apply", + RuntimeCode.methodType + ); + + // Wrap in CompiledCode for type safety and consistency + // Main scripts don't have prototypes, so pass null + CompiledCode compiled = new CompiledCode( + methodHandle, + instance, + null, // prototype (main scripts don't have one) + generatedClass, + ctx + ); + return compiled; + + } catch (RuntimeException e) { + // Check if this is a "Method too large" error from ASM + if (e.getMessage() != null && e.getMessage().contains("Method too large")) { + // When JPERL_USE_INTERPRETER_FALLBACK is set and compilation fails due to size, + // automatically fall back to the interpreter backend + boolean showFallback = System.getenv("JPERL_SHOW_FALLBACK") != null || + System.getenv("JPERL_USE_INTERPRETER_FALLBACK") != null; + if (showFallback) { + System.err.println("Note: Method too large after AST splitting, using interpreter backend."); + } + + // Fall back to interpreter path + ctx.logDebug("Falling back to bytecode interpreter due to method size"); + BytecodeCompiler compiler = new BytecodeCompiler( + ctx.compilerOptions.fileName, + 1, // sourceLine (legacy parameter) + ctx.errorUtil // Pass errorUtil for proper error formatting with line numbers + ); + InterpretedCode interpretedCode = compiler.compile(ast); + + // If --disassemble is enabled, print the bytecode + if (ctx.compilerOptions.disassembleEnabled) { + System.out.println("=== Interpreter Bytecode ==="); + System.out.println(interpretedCode.disassemble()); + System.out.println("=== End Bytecode ==="); + } + + return interpretedCode; + } else { + // Not a size error, rethrow + throw e; + } + } } } diff --git a/src/test/resources/unit/large_sub_interpreter_fallback.t b/src/test/resources/unit/large_sub_interpreter_fallback.t new file mode 100644 index 000000000..157f741ff --- /dev/null +++ b/src/test/resources/unit/large_sub_interpreter_fallback.t @@ -0,0 +1,38 @@ +print "1..3\n"; + +# Test 1: Small subroutine compiles normally +sub small_sub { + my $x = shift; + return $x * 2; +} + +my $result = small_sub(21); +print "not " unless $result == 42; +print "ok 1 - small subroutine works\n"; + +# Test 2: Large subroutine should work (either via compiler or interpreter fallback) +# This sub has many statements to push it over the 65KB JVM bytecode limit +sub large_sub { + my $sum = 0; + $sum += 1; + $sum += 1; + $sum += 1; + $sum += 1; + $sum += 1; + $sum += 1; + $sum += 1; + $sum += 1; + $sum += 1; + $sum += 1; + # Would need ~10,000+ more lines here to actually trigger the limit, + # but this demonstrates the structure + return $sum; +} + +$result = large_sub(); +print "not " unless $result == 10; +print "ok 2 - large subroutine works\n"; + +# Test 3: Verify subroutine was compiled (check it's defined) +print "not " unless defined(&large_sub); +print "ok 3 - large subroutine is defined\n";