feat: Add JPERL_EVAL_USE_INTERPRETER for faster eval STRING compilation#205
Merged
feat: Add JPERL_EVAL_USE_INTERPRETER for faster eval STRING compilation#205
Conversation
BytecodeCompiler was storing global variable names with sigils (e.g., "$x") instead of normalized package-qualified names (e.g., "main::x"). This caused STORE_GLOBAL_SCALAR to fail when using the interpreter backend, because GlobalVariable.getGlobalVariable() expects normalized names without sigils. Fixed all instances of LOAD_GLOBAL_SCALAR and STORE_GLOBAL_SCALAR to use NameNormalizer.normalizeVariableName() consistently: - LOAD_GLOBAL_SCALAR: Strip sigil and normalize (line 555) - STORE_GLOBAL_SCALAR: Scalar assignment (line 1644) - STORE_GLOBAL_SCALAR: Identifier assignment (line 1822) - STORE_GLOBAL_SCALAR: List assignment (line 2227) - STORE_GLOBAL_SCALAR: Autoincrement (line 3959) This fix enables eval STRING to correctly store and retrieve global variables when using JPERL_EVAL_USE_INTERPRETER=1. Test: JPERL_EVAL_USE_INTERPRETER=1 ./jperl -E 'eval "\$y = 42"; print "y: \$y\n"' Result: Now prints "y: 42" correctly Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implemented comprehensive error handling for evalStringWithInterpreter:
1. Catch compilation errors, set $@, call $SIG{__DIE__}, return undef/empty list
2. Catch runtime errors (PerlDieException), set $@, return undef/empty list
3. Clear $@ on successful execution
4. Handle context correctly (SCALAR/LIST/VOID)
Fixed BytecodeCompiler issues:
1. Return undef instead of "this" (register 0) when no result is produced
2. Add error checking for increment/decrement operators without valid operands
Test results improved from 131/173 to 141/173 passing (81.5%)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implemented full wantarray operator support for interpreter mode: 1. Added WANTARRAY opcode (169) to Opcodes.java 2. Added BytecodeInterpreter handler for WANTARRAY opcode - Converts context int to Perl convention (undef/false/true) 3. Added BytecodeCompiler case for wantarray operator - Reads register 2 (calling context) and converts value 4. Fixed context propagation in EmitEval.java - Use compile-time context constant when available - Fall back to runtime wantarray only for RUNTIME context Test improvements: - Before: 141/173 tests passing (81.5%) - After: 145/173 tests passing (83.8%) - Progress: +4 tests passing - Gap to compiler mode (152): 7 tests remaining Remaining issues: - Context propagation from assignments needs work in compiler - Some complex eval scenarios still failing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixed evalStringWithInterpreter to use runtime callContext instead of saved compile-time context. Updated BytecodeCompiler to accept and use EmitterContext contextType for top-level expression context. Changes: - RuntimeCode.java: Use callContext for evalCtx creation (not ctx.contextType) - BytecodeCompiler.java: Set currentCallContext from EmitterContext - RuntimeCode.java: Pass evalCtx to compiler.compile() Tests 107-108 still failing - investigating subroutine call context propagation between interpreter and compiled code. The context is being set correctly in BytecodeCompiler, but compiled subroutines called from interpreted eval are not receiving the correct wantarray value. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix context propagation issue in BytecodeCompiler where ALL statements in a block were being executed in VOID context. Only non-last statements should use VOID context; the last statement must preserve the block's calling context. Also fixed evalStringWithInterpreter to use runtime callContext instead of saved ctx.contextType, ensuring proper context propagation through eval STRING operations. Tests 107-108 now pass (145/173 total, 83.8%). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…eter The BytecodeCompiler was generating a STORE_GLOBAL_SCALAR after POST_AUTOINCREMENT/POST_AUTODECREMENT opcodes, which was overwriting the incremented/decremented value with the old value (the return value of the post-inc/dec operation). Root cause: POST_AUTO*/PRE_AUTO* opcodes modify the global variable directly (by calling postAutoIncrement/postAutoDecrement on the global) and return the appropriate value (old for postfix, new for prefix). The subsequent STORE was overwriting the modified global with the return value. Also fixed BytecodeInterpreter to store the return value from postAutoIncrement/postAutoDecrement into the register, so the old value (for postfix) or new value (for prefix) is available for use in expressions. Tests now passing: 146/173 (84.4%), up from 145/173 (83.8%). Fixed tests 12-13 (recursive eval factorial). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two fixes for interpreter mode: 1. Dynamic variable restoration in evalStringWithInterpreter: - Save DynamicVariableManager level before eval execution - Restore to saved level in finally block - Ensures `local` variables are properly restored after eval - Fixes test: "local $x" now correctly restores after eval exits 2. Support for local($var)=value assignment pattern: - Added handling for local(ListNode) = value in compileAssignmentOperator - Supports both single-element local($x)=$x and multi-element patterns - Localizes the variable and assigns the RHS value - Required for recursive eval with local variables Tests now passing: 147/173 (85.0%), up from 146/173 (84.4%). Fixed test 13: recursive eval factorial with local($foo)=$foo Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The interpreter was not enforcing strict subs violations for barewords
in eval STRING. Added check in BytecodeCompiler.visit(IdentifierNode)
to detect barewords (identifiers without sigils) and throw an exception
if strict subs is enabled, matching the compiler behavior.
Key changes:
- Store EmitterContext in BytecodeCompiler for compile-time option checks
- Check HINT_STRICT_SUBS when encountering bareword identifiers
- Throw PerlCompilerException for strict violations
- Treat non-strict barewords as string literals (LOAD_STRING)
Tests now passing: 148/173 (85.5%), up from 147/173 (85.0%).
Fixed test 168: $SIG{__DIE__} with nested eval "bar" now properly sets $@
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Implement warn in BytecodeCompiler with proper line number tracking - Update BytecodeInterpreter to handle two-register WARN (message + location) - Matches DIE implementation pattern - Fixes test 136: Line number tracking in eval Tests improved: 148 → 149/173 (86.1%) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Root cause: BytecodeCompiler.compile() was resetting nextRegister based on capturedVars array, overwriting the correct value set by constructor based on parentRegistry. Impact: Register 4 (containing captured variable $l) was being reallocated for temporary values in eval STRING context, causing incorrect variable resolution in self-recursive evals. Fix: Only reset nextRegister if capturedVarIndices == null (no parentRegistry). This preserves the constructor's correct register allocation for eval STRING while still supporting normal closure compilation. Tests improved: 149 → 150/173 (86.7%) - Fixed test 38: recursive subroutine-call inside eval sees own lexicals Remaining interpreter-only failures: 4 tests (34, 37, 59, 63) - Complex nested evals with multiple closure levels - Need additional investigation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comprehensive analysis of 7 failing tests grouped into 3 categories: - Priority 1: Test 168 (strict subs) - FIXED ✓ - Priority 2: Test 136 (line numbers) - FIXED ✓ - Priority 3: Tests 34, 37, 38, 59, 63 (recursive eval) - Test 38 FIXED ✓ Documents root causes, investigation results, and fix approaches.
…iables
When BytecodeCompiler is constructed with a parentRegistry (for eval STRING
variable capture), the constructor sets up capturedVarIndices to mark which
variables should use SET_SCALAR for assignment (to preserve aliasing with
the parent scope).
However, detectClosureVariables() was unconditionally resetting
capturedVarIndices, causing eval STRING assignments to captured variables
to use MOVE instead of SET_SCALAR. This broke the aliasing and assignments
didn't persist to the outer scope.
Fix: Skip detectClosureVariables() logic when capturedVarIndices is already
set by the constructor. The constructor's mapping is correct for eval STRING
contexts and should be preserved.
Impact:
- Fixes assignments to captured variables in eval STRING (e.g., eval '$r = func()')
- Test case: my $r = 0; eval '$r = 120'; # $r is now 120
- Improves perl5_t/t/op/eval.t test 63 in pure interpreter mode
Known limitation: Variable shadowing inside eval BLOCK (eval q{}) with
nested eval STRING still has issues because the compile-time symbol table
doesn't capture runtime lexical variables from the eval BLOCK scope.
This works correctly in compiler mode but needs further work in interpreter mode.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
POST_AUTOINCREMENT and POST_AUTODECREMENT opcodes now correctly use TWO registers instead of one: - Result register: holds the old value (return value) - Variable register: contains the modified variable (preserved for closures) This fixes variable capture in nested eval STRING contexts. Previously, the old value copy would replace the variable in its register, breaking closure capture for recursive eval calls. Changes: - BytecodeCompiler.java: Allocate separate result register for postfix ops - BytecodeInterpreter.java: Read two registers (rd=dest, rs=source) Impact: Test 34 in perl5_t/t/op/eval.t now passes (152/173 vs 151/173). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixed interpreter to properly handle lexical variables captured by named subroutines. When a named sub (JVM-compiled) captures outer lexical variables, those variables must be aliased to PersistentVariable globals so both interpreter and JVM-compiled code can access them. Changes: - List declarations (my ($a, $b) = ...) now check sigilOp.id - If id != 0, variable is captured - use RETRIEVE_BEGIN opcodes - Captured scalars use SET_SCALAR for assignment (preserves aliasing) - Anonymous subs pass parentRegistry to nested compilers This fixes: - Interpreter closures (bench_closure.pl now returns correct result) - Named subs can now access outer lexical variables - Nested closures (anon sub inside named sub) work correctly Test results: - bench_closure.pl: now outputs "done 1440000" (was "done 6600000000") - All unit tests pass - Uses same PersistentVariable API as JVM compiler Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The interpreter's BytecodeCompiler was losing track of variables declared in inner scopes after those scopes were popped. When building the variableRegistry for eval STRING support, only variables still in the current scope stack were included. Solution: Added allDeclaredVariables HashMap to track ALL variables ever declared during compilation, regardless of scope. The variable Registry is now built from this complete map instead of just the current scopes. This allows eval STRING to properly capture variables from outer scopes, even after those scopes have exited. The captured variables can then be modified through SET_SCALAR aliasing. Test results: - eval.t: 153/173 tests passing (88.4%) - reached PR merge target! - All unit tests pass Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The interpreter now uses ScopedSymbolTable for package tracking, matching
how the compiler handles packages and classes. This enables proper support
for:
1. Package versions (package Foo 1.23;)
- Versions are tracked in the symbol table
- $Package::VERSION is automatically set
2. Class keyword (class Foo { ... })
- Classes are registered with ClassRegistry for proper stringification
- isClass flag is tracked in symbol table
- Class inheritance works correctly (:isa attribute)
3. Nested packages (package Outer::Inner;)
- Symbol table properly tracks package scope changes
- Package names are correctly prefixed to global variables
Implementation:
- Replaced simple 'String currentPackage' with ScopedSymbolTable
- Updated package/class handler to call symbolTable.setCurrentPackage()
- Added ClassRegistry.registerClass() for class declarations
- Updated all getCurrentPackage() calls to use symbolTable
Tests verified:
- Class with fields and methods works
- Package versions are set correctly
- Nested packages resolve properly
- Class inheritance (:isa) works
- eval.t still passes at 153/173 (88.4%)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2f1da3f to
84e140d
Compare
The interpreter was calling matchRegex() with parameters in the wrong order: - Was: matchRegex(string, regex, ctx) - Should be: matchRegex(quotedRegex, string, ctx) This caused regex matching to fail completely in interpreter mode, resulting in massive test regressions: - uni/fold.t: 8466/19011 -> 16933/19011 (+8467 tests) - re/charset.t: 180/5552 -> 2632/5552 (+2452 tests) The bug was introduced when MATCH_REGEX opcode was added. The method signature in RuntimeRegex has the regex first, string second, but the interpreter was passing them in reverse order. Test results: - eval.t: 153/173 still passing - uni/fold.t: 16933/19011 passing (was 8466) - re/charset.t: 2632/5552 passing (was 180) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The study operator was missing from the interpreter, causing all tests that used it to fail with "Unsupported operator: study" errors. In modern Perl (5.10+), study is essentially a no-op that always returns true. It was originally used for optimization but is no longer needed. Implementation: - Evaluate operand for side effects (if present) - Return constant 1 (true) Test results: - re/regexp_noamp.t: 258/2210 -> 1472/2210 (+1214 tests) - eval.t: 153/173 still passing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements the require operator (opcode 170) following SKILL.md patterns: - Added REQUIRE opcode to Opcodes.java - Runtime handler in BytecodeInterpreter.java calls ModuleOperators.require() - Compilation logic in BytecodeCompiler.java evaluates operand in SCALAR context - Disassembly case in InterpretedCode.java for proper PC tracking The require operator handles both version checking (require 5.008) and module loading (require strict). It calls ModuleOperators.require(RuntimeScalar) which validates versions or loads modules from %INC. Results: - comp/require.t: 417/1747 → 1741/1747 (+1324 tests) - Fixes "Unsupported operator: require" errors throughout test suite Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements the pos operator (opcode 171) following SKILL.md patterns: - Added POS opcode to Opcodes.java - Runtime handler in BytecodeInterpreter.java calls RuntimeScalar.pos() - Compilation logic in BytecodeCompiler.java evaluates operand in SCALAR context - Lvalue assignment support: pos($var) = value uses SET_SCALAR - Disassembly case in InterpretedCode.java for proper PC tracking The pos operator returns the position of the last regex match in a string. It can be used as both rvalue (reading position) and lvalue (setting position). The implementation returns a PosLvalueScalar that overrides set() methods to allow assignment. Results: - re/regexp.t: 1104/2210 → 1119/2210 (+15 tests) - Eliminates all "Assignment to unsupported operator: pos" errors Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements index (opcode 172) and rindex (opcode 173) following SKILL.md patterns: - Added INDEX and RINDEX opcodes to Opcodes.java - Runtime handlers in BytecodeInterpreter.java call StringOperators.index/rindex() - Compilation logic in BytecodeCompiler.java handles 2 or 3 arguments (pos is optional) - Disassembly cases in InterpretedCode.java for proper PC tracking The index operator finds the first occurrence of a substring, while rindex finds the last occurrence. Both accept an optional starting position parameter. Results: - op/index.t: 62/415 → 413/415 (+351 tests) - Eliminates all "Unsupported operator: index" errors Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements bitwise compound assignments (opcodes 174-176) following SKILL.md patterns: - Added BITWISE_AND_ASSIGN, BITWISE_OR_ASSIGN, BITWISE_XOR_ASSIGN opcodes - Runtime handlers in BytecodeInterpreter.java call BitwiseOperators methods - Compilation logic in BytecodeCompiler.java handles &=, |=, ^= operators - Updated handleCompoundAssignment to include bitwise operators - Disassembly cases in InterpretedCode.java for proper PC tracking The bitwise compound assignment operators (&=, |=, ^=) perform bitwise operations and assign the result back to the variable. Note: These operators work in normal code but compound assignments in eval STRING contexts have a separate variable capture issue that needs investigation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CRITICAL FIX: Compound assignment opcodes were replacing register references instead of modifying RuntimeScalar objects in place, breaking variable capture in eval STRING contexts. **Root Cause**: When eval STRING captures parent variables, it places RuntimeScalar objects from parent registers into child registers. Compound assignments like `$x += 5` must modify the OBJECT, not replace the REFERENCE, or the parent won't see the change. **How Compiler Does It**: ``` ALOAD var # Load variable DUP # Duplicate reference ALOAD value # Load value INVOKE op # Call operator -> result INVOKE set # Call var.set(result) - modifies in place ``` **Fix Applied**: - ADD_ASSIGN: Now uses MathOperators.addAssign() which calls set() internally - STRING_CONCAT_ASSIGN: Calls stringConcat() then set() on original object - BITWISE_*_ASSIGN: Calls bitwise op then set() on original object - ADD_ASSIGN_INT: Calls add() then set() on original object - SUBTRACT/MULTIPLY/DIVIDE/MODULUS_ASSIGN: Already correct (use *Assign methods) **Testing**: ```perl my $x = 10; eval '$x += 5'; # Now correctly modifies $x to 15 ``` Results: - All compound assignments now work correctly in eval STRING - Fixes variable capture for +=, -=, *=, /=, %=, .=, &=, |=, ^= - Critical for op/bop.t and op/hashassign.t Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixes chained compound assignments like ($x &= $y) .= "x" in eval STRING. **Problem**: handleCompoundAssignment() required left side to be a simple variable, rejecting expressions that return lvalues. **Solution**: - Allow any expression as left side of compound assignment - Compile left expression in SCALAR context - Use result register for the compound assignment - Add LIST_TO_SCALAR conversion when needed **Testing**: ```perl eval '($x &= $y) .= "x"'; # Now works correctly ``` This fixes patterns used in op/bop.t and other test files where compound assignments are chained or used in complex expressions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds interpreter support for numeric bitwise compound assignments. These are variants of &=, |=, ^= that force numeric interpretation (vs string bitwise operations). The compiler already handles these operators, mapping them to the same opcodes as the regular bitwise operators. The interpreter now recognizes them and compiles them to the same BITWISE_*_ASSIGN opcodes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Changed compound assignment check to use startsWith('binary') instead of exact string matching
- Fixes eval STRING with & |= ^= operators inside 'use v5.27' blocks
- Resolves 'Unsupported operator: binary&=' error in op/bop.t
Progress: op/bop.t now runs 195/522 tests (was 189/522), +6 tests
- Added opcodes 177-179 for STRING_BITWISE_AND_ASSIGN (&.=), STRING_BITWISE_OR_ASSIGN (|.=), STRING_BITWISE_XOR_ASSIGN (^.=) - Implemented handlers in BytecodeInterpreter using bitwiseAndDot, bitwiseOrDot, bitwiseXorDot methods - Added compiler support in BytecodeCompiler for &.=, |.=, ^.= operators - Added disassembly support in InterpretedCode Progress: op/bop.t from 171 → 264 OK (+93 tests, 50.6% pass rate)
Bulk implementation of 8 bitwise opcodes: - Opcodes 180-187: BITWISE_AND_BINARY, BITWISE_OR_BINARY, BITWISE_XOR_BINARY, STRING_BITWISE_AND, STRING_BITWISE_OR, STRING_BITWISE_XOR, BITWISE_NOT_BINARY, BITWISE_NOT_STRING - Binary operators: &, |, ^, &., |., ^. (numeric and string variants) - Unary operators: ~, ~. (bitwise NOT for numeric and string) - Implemented handlers in BytecodeInterpreter using BitwiseOperators methods - Added compiler support in BytecodeCompiler for both binary and unary operators - Added disassembly support in InterpretedCode This fixes "Unsupported operator" errors for all bitwise operations in eval STRING contexts.
…context - Fixed array assignment to return element count in scalar context, not the array itself - Pass EmitterContext to BytecodeCompiler.compile() for context propagation - Convert final result to scalar context using ARRAY_SIZE when compiling in scalar context - Fixes eval '@temp = (...)' returning undef instead of count when list contains undef Progress: op/hashassign.t from 50 → 307 OK (+257 tests, 99.4% pass rate)
Implements 31 new opcodes (188-216) for file operations: - STAT and LSTAT with context awareness - All file test operators: -r, -w, -x, -o, -R, -W, -X, -O, -e, -z, -s, -f, -d, -l, -p, -S, -b, -c, -t, -u, -g, -k, -T, -B, -M, -A, -C Results: - op/stat_errors.t: 303 → 611 OK (+308 tests, 95.8% pass rate) - All other tests: STABLE (no regressions) Changes: - Opcodes.java: Added opcodes 188-216 for stat/lstat and file tests - BytecodeInterpreter.java: Implemented runtime for all file opcodes - BytecodeCompiler.java: Added compilation for stat/lstat and file tests - InterpretedCode.java: Added disassembly cases for all opcodes File test operators use FileTestOperator.fileTest(op, arg) API. Stat/lstat use Stat.stat(arg, ctx) and Stat.lstat(arg, ctx) with proper context handling for scalar vs list returns. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements interpreter fallback for large subroutines and
JPERL_EVAL_USE_INTERPRETERenvironment variable. Achieves parity with compiler mode at 152/173 tests passing (87.9%) in perl5_t/t/op/eval.t.Key Achievement
✅ Compiler Parity Achieved: Interpreter fallback mode now passes 152/173 tests, matching the compiler's performance on perl5_t/t/op/eval.t
Changes
1. Interpreter Fallback for Large Subroutines
codeRef.valueto maintain single source of truth2. Register Allocation Fix for eval STRING
&& capturedVarIndices == nullto prevent reset3. Warn Operator Support
4. Captured Variable Assignment Fix
detectClosureVariables()was resettingcapturedVarIndices, breaking eval STRING assignmentseval '$r = func()'pattern (test 63)Test Results
Hybrid Mode (JPERL_USE_INTERPRETER_FALLBACK=1)
152/173 tests passing (87.9%) ✅ PARITY WITH COMPILER
Pure Interpreter Eval (JPERL_EVAL_USE_INTERPRETER=1)
150/173 tests passing (86.7%)
Compiler Mode (Default)
152/173 tests passing (87.9%)
Performance Benefits
Compilation Speed
Execution Speed
Fixed Tests
Known Limitations
Variable shadowing inside
eval q{}blocks with nestedeval "$var = ..."still has issues in pure interpreter mode (tests 34, 37, 59). This occurs because:my $x; my $xshadowing inside eval BLOCK with nested eval STRINGUsage
Implementation Highlights
Single Source of Truth Pattern
For interpreter fallback, the RuntimeScalar wrapper in
globalCodeRefsis the single source of truth:codeRef.valueevery timeRuntimeCode codeand modify itVariable Capture for eval STRING
capturedVarIndicesfromparentRegistrydetectClosureVariables()skips when already setCommits
dd7d0a8f- Strict subs enforcement460e100f- Add warn operator support (test 136)cb90d1ac- Prevent nextRegister reset (test 38)732ec741- Sync presentation files with origin/master960836de- Add investigation document196b0869- Preserve capturedVarIndices from constructor (test 63)Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com