diff --git a/dev/architecture/large-code-refactoring.md b/dev/architecture/large-code-refactoring.md index 4bfe3c600..315f8ccff 100644 --- a/dev/architecture/large-code-refactoring.md +++ b/dev/architecture/large-code-refactoring.md @@ -4,7 +4,7 @@ PerlOnJava uses a **two-tier strategy** to handle Perl code that exceeds the JVM's 65,535-byte method size limit: -1. **Proactive**: During codegen, large blocks are detected and wrapped in closure calls to split them across multiple JVM methods +1. **Proactive**: During codegen, large blocks are detected and wrapped in a closure call to push them into a separate JVM method 2. **Reactive fallback**: If ASM still produces a method that's too large, the code is compiled using the bytecode interpreter backend instead ## The Problem @@ -13,7 +13,7 @@ The JVM limits each method to 65,535 bytes of bytecode. PerlOnJava compiles each ### Closure Scoping Complication -The natural fix is to split large blocks into chunks wrapped in anonymous subs: `sub { ...chunk... }->(@_)`. However, this changes lexical scoping. When `use` or `require` statements are wrapped in closures, their imports happen in the closure's scope instead of the package scope: +The natural fix is to wrap large blocks in anonymous subs: `sub { ...block... }->(@_)`. However, this changes lexical scoping. When `use` or `require` statements are wrapped in closures, their imports happen in the closure's scope instead of the package scope: ```perl # Original code @@ -31,7 +31,7 @@ my $x = $Config{foo}; # ERROR: %Config not in scope This is why proactive refactoring skips subroutines, special blocks (BEGIN/END/INIT/CHECK/UNITCHECK), and blocks with unsafe control flow. -## Tier 1: Proactive Block Refactoring +## Tier 1: Proactive Block Wrapping ### Entry Point @@ -52,7 +52,13 @@ EmitBlock.emitBlock(visitor, blockNode) └── Return false → normal block emission continues ``` -Wrapping pushes the block's code into a separate JVM method (the anonymous sub body), giving it its own 64KB budget. +Wrapping pushes the block's code into a separate JVM method (the anonymous sub body), giving it its own 64KB budget. This effectively doubles the available space for that block. + +### Limitations + +The wrapping is a **single-level** operation — it wraps the entire block in one closure. It does not recursively split the block into smaller chunks. This means: +- For blocks up to ~2x the 64KB limit, wrapping succeeds (the block fits in the new method) +- For blocks larger than ~2x the limit, wrapping is insufficient and the `MethodTooLargeException` still occurs, triggering Tier 2 ### Thresholds @@ -63,12 +69,12 @@ Wrapping pushes the block's code into a separate JVM method (the anonymous sub b ### Key Classes -- **`BlockRefactor`** (`backend/jvm/astrefactor/BlockRefactor.java`) — Utility methods: `createAnonSubCall()` creates `sub { ... }->(@_)` AST nodes, `buildNestedStructure()` builds nested tail-closure chains, `createBlockNode()` with anti-recursion guard +- **`BlockRefactor`** (`backend/jvm/astrefactor/BlockRefactor.java`) — Constants and `createAnonSubCall()` utility that creates `sub { ... }->(@_)` AST nodes - **`LargeBlockRefactorer`** (`backend/jvm/astrefactor/LargeBlockRefactorer.java`) — Orchestrates block-level refactoring: size estimation, control flow safety checks, whole-block wrapping ## Tier 2: Interpreter Fallback -When the proactive refactoring is insufficient (or skipped due to unsafe control flow), ASM may still throw `MethodTooLargeException`. The fallback catches this and compiles the code using the bytecode interpreter instead. +When the proactive wrapping is insufficient (or skipped due to unsafe control flow), ASM throws `MethodTooLargeException`. The fallback catches this and compiles the code using the bytecode interpreter instead. ### Flow @@ -94,7 +100,7 @@ The fallback also handles other compilation failures (`VerifyError`, `ClassForma When fallback is triggered with `JPERL_SHOW_FALLBACK=1`: ``` -Note: Method too large after AST splitting, using interpreter backend. +Note: Method too large, using interpreter backend. ``` ## Technical Details @@ -106,27 +112,14 @@ Note: Method too large after AST splitting, using interpreter backend. ### Refactoring Strategy 1. **Whole-block wrapping**: The entire block becomes `sub { }->(@_)` 2. **`@_` passthrough**: Arguments are forwarded so the wrapper is transparent -3. **Anti-recursion guard**: `BlockRefactor.createBlockNode()` sets a thread-local `skipRefactoring` flag to prevent infinite recursion when the wrapper's BlockNode is constructed +3. **Anti-recursion guard**: `blockAlreadyRefactored` annotation prevents infinite recursion when the wrapper's BlockNode is processed 4. **Safe boundaries**: Blocks with unlabeled control flow (`next`/`last`/`redo`/`goto` outside loops) are not refactored, since these would break when wrapped in a closure -### Dead Code - -The codebase contains remnants of a former retry-based approach that was replaced by the interpreter fallback: - -| Dead Code | Purpose (unused) | -|-----------|-----------------| -| `LargeBlockRefactorer.forceRefactorForCodegen()` | Was meant for reactive retry after MethodTooLargeException | -| `LargeBlockRefactorer.trySmartChunking()` | Sophisticated chunking algorithm (only called by dead code above) | -| `DepthFirstLiteralRefactorVisitor` (entire class) | Depth-first literal refactoring (marked OBSOLETE in design docs) | -| `LargeNodeRefactorer` (entire class) | Element list chunking (only called by dead code above) | - -These are candidates for removal. - ## Implementation Files | File | Role | |------|------| -| `backend/jvm/astrefactor/BlockRefactor.java` | Constants, closure-wrapping utilities | +| `backend/jvm/astrefactor/BlockRefactor.java` | Constants, closure-wrapping utility | | `backend/jvm/astrefactor/LargeBlockRefactorer.java` | Block-level proactive refactoring | | `backend/jvm/EmitBlock.java` | Calls `processBlock()` during block emission | | `backend/jvm/EmitterMethodCreator.java` | Catches `MethodTooLargeException`, triggers interpreter fallback | diff --git a/dev/custom_bytecode/STATUS.md b/dev/custom_bytecode/STATUS.md index f1bf1c828..97861f166 100644 --- a/dev/custom_bytecode/STATUS.md +++ b/dev/custom_bytecode/STATUS.md @@ -36,7 +36,7 @@ Show diagnostic messages when compilation paths are taken: export JPERL_SHOW_FALLBACK=1 ./jperl script.pl # Output: "Note: JVM compilation succeeded." -# Or: "Note: Method too large after AST splitting, using interpreter backend." +# Or: "Note: Method too large, using interpreter backend." ``` ### JPERL_EVAL_USE_INTERPRETER diff --git a/src/main/java/org/perlonjava/app/scriptengine/PerlLanguageProvider.java b/src/main/java/org/perlonjava/app/scriptengine/PerlLanguageProvider.java index e16484ded..82fde36f4 100644 --- a/src/main/java/org/perlonjava/app/scriptengine/PerlLanguageProvider.java +++ b/src/main/java/org/perlonjava/app/scriptengine/PerlLanguageProvider.java @@ -510,7 +510,7 @@ private static RuntimeCode compileToExecutable(Node ast, EmitterContext ctx) thr if (needsInterpreterFallback(e)) { boolean showFallback = System.getenv("JPERL_SHOW_FALLBACK") != null; if (showFallback) { - System.err.println("Note: Method too large after AST splitting, using interpreter backend."); + System.err.println("Note: Method too large, using interpreter backend."); } if (CompilerOptions.DEBUG_ENABLED) ctx.logDebug("Falling back to bytecode interpreter due to method size"); diff --git a/src/main/java/org/perlonjava/backend/jvm/EmitterMethodCreator.java b/src/main/java/org/perlonjava/backend/jvm/EmitterMethodCreator.java index 4a7de4593..b1470199b 100644 --- a/src/main/java/org/perlonjava/backend/jvm/EmitterMethodCreator.java +++ b/src/main/java/org/perlonjava/backend/jvm/EmitterMethodCreator.java @@ -1529,7 +1529,7 @@ public static RuntimeCode createRuntimeCode( } catch (MethodTooLargeException e) { if (USE_INTERPRETER_FALLBACK) { if (SHOW_FALLBACK) { - System.err.println("Note: Method too large after AST splitting, using interpreter backend."); + System.err.println("Note: Method too large, using interpreter backend."); } return compileToInterpreter(ast, ctx, useTryCatch); } diff --git a/src/main/java/org/perlonjava/backend/jvm/astrefactor/BlockRefactor.java b/src/main/java/org/perlonjava/backend/jvm/astrefactor/BlockRefactor.java index 6282abe54..ed2824e91 100644 --- a/src/main/java/org/perlonjava/backend/jvm/astrefactor/BlockRefactor.java +++ b/src/main/java/org/perlonjava/backend/jvm/astrefactor/BlockRefactor.java @@ -28,113 +28,4 @@ public static BinaryOperatorNode createAnonSubCall(int tokenIndex, BlockNode nes tokenIndex ); } - - /** - * Builds nested closure structure from segments. - * Structure: direct1, direct2, sub{ chunk1, sub{ chunk2, chunk3 }->(@_) }->(@_) - * Closures are always placed at tail position to preserve variable scoping. - * - * @param segments List of segments (either Node for direct elements or List for chunks) - * @param tokenIndex token index for new nodes - * @param minChunkSize minimum size for a chunk to be wrapped in a closure - * @param returnTypeIsList if true, wrap elements in ListNode to return list; if false, execute statements - * @param skipRefactoring thread-local flag to prevent recursion during BlockNode construction - * @return List of processed elements with nested structure - */ - @SuppressWarnings("unchecked") - public static List buildNestedStructure( - List segments, - int tokenIndex, - int minChunkSize, - boolean returnTypeIsList, - ThreadLocal skipRefactoring) { - if (segments.isEmpty()) { - return new ArrayList<>(); - } - - int firstBigIndex = -1; - int endExclusive = segments.size(); - Node tailClosure = null; - - for (int i = segments.size() - 1; i >= 0; i--) { - Object segment = segments.get(i); - if (!(segment instanceof List)) { - continue; - } - List chunk = (List) segment; - if (chunk.size() < minChunkSize) { - continue; - } - - firstBigIndex = i; - - List blockElements = new ArrayList<>(); - blockElements.addAll(chunk); - for (int s = i + 1; s < endExclusive; s++) { - Object seg = segments.get(s); - if (seg instanceof Node directNode) { - blockElements.add(directNode); - } else { - blockElements.addAll((List) seg); - } - } - if (tailClosure != null) { - blockElements.add(tailClosure); - } - - List wrapped = returnTypeIsList ? wrapInListNode(blockElements, tokenIndex) : blockElements; - BlockNode block = createBlockNode(wrapped, tokenIndex, skipRefactoring); - tailClosure = createAnonSubCall(tokenIndex, block); - - endExclusive = i; - } - - if (tailClosure == null) { - List result = new ArrayList<>(); - for (Object segment : segments) { - if (segment instanceof Node directNode) { - result.add(directNode); - } else { - result.addAll((List) segment); - } - } - return result; - } - - List result = new ArrayList<>(); - for (int s = 0; s < firstBigIndex; s++) { - Object seg = segments.get(s); - if (seg instanceof Node directNode) { - result.add(directNode); - } else { - result.addAll((List) seg); - } - } - result.add(tailClosure); - return result; - } - - /** - * Wraps elements in a ListNode to ensure the closure returns a list of elements. - */ - private static List wrapInListNode(List elements, int tokenIndex) { - ListNode listNode = new ListNode(elements, tokenIndex); - listNode.setAnnotation("chunkAlreadyRefactored", true); - return List.of(listNode); - } - - /** - * Creates a BlockNode using thread-local flag to prevent recursion. - */ - private static BlockNode createBlockNode(List elements, int tokenIndex, ThreadLocal skipRefactoring) { - BlockNode block; - skipRefactoring.set(true); - try { - block = new BlockNode(elements, tokenIndex); - } finally { - skipRefactoring.set(false); - } - return block; - } - } diff --git a/src/main/java/org/perlonjava/backend/jvm/astrefactor/LargeBlockRefactorer.java b/src/main/java/org/perlonjava/backend/jvm/astrefactor/LargeBlockRefactorer.java index b3c4f7bff..2f019ac46 100644 --- a/src/main/java/org/perlonjava/backend/jvm/astrefactor/LargeBlockRefactorer.java +++ b/src/main/java/org/perlonjava/backend/jvm/astrefactor/LargeBlockRefactorer.java @@ -2,17 +2,11 @@ import org.perlonjava.frontend.analysis.BytecodeSizeEstimator; import org.perlonjava.frontend.analysis.ControlFlowDetectorVisitor; -import org.perlonjava.frontend.analysis.ControlFlowFinder; import org.perlonjava.frontend.analysis.EmitterVisitor; import org.perlonjava.frontend.astnode.BinaryOperatorNode; import org.perlonjava.frontend.astnode.BlockNode; -import org.perlonjava.frontend.astnode.LabelNode; import org.perlonjava.frontend.astnode.Node; -import org.perlonjava.frontend.parser.Parser; -import java.util.ArrayDeque; -import java.util.ArrayList; -import java.util.Deque; import java.util.List; import static org.perlonjava.backend.jvm.astrefactor.BlockRefactor.*; @@ -20,26 +14,19 @@ /** * Helper class for refactoring large blocks to avoid JVM's "Method too large" error. *

- * This class encapsulates all logic for detecting and transforming large blocks, - * including smart chunking strategies and control flow analysis. + * When a block's estimated bytecode size exceeds {@link BlockRefactor#LARGE_BYTECODE_SIZE}, + * the entire block is wrapped in an anonymous sub call: {@code sub { }->(@_)}. + * This pushes the block's code into a separate JVM method with its own 64KB budget. + *

+ * If wrapping is insufficient (the block is still too large for a single method), + * the caller ({@link org.perlonjava.backend.jvm.EmitterMethodCreator}) catches the + * resulting {@code MethodTooLargeException} and falls back to the interpreter backend. */ public class LargeBlockRefactorer { // Reusable visitor for control flow detection private static final ControlFlowDetectorVisitor controlFlowDetector = new ControlFlowDetectorVisitor(); - // Thread-local flag to prevent recursion when creating chunk blocks - private static final ThreadLocal skipRefactoring = ThreadLocal.withInitial(() -> false); - - private static final ThreadLocal controlFlowFinderTl = ThreadLocal.withInitial(ControlFlowFinder::new); - - private static final int FORCE_REFACTOR_ELEMENT_COUNT = 50000; - private static final int TARGET_CHUNK_BYTECODE_SIZE = LARGE_BYTECODE_SIZE / 2; - - private static final int MAX_REFACTOR_ATTEMPTS = 3; - private static final ThreadLocal> pendingRefactorBlocks = ThreadLocal.withInitial(ArrayDeque::new); - private static final ThreadLocal processingPendingRefactors = ThreadLocal.withInitial(() -> false); - private static long estimateTotalBytecodeSizeCapped(List nodes, long capInclusive) { long total = 0; for (Node node : nodes) { @@ -54,87 +41,16 @@ private static long estimateTotalBytecodeSizeCapped(List nodes, long capIn return total; } - private static int findChunkStartByEstimatedSize(List elements, - int safeRunStart, - int safeRunEndExclusive, - long suffixEstimatedSize, - int minChunkSize) { - int chunkStart = safeRunEndExclusive; - long chunkEstimatedSize = 0; - while (chunkStart > safeRunStart) { - Node candidate = elements.get(chunkStart - 1); - long candidateSize = candidate == null ? 0 : BytecodeSizeEstimator.estimateSnippetSize(candidate); - int candidateChunkLen = safeRunEndExclusive - (chunkStart - 1); - if (candidateChunkLen < minChunkSize) { - chunkStart--; - chunkEstimatedSize += candidateSize; - continue; - } - if (chunkEstimatedSize + candidateSize + suffixEstimatedSize <= TARGET_CHUNK_BYTECODE_SIZE) { - chunkStart--; - chunkEstimatedSize += candidateSize; - continue; - } - break; - } - - if (safeRunEndExclusive - chunkStart < minChunkSize) { - chunkStart = Math.max(safeRunStart, safeRunEndExclusive - minChunkSize); - } - return chunkStart; - } - - private static void processPendingRefactors() { - if (processingPendingRefactors.get()) { - return; - } - processingPendingRefactors.set(true); - Deque queue = pendingRefactorBlocks.get(); - try { - while (!queue.isEmpty()) { - BlockNode block = queue.removeFirst(); - } - } finally { - queue.clear(); - processingPendingRefactors.set(false); - } - } - - /** - * Force refactoring of a block that has already reached codegen and failed with MethodTooLargeException. - * This is called during automatic error recovery. - * - * @param node The block to refactor (modified in place) - */ - public static void forceRefactorForCodegen(BlockNode node) { - if (node == null) { - return; - } - Object attemptsObj = node.getAnnotation("refactorAttempts"); - int attempts = attemptsObj instanceof Integer ? (Integer) attemptsObj : 0; - if (attempts >= MAX_REFACTOR_ATTEMPTS) { - return; - } - node.setAnnotation("refactorAttempts", attempts + 1); - - // The estimator can under-estimate; if we reached codegen overflow, we must allow another pass. - node.setAnnotation("blockAlreadyRefactored", false); - - // More aggressive than parse-time: allow deeper nesting to ensure we get under the JVM limit. - trySmartChunking(node, null, 256); - processPendingRefactors(); - } - /** * Process a block and refactor it if necessary to avoid method size limits. - * This is the code-generation time entry point (legacy, kept for compatibility). + * Called from {@link org.perlonjava.backend.jvm.EmitBlock#emitBlock} during bytecode emission. * * @param emitterVisitor The emitter visitor context * @param node The block to process * @return true if the block was refactored and emitted, false if no refactoring was needed */ public static boolean processBlock(EmitterVisitor emitterVisitor, BlockNode node) { - // CRITICAL: Skip if this block was already refactored to prevent infinite recursion + // Skip if this block was already refactored to prevent infinite recursion if (node.getBooleanAnnotation("blockAlreadyRefactored")) { return false; } @@ -145,9 +61,7 @@ public static boolean processBlock(EmitterVisitor emitterVisitor, BlockNode node } // Determine if we need to refactor - boolean needsRefactoring = shouldRefactorBlock(node, emitterVisitor); - - if (!needsRefactoring) { + if (!shouldRefactorBlock(node)) { return false; } @@ -157,21 +71,14 @@ public static boolean processBlock(EmitterVisitor emitterVisitor, BlockNode node return false; } - // Fallback: Try whole-block refactoring - return tryWholeBlockRefactoring(emitterVisitor, node); // Block was refactored and emitted - - // No refactoring was possible + // Try whole-block refactoring + return tryWholeBlockRefactoring(emitterVisitor, node); } /** * Determine if a block should be refactored based on size criteria. - * Uses minimal element count check to avoid overhead on trivial blocks. - * - * @param node The block to check - * @param emitterVisitor The emitter visitor for context - * @return true if the block should be refactored */ - private static boolean shouldRefactorBlock(BlockNode node, EmitterVisitor emitterVisitor) { + private static boolean shouldRefactorBlock(BlockNode node) { if (node.elements.size() <= MIN_CHUNK_SIZE) { return false; } @@ -181,7 +88,7 @@ private static boolean shouldRefactorBlock(BlockNode node, EmitterVisitor emitte } /** - * Check if the block is in a special context where smart chunking should be avoided. + * Check if the block is in a special context where refactoring should be avoided. */ private static boolean isSpecialContext(BlockNode node) { return node.getBooleanAnnotation("blockIsSpecial") || @@ -191,276 +98,7 @@ private static boolean isSpecialContext(BlockNode node) { } /** - * Try to apply smart chunking to reduce the number of top-level elements. - * Creates nested closures for proper lexical scoping. - * - * @param node The block to chunk - * @param parser The parser instance for access to error utilities (can be null) - */ - private static void trySmartChunking(BlockNode node, Parser parser, int maxNestedClosures) { - // Minimal check: skip very small blocks to avoid estimation overhead - if (node.elements.size() <= MIN_CHUNK_SIZE) { - if (parser != null || node.annotations != null) { - node.setAnnotation("refactorSkipReason", "Element count " + node.elements.size() + " <= " + MIN_CHUNK_SIZE + " (minimal threshold)"); - } - return; - } - - // Check bytecode size - skip if under threshold. - // IMPORTANT: use a larger cap here so we can compute a meaningful maxNestedClosuresEffective. - long estimatedSize = estimateTotalBytecodeSizeCapped(node.elements, (long) LARGE_BYTECODE_SIZE * maxNestedClosures); - long estimatedHalf = estimatedSize / 2; - long estimatedSizeWithSafetyMargin = estimatedSize > Long.MAX_VALUE - estimatedHalf ? Long.MAX_VALUE : estimatedSize + estimatedHalf; - if (parser != null || node.annotations != null) { - node.setAnnotation("estimatedBytecodeSize", estimatedSize); - node.setAnnotation("estimatedBytecodeSizeWithSafetyMargin", estimatedSizeWithSafetyMargin); - } - boolean forceRefactorByElementCount = node.elements.size() >= FORCE_REFACTOR_ELEMENT_COUNT; - if (!forceRefactorByElementCount && estimatedSizeWithSafetyMargin <= LARGE_BYTECODE_SIZE) { - if (parser != null || node.annotations != null) { - node.setAnnotation("refactorSkipReason", "Bytecode size " + estimatedSize + " <= threshold " + LARGE_BYTECODE_SIZE); - } - return; - } - - int effectiveMinChunkSize = MIN_CHUNK_SIZE; - - int maxNestedClosuresEffective = (int) Math.min( - maxNestedClosures, - Math.max(1L, (estimatedSizeWithSafetyMargin + TARGET_CHUNK_BYTECODE_SIZE - 1) / TARGET_CHUNK_BYTECODE_SIZE) - ); - - int closuresCreated = 0; - if (node.elements.size() > (long) effectiveMinChunkSize * maxNestedClosuresEffective) { - effectiveMinChunkSize = Math.max(MIN_CHUNK_SIZE, (node.elements.size() + maxNestedClosuresEffective - 1) / maxNestedClosuresEffective); - if (parser != null || node.annotations != null) { - node.setAnnotation("refactorEffectiveMinChunkSize", effectiveMinChunkSize); - } - } - - // Streaming construction from the end to avoid building large intermediate segment lists. - // We only materialize block bodies for chunks that will actually be wrapped. - List suffixReversed = new ArrayList<>(); - Node tailClosure = null; - boolean createdAnyClosure = false; - long suffixEstimatedSize = 0; - - int safeRunEndExclusive = node.elements.size(); - int safeRunLen = 0; - boolean safeRunActive = false; - - boolean hasLabelElement = false; - for (Node el : node.elements) { - if (el instanceof LabelNode) { - hasLabelElement = true; - break; - } - } - ControlFlowFinder blockFinder = controlFlowFinderTl.get(); - blockFinder.scan(node); - boolean hasAnyControlFlowInBlock = blockFinder.foundControlFlow; - boolean treatAllElementsAsSafe = !hasLabelElement && !hasAnyControlFlowInBlock; - - if (treatAllElementsAsSafe) { - safeRunActive = true; - safeRunLen = node.elements.size(); - safeRunEndExclusive = node.elements.size(); - } else { - - for (int i = node.elements.size() - 1; i >= 0; i--) { - Node element = node.elements.get(i); - boolean safeForChunk = !shouldBreakChunk(element); - - if (safeForChunk) { - safeRunActive = true; - safeRunLen++; - continue; - } - - if (safeRunActive) { - int safeRunStart = safeRunEndExclusive - safeRunLen; - while (safeRunLen >= effectiveMinChunkSize) { - int remainingBudget = maxNestedClosuresEffective - closuresCreated; - if (remainingBudget <= 0) { - break; - } - - int chunkStart = findChunkStartByEstimatedSize( - node.elements, - safeRunStart, - safeRunEndExclusive, - suffixEstimatedSize, - effectiveMinChunkSize - ); - int chunkLen = safeRunEndExclusive - chunkStart; - - if (chunkLen <= 0) { - break; - } - - List blockElements = new ArrayList<>(chunkLen + suffixReversed.size() + (tailClosure != null ? 1 : 0)); - for (int j = chunkStart; j < safeRunEndExclusive; j++) { - blockElements.add(node.elements.get(j)); - } - for (int k = suffixReversed.size() - 1; k >= 0; k--) { - blockElements.add(suffixReversed.get(k)); - } - if (tailClosure != null) { - blockElements.add(tailClosure); - } - - BlockNode block = createBlockNode(blockElements, node.tokenIndex, skipRefactoring); - tailClosure = createAnonSubCall(node.tokenIndex, block); - suffixEstimatedSize = BytecodeSizeEstimator.estimateSnippetSize(tailClosure); - suffixReversed.clear(); - createdAnyClosure = true; - closuresCreated++; - - safeRunEndExclusive = chunkStart; - safeRunLen -= chunkLen; - } - - safeRunStart = safeRunEndExclusive - safeRunLen; - for (int j = safeRunEndExclusive - 1; j >= safeRunStart; j--) { - suffixReversed.add(node.elements.get(j)); - suffixEstimatedSize += BytecodeSizeEstimator.estimateSnippetSize(node.elements.get(j)); - } - - safeRunActive = false; - safeRunLen = 0; - } - - suffixReversed.add(element); - suffixEstimatedSize += BytecodeSizeEstimator.estimateSnippetSize(element); - safeRunEndExclusive = i; - } - } - - if (safeRunActive) { - int safeRunStart = safeRunEndExclusive - safeRunLen; - while (safeRunLen >= effectiveMinChunkSize) { - int remainingBudget = maxNestedClosuresEffective - closuresCreated; - if (remainingBudget <= 0) { - break; - } - - int chunkStart = findChunkStartByEstimatedSize( - node.elements, - safeRunStart, - safeRunEndExclusive, - suffixEstimatedSize, - effectiveMinChunkSize - ); - int chunkLen = safeRunEndExclusive - chunkStart; - - if (chunkLen <= 0) { - break; - } - - List blockElements = new ArrayList<>(chunkLen + suffixReversed.size() + (tailClosure != null ? 1 : 0)); - for (int j = chunkStart; j < safeRunEndExclusive; j++) { - blockElements.add(node.elements.get(j)); - } - for (int k = suffixReversed.size() - 1; k >= 0; k--) { - blockElements.add(suffixReversed.get(k)); - } - if (tailClosure != null) { - blockElements.add(tailClosure); - } - - BlockNode block = createBlockNode(blockElements, node.tokenIndex, skipRefactoring); - tailClosure = createAnonSubCall(node.tokenIndex, block); - suffixEstimatedSize = BytecodeSizeEstimator.estimateSnippetSize(tailClosure); - suffixReversed.clear(); - createdAnyClosure = true; - closuresCreated++; - - safeRunEndExclusive = chunkStart; - safeRunLen -= chunkLen; - } - - safeRunStart = safeRunEndExclusive - safeRunLen; - for (int j = safeRunEndExclusive - 1; j >= safeRunStart; j--) { - suffixReversed.add(node.elements.get(j)); - } - } - - if (!createdAnyClosure) { - if (parser != null || node.annotations != null) { - node.setAnnotation("refactorSkipReason", "No chunk >= effective min chunk size " + effectiveMinChunkSize); - } - return; - } - - List processedElements = new ArrayList<>(suffixReversed.size() + 1); - for (int k = suffixReversed.size() - 1; k >= 0; k--) { - processedElements.add(suffixReversed.get(k)); - } - processedElements.add(tailClosure); - - boolean didReduceElementCount = processedElements.size() < node.elements.size(); - long originalSize = estimatedSize; - - // Apply chunking if we reduced the element count - if (didReduceElementCount) { - node.elements = processedElements; - } - - // Single verification pass after applying (or not applying) chunking. - long finalEstimatedSize = didReduceElementCount - ? estimateTotalBytecodeSizeCapped(node.elements, (long) LARGE_BYTECODE_SIZE * maxNestedClosures) - : estimatedSize; - long finalEstimatedHalf = finalEstimatedSize / 2; - long finalEstimatedSizeWithSafetyMargin = finalEstimatedSize > Long.MAX_VALUE - finalEstimatedHalf ? Long.MAX_VALUE : finalEstimatedSize + finalEstimatedHalf; - if (parser != null || node.annotations != null) { - if (didReduceElementCount) { - node.setAnnotation("refactoredBytecodeSize", finalEstimatedSize); - } - } - - if (finalEstimatedSizeWithSafetyMargin > LARGE_BYTECODE_SIZE) { - if (parser != null || node.annotations != null) { - if (didReduceElementCount) { - node.setAnnotation("refactorSkipReason", "Refactoring failed: size " + finalEstimatedSize + " still > threshold " + LARGE_BYTECODE_SIZE); - } else { - node.setAnnotation("refactorSkipReason", "Refactoring didn't reduce element count, size " + finalEstimatedSize + " > threshold " + LARGE_BYTECODE_SIZE); - } - } - return; - } - - if (parser != null || node.annotations != null) { - if (didReduceElementCount) { - node.setAnnotation("refactorSkipReason", "Successfully refactored: " + originalSize + " -> " + finalEstimatedSize + " bytes"); - } else { - node.setAnnotation("refactorSkipReason", "Refactoring didn't reduce element count, but size " + finalEstimatedSize + " <= threshold " + LARGE_BYTECODE_SIZE); - } - } - - } - - - /** - * Determine if an element should break the current chunk. - * Labels and ANY control flow statements break chunks - they must stay as direct elements. - * This is more conservative than ControlFlowDetectorVisitor because we need to catch - * ALL control flow, not just "unsafe" control flow (which considers loop depth). - */ - private static boolean shouldBreakChunk(Node element) { - // Labels break chunks - they're targets for goto/next/last - if (element instanceof LabelNode) { - return true; - } - - // Check if element contains ANY control flow (last/next/redo/goto) - // We use a custom visitor that doesn't consider loop depth - ControlFlowFinder finder = controlFlowFinderTl.get(); - finder.scan(element); - return finder.foundControlFlow; - } - - /** - * Try to refactor the entire block as a subroutine. + * Try to refactor the entire block as a subroutine: {@code sub { }->(@_)}. */ private static boolean tryWholeBlockRefactoring(EmitterVisitor emitterVisitor, BlockNode node) { // Check for unsafe control flow using ControlFlowDetectorVisitor @@ -474,7 +112,7 @@ private static boolean tryWholeBlockRefactoring(EmitterVisitor emitterVisitor, B // Create sub {...}->(@_) for whole block int tokenIndex = node.tokenIndex; - // IMPORTANT: Mark the original block as already refactored to prevent recursion + // Mark the original block as already refactored to prevent recursion node.setAnnotation("blockAlreadyRefactored", true); // Create a wrapper block containing the original block @@ -486,16 +124,4 @@ private static boolean tryWholeBlockRefactoring(EmitterVisitor emitterVisitor, B subr.accept(emitterVisitor); return true; } - - private static BlockNode createBlockNode(List elements, int tokenIndex, ThreadLocal skipRefactoring) { - BlockNode block; - skipRefactoring.set(true); - try { - block = new BlockNode(elements, tokenIndex); - } finally { - skipRefactoring.set(false); - } - return block; - } - } diff --git a/src/main/java/org/perlonjava/backend/jvm/astrefactor/LargeNodeRefactorer.java b/src/main/java/org/perlonjava/backend/jvm/astrefactor/LargeNodeRefactorer.java deleted file mode 100644 index 292c5874c..000000000 --- a/src/main/java/org/perlonjava/backend/jvm/astrefactor/LargeNodeRefactorer.java +++ /dev/null @@ -1,189 +0,0 @@ -package org.perlonjava.backend.jvm.astrefactor; - -import org.perlonjava.frontend.analysis.BytecodeSizeEstimator; -import org.perlonjava.frontend.astnode.LabelNode; -import org.perlonjava.frontend.astnode.ListNode; -import org.perlonjava.frontend.astnode.Node; - -import java.util.ArrayList; -import java.util.List; - -import static org.perlonjava.backend.jvm.astrefactor.BlockRefactor.*; - -/** - * Helper class for refactoring large AST node lists to avoid JVM's "Method too large" error. - *

- * Problem: The JVM has a hard limit of 65535 bytes per method. Large Perl literals - * (arrays, hashes, lists with thousands of elements) can exceed this limit when compiled. - *

- * Solution: This class provides on-demand refactoring that splits large element lists - * into chunks, each wrapped in an anonymous subroutine. The chunks are then dereferenced - * and merged back together when compilation errors occur. - *

- * Integration: Used by {@link LargeBlockRefactorer} for automatic on-demand refactoring - * when "Method too large" errors are detected during bytecode generation. - *

- * Recursion Safety: The circular dependency (constructor calls refactorer which - * creates new nodes) breaks naturally when chunks become small enough (below MIN_CHUNK_SIZE). - * - * @see BytecodeSizeEstimator#estimateSnippetSize(Node) - * @see LargeBlockRefactorer - */ -public class LargeNodeRefactorer { - /** - * Maximum elements per chunk. Limits chunk size even if bytecode estimates - * suggest larger chunks would fit. - */ - private static final int MAX_CHUNK_SIZE = 200; - - /** - * Thread-local flag to prevent recursion when creating nested blocks. - */ - private static final ThreadLocal skipRefactoring = ThreadLocal.withInitial(() -> false); - - /** - * Refactors a large element list (for on-demand use when MethodTooLargeException occurs). - *

- * This method always attempts refactoring and is used by DepthFirstLiteralRefactorVisitor - * when "Method too large" errors are detected during bytecode generation. - * - * @param elements the elements list to refactor - * @param tokenIndex the token index for creating new nodes - * @return refactored list with elements chunked into closures, or original list if refactoring not possible - */ - public static List forceRefactorElements(List elements, int tokenIndex) { - if (elements == null || elements.isEmpty() || !shouldRefactor(elements)) { - return elements; - } - - // Check if elements contain any top-level labels - for (Node element : elements) { - if (element instanceof LabelNode) { - // Contains a label - skip refactoring to preserve label scope - return elements; - } - } - - List chunks = splitIntoDynamicChunks(elements); - - // Note: Control flow checks removed since master supports non-local gotos in subroutines - // Wrapping code in `sub { next }` is now safe - - // Create nested closures for proper lexical scoping - return createNestedListClosures(chunks, tokenIndex); - } - - /** - * Creates nested closures for LIST chunks to ensure proper lexical scoping. - * Structure: sub{ chunk1, sub{ chunk2, sub{ chunk3 }->(@_) }->(@_) }->(@_) - * - * @param chunks the list of chunks to nest - * @param tokenIndex token index for new nodes - * @return a single Node representing the nested closure structure, or a ListNode if only one small chunk - */ - private static List createNestedListClosures(List chunks, int tokenIndex) { - if (chunks.isEmpty()) { - return new ArrayList<>(); - } - - // If only one chunk and it's small, just return its elements - if (chunks.size() == 1 && chunks.get(0) instanceof ListNode listChunk && - listChunk.elements.size() < MIN_CHUNK_SIZE) { - return listChunk.elements; - } - - // Convert chunks (ListNode objects) to segments (List objects) - List segments = new ArrayList<>(); - for (Node chunk : chunks) { - if (chunk instanceof ListNode listChunk) { - segments.add(listChunk.elements); - } else { - segments.add(List.of(chunk)); - } - } - - // Use unified method with ListNode wrapper - // The wrapping is necessary because the closure needs to return a list of elements, - // not just execute them sequentially. - return buildNestedStructure( - segments, - tokenIndex, - MIN_CHUNK_SIZE, - true, // returnTypeIsList = true: wrap in ListNode to return list - skipRefactoring - ); - } - - /** - * Determines if a list of elements should be refactored based on size criteria. - *

- * Uses {@link BytecodeSizeEstimator#estimateSnippetSize(Node)} to estimate - * the bytecode that would be generated for each element. - * - * @param elements the list of AST nodes to evaluate - * @return true if the list exceeds size thresholds and should be refactored - */ - private static boolean shouldRefactor(List elements) { - // Estimate bytecode size by visiting all elements (no sampling) - // Sampling was causing inaccurate estimates for mixed element types - int n = elements.size(); - if (n == 0) { - return false; - } - if (n == 1) { - long size = BytecodeSizeEstimator.estimateSnippetSize(elements.get(0)); - return size > LARGE_BYTECODE_SIZE; - } - - // Estimate all elements for accurate size calculation - long totalSize = 0; - for (Node element : elements) { - totalSize += BytecodeSizeEstimator.estimateSnippetSize(element); - } - - return totalSize > LARGE_BYTECODE_SIZE; - } - - /** - * Splits a list of elements into dynamic chunks based on estimated bytecode sizes. - *

- * Each chunk is created by accumulating elements until the estimated bytecode size - * reaches LARGE_BYTECODE_SIZE. This ensures chunks stay under the size limit while - * maximizing elements per chunk. - *

- * Respects MIN_CHUNK_SIZE and MAX_CHUNK_SIZE constraints. - * - * @param elements the list to split - * @return list of ListNode chunks - */ - private static List splitIntoDynamicChunks(List elements) { - List chunks = new ArrayList<>(); - List currentChunk = new ArrayList<>(); - long currentChunkSize = 0; - - for (Node element : elements) { - long elementSize = BytecodeSizeEstimator.estimateSnippetSize(element); - - // Check if adding this element would exceed the size limit or max chunk size - if (!currentChunk.isEmpty() && - (currentChunkSize + elementSize > LARGE_BYTECODE_SIZE || - currentChunk.size() >= MAX_CHUNK_SIZE)) { - // Finalize current chunk - chunks.add(new ListNode(new ArrayList<>(currentChunk), currentChunk.get(0).getIndex())); - currentChunk.clear(); - currentChunkSize = 0; - } - - // Add element to current chunk - currentChunk.add(element); - currentChunkSize += elementSize; - } - - // Add the last chunk if it has elements - if (!currentChunk.isEmpty()) { - chunks.add(new ListNode(new ArrayList<>(currentChunk), currentChunk.get(0).getIndex())); - } - - return chunks; - } -} diff --git a/src/main/java/org/perlonjava/core/Configuration.java b/src/main/java/org/perlonjava/core/Configuration.java index ff19793f3..ae081404f 100644 --- a/src/main/java/org/perlonjava/core/Configuration.java +++ b/src/main/java/org/perlonjava/core/Configuration.java @@ -33,7 +33,7 @@ public final class Configuration { * Automatically populated by Gradle/Maven during build. * DO NOT EDIT MANUALLY - this value is replaced at build time. */ - public static final String gitCommitId = "f68479a46"; + public static final String gitCommitId = "ffc466124"; /** * Git commit date of the build (ISO format: YYYY-MM-DD). @@ -48,7 +48,7 @@ public final class Configuration { * Parsed by App::perlbrew and other tools via: perl -V | grep "Compiled at" * DO NOT EDIT MANUALLY - this value is replaced at build time. */ - public static final String buildTimestamp = "Apr 10 2026 21:43:26"; + public static final String buildTimestamp = "Apr 10 2026 22:16:43"; // Prevent instantiation private Configuration() { diff --git a/src/main/java/org/perlonjava/frontend/analysis/BytecodeSizeEstimator.java b/src/main/java/org/perlonjava/frontend/analysis/BytecodeSizeEstimator.java index ba9546851..7c7679d10 100644 --- a/src/main/java/org/perlonjava/frontend/analysis/BytecodeSizeEstimator.java +++ b/src/main/java/org/perlonjava/frontend/analysis/BytecodeSizeEstimator.java @@ -1,7 +1,6 @@ package org.perlonjava.frontend.analysis; import org.perlonjava.backend.jvm.astrefactor.LargeBlockRefactorer; -import org.perlonjava.backend.jvm.astrefactor.LargeNodeRefactorer; import org.perlonjava.frontend.astnode.*; /** @@ -20,7 +19,7 @@ *

* Usage: *

    - *
  • {@link #estimateSnippetSize(Node)} - For code snippets/chunks (no BASE_OVERHEAD, used by LargeNodeRefactorer)
  • + *
  • {@link #estimateSnippetSize(Node)} - For code snippets/chunks (no BASE_OVERHEAD, used by LargeBlockRefactorer)
  • *
*

* The distinction between these methods is important: @@ -29,7 +28,6 @@ *

  • Code snippets are partial AST fragments where method overhead doesn't apply
  • * * - * @see LargeNodeRefactorer * @see LargeBlockRefactorer */ public class BytecodeSizeEstimator implements Visitor { diff --git a/src/main/java/org/perlonjava/frontend/analysis/ControlFlowFinder.java b/src/main/java/org/perlonjava/frontend/analysis/ControlFlowFinder.java deleted file mode 100644 index 733d7280e..000000000 --- a/src/main/java/org/perlonjava/frontend/analysis/ControlFlowFinder.java +++ /dev/null @@ -1,626 +0,0 @@ -package org.perlonjava.frontend.analysis; - -import org.perlonjava.frontend.astnode.*; - -/** - * Simple visitor that finds ANY control flow statement, ignoring loop depth. - */ -public class ControlFlowFinder implements Visitor { - public boolean foundControlFlow = false; - - private Node[] nodeStack = new Node[256]; - private int[] stateStack = new int[256]; - private int[] indexStack = new int[256]; - private int[] extraStack = new int[256]; - - private void ensureCapacity(int top) { - if (top < nodeStack.length) { - return; - } - int newCap = nodeStack.length * 2; - while (top >= newCap) { - newCap *= 2; - } - nodeStack = java.util.Arrays.copyOf(nodeStack, newCap); - stateStack = java.util.Arrays.copyOf(stateStack, newCap); - indexStack = java.util.Arrays.copyOf(indexStack, newCap); - extraStack = java.util.Arrays.copyOf(extraStack, newCap); - } - - /** - * Iterative (non-recursive) scan for control flow. - * - *

    Used by large-block refactoring to decide chunk boundaries without risking - * StackOverflowError on huge ASTs. - */ - public void scan(Node root) { - foundControlFlow = false; - if (root == null) { - return; - } - - if (root instanceof AbstractNode abstractNode) { - Boolean cached = abstractNode.getCachedHasAnyControlFlow(); - if (cached != null) { - foundControlFlow = cached; - return; - } - } - - int top = 0; - - ensureCapacity(0); - nodeStack[0] = root; - stateStack[0] = 0; - indexStack[0] = 0; - extraStack[0] = 0; - - while (top >= 0 && !foundControlFlow) { - Node node = nodeStack[top]; - int state = stateStack[top]; - - if (node == null) { - top--; - continue; - } - - if (node instanceof SubroutineNode) { - top--; - continue; - } - if (node instanceof LabelNode) { - top--; - continue; - } - - if (node instanceof OperatorNode op) { - if (state == 0) { - if ("last".equals(op.operator) || - "next".equals(op.operator) || - "redo".equals(op.operator) || "goto".equals(op.operator)) { - foundControlFlow = true; - continue; - } - stateStack[top] = 1; - if (op.operand != null) { - top++; - ensureCapacity(top); - nodeStack[top] = op.operand; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - } else { - top--; - } - continue; - } - - if (node instanceof BlockNode block) { - if (state == 0) { - stateStack[top] = 1; - indexStack[top] = block.elements.size() - 1; - continue; - } - int idx = indexStack[top]; - while (idx >= 0) { - Node child = block.elements.get(idx); - idx--; - if (child != null) { - indexStack[top] = idx; - top++; - ensureCapacity(top); - nodeStack[top] = child; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - break; - } - } - if (idx < 0) { - top--; - } - continue; - } - - if (node instanceof ListNode list) { - if (state == 0) { - stateStack[top] = 1; - indexStack[top] = list.elements.size() - 1; - extraStack[top] = 0; // handlePushed: 0=no, 1=yes - continue; - } - - int idx = indexStack[top]; - while (idx >= 0) { - Node child = list.elements.get(idx); - idx--; - if (child != null) { - indexStack[top] = idx; - top++; - ensureCapacity(top); - nodeStack[top] = child; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - break; - } - } - - if (idx < 0) { - if (extraStack[top] == 0) { - extraStack[top] = 1; - if (list.handle != null) { - top++; - ensureCapacity(top); - nodeStack[top] = list.handle; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - } else { - top--; - } - } else { - indexStack[top] = idx; - } - continue; - } - - if (node instanceof BinaryOperatorNode bin) { - if (state == 0) { - stateStack[top] = 1; - if (bin.right != null) { - top++; - ensureCapacity(top); - nodeStack[top] = bin.right; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - continue; - } - if (state == 1) { - stateStack[top] = 2; - if (bin.left != null) { - top++; - ensureCapacity(top); - nodeStack[top] = bin.left; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - continue; - } - top--; - continue; - } - - if (node instanceof TernaryOperatorNode tern) { - if (state == 0) { - stateStack[top] = 1; - if (tern.falseExpr != null) { - top++; - ensureCapacity(top); - nodeStack[top] = tern.falseExpr; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - continue; - } - if (state == 1) { - stateStack[top] = 2; - if (tern.trueExpr != null) { - top++; - ensureCapacity(top); - nodeStack[top] = tern.trueExpr; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - continue; - } - if (state == 2) { - stateStack[top] = 3; - if (tern.condition != null) { - top++; - ensureCapacity(top); - nodeStack[top] = tern.condition; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - continue; - } - top--; - continue; - } - - if (node instanceof IfNode ifNode) { - if (state == 0) { - stateStack[top] = 1; - if (ifNode.elseBranch != null) { - top++; - ensureCapacity(top); - nodeStack[top] = ifNode.elseBranch; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - continue; - } - if (state == 1) { - stateStack[top] = 2; - if (ifNode.thenBranch != null) { - top++; - ensureCapacity(top); - nodeStack[top] = ifNode.thenBranch; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - continue; - } - if (state == 2) { - stateStack[top] = 3; - if (ifNode.condition != null) { - top++; - ensureCapacity(top); - nodeStack[top] = ifNode.condition; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - continue; - } - top--; - continue; - } - - if (node instanceof For1Node for1) { - if (state == 0) { - stateStack[top] = 1; - if (for1.body != null) { - top++; - ensureCapacity(top); - nodeStack[top] = for1.body; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - continue; - } - if (state == 1) { - stateStack[top] = 2; - if (for1.list != null) { - top++; - ensureCapacity(top); - nodeStack[top] = for1.list; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - continue; - } - if (state == 2) { - stateStack[top] = 3; - if (for1.variable != null) { - top++; - ensureCapacity(top); - nodeStack[top] = for1.variable; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - continue; - } - top--; - continue; - } - - if (node instanceof For3Node for3) { - if (state == 0) { - stateStack[top] = 1; - if (for3.body != null) { - top++; - ensureCapacity(top); - nodeStack[top] = for3.body; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - continue; - } - if (state == 1) { - stateStack[top] = 2; - if (for3.increment != null) { - top++; - ensureCapacity(top); - nodeStack[top] = for3.increment; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - continue; - } - if (state == 2) { - stateStack[top] = 3; - if (for3.condition != null) { - top++; - ensureCapacity(top); - nodeStack[top] = for3.condition; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - continue; - } - if (state == 3) { - stateStack[top] = 4; - if (for3.initialization != null) { - top++; - ensureCapacity(top); - nodeStack[top] = for3.initialization; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - continue; - } - top--; - continue; - } - - if (node instanceof TryNode tryNode) { - if (state == 0) { - stateStack[top] = 1; - if (tryNode.finallyBlock != null) { - top++; - ensureCapacity(top); - nodeStack[top] = tryNode.finallyBlock; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - continue; - } - if (state == 1) { - stateStack[top] = 2; - if (tryNode.catchBlock != null) { - top++; - ensureCapacity(top); - nodeStack[top] = tryNode.catchBlock; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - continue; - } - if (state == 2) { - stateStack[top] = 3; - if (tryNode.tryBlock != null) { - top++; - ensureCapacity(top); - nodeStack[top] = tryNode.tryBlock; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - } - continue; - } - top--; - continue; - } - - if (node instanceof HashLiteralNode hash) { - if (state == 0) { - stateStack[top] = 1; - indexStack[top] = hash.elements.size() - 1; - continue; - } - int idx = indexStack[top]; - while (idx >= 0) { - Node child = hash.elements.get(idx); - idx--; - if (child != null) { - indexStack[top] = idx; - top++; - ensureCapacity(top); - nodeStack[top] = child; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - break; - } - } - if (idx < 0) { - top--; - } - continue; - } - - if (node instanceof ArrayLiteralNode array) { - if (state == 0) { - stateStack[top] = 1; - indexStack[top] = array.elements.size() - 1; - continue; - } - int idx = indexStack[top]; - while (idx >= 0) { - Node child = array.elements.get(idx); - idx--; - if (child != null) { - indexStack[top] = idx; - top++; - ensureCapacity(top); - nodeStack[top] = child; - stateStack[top] = 0; - indexStack[top] = 0; - extraStack[top] = 0; - break; - } - } - if (idx < 0) { - top--; - } - continue; - } - - // Leaf nodes - top--; - } - - if (root instanceof AbstractNode abstractNode) { - abstractNode.setCachedHasAnyControlFlow(foundControlFlow); - } - } - - @Override - public void visit(OperatorNode node) { - if (foundControlFlow) return; - if ("last".equals(node.operator) || - "next".equals(node.operator) || - "redo".equals(node.operator) || "goto".equals(node.operator)) { - foundControlFlow = true; - return; - } - if (node.operand != null) { - node.operand.accept(this); - } - } - - @Override - public void visit(BlockNode node) { - if (foundControlFlow) return; - for (Node element : node.elements) { - if (element != null) { - element.accept(this); - if (foundControlFlow) return; - } - } - } - - @Override - public void visit(ListNode node) { - if (foundControlFlow) return; - for (Node element : node.elements) { - if (element != null) { - element.accept(this); - if (foundControlFlow) return; - } - } - } - - @Override - public void visit(BinaryOperatorNode node) { - if (foundControlFlow) return; - if (node.left != null) node.left.accept(this); - if (!foundControlFlow && node.right != null) node.right.accept(this); - } - - @Override - public void visit(TernaryOperatorNode node) { - if (foundControlFlow) return; - if (node.condition != null) node.condition.accept(this); - if (!foundControlFlow && node.trueExpr != null) node.trueExpr.accept(this); - if (!foundControlFlow && node.falseExpr != null) node.falseExpr.accept(this); - } - - @Override - public void visit(IfNode node) { - if (foundControlFlow) return; - if (node.condition != null) node.condition.accept(this); - if (!foundControlFlow && node.thenBranch != null) node.thenBranch.accept(this); - if (!foundControlFlow && node.elseBranch != null) node.elseBranch.accept(this); - } - - @Override - public void visit(For1Node node) { - // Traverse into loops to find ANY control flow for chunking purposes - if (foundControlFlow) return; - if (node.variable != null) node.variable.accept(this); - if (!foundControlFlow && node.list != null) node.list.accept(this); - if (!foundControlFlow && node.body != null) node.body.accept(this); - } - - @Override - public void visit(For3Node node) { - // Traverse into loops to find ANY control flow for chunking purposes - if (foundControlFlow) return; - if (node.initialization != null) node.initialization.accept(this); - if (!foundControlFlow && node.condition != null) node.condition.accept(this); - if (!foundControlFlow && node.increment != null) node.increment.accept(this); - if (!foundControlFlow && node.body != null) node.body.accept(this); - } - - @Override - public void visit(TryNode node) { - if (foundControlFlow) return; - if (node.tryBlock != null) node.tryBlock.accept(this); - if (!foundControlFlow && node.catchBlock != null) node.catchBlock.accept(this); - if (!foundControlFlow && node.finallyBlock != null) node.finallyBlock.accept(this); - } - - @Override - public void visit(DeferNode node) { - if (foundControlFlow) return; - if (node.block != null) node.block.accept(this); - } - - @Override - public void visit(HashLiteralNode node) { - if (foundControlFlow) return; - for (Node element : node.elements) { - if (element != null) { - element.accept(this); - if (foundControlFlow) return; - } - } - } - - @Override - public void visit(ArrayLiteralNode node) { - if (foundControlFlow) return; - for (Node element : node.elements) { - if (element != null) { - element.accept(this); - if (foundControlFlow) return; - } - } - } - - @Override - public void visit(SubroutineNode node) { - // Do not traverse into subroutines - control flow inside is scoped to that subroutine - } - - // Default implementations for leaf nodes - @Override - public void visit(IdentifierNode node) { - } - - @Override - public void visit(NumberNode node) { - } - - @Override - public void visit(StringNode node) { - } - - @Override - public void visit(LabelNode node) { - } - - @Override - public void visit(CompilerFlagNode node) { - } - - @Override - public void visit(FormatNode node) { - } - - @Override - public void visit(FormatLine node) { - } -} diff --git a/src/main/java/org/perlonjava/frontend/analysis/DepthFirstLiteralRefactorVisitor.java b/src/main/java/org/perlonjava/frontend/analysis/DepthFirstLiteralRefactorVisitor.java deleted file mode 100644 index 2db7b4f6b..000000000 --- a/src/main/java/org/perlonjava/frontend/analysis/DepthFirstLiteralRefactorVisitor.java +++ /dev/null @@ -1,297 +0,0 @@ -package org.perlonjava.frontend.analysis; - -import org.perlonjava.backend.jvm.astrefactor.BlockRefactor; -import org.perlonjava.backend.jvm.astrefactor.LargeNodeRefactorer; -import org.perlonjava.frontend.astnode.*; - -import java.util.List; - -/** - * Visitor that refactors large literals in a depth-first manner. - *

    - * This visitor traverses the AST and refactors large ListNode, HashLiteralNode, - * and ArrayLiteralNode structures by splitting them into smaller chunks wrapped - * in closures. The depth-first approach ensures that nested structures are - * refactored first, which naturally reduces the size of parent structures. - *

    - * Example: For a hash like: - *

    - * %hash = (
    - *   key1 => { nested => { deeply => 'nested' } },
    - *   key2 => { another => { structure => 'here' } },
    - *   ...
    - * )
    - * 
    - * The inner hashes are refactored first, making the outer hash smaller and - * potentially avoiding the need to refactor it at all. - *

    - * Control flow (next/last/redo) is safe to wrap in closures since master - * now supports non-local gotos in subroutines. - */ -public class DepthFirstLiteralRefactorVisitor implements Visitor { - - /** - * Minimum number of elements before considering refactoring. - * Avoids refactoring small but complex structures. - * Set conservatively high to only refactor truly massive structures. - */ - private static final int MIN_ELEMENTS_FOR_REFACTORING = 500; - - /** - * Debug flag for controlling debug output. - */ - private static final boolean DEBUG = false; - - /** - * Refactor an AST starting from the given node. - * Traverses depth-first and refactors large literals in-place. - * - * @param root the root node to start refactoring from - */ - public static void refactor(Node root) { - if (root != null) { - DepthFirstLiteralRefactorVisitor visitor = new DepthFirstLiteralRefactorVisitor(); - root.accept(visitor); - } - } - - @Override - public void visit(ListNode node) { - // First, recursively refactor all children (depth-first) - for (Node element : node.elements) { - if (element != null) { - element.accept(this); - } - } - if (node.handle != null) { - node.handle.accept(this); - } - - // Then, refactor this node if it's too large - if (shouldRefactor(node.elements)) { - if (DEBUG) { - System.err.println("DEBUG: Refactoring ListNode with " + node.elements.size() + " elements"); - System.err.println("DEBUG: First few elements: " + - node.elements.stream().limit(3).map(Node::toString).collect(java.util.stream.Collectors.joining(", "))); - } - List original = node.elements; - node.elements = LargeNodeRefactorer.forceRefactorElements(node.elements, node.getIndex()); - if (DEBUG) { - System.err.println("DEBUG: After refactoring: " + node.elements.size() + " elements"); - System.err.println("DEBUG: Refactored structure: " + node.elements.stream().limit(2) - .map(n -> n.getClass().getSimpleName()).collect(java.util.stream.Collectors.joining(", "))); - } - } - } - - @Override - public void visit(HashLiteralNode node) { - // First, recursively refactor all children (depth-first) - for (Node element : node.elements) { - if (element != null) { - element.accept(this); - } - } - - // Then, refactor this node if it's too large - if (shouldRefactor(node.elements)) { - node.elements = LargeNodeRefactorer.forceRefactorElements(node.elements, node.getIndex()); - } - } - - @Override - public void visit(ArrayLiteralNode node) { - // First, recursively refactor all children (depth-first) - for (Node element : node.elements) { - if (element != null) { - element.accept(this); - } - } - - // Then, refactor this node if it's too large - if (shouldRefactor(node.elements)) { - node.elements = LargeNodeRefactorer.forceRefactorElements(node.elements, node.getIndex()); - } - } - - @Override - public void visit(BlockNode node) { - // Recursively visit all elements - for (Node element : node.elements) { - if (element != null) { - element.accept(this); - } - } - } - - @Override - public void visit(BinaryOperatorNode node) { - if (node.left != null) { - node.left.accept(this); - } - if (node.right != null) { - node.right.accept(this); - } - } - - @Override - public void visit(TernaryOperatorNode node) { - if (node.condition != null) { - node.condition.accept(this); - } - if (node.trueExpr != null) { - node.trueExpr.accept(this); - } - if (node.falseExpr != null) { - node.falseExpr.accept(this); - } - } - - @Override - public void visit(IfNode node) { - if (node.condition != null) { - node.condition.accept(this); - } - if (node.thenBranch != null) { - node.thenBranch.accept(this); - } - if (node.elseBranch != null) { - node.elseBranch.accept(this); - } - } - - @Override - public void visit(For1Node node) { - if (node.variable != null) { - node.variable.accept(this); - } - if (node.list != null) { - node.list.accept(this); - } - if (node.body != null) { - node.body.accept(this); - } - if (node.continueBlock != null) { - node.continueBlock.accept(this); - } - } - - @Override - public void visit(For3Node node) { - if (node.initialization != null) { - node.initialization.accept(this); - } - if (node.condition != null) { - node.condition.accept(this); - } - if (node.increment != null) { - node.increment.accept(this); - } - if (node.body != null) { - node.body.accept(this); - } - if (node.continueBlock != null) { - node.continueBlock.accept(this); - } - } - - @Override - public void visit(TryNode node) { - if (node.tryBlock != null) { - node.tryBlock.accept(this); - } - if (node.catchBlock != null) { - node.catchBlock.accept(this); - } - if (node.finallyBlock != null) { - node.finallyBlock.accept(this); - } - } - - @Override - public void visit(DeferNode node) { - if (node.block != null) { - node.block.accept(this); - } - } - - @Override - public void visit(OperatorNode node) { - if (node.operand != null) { - node.operand.accept(this); - } - } - - @Override - public void visit(SubroutineNode node) { - // DO traverse into subroutines - we want to refactor large literals everywhere - if (node.block != null) { - node.block.accept(this); - } - } - - // Leaf nodes - no traversal needed - @Override - public void visit(IdentifierNode node) { - // No children to traverse - } - - @Override - public void visit(NumberNode node) { - // No children to traverse - } - - @Override - public void visit(StringNode node) { - // No children to traverse - } - - @Override - public void visit(LabelNode node) { - // No children to traverse - } - - @Override - public void visit(CompilerFlagNode node) { - // No children to traverse - } - - @Override - public void visit(FormatNode node) { - // Formats typically don't contain large literals - } - - @Override - public void visit(FormatLine node) { - // Format lines don't contain large literals - } - - /** - * Determine if a list of elements should be refactored. - * Uses the same logic as LargeNodeRefactorer for consistency. - * - * @param elements the elements to check - * @return true if refactoring is needed - */ - private boolean shouldRefactor(List elements) { - if (elements == null || elements.isEmpty()) { - return false; - } - - // Only refactor if we have a significant number of elements - // Avoids refactoring small but complex structures - if (elements.size() < MIN_ELEMENTS_FOR_REFACTORING) { - return false; - } - - // Use BlockRefactor.LARGE_BYTECODE_SIZE for consistency - long totalSize = 0; - for (Node element : elements) { - totalSize += BytecodeSizeEstimator.estimateSnippetSize(element); - if (totalSize > BlockRefactor.LARGE_BYTECODE_SIZE) { - return true; - } - } - return false; - } -} diff --git a/src/main/java/org/perlonjava/frontend/astnode/AbstractNode.java b/src/main/java/org/perlonjava/frontend/astnode/AbstractNode.java index 27a15f1ae..572fbe15f 100644 --- a/src/main/java/org/perlonjava/frontend/astnode/AbstractNode.java +++ b/src/main/java/org/perlonjava/frontend/astnode/AbstractNode.java @@ -14,14 +14,11 @@ */ public abstract class AbstractNode implements Node { private static final int FLAG_BLOCK_ALREADY_REFACTORED = 1; - private static final int FLAG_QUEUED_FOR_REFACTOR = 2; - private static final int FLAG_CHUNK_ALREADY_REFACTORED = 4; public int tokenIndex; // Lazy initialization - only created when first annotation is set public Map annotations; private int internalAnnotationFlags; private int cachedBytecodeSize = Integer.MIN_VALUE; - private byte cachedHasAnyControlFlow = -1; @Override public int getIndex() { @@ -57,14 +54,6 @@ public void setAnnotation(String key, Object value) { internalAnnotationFlags |= FLAG_BLOCK_ALREADY_REFACTORED; return; } - if ("queuedForRefactor".equals(key)) { - internalAnnotationFlags |= FLAG_QUEUED_FOR_REFACTOR; - return; - } - if ("chunkAlreadyRefactored".equals(key)) { - internalAnnotationFlags |= FLAG_CHUNK_ALREADY_REFACTORED; - return; - } } if (annotations == null) { annotations = new HashMap<>(); @@ -80,24 +69,10 @@ public void setCachedBytecodeSize(int size) { this.cachedBytecodeSize = size; } - public Boolean getCachedHasAnyControlFlow() { - return cachedHasAnyControlFlow < 0 ? null : cachedHasAnyControlFlow != 0; - } - - public void setCachedHasAnyControlFlow(boolean hasAnyControlFlow) { - this.cachedHasAnyControlFlow = (byte) (hasAnyControlFlow ? 1 : 0); - } - public Object getAnnotation(String key) { if ("blockAlreadyRefactored".equals(key)) { return (internalAnnotationFlags & FLAG_BLOCK_ALREADY_REFACTORED) != 0; } - if ("queuedForRefactor".equals(key)) { - return (internalAnnotationFlags & FLAG_QUEUED_FOR_REFACTOR) != 0; - } - if ("chunkAlreadyRefactored".equals(key)) { - return (internalAnnotationFlags & FLAG_CHUNK_ALREADY_REFACTORED) != 0; - } return annotations == null ? null : annotations.get(key); } diff --git a/src/main/java/org/perlonjava/frontend/astnode/ArrayLiteralNode.java b/src/main/java/org/perlonjava/frontend/astnode/ArrayLiteralNode.java index 8ee83c6c2..1812f4ada 100644 --- a/src/main/java/org/perlonjava/frontend/astnode/ArrayLiteralNode.java +++ b/src/main/java/org/perlonjava/frontend/astnode/ArrayLiteralNode.java @@ -1,6 +1,5 @@ package org.perlonjava.frontend.astnode; -import org.perlonjava.backend.jvm.astrefactor.LargeNodeRefactorer; import org.perlonjava.frontend.analysis.Visitor; import org.perlonjava.frontend.parser.Parser; @@ -15,9 +14,7 @@ *

  • {@code [$a, @b, %c]} - array with mixed elements
  • *
  • {@code [[1,2], [3,4]]} - nested array literals
  • * - *

    * - * @see LargeNodeRefactorer * @see HashLiteralNode * @see ListNode */ @@ -28,9 +25,6 @@ public class ArrayLiteralNode extends AbstractNode { * Each element is evaluated in LIST context when the array is constructed. * Elements may be scalars, arrays (which flatten), hashes (which flatten to key-value pairs), * or any expression. - *

    - * Note: This field is non-final because {@link LargeNodeRefactorer} may replace - * the original list with a refactored version containing chunk wrappers. */ public List elements; diff --git a/src/main/java/org/perlonjava/frontend/astnode/BlockNode.java b/src/main/java/org/perlonjava/frontend/astnode/BlockNode.java index 363588395..047666759 100644 --- a/src/main/java/org/perlonjava/frontend/astnode/BlockNode.java +++ b/src/main/java/org/perlonjava/frontend/astnode/BlockNode.java @@ -1,6 +1,5 @@ package org.perlonjava.frontend.astnode; -import org.perlonjava.backend.jvm.astrefactor.LargeBlockRefactorer; import org.perlonjava.frontend.analysis.Visitor; import org.perlonjava.frontend.parser.Parser; @@ -15,8 +14,6 @@ public class BlockNode extends AbstractNode { /** * The list of child nodes contained in this BlockNode. - * Note: This field is non-final because {@link LargeBlockRefactorer} may modify - * the list during parse-time refactoring. */ public List elements; diff --git a/src/main/java/org/perlonjava/frontend/astnode/HashLiteralNode.java b/src/main/java/org/perlonjava/frontend/astnode/HashLiteralNode.java index ba9f18782..66aed2d15 100644 --- a/src/main/java/org/perlonjava/frontend/astnode/HashLiteralNode.java +++ b/src/main/java/org/perlonjava/frontend/astnode/HashLiteralNode.java @@ -1,6 +1,5 @@ package org.perlonjava.frontend.astnode; -import org.perlonjava.backend.jvm.astrefactor.LargeNodeRefactorer; import org.perlonjava.frontend.analysis.Visitor; import org.perlonjava.frontend.parser.Parser; @@ -17,9 +16,7 @@ * *

    * The elements list contains key-value pairs in sequence: key1, value1, key2, value2, etc. - *

    * - * @see LargeNodeRefactorer * @see ArrayLiteralNode * @see ListNode */ @@ -29,9 +26,6 @@ public class HashLiteralNode extends AbstractNode { *

    * Elements are stored as alternating key-value pairs: [key1, value1, key2, value2, ...]. * Each element is evaluated in LIST context when the hash is constructed. - *

    - * Note: This field is non-final because {@link LargeNodeRefactorer} may replace - * the original list with a refactored version containing chunk wrappers. */ public List elements; diff --git a/src/main/java/org/perlonjava/frontend/astnode/ListNode.java b/src/main/java/org/perlonjava/frontend/astnode/ListNode.java index 0bc933566..0a58895a0 100644 --- a/src/main/java/org/perlonjava/frontend/astnode/ListNode.java +++ b/src/main/java/org/perlonjava/frontend/astnode/ListNode.java @@ -1,6 +1,5 @@ package org.perlonjava.frontend.astnode; -import org.perlonjava.backend.jvm.astrefactor.LargeNodeRefactorer; import org.perlonjava.frontend.analysis.Visitor; import org.perlonjava.frontend.parser.Parser; @@ -20,9 +19,7 @@ *

    * Unlike {@link ArrayLiteralNode} which creates an array reference, ListNode represents * a flat list that can be assigned to arrays or used in list context. - *

    * - * @see LargeNodeRefactorer * @see ArrayLiteralNode * @see HashLiteralNode */ @@ -33,9 +30,6 @@ public class ListNode extends AbstractNode { * Each element is an AST node representing an expression. Elements are evaluated * in the context determined by how the list is used (list context for assignments, * scalar context for the last element in scalar context, etc.). - *

    - * Note: This field is non-final because LargeNodeRefactorer may replace - * the original list with a refactored version containing chunk wrappers. */ public List elements;