Optimize control flow with block-level dispatcher sharing by fglock · Pull Request #161 · fglock/PerlOnJava

fglock · 2026-02-04T17:26:35Z

Summary

Implements block-level dispatcher sharing to eliminate redundant control flow dispatch code when multiple calls occur in the same block with the same visible loops.

The Problem

Previously, each call site emitted a complete control flow dispatcher (~150 bytes):

for (1..3) {
    A();  # 150 bytes dispatcher
    B();  # 150 bytes dispatcher (identical!)
    C();  # 150 bytes dispatcher (identical!)
    D();  # 150 bytes dispatcher (identical!)
}

Total: 600 bytes of redundant code for 4 calls.

The Solution

Block-level shared dispatchers - all calls with the same visible loop state share ONE dispatcher:

Call sites: Simple check (~20 bytes) + GOTO to shared dispatcher
Block dispatcher: Full dispatch logic (~150 bytes, emitted once)
Automatic reuse: Signature-based matching via blockDispatcherLabels map

Results

Bytecode Savings

Calls	Old (bytes)	New (bytes)	Savings	Percentage
1	150	173	-23	-15%
2	300	193	107	36% ✅
4	600	233	367	61% ✅
10	1500	353	1147	76% ✅

Real-World Measurements

Test with 4 sequential calls: 2232 → 2139 bytecode lines (4.2% reduction)
CHECKCAST operations: 23 → 17 (26% reduction)
Complex nested loops: No regression (1374 lines maintained)
All 2006 unit tests pass ✅

Implementation Details

Modified Files

JavaClassInfo.java
- Added blockDispatcherLabels map to track dispatcher reuse
- Added getLoopStateSignature() method for unique loop state identification
- Uses identity hash codes to distinguish loop instances
EmitSubroutine.java
- Simplified call-site emission to ~20 bytes (check + GOTO)
- Added emitBlockDispatcher() helper method
- First call with a signature creates and emits dispatcher
- Subsequent calls with same signature reuse existing dispatcher
Documentation
- New: BLOCK_DISPATCHER_OPTIMIZATION.md - detailed optimization guide
- New: CONTROL_FLOW_IMPLEMENTATION.md - comprehensive implementation guide
- Removed: 6 obsolete design docs

How It Works

Compute loop state signature (visible loop labels + identity hashes)
Check if dispatcher exists for that signature in blockDispatcherLabels
If first use: create dispatcher label, emit dispatcher code after call site
If reuse: jump to existing dispatcher label
Dispatcher stays within loop scope (no frame computation issues)

Trade-offs

Advantages:

✅ Massive savings for multiple calls (61% for 4 calls)
✅ Common pattern in real Perl code
✅ No frame computation issues
✅ Automatic optimization

Disadvantages:

⚠️ 23 bytes overhead for single calls (acceptable)
Small HashMap overhead per method

Net Result: Overall WIN for typical Perl code patterns.

Testing

All 2006 unit tests pass, including:

✅ Control flow tests (last/next/redo)
✅ Non-local control flow
✅ Tail call optimization
✅ Nested loops
✅ Labeled control flow
✅ Complex real-world code

Why This Works Better Than Alternatives

vs. Per-Call Dispatchers (Previous)

Eliminates redundancy while maintaining correctness
36-76% savings for multi-call blocks

vs. Method-Level Centralization (Attempted)

Stays within loop scope (no frame errors)
Only checks visible loops (not all method loops)
Actually reduces bytecode (centralization increased it)

Block-level is the sweet spot: sharing within proper scope boundaries.

🤖 Generated with Claude Code

Implements block-level dispatcher sharing to eliminate redundant control flow dispatch code when multiple calls occur in the same block with the same visible loops. Key improvements: - Multiple calls in same block share ONE dispatcher (not one per call) - Call sites reduced from ~150 bytes to ~20 bytes each - Block dispatcher emitted once per unique loop state (~150 bytes) - 36-76% bytecode savings for blocks with 2+ calls Results: - Test with 4 sequential calls: 2232 → 2139 lines (4.2% reduction) - CHECKCAST operations: 23 → 17 (26% reduction) - All 2006 unit tests pass Implementation: - JavaClassInfo: Added blockDispatcherLabels map and getLoopStateSignature() - EmitSubroutine: Simplified call sites, added emitBlockDispatcher() helper - Dispatcher stays within loop scope (no frame computation issues) - Automatic signature-based reuse via identity hash codes Trade-offs: - Single call: 23 bytes overhead (acceptable) - Multiple calls: massive savings (61% for 4 calls, 76% for 10 calls) - Net win for typical Perl code patterns Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

With block-level dispatcher sharing (PR #161), non-local control flow now works correctly. The skip() function can use 'last SKIP' directly without workarounds. Changes: - Test/More.pm: Replaced skip_internal() with proper skip() that uses last SKIP - TestMoreHelper.java: Removed skip() call rewriting logic - test.pl.patch: Removed skip_internal() workaround from Perl 5 tests Testing: - All 2012 unit tests pass (100%) - Perl 5 tests work correctly with native skip() implementation - Non-local last SKIP exits SKIP block immediately from subroutine This cleanup removes ~100 lines of workaround code that is no longer needed. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

fglock merged commit 5f0b7b9 into master Feb 4, 2026
2 checks passed

fglock deleted the optimize-block-level-dispatchers branch February 4, 2026 17:38

fglock mentioned this pull request Feb 4, 2026

Remove skip() workarounds - non-local last SKIP now works #162

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize control flow with block-level dispatcher sharing#161

Optimize control flow with block-level dispatcher sharing#161
fglock merged 1 commit intomasterfrom
optimize-block-level-dispatchers

fglock commented Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fglock commented Feb 4, 2026

Summary

The Problem

The Solution

Results

Bytecode Savings

Real-World Measurements

Implementation Details

Modified Files

How It Works

Trade-offs

Testing

Why This Works Better Than Alternatives

vs. Per-Call Dispatchers (Previous)

vs. Method-Level Centralization (Attempted)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant