Skip to content

[Windows] Search for closest free page in x64 JIT memory allocation#124

Merged
mazong1123 merged 1 commit intomainfrom
jingm/fix-win-x64-jit-allocation-closest-first
Mar 25, 2026
Merged

[Windows] Search for closest free page in x64 JIT memory allocation#124
mazong1123 merged 1 commit intomainfrom
jingm/fix-win-x64-jit-allocation-closest-first

Conversation

@mazong1123
Copy link
Collaborator

@mazong1123 mazong1123 commented Mar 25, 2026

Problem

The x86_64 Windows implementation of allocate_jit_memory_windows scanned linearly from func_addr - 2GB upward, which often allocated memory ~2GB away from the function. This could land in/near the stack region, disrupting the stack guard page and causing STATUS_STACK_OVERFLOW (0xc00000fd) during parallel test execution.

Observed: Function at 0x7ff65ad71670, JIT at 0x7ff5dad80000, distance: 2047 MB

Fix

Replace the linear scan with a bidirectional closest-first search, matching the existing aarch64 Windows and Unix implementations. The allocator now searches outward from the function address at +offset and -offset, finding the closest available page first.

This is an improved version of #122 that additionally:

  • Searches both directions (not just downward) for robustness
  • Avoids an infinite loop when checked_sub fails by keeping offset += page_size outside the inner direction loop

Tests Added

Unit tests (src/injector_core/common.rs):

  • test_jit_allocation_is_close_to_source — verifies JIT allocation is within 128MB of the function (fails with old code at 2047MB distance)
  • test_jit_allocation_not_in_stack_region — verifies JIT allocation is >16MB from the stack

Integration tests (tests/stack_safety.rs):

  • test_stack_growth_works_after_patching — patches a function then does deep recursion (2000 frames)
  • test_concurrent_patching_with_deep_stack_usage — 8 threads concurrently patch + exercise deep recursion (replicates the original crash scenario)

The old x86_64 Windows implementation of allocate_jit_memory scanned linearly
from func_addr - 2GB upward, which often allocated memory ~2GB away from the
function. This could land in/near the stack region, disrupting the stack guard
page and causing STATUS_STACK_OVERFLOW (0xc00000fd) during parallel test
execution.

This change replaces the linear scan with a bidirectional closest-first search
(matching the existing aarch64 Windows and Unix implementations). The allocator
now searches outward from the function address at +offset and -offset, finding
the closest available page first.

This is an improved version of PR #122 that fixes two additional issues:
- Searches both directions (not just downward) for robustness
- Avoids an infinite loop when checked_sub fails by keeping offset increment
  outside the inner direction loop

Also adds tests:
- Unit test verifying JIT allocation is close to source (<128MB, not ~2GB)
- Unit test verifying JIT allocation is not near the stack region
- Integration test for stack growth after patching
- Integration test for concurrent patching with deep stack usage (8 threads)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@mazong1123 mazong1123 enabled auto-merge (squash) March 25, 2026 18:15
@mazong1123 mazong1123 disabled auto-merge March 25, 2026 18:15
@mazong1123 mazong1123 merged commit 98e4a92 into main Mar 25, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant