Skip to content

perf(codegen): inline the slot load for guarded numeric array index reads#5132

Open
TheHypnoo wants to merge 7 commits into
mainfrom
perf/inline-numeric-array-get
Open

perf(codegen): inline the slot load for guarded numeric array index reads#5132
TheHypnoo wants to merge 7 commits into
mainfrom
perf/inline-numeric-array-get

Conversation

@TheHypnoo

@TheHypnoo TheHypnoo commented Jun 14, 2026

Copy link
Copy Markdown
Member

What

For a numeric (raw-f64) array read arr[i] that already passes the
numeric_array_index_get_guard, emit the slot load inline instead of
calling the runtime helper js_array_numeric_get_f64_unboxed.

Why it's safe

The guard on the way into the fast block already proves everything the helper
re-checks:

  • a plain, non-forwarded Array (plain_array_index_guard),
  • raw-f64 numeric layout (js_array_is_numeric_f64_layout),
  • index in bounds (plain_array_index_guard(.., in_bounds=true)).

Raw-f64 arrays are dense (no HOLE slots) and each slot holds a raw f64, so
the helper's hot path reduces to return *elements_ptr.add(index). We emit
exactly that: load double at arr + 8 + i*8 (ArrayHeader: length@0,
capacity@4, elements@8). The non-numeric else branch already addressed
elements the same way.

Impact

array_read micro-benchmark: 149ms → 91ms (~1.6×) by removing the
per-element call + redundant re-validation.

Validation

  • 29/29 benchmark-correctness programs match Node byte-for-byte.
  • Edge cases match Node: in-bounds sum, OOB index → undefined, negative index
    undefined.
  • Updated tests/typed_feedback.rs to assert the IR no longer contains
    call double @js_array_numeric_get_f64_unboxed and now contains load double.
  • Full parity suite: 702 pass / 53 fail, all 53 in the pre-existing
    gap/issue/parity-module baseline (none array-indexing related). Zero new
    regressions.

Summary by CodeRabbit

  • Performance

    • Improved the numeric array index “fast path” by eliminating the unboxed helper call and performing an inline double element load in the guarded typed-feedback path.
  • Tests

    • Updated typed feedback assertions to confirm the unboxed helper is not invoked and that direct double loads are generated.
    • Updated benchmark IR checks and regression IR fixtures to validate the guarded fast-path block shape and disallow the helper call.

…eads

A numeric `arr[i]` read is lowered as a typed-feedback guard followed, on the
fast path, by a call to `js_array_numeric_get_f64_unboxed`. But the
`numeric_array_index_get_guard` on the way into the fast block already proves
everything that helper re-checks: a plain non-forwarded `Array`, in raw-f64
numeric layout, with `index` in bounds (`plain_array_index_guard(.., in_bounds
=true)` && `js_array_is_numeric_f64_layout`). So the helper's hot path just does
`return *elements_ptr.add(index)` after re-validating — a redundant call per read.

Emit that slot load inline (`getelementptr` + `load double` at `arr + 8 +
index*8`) instead, matching the helper exactly. Raw-f64 arrays are dense (no
HOLE slots) and hold raw f64s, so no hole→undefined translation is needed; the
non-raw path already inlines (with the hole check) and is unchanged. Out-of-
bounds / negative / non-number indices fail the guard and route to the boxed
fallback exactly as before.

04_array_read: 149ms -> 91ms (~1.6x). numeric_array_numeric is read+write so it
is now write-bound (neutral). Part of closing the AOT array-element-access gap
(see #5094).

Verified: 29/29 suite benchmarks match Node; OOB/negative/hole reads on numeric
arrays return undefined identical to Node; codegen tests pass (the hot-path test
now asserts the inline load instead of the unboxed-get call).
@coderabbitai

coderabbitai Bot commented Jun 14, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: c1a05f9f-a7c3-4d06-a409-459fb9ed64d4

📥 Commits

Reviewing files that changed from the base of the PR and between 73c239d and 966e1aa.

📒 Files selected for processing (1)
  • tests/test_compiler_output_regression.py

📝 Walkthrough

Walkthrough

In lower_guarded_array_index_get, the require_numeric_layout fast path is changed to load the array element inline: it extends idx_i32 to i64, computes the byte offset (idx*8 + 8), adds that to the masked handle, casts to double*, and loads directly—removing the js_array_numeric_get_f64_unboxed runtime helper call. The corresponding unit test assertion is updated to verify the helper is absent and load double is present. The benchmark IR check and regression test fixture are also updated to assert the new behavior.

Changes

Inline numeric array element load

Layer / File(s) Summary
Inline load implementation, unit test, and benchmark IR verification
crates/perry-codegen/src/expr/index_get.rs, crates/perry-codegen/tests/typed_feedback.rs, benchmarks/compiler_output/workloads.toml, tests/test_compiler_output_regression.py
lower_guarded_array_index_get replaces the js_array_numeric_get_f64_unboxed call with inline pointer arithmetic (index extension → byte offset → masked_handle + offsetdouble* cast → load double). The typed_feedback_guards_computed_numeric_array_index_hot_path test removes the old positive call assertion, adds a negative assertion for the helper, and adds a positive assertion for load double, with a clarifying comment. The numeric_array_uses_unboxed_get IR check in the workload is updated to match the guarded typed-feedback path and assert the helper is not called. The regression test fixture is updated to reflect the new guard-driven control flow with inlined fast-path computation.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐇 No more helper hops across the runtime floor,
The bunny loads the double straight from memory's door.
Extend the index, shift by eight, add header too,
A pointer cast, a load, and we are nearly through!
Inline is faster—that's the carrot I adore. 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main optimization: inlining slot loads for guarded numeric array index reads, which is the primary change across all modified files.
Description check ✅ Passed The PR description includes all key sections: Summary (What/Why), Changes context, detailed validation results, performance metrics, and test updates, comprehensively documenting the optimization and its safety guarantees.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/inline-numeric-array-get

Comment @coderabbitai help to get the list of available commands and usage tips.

TheHypnoo and others added 3 commits June 14, 2026 16:58
…ne GET

The inline numeric array read no longer emits a call to
js_array_numeric_get_f64_unboxed; the slot is loaded inline after the
guard. Update the native-region-proof IR check to assert the guard
(js_typed_feedback_numeric_array_index_get_guard) is present and that the
helper call is elided, instead of requiring the now-removed call. The
native-rep structural record still tags the logical consumer, so
numeric_array_get_fast_f64 continues to prove the checked_native fast path.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@benchmarks/compiler_output/workloads.toml`:
- Around line 627-630: The workload check for numeric_array_uses_unboxed_get at
lines 627-630 only verifies the presence of the guard and absence of the helper
function call, but does not assert that the expected direct inline-load
instruction is actually present in the IR output. Add a regex_contains or
contains check to verify the presence of the direct load instruction (such as
"load double") to strengthen the test and ensure it validates the complete
inline-load fast path optimization, preventing regressions where the helper
removal occurs without the expected inline load taking place.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 24f156c4-8dd6-479e-b179-ce281344a06d

📥 Commits

Reviewing files that changed from the base of the PR and between 6c906b0 and 5dfcfd6.

📒 Files selected for processing (1)
  • benchmarks/compiler_output/workloads.toml

Comment thread benchmarks/compiler_output/workloads.toml Outdated
TheHypnoo and others added 3 commits June 15, 2026 15:53
…path

Per CodeRabbit: prove the optimization contract positively, not just by
helper removal. Add a regex asserting the guarded fast block
(bidx.num.fast) computes the element pointer via inttoptr and performs an
inline 'load double' before branching to the merge block. Keeps the guard
presence + helper-call-absence checks. Validated against the optimized
llvm_after IR the harness gates on.
test_generic_native_rep_checks_require_configured_records fed synthetic
IR with the old 'call double @js_array_numeric_get_f64_unboxed' shape,
which no longer matches the updated numeric_array_uses_unboxed_get check
(guard present + inline inttoptr/load double in bidx.num.fast + helper
call elided). Update the fixture IR to the inlined fast-path shape;
push/set still exercise their guarded raw-f64 helpers. Native-rep records
keep the logical get consumer tag, so the structural checks are unchanged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant