Skip to content

perf(codegen): inline class-field shape guard fast path (#5093)#5198

Open
proggeramlug wants to merge 1 commit into
mainfrom
fix/issue-5093-inline-shape-guard
Open

perf(codegen): inline class-field shape guard fast path (#5093)#5198
proggeramlug wants to merge 1 commit into
mainfrom
fix/issue-5093-inline-shape-guard

Conversation

@proggeramlug

@proggeramlug proggeramlug commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Summary

Inlines the class-field shape guard so a monomorphic this.field read/write on a
known class instance branches straight to the raw slot on a hit, skipping the
cross-crate js_typed_feedback_class_field_{get,set}_guard call that #5093
measured as the dominant per-access cost on 09_method_calls. On any miss it falls
through to the unchanged guard-call path, so the change is purely additive — it
can never take a fast path the guard would have rejected.

Addresses #5093 (incrementally — see Scope below).

How it works

  • Per-object intact bit (gc/layout.rs): a single free bit of
    GcHeader._reserved (GC_OBJ_TYPED_LAYOUT_INTACT), set when an object's canonical
    typed descriptor is installed and cleared on every downgrade/removal path
    (layout_set_typed_unknown, layout_clear_for_ptr, layout_mark_unknown,
    layout_init_pointer_free, rebuild, transfer). Because raw-f64 downgrade is
    all-or-nothing (the whole descriptor is removed), this one bit replaces the
    thread-local TYPED_LAYOUTS hashmap probe — so the inline guard concludes "slot K
    is raw-f64" from a single bit-test plus a class_id/keys match. The bit rides along
    with _reserved across copying/evacuating GC.
  • Inline pre-check (expr/class_field_inline_guard.rs, wired into
    property_get/property_set): a pointer-tag/handle-band gate guarding the
    dereference, then gc-type / forwarded / object-type / class_id / keys_array /
    field_count checks, the process-global enable flag, the intact bit for raw-f64
    fields, and — for raw sets — not-frozen + a plain-finite-number value check. GET
    inlines raw and non-raw fields; SET inlines raw-f64 fields only.
  • Safety gates: PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED (exported, flips on the
    moment any accessor/property descriptor comes into use or typed-feedback tracing is
    enabled — the inline path then falls back to the full guard, preserving
    descriptor-aware dispatch + feedback recording). PERRY_DISABLE_CLASS_FIELD_INLINE=1
    is an explicit escape hatch; PERRY_VERIFY_TYPED_INTACT=1 aborts if the intact bit
    ever disagrees with the side table.

Scope — incremental win, NOT #5093's ~60× target

This skips the guard call on the hot path (~10% on 09_method_calls) but does not
collapse the loop to load/fadd/store. Root cause, found by disassembling the
optimized loop: the cold guard-miss fallback call remains a GC safepoint, which
forces the shadow-stack-rooted receiver to reload every iteration, so the object
pointer is never loop-invariant and LLVM's LICM can't hoist any shape load (even
!invariant.load-tagged class_id/object_type). Collapsing the loop requires a
loop-versioning / deopt-exit pass that makes the hot loop call-free — a larger,
maintainer-driven change. Full analysis posted on #5093. This PR is the additive
foundation (per-object intact bit + inline fast path).

Testing

  • test-files/test_class_field_raw_f64_downgrade.ts (new): the memory-corruption
    trap perf(method dispatch): method_calls ~290× Node — remaining cost is per-field-access shape-guard calls (plan + standby) #5093 warns about — storing a non-number into a number-typed slot through an
    any alias must clear the intact bit so a later read returns the boxed value, not
    the pointer bits read as a raw double. Passes byte-for-byte vs
    node --experimental-strip-types
    , including under PERRY_VERIFY_TYPED_INTACT=1.
  • cargo test -p perry-runtime: gc/layout + typed_feedback/guards pass (1041 ok).
    The lone failure (builtin_prototype_methods_reject_dynamic_new) is a pre-existing
    parallel test-isolation flake
    origin/main fails a different test
    (stream_constructors_expose_static_method_values) with the identical 1-failure
    pattern; the test passes in isolation and within its own module.
  • Local parity sweep: the two mismatches (object_methods strict-mode freeze-write
    throw; perfhooks TypeError message wording) reproduce identically with the
    inline path disabled
    , i.e. they're pre-existing gaps in code paths this change
    doesn't touch.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Added an inline “class-field shape” fast-path pre-check for monomorphic this.field reads/writes, with safe fallback to the existing guarded path.
    • Introduced a GC typed-layout integrity flag and additional typed-intact verification under optional runtime checks.
    • Automatically disables the inline fast path when typed-feedback tracing or verification is enabled.
  • Tests
    • Added coverage for raw-f64 fast-path downgrade safety to ensure correct type behavior after non-number writes and subsequent repromotion.

@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

A new inlined class-field shape guard fast path is added across codegen and runtime. A per-object GC_OBJ_TYPED_LAYOUT_INTACT header bit is introduced to track canonical typed-descriptor state. A process-global PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED flag controls fallback to full guard calls. The codegen emits an inline precheck before property get/set guard calls, and a runtime verifier cross-checks intact-bit consistency.

Changes

Class Field Inline Guard Fast Path

Layer / File(s) Summary
GC typed-layout intact bit definition and lifecycle
crates/perry-runtime/src/gc/layout.rs
Defines GC_OBJ_TYPED_LAYOUT_INTACT, adds header bit helpers (header_set_typed_layout_intact, header_clear_typed_layout_intact, clear_typed_layout_intact_for_user), integrates bit set/clear into typed-descriptor init paths, layout_mark_unknown, layout_clear_for_ptr, layout_set_typed_unknown, layout_rebuild_from_slots_with_policy, layout_transfer, and adds layout_typed_intact_for_user query.
Process-global inline-guard disable flag and GC init wiring
crates/perry-runtime/src/typed_feedback.rs, crates/perry-runtime/src/object/mod.rs, crates/perry-runtime/src/gc/mod.rs
Adds typed_feedback_active() wrapper, introduces PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED AtomicU8 with disable_class_field_inline_guard() and class_field_inline_guard_enabled() helpers, calls the disable from set_property_attrs and set_accessor_descriptor, and wires js_gc_init() to disable the guard under typed-feedback or PERRY_DISABLE_CLASS_FIELD_INLINE via the new env_flag_enabled() truthy parser.
Codegen inline precheck module and runtime global declaration
crates/perry-codegen/src/runtime_decls/objects.rs, crates/perry-codegen/src/expr/mod.rs, crates/perry-codegen/src/expr/class_field_inline_guard.rs
Declares the external PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED global for codegen, adds the class_field_inline_guard submodule, and implements emit_class_field_inline_precheck emitting IR that gates on pointer safety, object/class header metadata, disable flag, and optional raw-f64/not-frozen/finite-f64 constraints, branching to fast_label on hit or a new guardcall block on miss.
Inline precheck wired into property get and set codegen
crates/perry-codegen/src/expr/property_get.rs, crates/perry-codegen/src/expr/property_set.rs
Precomputes guard operands once in both get and set paths; inserts emit_class_field_inline_precheck before js_typed_feedback_class_field_get_guard and (for raw-f64 candidates) before js_typed_feedback_class_field_set_guard; removes redundant obj_bits recomputation in the set fallback path.
Runtime guard integrity verification
crates/perry-runtime/src/typed_feedback/guards.rs
Refactors class_field_fast_contract shape check into a shape_ok predicate; adds a conditional cross-check via verify_typed_intact_enabled() that reads the intact bit and typed-raw side-table, aborting on inconsistency; adds the verify_typed_intact_enabled() helper with static-atomic caching in non-test builds.
Raw-f64 downgrade regression test
test-files/test_class_field_raw_f64_downgrade.ts
Adds a TypeScript test that warms up the raw-f64 path, forces downgrade via any-typed non-number writes, checks value/type correctness, repromotes, and verifies null and object downgrade safety.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Poem

🐇 A bit in the header, a flag in the flag,
The guard skips the hashmap — no need to drag!
If shapes stay intact and the class IDs match,
The fast path just fires without any catch.
But freeze it or break it, the bit tumbles down,
And the full guard runs in — no garbage, no frown. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: inlining the class-field shape guard fast path for improved codegen performance. It is specific, concise, and clearly summarizes the primary objective.
Description check ✅ Passed The description is comprehensive and well-structured, covering summary, mechanism details, scope clarification, and testing. It follows the repository's template structure with all key sections properly filled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/issue-5093-inline-shape-guard

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/perry-codegen/src/expr/class_field_inline_guard.rs`:
- Around line 75-141: The runtime flags PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED
and PERRY_VERIFY_TYPED_INTACT are being checked too late in the execution flow,
after the code has already performed unsafe memory loads from obj_handle
offsets. Move the flag loading and checking to occur before the conditional
branch that routes to deref_label, so that when either flag indicates the inline
path should be disabled, the code branches to guardcall_label instead of
executing the crash-prone inline dereferences. Incorporate both the
PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED check (currently done with the I8 load
and icmp_eq call near the end of the deref block) and a corresponding check for
PERRY_VERIFY_TYPED_INTACT into a combined gate condition that is evaluated
before the cond_br call, replacing the current late-stage flag validation that
occurs after loading from obj_ptr.

In `@crates/perry-codegen/src/runtime_decls/objects.rs`:
- Around line 26-31: The external global declaration for
PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED uses plain I8 type, but the runtime
defines this symbol as AtomicU8, creating a contract mismatch. Change the
add_external_global call to declare this symbol with atomic semantics to match
the runtime's definition, ensuring the codegen emits a relaxed atomic load
instead of a standard load when reading this flag in the class-field inline
guard path. Update both the declaration in objects.rs and the corresponding load
operation in class_field_inline_guard.rs to use atomic load semantics that
respect the atomic nature of the runtime's AtomicU8 definition.

In `@crates/perry-runtime/src/gc/mod.rs`:
- Around line 447-449: The condition checking PERRY_DISABLE_CLASS_FIELD_INLINE
uses .is_some() which only verifies the variable's presence, but the documented
escape hatch requires parsing its actual value (checking if it equals "1").
Replace the var_os("PERRY_DISABLE_CLASS_FIELD_INLINE").is_some() check with
logic that parses the environment variable's value and returns true only when it
equals "1", so that setting it to "0" or "false" will correctly allow the fast
path instead of disabling it.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: ecc743f5-0187-4676-a851-25362cffc7a2

📥 Commits

Reviewing files that changed from the base of the PR and between a1d9ce6 and 152ae93.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (14)
  • CHANGELOG.md
  • CLAUDE.md
  • Cargo.toml
  • crates/perry-codegen/src/expr/class_field_inline_guard.rs
  • crates/perry-codegen/src/expr/mod.rs
  • crates/perry-codegen/src/expr/property_get.rs
  • crates/perry-codegen/src/expr/property_set.rs
  • crates/perry-codegen/src/runtime_decls/objects.rs
  • crates/perry-runtime/src/gc/layout.rs
  • crates/perry-runtime/src/gc/mod.rs
  • crates/perry-runtime/src/object/mod.rs
  • crates/perry-runtime/src/typed_feedback.rs
  • crates/perry-runtime/src/typed_feedback/guards.rs
  • test-files/test_class_field_raw_f64_downgrade.ts

Comment on lines +75 to +141
// Gate the dereference: a basic block has no short-circuit, so the field
// loads below must only run once we know the receiver is a real heap object
// (POINTER_TAG and above the handle band). Otherwise fall to the guard call,
// which classifies non-pointer / handle receivers safely.
{
let blk = ctx.block();
let tag = blk.lshr(I64, obj_bits, "48");
let is_ptr = blk.icmp_eq(I64, &tag, POINTER_TAG_HI16);
let above_band = blk.icmp_ugt(I64, obj_handle, HANDLE_BAND_TOP);
let ptr_safe = blk.and(I1, &is_ptr, &above_band);
blk.cond_br(&ptr_safe, &deref_label, &guardcall_label);
}

ctx.current_block = deref_idx;
{
let blk = ctx.block();
let obj_ptr = blk.inttoptr(I64, obj_handle);

// GcHeader (precedes the object by 8 bytes): obj_type @-8 (i8),
// gc_flags @-7 (i8), _reserved @-6 (i16).
let gtype_ptr = blk.gep(I8, &obj_ptr, &[(I64, "-8")]);
let gtype = blk.load(I8, &gtype_ptr);
let gtype_ok = blk.icmp_eq(I8, &gtype, GC_TYPE_OBJECT);

let gflags_ptr = blk.gep(I8, &obj_ptr, &[(I64, "-7")]);
let gflags = blk.load(I8, &gflags_ptr);
let fwd = blk.and(I8, &gflags, GC_FLAG_FORWARDED_I8);
let not_fwd = blk.icmp_eq(I8, &fwd, "0");

let res_ptr = blk.gep(I8, &obj_ptr, &[(I64, "-6")]);
let reserved = blk.load(I16, &res_ptr);

// ObjectHeader: object_type @0 (i32)==REGULAR, class_id @4 (i32),
// field_count @12 (i32), keys_array @16 (i64).
let object_type = blk.load(I32, &obj_ptr);
let ot_ok = blk.icmp_eq(I32, &object_type, OBJECT_TYPE_REGULAR);

let cid_ptr = blk.gep(I8, &obj_ptr, &[(I64, "4")]);
let class_id = blk.load(I32, &cid_ptr);
let cid_ok = blk.icmp_eq(I32, &class_id, expected_class_id);

let fc_ptr = blk.gep(I8, &obj_ptr, &[(I64, "12")]);
let field_count = blk.load(I32, &fc_ptr);
let fc_ok = blk.icmp_ugt(I32, &field_count, &field_index_str);

let ka_ptr = blk.gep(I8, &obj_ptr, &[(I64, "16")]);
let keys_array = blk.load(I64, &ka_ptr);
let ka_ok = blk.icmp_eq(I64, &keys_array, expected_keys);

// Process-global gate: descriptors / typed-feedback in use disable the
// inline path (the guard then has observable behavior to preserve).
let flag = blk.load(I8, "@PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED");
let flag_ok = blk.icmp_eq(I8, &flag, "0");

let mut acc = blk.and(I1, &gtype_ok, &not_fwd);
acc = blk.and(I1, &acc, &ot_ok);
acc = blk.and(I1, &acc, &cid_ok);
acc = blk.and(I1, &acc, &fc_ok);
acc = blk.and(I1, &acc, &ka_ok);
acc = blk.and(I1, &acc, &flag_ok);

if require_raw_f64 {
// The slot is read/written as a raw double, so the per-object typed
// layout must be intact (no downgrade to a NaN-boxed value).
let intact = blk.and(I16, &reserved, TYPED_LAYOUT_INTACT_BIT);
let intact_ok = blk.icmp_ne(I16, &intact, "0");
acc = blk.and(I1, &acc, &intact_ok);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Apply the runtime disable/verify gates before any inline dereference.

PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED is checked after the helper has already loaded from obj_handle - 8, obj_handle, and ObjectHeader offsets. That means PERRY_DISABLE_CLASS_FIELD_INLINE=1 still executes the crash-prone inline layout reads instead of cleanly falling back to the guard call. Also ensure PERRY_VERIFY_TYPED_INTACT=1 forces this same guard-call path; otherwise raw-f64 inline hits skip the runtime verifier that cross-checks the intact bit.

Suggested direction
     {
         let blk = ctx.block();
+        let flag = blk.load(I8, "`@PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED`");
+        let flag_ok = blk.icmp_eq(I8, &flag, "0");
         let tag = blk.lshr(I64, obj_bits, "48");
         let is_ptr = blk.icmp_eq(I64, &tag, POINTER_TAG_HI16);
         let above_band = blk.icmp_ugt(I64, obj_handle, HANDLE_BAND_TOP);
         let ptr_safe = blk.and(I1, &is_ptr, &above_band);
-        blk.cond_br(&ptr_safe, &deref_label, &guardcall_label);
+        let can_inline_deref = blk.and(I1, &ptr_safe, &flag_ok);
+        blk.cond_br(&can_inline_deref, &deref_label, &guardcall_label);
     }
@@
-        // Process-global gate: descriptors / typed-feedback in use disable the
-        // inline path (the guard then has observable behavior to preserve).
-        let flag = blk.load(I8, "`@PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED`");
-        let flag_ok = blk.icmp_eq(I8, &flag, "0");
-
         let mut acc = blk.and(I1, &gtype_ok, &not_fwd);
@@
-        acc = blk.and(I1, &acc, &flag_ok);

Then wire PERRY_VERIFY_TYPED_INTACT=1 into the same disabled flag during runtime initialization, or add an equivalent pre-deref verifier gate that branches to guardcall_label.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Gate the dereference: a basic block has no short-circuit, so the field
// loads below must only run once we know the receiver is a real heap object
// (POINTER_TAG and above the handle band). Otherwise fall to the guard call,
// which classifies non-pointer / handle receivers safely.
{
let blk = ctx.block();
let tag = blk.lshr(I64, obj_bits, "48");
let is_ptr = blk.icmp_eq(I64, &tag, POINTER_TAG_HI16);
let above_band = blk.icmp_ugt(I64, obj_handle, HANDLE_BAND_TOP);
let ptr_safe = blk.and(I1, &is_ptr, &above_band);
blk.cond_br(&ptr_safe, &deref_label, &guardcall_label);
}
ctx.current_block = deref_idx;
{
let blk = ctx.block();
let obj_ptr = blk.inttoptr(I64, obj_handle);
// GcHeader (precedes the object by 8 bytes): obj_type @-8 (i8),
// gc_flags @-7 (i8), _reserved @-6 (i16).
let gtype_ptr = blk.gep(I8, &obj_ptr, &[(I64, "-8")]);
let gtype = blk.load(I8, &gtype_ptr);
let gtype_ok = blk.icmp_eq(I8, &gtype, GC_TYPE_OBJECT);
let gflags_ptr = blk.gep(I8, &obj_ptr, &[(I64, "-7")]);
let gflags = blk.load(I8, &gflags_ptr);
let fwd = blk.and(I8, &gflags, GC_FLAG_FORWARDED_I8);
let not_fwd = blk.icmp_eq(I8, &fwd, "0");
let res_ptr = blk.gep(I8, &obj_ptr, &[(I64, "-6")]);
let reserved = blk.load(I16, &res_ptr);
// ObjectHeader: object_type @0 (i32)==REGULAR, class_id @4 (i32),
// field_count @12 (i32), keys_array @16 (i64).
let object_type = blk.load(I32, &obj_ptr);
let ot_ok = blk.icmp_eq(I32, &object_type, OBJECT_TYPE_REGULAR);
let cid_ptr = blk.gep(I8, &obj_ptr, &[(I64, "4")]);
let class_id = blk.load(I32, &cid_ptr);
let cid_ok = blk.icmp_eq(I32, &class_id, expected_class_id);
let fc_ptr = blk.gep(I8, &obj_ptr, &[(I64, "12")]);
let field_count = blk.load(I32, &fc_ptr);
let fc_ok = blk.icmp_ugt(I32, &field_count, &field_index_str);
let ka_ptr = blk.gep(I8, &obj_ptr, &[(I64, "16")]);
let keys_array = blk.load(I64, &ka_ptr);
let ka_ok = blk.icmp_eq(I64, &keys_array, expected_keys);
// Process-global gate: descriptors / typed-feedback in use disable the
// inline path (the guard then has observable behavior to preserve).
let flag = blk.load(I8, "@PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED");
let flag_ok = blk.icmp_eq(I8, &flag, "0");
let mut acc = blk.and(I1, &gtype_ok, &not_fwd);
acc = blk.and(I1, &acc, &ot_ok);
acc = blk.and(I1, &acc, &cid_ok);
acc = blk.and(I1, &acc, &fc_ok);
acc = blk.and(I1, &acc, &ka_ok);
acc = blk.and(I1, &acc, &flag_ok);
if require_raw_f64 {
// The slot is read/written as a raw double, so the per-object typed
// layout must be intact (no downgrade to a NaN-boxed value).
let intact = blk.and(I16, &reserved, TYPED_LAYOUT_INTACT_BIT);
let intact_ok = blk.icmp_ne(I16, &intact, "0");
acc = blk.and(I1, &acc, &intact_ok);
// Gate the dereference: a basic block has no short-circuit, so the field
// loads below must only run once we know the receiver is a real heap object
// (POINTER_TAG and above the handle band). Otherwise fall to the guard call,
// which classifies non-pointer / handle receivers safely.
{
let blk = ctx.block();
let flag = blk.load(I8, "`@PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED`");
let flag_ok = blk.icmp_eq(I8, &flag, "0");
let tag = blk.lshr(I64, obj_bits, "48");
let is_ptr = blk.icmp_eq(I64, &tag, POINTER_TAG_HI16);
let above_band = blk.icmp_ugt(I64, obj_handle, HANDLE_BAND_TOP);
let ptr_safe = blk.and(I1, &is_ptr, &above_band);
let can_inline_deref = blk.and(I1, &ptr_safe, &flag_ok);
blk.cond_br(&can_inline_deref, &deref_label, &guardcall_label);
}
ctx.current_block = deref_idx;
{
let blk = ctx.block();
let obj_ptr = blk.inttoptr(I64, obj_handle);
// GcHeader (precedes the object by 8 bytes): obj_type `@-8` (i8),
// gc_flags `@-7` (i8), _reserved `@-6` (i16).
let gtype_ptr = blk.gep(I8, &obj_ptr, &[(I64, "-8")]);
let gtype = blk.load(I8, &gtype_ptr);
let gtype_ok = blk.icmp_eq(I8, &gtype, GC_TYPE_OBJECT);
let gflags_ptr = blk.gep(I8, &obj_ptr, &[(I64, "-7")]);
let gflags = blk.load(I8, &gflags_ptr);
let fwd = blk.and(I8, &gflags, GC_FLAG_FORWARDED_I8);
let not_fwd = blk.icmp_eq(I8, &fwd, "0");
let res_ptr = blk.gep(I8, &obj_ptr, &[(I64, "-6")]);
let reserved = blk.load(I16, &res_ptr);
// ObjectHeader: object_type `@0` (i32)==REGULAR, class_id `@4` (i32),
// field_count `@12` (i32), keys_array `@16` (i64).
let object_type = blk.load(I32, &obj_ptr);
let ot_ok = blk.icmp_eq(I32, &object_type, OBJECT_TYPE_REGULAR);
let cid_ptr = blk.gep(I8, &obj_ptr, &[(I64, "4")]);
let class_id = blk.load(I32, &cid_ptr);
let cid_ok = blk.icmp_eq(I32, &class_id, expected_class_id);
let fc_ptr = blk.gep(I8, &obj_ptr, &[(I64, "12")]);
let field_count = blk.load(I32, &fc_ptr);
let fc_ok = blk.icmp_ugt(I32, &field_count, &field_index_str);
let ka_ptr = blk.gep(I8, &obj_ptr, &[(I64, "16")]);
let keys_array = blk.load(I64, &ka_ptr);
let ka_ok = blk.icmp_eq(I64, &keys_array, expected_keys);
let mut acc = blk.and(I1, &gtype_ok, &not_fwd);
acc = blk.and(I1, &acc, &ot_ok);
acc = blk.and(I1, &acc, &cid_ok);
acc = blk.and(I1, &acc, &fc_ok);
acc = blk.and(I1, &acc, &ka_ok);
if require_raw_f64 {
// The slot is read/written as a raw double, so the per-object typed
// layout must be intact (no downgrade to a NaN-boxed value).
let intact = blk.and(I16, &reserved, TYPED_LAYOUT_INTACT_BIT);
let intact_ok = blk.icmp_ne(I16, &intact, "0");
acc = blk.and(I1, &acc, &intact_ok);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/perry-codegen/src/expr/class_field_inline_guard.rs` around lines 75 -
141, The runtime flags PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED and
PERRY_VERIFY_TYPED_INTACT are being checked too late in the execution flow,
after the code has already performed unsafe memory loads from obj_handle
offsets. Move the flag loading and checking to occur before the conditional
branch that routes to deref_label, so that when either flag indicates the inline
path should be disabled, the code branches to guardcall_label instead of
executing the crash-prone inline dereferences. Incorporate both the
PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED check (currently done with the I8 load
and icmp_eq call near the end of the deref block) and a corresponding check for
PERRY_VERIFY_TYPED_INTACT into a combined gate condition that is evaluated
before the cond_br call, replacing the current late-stage flag validation that
occurs after loading from obj_ptr.

Comment on lines +26 to +31
// #5093: sticky runtime flag (i8, 0 = enabled) gating the codegen-inlined
// class-field shape-guard fast path. The inline guard loads this directly
// and falls back to the full `js_typed_feedback_class_field_*_guard` call
// when it is non-zero (descriptors / typed-feedback in use). Defined in
// perry-runtime as `PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED`.
module.add_external_global("PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED", I8);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Inspect the generated-code load of PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED.
#
# Expected: the inline precheck either emits an atomic/volatile load, calls a runtime helper
# such as class_field_inline_guard_enabled(), or has an explicit single-threaded guarantee
# around all reads/writes of this symbol.
rg -n -C4 'PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED|class_field_inline_guard_enabled|disable_class_field_inline_guard|atomic|volatile' crates/perry-codegen/src crates/perry-runtime/src

# Print the inline precheck module if present so the load instruction can be reviewed in context.
fd -i '^class_field_inline_guard\.rs$' crates/perry-codegen/src --exec sed -n '1,260p' {}

Repository: PerryTS/perry

Length of output: 50372


🏁 Script executed:

# Find exact runtime definition of PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED
rg -n 'PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED' crates/perry-runtime/src --type rust -A 2 -B 2

# Search for the codegen file that emits the load
fd -i 'class.*field.*guard' crates/perry-codegen/src --type f

# Search for where this is being loaded in codegen
rg -n 'emit.*class.*field\|class_field.*load' crates/perry-codegen/src --type rust -A 3 -B 1

Repository: PerryTS/perry

Length of output: 1976


🏁 Script executed:

# Read the codegen file that emits the class field inline guard load
cat -n crates/perry-codegen/src/expr/class_field_inline_guard.rs | head -150

Repository: PerryTS/perry

Length of output: 8474


Verify the generated guard reads this atomic flag atomically.

The runtime defines PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED as AtomicU8 and documents that the codegen should emit a "relaxed load". However, codegen declares it as a plain I8 global and emits a standard (non-atomic) load at line 126 in class_field_inline_guard.rs. While the flag is monotonic (set-only), the declaration mismatch creates an implicit contract violation: the codegen should declare this symbol with atomic intent to match the runtime's AtomicU8.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/perry-codegen/src/runtime_decls/objects.rs` around lines 26 - 31, The
external global declaration for PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED uses
plain I8 type, but the runtime defines this symbol as AtomicU8, creating a
contract mismatch. Change the add_external_global call to declare this symbol
with atomic semantics to match the runtime's definition, ensuring the codegen
emits a relaxed atomic load instead of a standard load when reading this flag in
the class-field inline guard path. Update both the declaration in objects.rs and
the corresponding load operation in class_field_inline_guard.rs to use atomic
load semantics that respect the atomic nature of the runtime's AtomicU8
definition.

Comment thread crates/perry-runtime/src/gc/mod.rs
@proggeramlug proggeramlug force-pushed the fix/issue-5093-inline-shape-guard branch from 152ae93 to 44f9666 Compare June 15, 2026 14:02
Monomorphic this.field reads/writes on a known class instance previously
routed every access through a cross-crate js_typed_feedback_class_field_*
_guard call before touching the raw slot. #5093 measured the call itself as
the dominant per-access cost. Emit the cheap part of the guard contract as
inline IR (pointer-tag gate, gc-type/forwarded/object-type/class_id/keys/
field_count checks, plus a per-object typed-layout intact bit for raw-f64
fields, frozen + plain-number checks for raw sets) and branch straight to
the fast slot load/store on a hit, skipping the call. On any miss it falls
through to the unchanged guard-call path, so the change is purely additive.

Runtime: add a single per-object GC_OBJ_TYPED_LAYOUT_INTACT bit (free bit
of GcHeader._reserved), set when an object's canonical typed descriptor is
installed and cleared on every downgrade/removal path. Because raw-f64
downgrade is all-or-nothing (the whole descriptor is removed), this one bit
stands in for the thread-local raw-f64 layout probe. Add the exported
PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED flag (flips on when descriptors /
typed-feedback come into use; PERRY_DISABLE_CLASS_FIELD_INLINE=1 escape
hatch) and a PERRY_VERIFY_TYPED_INTACT=1 self-check.

NOTE: incremental win only (~10% on 09_method_calls; skips the call on the
hot path). It does NOT hit #5093's ~60x target: LLVM cannot collapse the
loop because the cold fallback call remains a GC safepoint that forces the
shadow-rooted receiver to reload every iteration, defeating LICM. The full
collapse needs a loop-versioning/deopt-exit pass (tracked in #5093).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@proggeramlug proggeramlug force-pushed the fix/issue-5093-inline-shape-guard branch from 44f9666 to eb592b0 Compare June 15, 2026 14:11
@proggeramlug

Copy link
Copy Markdown
Contributor Author

Addressed CodeRabbit review:

  1. (Major) The runtime enable-flag is now checked at the gate before any dereference (can_inline = ptr_safe && flag_ok), so a disabled inline path / verify mode skips the inline reads entirely. PERRY_VERIFY_TYPED_INTACT now also disables the inline path in js_gc_init, so the intact-bit verifier (which lives in the guard's fast contract) runs on every access in verify mode instead of being skipped on inline hits.
  2. (Critical) The generated flag load is now load volatile — LLVM can't hoist a stale 0 across a mid-execution flip (matches the runtime's relaxed-atomic read).
  3. (Minor) PERRY_DISABLE_CLASS_FIELD_INLINE and PERRY_VERIFY_TYPED_INTACT now parse the value (1/true/on/yes) instead of mere presence, so =0/=false no longer enable them.

Also fixed the failing lint check (cargo fmt). Re-validated: test_class_field_raw_f64_downgrade.ts matches Node byte-for-byte both inline-active and under PERRY_VERIFY_TYPED_INTACT=1 (verifier on every access, no abort).

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/perry-runtime/src/typed_feedback/guards.rs (1)

348-355: ⚡ Quick win

Avoid duplicating truthy-env parsing logic across modules.

At Line 348, the PERRY_VERIFY_TYPED_INTACT parser duplicates gc/mod.rs::env_flag_enabled semantics. Consider sharing one parser to prevent future drift between “disable inline” and “enable verifier” control paths.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/perry-runtime/src/typed_feedback/guards.rs` around lines 348 - 355,
The environment variable parsing logic for PERRY_VERIFY_TYPED_INTACT at lines
348-355 duplicates the truthy-value matching semantics already implemented in
the env_flag_enabled function from gc/mod.rs. Replace the std::env::var call and
the manual matches! logic with a call to the shared env_flag_enabled function,
passing "PERRY_VERIFY_TYPED_INTACT" as the argument. This eliminates code
duplication and ensures both the "disable inline" and "enable verifier" control
paths use consistent environment flag parsing.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@crates/perry-runtime/src/typed_feedback/guards.rs`:
- Around line 348-355: The environment variable parsing logic for
PERRY_VERIFY_TYPED_INTACT at lines 348-355 duplicates the truthy-value matching
semantics already implemented in the env_flag_enabled function from gc/mod.rs.
Replace the std::env::var call and the manual matches! logic with a call to the
shared env_flag_enabled function, passing "PERRY_VERIFY_TYPED_INTACT" as the
argument. This eliminates code duplication and ensures both the "disable inline"
and "enable verifier" control paths use consistent environment flag parsing.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: fb02f17e-b140-4abc-b594-ee51ed0c1dee

📥 Commits

Reviewing files that changed from the base of the PR and between 44f9666 and eb592b0.

📒 Files selected for processing (11)
  • crates/perry-codegen/src/expr/class_field_inline_guard.rs
  • crates/perry-codegen/src/expr/mod.rs
  • crates/perry-codegen/src/expr/property_get.rs
  • crates/perry-codegen/src/expr/property_set.rs
  • crates/perry-codegen/src/runtime_decls/objects.rs
  • crates/perry-runtime/src/gc/layout.rs
  • crates/perry-runtime/src/gc/mod.rs
  • crates/perry-runtime/src/object/mod.rs
  • crates/perry-runtime/src/typed_feedback.rs
  • crates/perry-runtime/src/typed_feedback/guards.rs
  • test-files/test_class_field_raw_f64_downgrade.ts
✅ Files skipped from review due to trivial changes (1)
  • crates/perry-codegen/src/expr/mod.rs
🚧 Files skipped from review as they are similar to previous changes (7)
  • test-files/test_class_field_raw_f64_downgrade.ts
  • crates/perry-runtime/src/typed_feedback.rs
  • crates/perry-codegen/src/runtime_decls/objects.rs
  • crates/perry-codegen/src/expr/class_field_inline_guard.rs
  • crates/perry-codegen/src/expr/property_get.rs
  • crates/perry-codegen/src/expr/property_set.rs
  • crates/perry-runtime/src/object/mod.rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant