perf(codegen): inline class-field shape guard fast path (#5093)#5198
perf(codegen): inline class-field shape guard fast path (#5093)#5198proggeramlug wants to merge 1 commit into
Conversation
📝 WalkthroughWalkthroughA new inlined class-field shape guard fast path is added across codegen and runtime. A per-object ChangesClass Field Inline Guard Fast Path
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/perry-codegen/src/expr/class_field_inline_guard.rs`:
- Around line 75-141: The runtime flags PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED
and PERRY_VERIFY_TYPED_INTACT are being checked too late in the execution flow,
after the code has already performed unsafe memory loads from obj_handle
offsets. Move the flag loading and checking to occur before the conditional
branch that routes to deref_label, so that when either flag indicates the inline
path should be disabled, the code branches to guardcall_label instead of
executing the crash-prone inline dereferences. Incorporate both the
PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED check (currently done with the I8 load
and icmp_eq call near the end of the deref block) and a corresponding check for
PERRY_VERIFY_TYPED_INTACT into a combined gate condition that is evaluated
before the cond_br call, replacing the current late-stage flag validation that
occurs after loading from obj_ptr.
In `@crates/perry-codegen/src/runtime_decls/objects.rs`:
- Around line 26-31: The external global declaration for
PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED uses plain I8 type, but the runtime
defines this symbol as AtomicU8, creating a contract mismatch. Change the
add_external_global call to declare this symbol with atomic semantics to match
the runtime's definition, ensuring the codegen emits a relaxed atomic load
instead of a standard load when reading this flag in the class-field inline
guard path. Update both the declaration in objects.rs and the corresponding load
operation in class_field_inline_guard.rs to use atomic load semantics that
respect the atomic nature of the runtime's AtomicU8 definition.
In `@crates/perry-runtime/src/gc/mod.rs`:
- Around line 447-449: The condition checking PERRY_DISABLE_CLASS_FIELD_INLINE
uses .is_some() which only verifies the variable's presence, but the documented
escape hatch requires parsing its actual value (checking if it equals "1").
Replace the var_os("PERRY_DISABLE_CLASS_FIELD_INLINE").is_some() check with
logic that parses the environment variable's value and returns true only when it
equals "1", so that setting it to "0" or "false" will correctly allow the fast
path instead of disabling it.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: ecc743f5-0187-4676-a851-25362cffc7a2
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (14)
CHANGELOG.mdCLAUDE.mdCargo.tomlcrates/perry-codegen/src/expr/class_field_inline_guard.rscrates/perry-codegen/src/expr/mod.rscrates/perry-codegen/src/expr/property_get.rscrates/perry-codegen/src/expr/property_set.rscrates/perry-codegen/src/runtime_decls/objects.rscrates/perry-runtime/src/gc/layout.rscrates/perry-runtime/src/gc/mod.rscrates/perry-runtime/src/object/mod.rscrates/perry-runtime/src/typed_feedback.rscrates/perry-runtime/src/typed_feedback/guards.rstest-files/test_class_field_raw_f64_downgrade.ts
| // Gate the dereference: a basic block has no short-circuit, so the field | ||
| // loads below must only run once we know the receiver is a real heap object | ||
| // (POINTER_TAG and above the handle band). Otherwise fall to the guard call, | ||
| // which classifies non-pointer / handle receivers safely. | ||
| { | ||
| let blk = ctx.block(); | ||
| let tag = blk.lshr(I64, obj_bits, "48"); | ||
| let is_ptr = blk.icmp_eq(I64, &tag, POINTER_TAG_HI16); | ||
| let above_band = blk.icmp_ugt(I64, obj_handle, HANDLE_BAND_TOP); | ||
| let ptr_safe = blk.and(I1, &is_ptr, &above_band); | ||
| blk.cond_br(&ptr_safe, &deref_label, &guardcall_label); | ||
| } | ||
|
|
||
| ctx.current_block = deref_idx; | ||
| { | ||
| let blk = ctx.block(); | ||
| let obj_ptr = blk.inttoptr(I64, obj_handle); | ||
|
|
||
| // GcHeader (precedes the object by 8 bytes): obj_type @-8 (i8), | ||
| // gc_flags @-7 (i8), _reserved @-6 (i16). | ||
| let gtype_ptr = blk.gep(I8, &obj_ptr, &[(I64, "-8")]); | ||
| let gtype = blk.load(I8, >ype_ptr); | ||
| let gtype_ok = blk.icmp_eq(I8, >ype, GC_TYPE_OBJECT); | ||
|
|
||
| let gflags_ptr = blk.gep(I8, &obj_ptr, &[(I64, "-7")]); | ||
| let gflags = blk.load(I8, &gflags_ptr); | ||
| let fwd = blk.and(I8, &gflags, GC_FLAG_FORWARDED_I8); | ||
| let not_fwd = blk.icmp_eq(I8, &fwd, "0"); | ||
|
|
||
| let res_ptr = blk.gep(I8, &obj_ptr, &[(I64, "-6")]); | ||
| let reserved = blk.load(I16, &res_ptr); | ||
|
|
||
| // ObjectHeader: object_type @0 (i32)==REGULAR, class_id @4 (i32), | ||
| // field_count @12 (i32), keys_array @16 (i64). | ||
| let object_type = blk.load(I32, &obj_ptr); | ||
| let ot_ok = blk.icmp_eq(I32, &object_type, OBJECT_TYPE_REGULAR); | ||
|
|
||
| let cid_ptr = blk.gep(I8, &obj_ptr, &[(I64, "4")]); | ||
| let class_id = blk.load(I32, &cid_ptr); | ||
| let cid_ok = blk.icmp_eq(I32, &class_id, expected_class_id); | ||
|
|
||
| let fc_ptr = blk.gep(I8, &obj_ptr, &[(I64, "12")]); | ||
| let field_count = blk.load(I32, &fc_ptr); | ||
| let fc_ok = blk.icmp_ugt(I32, &field_count, &field_index_str); | ||
|
|
||
| let ka_ptr = blk.gep(I8, &obj_ptr, &[(I64, "16")]); | ||
| let keys_array = blk.load(I64, &ka_ptr); | ||
| let ka_ok = blk.icmp_eq(I64, &keys_array, expected_keys); | ||
|
|
||
| // Process-global gate: descriptors / typed-feedback in use disable the | ||
| // inline path (the guard then has observable behavior to preserve). | ||
| let flag = blk.load(I8, "@PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED"); | ||
| let flag_ok = blk.icmp_eq(I8, &flag, "0"); | ||
|
|
||
| let mut acc = blk.and(I1, >ype_ok, ¬_fwd); | ||
| acc = blk.and(I1, &acc, &ot_ok); | ||
| acc = blk.and(I1, &acc, &cid_ok); | ||
| acc = blk.and(I1, &acc, &fc_ok); | ||
| acc = blk.and(I1, &acc, &ka_ok); | ||
| acc = blk.and(I1, &acc, &flag_ok); | ||
|
|
||
| if require_raw_f64 { | ||
| // The slot is read/written as a raw double, so the per-object typed | ||
| // layout must be intact (no downgrade to a NaN-boxed value). | ||
| let intact = blk.and(I16, &reserved, TYPED_LAYOUT_INTACT_BIT); | ||
| let intact_ok = blk.icmp_ne(I16, &intact, "0"); | ||
| acc = blk.and(I1, &acc, &intact_ok); |
There was a problem hiding this comment.
Apply the runtime disable/verify gates before any inline dereference.
PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED is checked after the helper has already loaded from obj_handle - 8, obj_handle, and ObjectHeader offsets. That means PERRY_DISABLE_CLASS_FIELD_INLINE=1 still executes the crash-prone inline layout reads instead of cleanly falling back to the guard call. Also ensure PERRY_VERIFY_TYPED_INTACT=1 forces this same guard-call path; otherwise raw-f64 inline hits skip the runtime verifier that cross-checks the intact bit.
Suggested direction
{
let blk = ctx.block();
+ let flag = blk.load(I8, "`@PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED`");
+ let flag_ok = blk.icmp_eq(I8, &flag, "0");
let tag = blk.lshr(I64, obj_bits, "48");
let is_ptr = blk.icmp_eq(I64, &tag, POINTER_TAG_HI16);
let above_band = blk.icmp_ugt(I64, obj_handle, HANDLE_BAND_TOP);
let ptr_safe = blk.and(I1, &is_ptr, &above_band);
- blk.cond_br(&ptr_safe, &deref_label, &guardcall_label);
+ let can_inline_deref = blk.and(I1, &ptr_safe, &flag_ok);
+ blk.cond_br(&can_inline_deref, &deref_label, &guardcall_label);
}
@@
- // Process-global gate: descriptors / typed-feedback in use disable the
- // inline path (the guard then has observable behavior to preserve).
- let flag = blk.load(I8, "`@PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED`");
- let flag_ok = blk.icmp_eq(I8, &flag, "0");
-
let mut acc = blk.and(I1, >ype_ok, ¬_fwd);
@@
- acc = blk.and(I1, &acc, &flag_ok);Then wire PERRY_VERIFY_TYPED_INTACT=1 into the same disabled flag during runtime initialization, or add an equivalent pre-deref verifier gate that branches to guardcall_label.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Gate the dereference: a basic block has no short-circuit, so the field | |
| // loads below must only run once we know the receiver is a real heap object | |
| // (POINTER_TAG and above the handle band). Otherwise fall to the guard call, | |
| // which classifies non-pointer / handle receivers safely. | |
| { | |
| let blk = ctx.block(); | |
| let tag = blk.lshr(I64, obj_bits, "48"); | |
| let is_ptr = blk.icmp_eq(I64, &tag, POINTER_TAG_HI16); | |
| let above_band = blk.icmp_ugt(I64, obj_handle, HANDLE_BAND_TOP); | |
| let ptr_safe = blk.and(I1, &is_ptr, &above_band); | |
| blk.cond_br(&ptr_safe, &deref_label, &guardcall_label); | |
| } | |
| ctx.current_block = deref_idx; | |
| { | |
| let blk = ctx.block(); | |
| let obj_ptr = blk.inttoptr(I64, obj_handle); | |
| // GcHeader (precedes the object by 8 bytes): obj_type @-8 (i8), | |
| // gc_flags @-7 (i8), _reserved @-6 (i16). | |
| let gtype_ptr = blk.gep(I8, &obj_ptr, &[(I64, "-8")]); | |
| let gtype = blk.load(I8, >ype_ptr); | |
| let gtype_ok = blk.icmp_eq(I8, >ype, GC_TYPE_OBJECT); | |
| let gflags_ptr = blk.gep(I8, &obj_ptr, &[(I64, "-7")]); | |
| let gflags = blk.load(I8, &gflags_ptr); | |
| let fwd = blk.and(I8, &gflags, GC_FLAG_FORWARDED_I8); | |
| let not_fwd = blk.icmp_eq(I8, &fwd, "0"); | |
| let res_ptr = blk.gep(I8, &obj_ptr, &[(I64, "-6")]); | |
| let reserved = blk.load(I16, &res_ptr); | |
| // ObjectHeader: object_type @0 (i32)==REGULAR, class_id @4 (i32), | |
| // field_count @12 (i32), keys_array @16 (i64). | |
| let object_type = blk.load(I32, &obj_ptr); | |
| let ot_ok = blk.icmp_eq(I32, &object_type, OBJECT_TYPE_REGULAR); | |
| let cid_ptr = blk.gep(I8, &obj_ptr, &[(I64, "4")]); | |
| let class_id = blk.load(I32, &cid_ptr); | |
| let cid_ok = blk.icmp_eq(I32, &class_id, expected_class_id); | |
| let fc_ptr = blk.gep(I8, &obj_ptr, &[(I64, "12")]); | |
| let field_count = blk.load(I32, &fc_ptr); | |
| let fc_ok = blk.icmp_ugt(I32, &field_count, &field_index_str); | |
| let ka_ptr = blk.gep(I8, &obj_ptr, &[(I64, "16")]); | |
| let keys_array = blk.load(I64, &ka_ptr); | |
| let ka_ok = blk.icmp_eq(I64, &keys_array, expected_keys); | |
| // Process-global gate: descriptors / typed-feedback in use disable the | |
| // inline path (the guard then has observable behavior to preserve). | |
| let flag = blk.load(I8, "@PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED"); | |
| let flag_ok = blk.icmp_eq(I8, &flag, "0"); | |
| let mut acc = blk.and(I1, >ype_ok, ¬_fwd); | |
| acc = blk.and(I1, &acc, &ot_ok); | |
| acc = blk.and(I1, &acc, &cid_ok); | |
| acc = blk.and(I1, &acc, &fc_ok); | |
| acc = blk.and(I1, &acc, &ka_ok); | |
| acc = blk.and(I1, &acc, &flag_ok); | |
| if require_raw_f64 { | |
| // The slot is read/written as a raw double, so the per-object typed | |
| // layout must be intact (no downgrade to a NaN-boxed value). | |
| let intact = blk.and(I16, &reserved, TYPED_LAYOUT_INTACT_BIT); | |
| let intact_ok = blk.icmp_ne(I16, &intact, "0"); | |
| acc = blk.and(I1, &acc, &intact_ok); | |
| // Gate the dereference: a basic block has no short-circuit, so the field | |
| // loads below must only run once we know the receiver is a real heap object | |
| // (POINTER_TAG and above the handle band). Otherwise fall to the guard call, | |
| // which classifies non-pointer / handle receivers safely. | |
| { | |
| let blk = ctx.block(); | |
| let flag = blk.load(I8, "`@PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED`"); | |
| let flag_ok = blk.icmp_eq(I8, &flag, "0"); | |
| let tag = blk.lshr(I64, obj_bits, "48"); | |
| let is_ptr = blk.icmp_eq(I64, &tag, POINTER_TAG_HI16); | |
| let above_band = blk.icmp_ugt(I64, obj_handle, HANDLE_BAND_TOP); | |
| let ptr_safe = blk.and(I1, &is_ptr, &above_band); | |
| let can_inline_deref = blk.and(I1, &ptr_safe, &flag_ok); | |
| blk.cond_br(&can_inline_deref, &deref_label, &guardcall_label); | |
| } | |
| ctx.current_block = deref_idx; | |
| { | |
| let blk = ctx.block(); | |
| let obj_ptr = blk.inttoptr(I64, obj_handle); | |
| // GcHeader (precedes the object by 8 bytes): obj_type `@-8` (i8), | |
| // gc_flags `@-7` (i8), _reserved `@-6` (i16). | |
| let gtype_ptr = blk.gep(I8, &obj_ptr, &[(I64, "-8")]); | |
| let gtype = blk.load(I8, >ype_ptr); | |
| let gtype_ok = blk.icmp_eq(I8, >ype, GC_TYPE_OBJECT); | |
| let gflags_ptr = blk.gep(I8, &obj_ptr, &[(I64, "-7")]); | |
| let gflags = blk.load(I8, &gflags_ptr); | |
| let fwd = blk.and(I8, &gflags, GC_FLAG_FORWARDED_I8); | |
| let not_fwd = blk.icmp_eq(I8, &fwd, "0"); | |
| let res_ptr = blk.gep(I8, &obj_ptr, &[(I64, "-6")]); | |
| let reserved = blk.load(I16, &res_ptr); | |
| // ObjectHeader: object_type `@0` (i32)==REGULAR, class_id `@4` (i32), | |
| // field_count `@12` (i32), keys_array `@16` (i64). | |
| let object_type = blk.load(I32, &obj_ptr); | |
| let ot_ok = blk.icmp_eq(I32, &object_type, OBJECT_TYPE_REGULAR); | |
| let cid_ptr = blk.gep(I8, &obj_ptr, &[(I64, "4")]); | |
| let class_id = blk.load(I32, &cid_ptr); | |
| let cid_ok = blk.icmp_eq(I32, &class_id, expected_class_id); | |
| let fc_ptr = blk.gep(I8, &obj_ptr, &[(I64, "12")]); | |
| let field_count = blk.load(I32, &fc_ptr); | |
| let fc_ok = blk.icmp_ugt(I32, &field_count, &field_index_str); | |
| let ka_ptr = blk.gep(I8, &obj_ptr, &[(I64, "16")]); | |
| let keys_array = blk.load(I64, &ka_ptr); | |
| let ka_ok = blk.icmp_eq(I64, &keys_array, expected_keys); | |
| let mut acc = blk.and(I1, >ype_ok, ¬_fwd); | |
| acc = blk.and(I1, &acc, &ot_ok); | |
| acc = blk.and(I1, &acc, &cid_ok); | |
| acc = blk.and(I1, &acc, &fc_ok); | |
| acc = blk.and(I1, &acc, &ka_ok); | |
| if require_raw_f64 { | |
| // The slot is read/written as a raw double, so the per-object typed | |
| // layout must be intact (no downgrade to a NaN-boxed value). | |
| let intact = blk.and(I16, &reserved, TYPED_LAYOUT_INTACT_BIT); | |
| let intact_ok = blk.icmp_ne(I16, &intact, "0"); | |
| acc = blk.and(I1, &acc, &intact_ok); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/perry-codegen/src/expr/class_field_inline_guard.rs` around lines 75 -
141, The runtime flags PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED and
PERRY_VERIFY_TYPED_INTACT are being checked too late in the execution flow,
after the code has already performed unsafe memory loads from obj_handle
offsets. Move the flag loading and checking to occur before the conditional
branch that routes to deref_label, so that when either flag indicates the inline
path should be disabled, the code branches to guardcall_label instead of
executing the crash-prone inline dereferences. Incorporate both the
PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED check (currently done with the I8 load
and icmp_eq call near the end of the deref block) and a corresponding check for
PERRY_VERIFY_TYPED_INTACT into a combined gate condition that is evaluated
before the cond_br call, replacing the current late-stage flag validation that
occurs after loading from obj_ptr.
| // #5093: sticky runtime flag (i8, 0 = enabled) gating the codegen-inlined | ||
| // class-field shape-guard fast path. The inline guard loads this directly | ||
| // and falls back to the full `js_typed_feedback_class_field_*_guard` call | ||
| // when it is non-zero (descriptors / typed-feedback in use). Defined in | ||
| // perry-runtime as `PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED`. | ||
| module.add_external_global("PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED", I8); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Inspect the generated-code load of PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED.
#
# Expected: the inline precheck either emits an atomic/volatile load, calls a runtime helper
# such as class_field_inline_guard_enabled(), or has an explicit single-threaded guarantee
# around all reads/writes of this symbol.
rg -n -C4 'PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED|class_field_inline_guard_enabled|disable_class_field_inline_guard|atomic|volatile' crates/perry-codegen/src crates/perry-runtime/src
# Print the inline precheck module if present so the load instruction can be reviewed in context.
fd -i '^class_field_inline_guard\.rs$' crates/perry-codegen/src --exec sed -n '1,260p' {}Repository: PerryTS/perry
Length of output: 50372
🏁 Script executed:
# Find exact runtime definition of PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED
rg -n 'PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED' crates/perry-runtime/src --type rust -A 2 -B 2
# Search for the codegen file that emits the load
fd -i 'class.*field.*guard' crates/perry-codegen/src --type f
# Search for where this is being loaded in codegen
rg -n 'emit.*class.*field\|class_field.*load' crates/perry-codegen/src --type rust -A 3 -B 1Repository: PerryTS/perry
Length of output: 1976
🏁 Script executed:
# Read the codegen file that emits the class field inline guard load
cat -n crates/perry-codegen/src/expr/class_field_inline_guard.rs | head -150Repository: PerryTS/perry
Length of output: 8474
Verify the generated guard reads this atomic flag atomically.
The runtime defines PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED as AtomicU8 and documents that the codegen should emit a "relaxed load". However, codegen declares it as a plain I8 global and emits a standard (non-atomic) load at line 126 in class_field_inline_guard.rs. While the flag is monotonic (set-only), the declaration mismatch creates an implicit contract violation: the codegen should declare this symbol with atomic intent to match the runtime's AtomicU8.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/perry-codegen/src/runtime_decls/objects.rs` around lines 26 - 31, The
external global declaration for PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED uses
plain I8 type, but the runtime defines this symbol as AtomicU8, creating a
contract mismatch. Change the add_external_global call to declare this symbol
with atomic semantics to match the runtime's definition, ensuring the codegen
emits a relaxed atomic load instead of a standard load when reading this flag in
the class-field inline guard path. Update both the declaration in objects.rs and
the corresponding load operation in class_field_inline_guard.rs to use atomic
load semantics that respect the atomic nature of the runtime's AtomicU8
definition.
152ae93 to
44f9666
Compare
Monomorphic this.field reads/writes on a known class instance previously routed every access through a cross-crate js_typed_feedback_class_field_* _guard call before touching the raw slot. #5093 measured the call itself as the dominant per-access cost. Emit the cheap part of the guard contract as inline IR (pointer-tag gate, gc-type/forwarded/object-type/class_id/keys/ field_count checks, plus a per-object typed-layout intact bit for raw-f64 fields, frozen + plain-number checks for raw sets) and branch straight to the fast slot load/store on a hit, skipping the call. On any miss it falls through to the unchanged guard-call path, so the change is purely additive. Runtime: add a single per-object GC_OBJ_TYPED_LAYOUT_INTACT bit (free bit of GcHeader._reserved), set when an object's canonical typed descriptor is installed and cleared on every downgrade/removal path. Because raw-f64 downgrade is all-or-nothing (the whole descriptor is removed), this one bit stands in for the thread-local raw-f64 layout probe. Add the exported PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED flag (flips on when descriptors / typed-feedback come into use; PERRY_DISABLE_CLASS_FIELD_INLINE=1 escape hatch) and a PERRY_VERIFY_TYPED_INTACT=1 self-check. NOTE: incremental win only (~10% on 09_method_calls; skips the call on the hot path). It does NOT hit #5093's ~60x target: LLVM cannot collapse the loop because the cold fallback call remains a GC safepoint that forces the shadow-rooted receiver to reload every iteration, defeating LICM. The full collapse needs a loop-versioning/deopt-exit pass (tracked in #5093). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
44f9666 to
eb592b0
Compare
|
Addressed CodeRabbit review:
Also fixed the failing |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
crates/perry-runtime/src/typed_feedback/guards.rs (1)
348-355: ⚡ Quick winAvoid duplicating truthy-env parsing logic across modules.
At Line 348, the
PERRY_VERIFY_TYPED_INTACTparser duplicatesgc/mod.rs::env_flag_enabledsemantics. Consider sharing one parser to prevent future drift between “disable inline” and “enable verifier” control paths.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/perry-runtime/src/typed_feedback/guards.rs` around lines 348 - 355, The environment variable parsing logic for PERRY_VERIFY_TYPED_INTACT at lines 348-355 duplicates the truthy-value matching semantics already implemented in the env_flag_enabled function from gc/mod.rs. Replace the std::env::var call and the manual matches! logic with a call to the shared env_flag_enabled function, passing "PERRY_VERIFY_TYPED_INTACT" as the argument. This eliminates code duplication and ensures both the "disable inline" and "enable verifier" control paths use consistent environment flag parsing.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@crates/perry-runtime/src/typed_feedback/guards.rs`:
- Around line 348-355: The environment variable parsing logic for
PERRY_VERIFY_TYPED_INTACT at lines 348-355 duplicates the truthy-value matching
semantics already implemented in the env_flag_enabled function from gc/mod.rs.
Replace the std::env::var call and the manual matches! logic with a call to the
shared env_flag_enabled function, passing "PERRY_VERIFY_TYPED_INTACT" as the
argument. This eliminates code duplication and ensures both the "disable inline"
and "enable verifier" control paths use consistent environment flag parsing.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: fb02f17e-b140-4abc-b594-ee51ed0c1dee
📒 Files selected for processing (11)
crates/perry-codegen/src/expr/class_field_inline_guard.rscrates/perry-codegen/src/expr/mod.rscrates/perry-codegen/src/expr/property_get.rscrates/perry-codegen/src/expr/property_set.rscrates/perry-codegen/src/runtime_decls/objects.rscrates/perry-runtime/src/gc/layout.rscrates/perry-runtime/src/gc/mod.rscrates/perry-runtime/src/object/mod.rscrates/perry-runtime/src/typed_feedback.rscrates/perry-runtime/src/typed_feedback/guards.rstest-files/test_class_field_raw_f64_downgrade.ts
✅ Files skipped from review due to trivial changes (1)
- crates/perry-codegen/src/expr/mod.rs
🚧 Files skipped from review as they are similar to previous changes (7)
- test-files/test_class_field_raw_f64_downgrade.ts
- crates/perry-runtime/src/typed_feedback.rs
- crates/perry-codegen/src/runtime_decls/objects.rs
- crates/perry-codegen/src/expr/class_field_inline_guard.rs
- crates/perry-codegen/src/expr/property_get.rs
- crates/perry-codegen/src/expr/property_set.rs
- crates/perry-runtime/src/object/mod.rs
Summary
Inlines the class-field shape guard so a monomorphic
this.fieldread/write on aknown class instance branches straight to the raw slot on a hit, skipping the
cross-crate
js_typed_feedback_class_field_{get,set}_guardcall that #5093measured as the dominant per-access cost on
09_method_calls. On any miss it fallsthrough to the unchanged guard-call path, so the change is purely additive — it
can never take a fast path the guard would have rejected.
Addresses #5093 (incrementally — see Scope below).
How it works
gc/layout.rs): a single free bit ofGcHeader._reserved(GC_OBJ_TYPED_LAYOUT_INTACT), set when an object's canonicaltyped descriptor is installed and cleared on every downgrade/removal path
(
layout_set_typed_unknown,layout_clear_for_ptr,layout_mark_unknown,layout_init_pointer_free, rebuild, transfer). Because raw-f64 downgrade isall-or-nothing (the whole descriptor is removed), this one bit replaces the
thread-local
TYPED_LAYOUTShashmap probe — so the inline guard concludes "slot Kis raw-f64" from a single bit-test plus a class_id/keys match. The bit rides along
with
_reservedacross copying/evacuating GC.expr/class_field_inline_guard.rs, wired intoproperty_get/property_set): a pointer-tag/handle-band gate guarding thedereference, then gc-type / forwarded / object-type / class_id / keys_array /
field_count checks, the process-global enable flag, the intact bit for raw-f64
fields, and — for raw sets — not-frozen + a plain-finite-number value check. GET
inlines raw and non-raw fields; SET inlines raw-f64 fields only.
PERRY_CLASS_FIELD_INLINE_GUARD_DISABLED(exported, flips on themoment any accessor/property descriptor comes into use or typed-feedback tracing is
enabled — the inline path then falls back to the full guard, preserving
descriptor-aware dispatch + feedback recording).
PERRY_DISABLE_CLASS_FIELD_INLINE=1is an explicit escape hatch;
PERRY_VERIFY_TYPED_INTACT=1aborts if the intact bitever disagrees with the side table.
Scope — incremental win, NOT #5093's ~60× target
This skips the guard call on the hot path (~10% on
09_method_calls) but does notcollapse the loop to
load/fadd/store. Root cause, found by disassembling theoptimized loop: the cold guard-miss fallback call remains a GC safepoint, which
forces the shadow-stack-rooted receiver to reload every iteration, so the object
pointer is never loop-invariant and LLVM's LICM can't hoist any shape load (even
!invariant.load-taggedclass_id/object_type). Collapsing the loop requires aloop-versioning / deopt-exit pass that makes the hot loop call-free — a larger,
maintainer-driven change. Full analysis posted on #5093. This PR is the additive
foundation (per-object intact bit + inline fast path).
Testing
test-files/test_class_field_raw_f64_downgrade.ts(new): the memory-corruptiontrap perf(method dispatch): method_calls ~290× Node — remaining cost is per-field-access shape-guard calls (plan + standby) #5093 warns about — storing a non-number into a
number-typed slot through ananyalias must clear the intact bit so a later read returns the boxed value, notthe pointer bits read as a raw
double. Passes byte-for-byte vsnode --experimental-strip-types, including underPERRY_VERIFY_TYPED_INTACT=1.cargo test -p perry-runtime: gc/layout + typed_feedback/guards pass (1041 ok).The lone failure (
builtin_prototype_methods_reject_dynamic_new) is a pre-existingparallel test-isolation flake —
origin/mainfails a different test(
stream_constructors_expose_static_method_values) with the identical 1-failurepattern; the test passes in isolation and within its own module.
object_methodsstrict-mode freeze-writethrow;
perfhooksTypeErrormessage wording) reproduce identically with theinline path disabled, i.e. they're pre-existing gaps in code paths this change
doesn't touch.
🤖 Generated with Claude Code
Summary by CodeRabbit
this.fieldreads/writes, with safe fallback to the existing guarded path.