From fa9655cb50bbd09a8ba19c4b6fa64555ca7b1770 Mon Sep 17 00:00:00 2001 From: Abmcar Date: Tue, 12 May 2026 16:12:03 +0800 Subject: [PATCH 1/6] perf(compiler): thread ValueRange through OR/XOR/SHR_U and skip OR-fold in U64 compares MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Extend the monotone-safe subset of G4 ValueRange propagation: - handleBitwiseOp Phase 1 u64-const fast path now carries OtherOp.getRange() (limbs[1..3] pass through as identical MInstruction pointers, so the value-range claim is preserved bit-identically). - handleBitwiseOp Phase 5 general path now carries max(LHS.range, RHS.range) — OR/XOR are monotone joins on bit width. AND is excluded (its monotone direction is min, already handled by its own earlier returns). - handleShift now carries ValueOp.getRange() — unsigned right shift cannot widen. SHL widens; SHR_S sign-fills; both keep U256. - handleCompareEqU64 / LtRhsU64 / GtRhsU64 skip the 3-upper-limb OR-fold and the boolean select when the wide operand carries ValueRange::U64. Every existing U64 producer materializes literal MIR zero in limbs[1..3], so the new fast path's runtime safety is structural. Out of scope per docs/research/directions/u256-strength-reduction/ analysis/2026-05-12-verified-opportunities.md §1.G4 verdict: SUB (wraps), SHL (widens), SAR (sign-fills), ADDMOD/MULMOD (need richer model), EXP, signed compare. Validation (worktree g4-safe-valuerange): - tools/format.sh check: clean - evmone-unittests multipass run list: 223/223 pass - evmone-statetest -k fork_Cancun multipass: 2723/2723 pass Spec + review artifacts: docs/changes/2026-05-12-g4-safe-valuerange/. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../2026-05-12-g4-safe-valuerange/README.md | 157 +++++++++++ .../reviews/impl-round-1-codex.md | 24 ++ .../reviews/impl-round-1-opus.md | 93 +++++++ .../reviews/round-1-codex.md | 22 ++ .../reviews/round-1-opus.md | 244 ++++++++++++++++++ .../evm_frontend/evm_mir_compiler.cpp | 88 ++++--- src/compiler/evm_frontend/evm_mir_compiler.h | 23 +- 7 files changed, 617 insertions(+), 34 deletions(-) create mode 100644 docs/changes/2026-05-12-g4-safe-valuerange/README.md create mode 100644 docs/changes/2026-05-12-g4-safe-valuerange/reviews/impl-round-1-codex.md create mode 100644 docs/changes/2026-05-12-g4-safe-valuerange/reviews/impl-round-1-opus.md create mode 100644 docs/changes/2026-05-12-g4-safe-valuerange/reviews/round-1-codex.md create mode 100644 docs/changes/2026-05-12-g4-safe-valuerange/reviews/round-1-opus.md diff --git a/docs/changes/2026-05-12-g4-safe-valuerange/README.md b/docs/changes/2026-05-12-g4-safe-valuerange/README.md new file mode 100644 index 000000000..cbb6905ec --- /dev/null +++ b/docs/changes/2026-05-12-g4-safe-valuerange/README.md @@ -0,0 +1,157 @@ +# Change: G4-safe ValueRange propagation for OR/XOR/SHR_U + range-driven compare + +- **Status**: Proposed +- **Date**: 2026-05-12 +- **Tier**: Light + +## Overview + +Extend `EVMMirBuilder::ValueRange` propagation to three monotone-safe sites: + +1. `handleBitwiseOp` / `handleBitwiseOp`: result range = `max(LHS.range, RHS.range)` (both Phase-1 u64-const fast path and Phase-5 general path). +2. `handleShift` (unsigned right shift): result range = `ValueOp.range` (right shift cannot widen). +3. `handleCompareEqU64` / `handleCompareLtRhsU64` / `handleCompareGtRhsU64`: when the wide operand carries `ValueRange::U64`, skip the upper-limbs OR-fold and emit a direct single-limb compare. + +Explicitly **out of scope** for this PR per `2026-05-12-verified-opportunities.md` §1.G4 verdict: SUB (wraps to full U256), SHL (widens), SAR (sign-fills), ADDMOD/MULMOD (need richer range model), EXP, signed compare. + +This work is **orthogonal** to the `perf/value-range-cfg-join` branch, which performs CFG-level join/meet range analysis and plumbs analyzer-derived ranges into block-entry operands. G4-safe adds per-opcode producer threading; cfg-join feeds those producers across control flow. The two compose without conflict (G4-safe touches `handleBitwiseOp`, `handleShift`, `handleCompare*RhsU64`; cfg-join touches `createStackEntryOperand` and adds an `EVMValueRange` external type). + +## Motivation + +`ValueRange` was introduced in PR #458 (`fca0b1a`) as a wholesale producer-side annotation: ADD/MUL/DIV/MOD/AND/BYTE/u64-compare populate it; the only non-arithmetic consumer is AND. Every other bitwise/shift/compare operator **drops the range to U256** even when the inputs are demonstrably narrower: + +- `(narrow_mul a, b) | c` → range falls back to U256, blocking any downstream narrow-path lowering. +- `(narrow_mul a, b) >> k` → same loss. +- `(narrow_mul a, b) < C64` → constant fast path triggers, but the helper still emits a 4-limb OR-fold check on the wide operand's upper limbs that we *know* are zero. + +The 2026-05-12 verification doc isolated this as the smallest, lowest-risk subset of the G4 work: monotone propagation only, no wraparound or sign-fill semantics. The win is small per opcode but **load-bearing** for any future narrow consumer (compare-via-range, narrow lowering paths for additional opcodes). + +## Impact + +### Modules touched + +- `src/compiler/evm_frontend/evm_mir_compiler.h` — three return sites in `handleBitwiseOp` (Phase 1 OR/XOR fast path, Phase 5 general path) and one in `handleShift`. Optionally a small `static maxRange` helper near the `ValueRange` enum. +- `src/compiler/evm_frontend/evm_mir_compiler.cpp` — three helper bodies: `handleCompareEqU64` (around line 2978), `handleCompareLtRhsU64` (around line 3011), `handleCompareGtRhsU64` (around line 3046). + +### Contracts preserved + +- EVM observable semantics: unchanged. All optimizations rely on `getRange()` annotations established at producer sites; values whose runtime bits do not match their annotation would be a pre-existing invariant violation. +- `protectUnsafeValue` calls remain on every generated I64 result that flows into stack-residence — the consumer side only changes WHICH limbs are computed, not their barrier annotation. +- Result encoding for narrow-path compares unchanged: `Result[0] = cmp`, `Result[1..3] = Zero`, `Range = U64`. + +### Risks + +- **Low**: monotone non-widening transformations only. OR/XOR of two operands with range `U128` yields a U128 value (limbs[2..3] of both operands are 0 → OR/XOR of zeros is zero). Logical right shift of a U64 value yields a U64 value (limbs[1..3] of input are 0 → shifting out is still zero in limbs[1..3]; shift cannot inject bits from above-limb). +- **Compare consumer change**: requires the producer-side `ValueRange::U64` invariant to hold at runtime. PR #458 establishes this invariant from ADD/MUL/DIV/MOD/AND; this PR extends it to OR/XOR/SHR_U. + +### Invariant chain (correctness of the compare-side fast path) + +Today, every `ValueRange::U64` producer materializes **literal MIR zero constants** in limbs[1..3] — not just modeled-zero. Verified producer sites: + +- `evm_mir_compiler.h:565` (general compare result via `handleCompareImpl` → `handleCompareEQZ` / `handleCompareEQ` / `handleCompareGT_LT`; each writes `Result[1..3] = Zero` literally) +- `evm_mir_compiler.h:629` (AND u64-const fast path: `Result[I] = Zero` for `I ≥ 1`) +- `evm_mir_compiler.cpp:2024` (DIV u64÷u64: `{DivResult, Zero, Zero, Zero}`) +- `evm_mir_compiler.cpp:2194` (MOD u64÷u64: `{ModResult, Zero, Zero, Zero}`) +- `evm_mir_compiler.cpp:3002-3006` / `3037-3041` / `3073-3077` (the three existing `handleCompare*U64` helpers — `Result[I] = Zero`) +- `evm_mir_compiler.cpp:3689-3693` (BYTE: `ResultComponents[I] = Zero`) + +The new compare-side fast path **does not weaken** this invariant: it only reads the `Range` tag and *elides reading* `LHS[1..3]`. It never creates a U64-tagged operand with non-literal-zero upper limbs. + +**SHR_U caveat (value-level, not MIR-level)**: `handleLogicalRightShift` emits `Select(IsLargeShift, Zero, )` for upper limbs when the input has `ValueRange::U64`. The *runtime value* is zero (correct), but the *MIR* is not necessarily a literal `Zero` constant. Future consumers must gate on `Range` rather than MIR-pattern-match upper limbs. The compare-side fast path proposed here is unaffected because it elides reading upper limbs entirely when `Range == U64`. + +### Compose with `perf/value-range-cfg-join` + +Once `perf/value-range-cfg-join` lands, `setRange(...)` will retrofit `Range = U64` onto stack-popped operands whose backing variables may contain *anything* at the MIR level (the analyzer guarantees value-level zero, not MIR-level zero). This change extends the **trust chain** for compare-side correctness from: + +> compare correctness ⇐ Range == U64 ⇐ producer materializes literal-zero MIR + +to: + +> compare correctness ⇐ Range == U64 ⇐ (today) literal-zero MIR OR (with cfg-join) analyzer soundness + +The same trust model already applies to the AND `NarrowRange` path at `evm_mir_compiler.h:633-655`. The combination is consistent with current practice; the analyzer's correctness tests should exercise EQ/LT/GT with analyzer-narrowed U64 ranges once cfg-join lands. + +## Implementation + +### Step 1 — Helper: derive merged range for binary ops + +Add an inline helper in `evm_mir_compiler.h` (near the `ValueRange` enum), or inline at use site: + +```cpp +static ValueRange maxRange(const Operand &A, const Operand &B) { + return A.getRange() > B.getRange() ? A.getRange() : B.getRange(); +} +``` + +`ValueRange` is `uint8_t` with `U64=0 < U128=1 < U256=2`, so `>` gives the wider tier. + +### Step 2 — OR/XOR: thread range through two return sites + +In `handleBitwiseOp` (`evm_mir_compiler.h:570-693`): + +- Phase 1 u64-const fast path return (currently line 678): + ```cpp + return Operand(Result, EVMType::UINT256, OtherOp.getRange()); + ``` + Rationale: the u64-const side has range U64, so `max(U64, OtherOp.range) = OtherOp.range`. Limb[0] is recomputed; limbs[1..3] pass through unchanged from `OtherOp`. Their value-range claim is preserved. + +- Phase 5 general path return (currently line 692): + ```cpp + return Operand(Result, EVMType::UINT256, maxRange(LHSOp, RHSOp)); + ``` + +### Step 3 — Unsigned SHR: thread range from ValueOp + +In `handleShift` (`evm_mir_compiler.h:704-756`): + +Change the single shared `return Operand(Result, EVMType::UINT256);` (line 755) to be operator-aware: + +```cpp +if constexpr (Operator == BinaryOperator::BO_SHR_U) { + return Operand(Result, EVMType::UINT256, ValueOp.getRange()); +} +return Operand(Result, EVMType::UINT256); +``` + +Rationale: for SHL the range *can* widen and we keep U256 default; for SAR sign-fill can populate upper limbs and we keep U256 default. For SHR_U, an N-bit value shifted right yields an at-most-N-bit value, so the range is preserved. + +### Step 4 — Compare consumer: skip upper-limbs OR-fold when range is U64 + +Modify three helpers in `evm_mir_compiler.cpp`: + +- `handleCompareEqU64`: when `FullOp.getRange() == ValueRange::U64`, the 3-upper-limb OR-fold (`Upper = LHS[1] | LHS[2] | LHS[3]`) is provably zero. Skip it; emit `FinalResult = ICMP_EQ(LHS[0], U64Val)` directly. Save 2 OR + 1 CMP + 1 AND per call. +- `handleCompareLtRhsU64`: when `LHSOp.getRange() == ValueRange::U64`, `HasUpper` is provably false. Skip the 3-upper-limb OR-fold and the `SelectInstruction`; emit `FinalResult = ICMP_ULT(LHS[0], RhsVal)` directly. Save 2 OR + 1 CMP + 1 SELECT. +- `handleCompareGtRhsU64`: when `LHSOp.getRange() == ValueRange::U64`, mirror of LT: emit `FinalResult = ICMP_UGT(LHS[0], RhsVal)` directly. + +Preserve the `protectUnsafeValue` wrap and the `Result[1..3] = Zero` tail in all three. Return range remains `ValueRange::U64` (comparison results are 0 or 1). + +### Step 5 — Build + verify + +- Configure CMake with the worktree-bootstrap default flags (includes `-DZEN_ENABLE_JIT_PRECOMPILE_FALLBACK=ON` per memory `feedback_jit_fallback_required_flag.md`; otherwise peephole O(n²) `setInsertBlock` hangs). +- Run `tools/format.sh check`. +- Local tests per `.claude/rules/dtvm-local-test.md`: + - `evmone-unittests` multipass with `EVMOneMultipassUnitTestsRunList.txt` + - `evmone-statetest -k fork_Cancun` multipass with `enable_gas_metering=true` +- For perf measurement: `/bench-compare` on 27-bench vs upstream/main baseline (per `.claude/rules/dtvm-perf-worktree-lab.md`). + +### Step 6 — Document the result back into the analysis doc + +Update `docs/research/directions/u256-strength-reduction/analysis/2026-05-12-verified-opportunities.md`: + +- §0 landed-status table: add a row for "G4-safe ValueRange OR/XOR/SHR_U/compare" with commit reference. +- §2 Tier 1 entry #1 and §3 Sprint 1: annotate as `Implemented` with link to this change doc and bench numbers. + +## Checklist + +- [ ] Step 1 helper or inline merge in place +- [ ] Step 2: OR/XOR Phase 1 + Phase 5 returns thread range +- [ ] Step 3: SHR_U return threads `ValueOp.getRange()` +- [ ] Step 4: three compare helpers gain U64-range early branch +- [ ] `tools/format.sh check` passes +- [ ] Build with CI-faithful flags green +- [ ] `evmone-unittests` (multipass) green +- [ ] `evmone-statetest -k fork_Cancun` (multipass) green +- [ ] `/bench-compare` shows no regression vs upstream/main on 27-bench +- [ ] `2026-05-12-verified-opportunities.md` updated with shipped status +- [ ] Module specs in `docs/modules/` updated (if affected — likely not for this change) +- [ ] No new producer site emits `ValueRange::U64` without setting `Result[1..3] = Zero` at the MIR level (self-audit of the diff before commit) diff --git a/docs/changes/2026-05-12-g4-safe-valuerange/reviews/impl-round-1-codex.md b/docs/changes/2026-05-12-g4-safe-valuerange/reviews/impl-round-1-codex.md new file mode 100644 index 000000000..8992dafb9 --- /dev/null +++ b/docs/changes/2026-05-12-g4-safe-valuerange/reviews/impl-round-1-codex.md @@ -0,0 +1,24 @@ +# Round 1 — Codex Review of G4-safe Implementation + +## Verdict +PASS + +## Verified +- Check 1: ✅ `handleBitwiseOp` returns `Operand(Result, EVMType::UINT256, OtherOp.getRange())` inside the `BO_OR || BO_XOR` `if constexpr`; function return type is `Operand`. `src/compiler/evm_frontend/evm_mir_compiler.h:575`, `src/compiler/evm_frontend/evm_mir_compiler.h:664-686` +- Check 2: ✅ Phase 5 excludes AND via `if constexpr (BO_OR || BO_XOR)` and uses `Operand::maxRange`; AND still returns earlier for u64-const and U128/NarrowRange, while unconstrained U256 AND falls through to default `Operand(Result, EVMType::UINT256)`. `src/compiler/evm_frontend/evm_mir_compiler.h:615-661`, `src/compiler/evm_frontend/evm_mir_compiler.h:690-707` +- Check 3: ✅ `maxRange` is a static helper on `Operand`; call site correctly uses `Operand::maxRange`. `src/compiler/evm_frontend/evm_mir_compiler.h:266-269`, `src/compiler/evm_frontend/evm_mir_compiler.h:703-705` +- Check 4: ✅ only `BO_SHR_U` returns `ValueOp.getRange()`; SHL and SHR_S share the conservative default return. `src/compiler/evm_frontend/evm_mir_compiler.h:762-776` +- Check 5: ✅ `handleCompareEqU64` uses `FinalResult = LowEq` for `Range == U64`; else branch preserves upper-limb OR-fold, `ICMP_EQ` vs Zero, and AND with `LowEq`; both feed one protected result and zero-fill tail. `src/compiler/evm_frontend/evm_mir_compiler.cpp:2990-3014` +- Check 6: ✅ LT and GT use the same shape: `LowLt`/`LowGt` fast path when range is U64; else branch preserves HasUpper select semantics. `src/compiler/evm_frontend/evm_mir_compiler.cpp:3032-3056`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:3074-3100` +- Check 7: ✅ `git diff --name-only HEAD -- src/` lists only `evm_mir_compiler.cpp` and `evm_mir_compiler.h`; no other `src/` file is modified. +- Check 8: ✅ no fast-path branch returns early; all three compare helpers converge to `Result[0] = protectUnsafeValue(...)` and `Result[1..3] = Zero`. `src/compiler/evm_frontend/evm_mir_compiler.cpp:3009-3014`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:3051-3056`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:3095-3100` + +## Refuted / concerns +- Check 9: ⚠️ `tools/format.sh check` was run and failed with exit 123 on unrelated existing files, e.g. `src/singlepass/x64/assembler.h` and `src/host/wasi/sandboxed-system-primitives/src/posix.c`. The two touched files pass `clang-format --dry-run -style=file -Werror`. +- No semantic scope creep found: SUB, SHL, SAR, ADDMOD/MULMOD, EXP, signed compare, and unrelated files are untouched by this diff. + +## Suggested commit subject +perf(evm): propagate safe ValueRange through OR/XOR and SHR_U + +## Trust-chain analysis +For check #10, the fast paths are intentionally unsound if `Range == U64` lies. EQ would be wrong when upper limbs are nonzero and the low limb equals the constant: it returns `LowEq` true instead of old `LowEq && UpperZero`. LT would be wrong when upper limbs are nonzero and `low < rhs`: it returns true even though the full U256 is greater than any u64. GT would be wrong when upper limbs are nonzero and `low <= rhs`: it returns false even though the full U256 is greater than any u64. This matches the spec trust chain: correctness depends on `ValueRange::U64` implying runtime-zero upper limbs. diff --git a/docs/changes/2026-05-12-g4-safe-valuerange/reviews/impl-round-1-opus.md b/docs/changes/2026-05-12-g4-safe-valuerange/reviews/impl-round-1-opus.md new file mode 100644 index 000000000..f4bdfd35b --- /dev/null +++ b/docs/changes/2026-05-12-g4-safe-valuerange/reviews/impl-round-1-opus.md @@ -0,0 +1,93 @@ +# Impl Review — Round 1 (Opus) + +**Subject**: G4-safe ValueRange propagation implementation +**Reviewer persona**: DTVM senior compiler reviewer (adversarial, contract-focused) +**Diff scope**: `src/compiler/evm_frontend/evm_mir_compiler.{h,cpp}` + +## Verdict: **PASS** + +Implementation faithfully realizes Steps 1–4 of the spec. No correctness regressions +identified, no unstated extra changes, format/build/tests clean. + +## Spec ↔ Impl mapping + +| Spec step | Site | Status | +|-----------|------|--------| +| 1. `maxRange` helper | `evm_mir_compiler.h:267-269` | OK — `static`, U64 B.getRange() ? A : B` gives the wider (more conservative) tier. Correct. +4. **Phase-1 OR/XOR limb pass-through soundness** — limbs[1..3] of `Result` come from `Other[1..3]` (`h:680-682`), i.e. the same `MInstruction *` pointers held by `OtherOp`. Returning `OtherOp.getRange()` is therefore strictly correct (no widening, no narrowing). Comment at `h:683-685` documents this. +5. **SHR_U gating on `ValueOp`** — `ValueOp` is the value being shifted (`h:718-719` signature `handleShift(Operand ShiftOp, Operand ValueOp)`); right-shifting an N-bit value yields at-most-N bits. Correct. +6. **Compare consumer invariants** — all three helpers: + - keep `Result[1..3] = Zero` initialization in BOTH branches (it's after the if/else, so unconditional); + - keep `Result[0] = protectUnsafeValue(FinalResult, MirI64Type)` wrap; + - keep `return Operand(Result, EVMType::UINT256, ValueRange::U64)`. + + `GtRhsU64` correctly only allocates `One` in the else branch — no dead use, no leak (instructions are arena-allocated regardless). + +## Style / nit issues + +- En-dash characters (`—`) in comments at `cpp:3034`, `cpp:3076`, `h:701`. ASCII-only is not codified in `.claude/rules/cpp-code-style.md`, and existing files contain similar Unicode. **Not blocking.** +- All comments in English (per `cpp-code-style.md`). No new doc-blocks added. LLVM naming preserved. Trailing newline preserved (file unchanged at tail). +- `tools/format.sh check` → exit 0 (no diff). + +## Build / test artifacts + +- `build/lib/libdtvmapi.so` present (confirmed `ls`). +- User-reported: evmone-unittests multipass 223/223, evmone-statetest `-k fork_Cancun` multipass 2723/2723. Consistent with diff scope (no semantic change for non-narrow producers; range-tag-only at most call sites). + +## Suggested commit subject + +``` +perf(compiler): thread ValueRange through OR/XOR/SHR_U and skip OR-fold in U64 compares +``` + +(89 chars; lowercase type/scope; imperative; under 120-char cap per `commitlint.config.js`.) + +Optional body: + +``` +Phase 1: extend monotone ValueRange propagation to OR/XOR (max of operand +ranges) and unsigned right shift (input range — N-bit input cannot widen +under SHR_U). + +Phase 2: in handleCompareEqU64 / handleCompareLtRhsU64 / handleCompareGtRhsU64, +when the wide operand carries ValueRange::U64, the upper-limb OR-fold zero-test +is provably redundant: every existing U64 producer materializes literal MIR +Zero in limbs[1..3]. Elide the OR-fold and the zero-AND / Select; emit the +single-limb compare directly. + +Out of scope: SUB/SHL/SAR (widening or sign-fill), ADDMOD/MULMOD, EXP, signed +compare. Composes with perf/value-range-cfg-join. +``` + +## Verified facts table + +| Claim | Source | +|-------|--------| +| `maxRange` helper added | `src/compiler/evm_frontend/evm_mir_compiler.h:266-269` | +| Phase-1 OR/XOR returns `OtherOp.getRange()` | `evm_mir_compiler.h:686` | +| Phase-1 guarded by OR/XOR `if constexpr` | `evm_mir_compiler.h:665-666` | +| Phase-5 OR/XOR returns `maxRange(LHSOp, RHSOp)` | `evm_mir_compiler.h:703-706` | +| Phase-5 AND falls through to default `Operand(..., UINT256)` | `evm_mir_compiler.h:707` | +| SHR_U threads `ValueOp.getRange()`; SHL/SAR keep default | `evm_mir_compiler.h:773-775` then `:776` | +| `handleCompareEqU64` early branch + tail preserved | `evm_mir_compiler.cpp:2990-3014` | +| `handleCompareLtRhsU64` early branch + tail preserved | `evm_mir_compiler.cpp:3032-3056` | +| `handleCompareGtRhsU64` `One` hoisted to else only | `evm_mir_compiler.cpp:3074-3093` | +| All three return `ValueRange::U64` | `cpp:3014`, `cpp:3056`, `cpp:3100` (tail unchanged) | +| `tools/format.sh check` exit 0 | command output | +| `build/lib/libdtvmapi.so` present | `ls` | diff --git a/docs/changes/2026-05-12-g4-safe-valuerange/reviews/round-1-codex.md b/docs/changes/2026-05-12-g4-safe-valuerange/reviews/round-1-codex.md new file mode 100644 index 000000000..0e0bd6941 --- /dev/null +++ b/docs/changes/2026-05-12-g4-safe-valuerange/reviews/round-1-codex.md @@ -0,0 +1,22 @@ +# Round 1 — Codex Review of G4-safe Spec + +## Verdict +REVISE + +## Verified +- Claim 1: ✅ confirmed. `ValueRange` is `enum class ... : uint8_t` with implicit enumerator order `U64`, `U128`, `U256`; current code also relies on ordering via `<=` and `std::min` in AND narrowing. Sources: `src/compiler/evm_frontend/evm_mir_compiler.h:127-131`, `src/compiler/evm_frontend/evm_mir_compiler.h:633-636`. +- Claim 2: ❌ refuted as written. OR/XOR Phase 1 does pass through upper limbs with `Result[I] = Other[I]`, but returns `Operand(Result, EVMType::UINT256)` without a range, so the result defaults to `U256`, not `OtherOp`'s range. Sources: `src/compiler/evm_frontend/evm_mir_compiler.h:659-679`, `src/compiler/evm_frontend/evm_mir_compiler.h:273`. +- Claim 3: ✅ confirmed. General bitwise return has no `ValueRange` argument and therefore defaults to `U256`. Sources: `src/compiler/evm_frontend/evm_mir_compiler.h:682-692`, `src/compiler/evm_frontend/evm_mir_compiler.h:273`. +- Claim 4: ✅ confirmed. `BO_SHR_U` dispatches to `handleLogicalRightShift(Value, ShiftAmount, IsLargeShift)`. Source: `src/compiler/evm_frontend/evm_mir_compiler.h:747-750`. +- Claim 5: ❌ refuted as phrased. The helpers emit a 3-upper-limb OR fold (`LHS[1] | LHS[2] | LHS[3]`), not a 4-limb fold. Sources: `src/compiler/evm_frontend/evm_mir_compiler.cpp:2990-2996`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:3020-3026`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:3056-3062`. +- Claim 5b: ✅ no unsafe U64 producer found in reviewed sites. U64 constants require limbs 1..3 zero; AND U64 paths write `Zero` to eliminated limbs; DIV/MOD U64 fast paths build `{result, Zero, Zero, Zero}`; BYTE writes `Zero` to limbs 1..3; compare results write `Zero` to limbs 1..3. ADD/MUL range fast paths are actually `U128`, not `U64`, and materialize limbs 2..3 as `Zero`. Sources: `src/compiler/evm_frontend/evm_mir_compiler.h:167-175`, `src/compiler/evm_frontend/evm_mir_compiler.h:621-629`, `src/compiler/evm_frontend/evm_mir_compiler.h:642-655`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:2024-2025`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:2194-2195`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:3689-3695`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:2769-2775`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:2800-2806`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:2873-2879`, `src/compiler/evm_frontend/evm_mir_compiler.h:381-392`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:1883-1887`. +- Claim 6: ✅ confirmed by running the requested `git diff`. The diff only touched the enum/include/setRange/createStackEntryOperand plumbing, not `handleBitwiseOp`, `handleShift`, or compare-RHS-U64 helpers. Sources: `src/compiler/evm_frontend/evm_mir_compiler.h:7-13`, `src/compiler/evm_frontend/evm_mir_compiler.h:127-131`, `src/compiler/evm_frontend/evm_mir_compiler.h:258-259`, `src/compiler/evm_frontend/evm_mir_compiler.h:313`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:1009-1014`. +- Claim 7: ✅ mostly confirmed. SUB can wrap because upper limbs use `SbbInstruction`; SHL can move low-limb bits into higher limbs; SAR can fill limbs with `AllOnes`; ADDMOD/MULMOD range depends on modulus and/or runtime helper; EXP repeatedly calls `handleMul` and returns full `U256`. Sources: `src/compiler/evm_frontend/evm_mir_compiler.h:461-475`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:3191-3284`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:3491-3508`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:3593-3660`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:2265-2277`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:2454-2471`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:2633-2640`, `src/compiler/evm_frontend/evm_mir_compiler.cpp:2739-2741`. + +## Unverified / hedged +- Claim 7 signed-compare exclusion: source confirms signed LT/GT uses a signed predicate only on the top limb and unsigned predicates on lower limbs, but this alone does not prove that any signed-compare optimization would break monotone narrowing. Source: `src/compiler/evm_frontend/evm_mir_compiler.cpp:2822-2854`. + +## Recommendations +- Fix Claim 2: say current OR/XOR fast path preserves upper limb instructions but does not preserve `ValueRange`. +- Fix Claim 5 wording from "4-limb OR-fold over upper limbs" to "3-upper-limb OR-fold". +- Keep the consumer shortcut only conditional on the established invariant: `ValueRange::U64` operands must have materialized zero limbs 1..3. diff --git a/docs/changes/2026-05-12-g4-safe-valuerange/reviews/round-1-opus.md b/docs/changes/2026-05-12-g4-safe-valuerange/reviews/round-1-opus.md new file mode 100644 index 000000000..0749dfbb6 --- /dev/null +++ b/docs/changes/2026-05-12-g4-safe-valuerange/reviews/round-1-opus.md @@ -0,0 +1,244 @@ +# Round 1 — Opus Review of G4-safe Spec + +Reviewer: Opus 4.7 +HEAD reviewed: `c644fbe` (worktree `/home/abmcar/DTVM/.worktrees/g4-safe-valuerange`) +Spec: `docs/changes/2026-05-12-g4-safe-valuerange/README.md` + +## Verdict + +**PASS** — with minor improvements. No correctness blockers found. + +The three transforms (OR/XOR range threading, SHR_U range threading, U64-range +compare-side OR-fold skip) are sound against current HEAD. The +`ValueRange::U64` invariant is **structurally enforced** today: every U64 +producer materializes literal MIR zero constants in limbs[1..3], so the +compare-side fast path's runtime safety does not rely on optimistic guesses. +The spec's line numbers (678, 692, 755 in the header; 2978, 3011, 3046 in the +.cpp) match HEAD exactly. + +Three nits in *Improvements* — none gate the implementation; can be addressed +in the same PR or follow-up. + +## Critical issues + +None. + +## Improvements + +### I1. Document the "literal MIR zero in upper limbs" invariant explicitly + +The spec says (lines 38–40 / 44–45): + +> EVM observable semantics: unchanged. All optimizations rely on `getRange()` +> annotations established at producer sites; values whose runtime bits do not +> match their annotation would be a pre-existing invariant violation. + +This is correct, but understates **why** today it is safe: every existing +`ValueRange::U64` producer materializes upper limbs as literal MIR zero +constants — the invariant is not just "runtime bits are zero", it is "the MIR +limb is a `createIntConstInstruction(I64Type, 0)`". I verified all eight +producer sites: + +- `src/compiler/evm_frontend/evm_mir_compiler.h:565` (general compare result — + built by `handleCompareImpl`, which calls `handleCompareEQZ` / `handleCompareEQ` + / `handleCompareGT_LT` — each writes `Result[1..3] = Zero` literally; see + `evm_mir_compiler.cpp:2771-2773`, `2802-2804`, and the GT_LT path) +- `evm_mir_compiler.h:629` (AND u64-const fast path) — `Result[I] = Zero` for `I >= 1` +- `evm_mir_compiler.cpp:2024` (DIV u64÷u64) — `U256Inst Result = {DivResult, Zero, Zero, Zero}` +- `evm_mir_compiler.cpp:2194` (MOD u64÷u64) — `U256Inst Result = {ModResult, Zero, Zero, Zero}` +- `evm_mir_compiler.cpp:3002-3006` (handleCompareEqU64) — `Result[I] = Zero` +- `evm_mir_compiler.cpp:3037-3041` (handleCompareLtRhsU64) — same +- `evm_mir_compiler.cpp:3073-3077` (handleCompareGtRhsU64) — same +- `evm_mir_compiler.cpp:3689-3693` (BYTE) — `ResultComponents[I] = Zero` + +(The Phase-2 AND NarrowRange path at `evm_mir_compiler.h:636-655` also writes +`Result[I] = Zero` whenever `NarrowRange == U64` and `I >= 1` — confirming the +invariant is already industry-standard inside this file.) + +Suggest adding a sentence to the *Risks* section that pins this down — and +that this PR preserves it: the new compare-side fast path **does not weaken** +the invariant because it only **reads** the Range tag and elides reading +`LHS[1..3]` entirely; it never *creates* a U64-tagged operand with +non-literal-zero upper limbs. + +### I2. Phase 2 OR/XOR helper is fine; "Result[1..3] = Other[1..3]" pass-through holds + +For Phase 1 OR/XOR fast path at `evm_mir_compiler.h:670-678`, the code +literally does: + +```cpp +Result[0] = protectUnsafeValue(, MirI64Type); +for (size_t I = 1; I < EVM_ELEMENTS_COUNT; ++I) { + Result[I] = Other[I]; +} +return Operand(Result, EVMType::UINT256); +``` + +The spec's proposed return-with-`OtherOp.getRange()` is sound because +`Result[1..3] == Other[1..3]` (identity pass-through; OR/XOR of `Other[I]` +with the U64 constant's upper limb, which is 0, is `Other[I]` and the code +correctly skips emitting that no-op). If `OtherOp` had `Range == U64`, its +upper limbs were already MIR zero (per I1), so `Result[1..3]` are also MIR +zero. Same logic for U128. **Spec claim verified.** + +One small textual nit: the spec at line 69 says "Limb[0] is recomputed; +limbs[1..3] pass through unchanged from `OtherOp`. Their value-range claim is +preserved." Consider tightening to: "limbs[1..3] are the *same MInstruction +pointers* as `Other[I]`, so any zero-limb materialization in the producer is +preserved bit-identically — no semantic re-derivation needed." + +### I3. SHR_U: confirm that result-limb MIR is **not** required to be literal-zero, only runtime-zero + +In `handleLogicalRightShift` (cpp:3291) — for a U64 input where +`Value[1..3]` are literal MIR zero constants: + +- Constant-shift path (3300-3355): result limbs are + `Select(IsLargeShift, Zero, R)` where `R` is composed from `Value[SrcIdx]` + and `Value[SrcIdx+1]` via `OP_ushr`/`OP_shl`/`OP_or`. If all `Value[J]` for + `J ≥ 1` are MIR zero constants, then `R` is shift/OR of zeros, which the + optimizer may or may not fold. **Runtime value is zero**, but MIR-level + identity is `Select(IsLargeShift, Zero, )`, **not** a literal + zero. +- Variable-shift path (3358-end): similar — variable-index selects from + `Value[*]`, but all the sources are zero at runtime. + +**Implication**: future consumers that re-read `extractU256Operand(SHR_U_result)` +and need MIR-level zero in limbs[1..3] (e.g., to constant-fold) **will not +get** literal zero — they will get a Select instruction. **Today's compare-side +fast path is unaffected** because it elides reading limbs[1..3] entirely when +`Range == U64`, but the spec should note that the SHR_U-produced Range +annotation is a *value-level* guarantee, not a *MIR-level* one. This matters +the moment a future opcode handler wants to constant-fold based on +`extractU256Operand[I] == Zero-instruction`. + +**Recommendation**: add a 1-line note to the *Risks* or *Contracts preserved* +section: "SHR_U-produced U64-range operands have runtime-zero upper limbs but +the corresponding MIR may not be a literal zero constant — consumers must +gate on `Range` rather than MIR-pattern-match upper limbs." + +### I4. Checklist is binary but missing one item + +The Implementation Checklist (lines 119-129) is mostly measurable. One gap: +no checklist item covers the **assertion / DCHECK** practice. Suggest adding: + +- [ ] No new producer site emits `ValueRange::U64` without setting + `Result[1..3] = Zero` at the MIR level. (Verify by reading the diff; + inheritance check, not codegen.) + +This is mostly self-policing since the PR only touches *consumers* (OR/XOR +threading is pass-through, SHR_U is read-only on `ValueOp.getRange()`), but +flagging it prevents future producer-extension PRs from regressing the +invariant silently. + +### I5. cfg-join interaction: textually orthogonal, semantically a soft coupling + +The spec (line 17) claims orthogonality with `perf/value-range-cfg-join`. +**Textual orthogonality holds** — the cfg-join diff against +`upstream/main` is 64 lines in `evm_mir_compiler.{h,cpp}` and touches: + +- `evm_mir_compiler.h:127` — `enum class ValueRange` → `using ValueRange = EVMValueRange` + (just refactors the enum into a shared header; no semantic change) +- `evm_mir_compiler.h:~260` — adds `void setRange(ValueRange NewRange)` +- `evm_mir_compiler.h:311` — `createStackEntryOperand` gains + `ValueRange Range = ValueRange::U256` parameter +- `evm_mir_compiler.cpp:1006-1017` — body of `createStackEntryOperand` +- `src/action/evm_bytecode_visitor.h:1132-1185` — calls `Opnd.setRange(EntryRanges[SlotIdx])` + on every operand popped at a lifted-block boundary, retrofitting the analyzer-computed range + +None of these overlap with G4-safe's touchpoints (`handleBitwiseOp:570-693`, +`handleShift:704-756`, three compare helpers at `evm_mir_compiler.cpp:2978/3011/3046`). +**The spec's "compose without conflict" claim is true at the file/diff level.** + +However, there is a **semantic soft coupling worth surfacing**: cfg-join +attaches `Range = U64` to operands whose `U256Var` backing variables may +contain *anything* in upper-limb slots — the analyzer guarantees value-level +zero, not MIR-level zero. The G4-safe compare fast path skips reading those +upper limbs and trusts the Range tag. **If cfg-join's analyzer ever has a +soundness bug that over-narrows a Range to U64, the compare-side change here +amplifies the impact from "wrong upper limbs in arithmetic" to "wrong +boolean result of EQ/LT/GT".** + +This is already the situation today for the AND NarrowRange path +(`evm_mir_compiler.h:633-655`), which also trusts `getRange()` and rewrites +upper limbs to Zero in the *output*. So G4-safe is **consistent with the +existing trust model**, not introducing a new one. But the spec should +acknowledge that cfg-join + G4-safe compose into a **trust-chain**: + +> compare-side fast path correctness ⇐ Range == U64 implies runtime upper +> limbs are zero ⇐ (today) producer materializes literal zero MIR ⇐ +> (with cfg-join) analyzer soundness + +Recommend adding to the *Risks* section: "When `perf/value-range-cfg-join` +lands, the compare-side fast path will additionally rely on the analyzer's +soundness — extend the analyzer's correctness test matrix to exercise EQ/LT/GT +with analyzer-narrowed range at the U64 boundary." + +## Spec compliance audit + +### Scope-creep check (passes) + +The spec's "out of scope" list (line 15) — SUB, SHL, SAR, ADDMOD, MULMOD, EXP, +signed compare — is internally consistent. The Implementation section never +re-introduces any of them: + +- Step 2 explicitly handles `BO_OR` / `BO_XOR` only (excludes `BO_AND` which + is already done). +- Step 3 explicitly handles `BO_SHR_U`; the `if constexpr (Operator == + BinaryOperator::BO_SHR_U)` guard at the proposed line 755 site explicitly + excludes SHL and SHR_S — verified against `evm_mir_compiler.h:747-753`. +- Step 4 explicitly names the three unsigned compare helpers; signed compare + (`CO_LT_S`, `CO_GT_S`, `handleCompareSlt*` if any) is untouched. + +No hidden scope-widening suggestion in the body. + +### Step-reference accuracy + +| Spec claim | Actual file:line | Match | +|---|---|---| +| `handleBitwiseOp` at `.h:570-693` | `.h:570-693` | ✓ | +| Phase 1 u64-const OR/XOR return at line 678 | `.h:678` (`return Operand(Result, EVMType::UINT256);`) | ✓ | +| Phase 5 general return at line 692 | `.h:692` (`return Operand(Result, EVMType::UINT256);`) | ✓ | +| `handleShift` at `.h:704-756` | `.h:704-756` (with return at 755) | ✓ | +| SHR return at line 755 | `.h:755` | ✓ | +| `handleCompareEqU64` around `.cpp:2978` | `.cpp:2978` | ✓ | +| `handleCompareLtRhsU64` around `.cpp:3011` | `.cpp:3011` | ✓ | +| `handleCompareGtRhsU64` around `.cpp:3046` | `.cpp:3046` | ✓ | +| `ValueRange` enum near `.h:127` | `.h:127-131` | ✓ | +| PR #458 → `fca0b1a` | `git log fca0b1a -1` confirms commit message starts `perf(evm): u256 arithmetic optimizations (shift/addmod/barrier/value-range/div-mod) (#458)` | ✓ | + +### "Result encoding for narrow-path compares unchanged" claim (passes) + +Spec (line 40) says `Result[0] = cmp`, `Result[1..3] = Zero`, `Range = U64`. +Verified for all three compare helpers — they always set `Result[I] = Zero` for +`I >= 1` regardless of whether the fast path was taken; the spec's "Preserve +the `protectUnsafeValue` wrap and the `Result[1..3] = Zero` tail in all three" +at line 99 is the correct guidance. + +## Verified facts + +| Claim | Source | +|---|---| +| `ValueRange` is `uint8_t` with `U64=0, U128=1, U256=2` | `evm_mir_compiler.h:127-131` | +| `Operand::getRange()` returns Range field | `evm_mir_compiler.h:259` | +| Range default is `U256` | `evm_mir_compiler.h:273` | +| `bothFitU64` checks both at `U64` | `evm_mir_compiler.h:262-264` | +| OR/XOR Phase-1 return drops range to default U256 today | `evm_mir_compiler.h:678` | +| OR/XOR Phase-5 general return drops range to default U256 today | `evm_mir_compiler.h:692` | +| SHR (all three) shares one return that drops range to default U256 today | `evm_mir_compiler.h:755` | +| `handleLogicalRightShift` body | `evm_mir_compiler.cpp:3291-3475` (approx) | +| `extractU256Operand` does not consult Range | `evm_mir_compiler.cpp:4900-4963` | +| All `ValueRange::U64` producers materialize literal-zero limbs[1..3] | See I1 list above | +| `handleCompareEqU64` upper-limbs OR-fold is at | `evm_mir_compiler.cpp:2991-3000` | +| `handleCompareLtRhsU64` upper-limbs OR-fold and SelectInstruction | `evm_mir_compiler.cpp:3020-3035` | +| `handleCompareGtRhsU64` upper-limbs OR-fold and SelectInstruction | `evm_mir_compiler.cpp:3056-3071` | +| cfg-join branch diff against upstream/main, evm_mir_compiler scope | 64 lines, no overlap with G4-safe touchpoints | +| `EVMValueRange` shared enum | `src/compiler/evm_frontend/evm_value_range.h` on `perf/value-range-cfg-join` | +| cfg-join retrofits Range onto popped operands | `src/action/evm_bytecode_visitor.h:1140-1185` on `perf/value-range-cfg-join` | + +## Summary + +Ship it. The spec is correct, scoped, and reproducible against HEAD. The +improvements above are documentation polish on the invariant chain (I1, I2, +I3, I5) and one optional checklist item (I4) — none block implementation. +The author can address them inline in the README or pick them up in PR +review comments. diff --git a/src/compiler/evm_frontend/evm_mir_compiler.cpp b/src/compiler/evm_frontend/evm_mir_compiler.cpp index c66401d16..276aa9b0c 100644 --- a/src/compiler/evm_frontend/evm_mir_compiler.cpp +++ b/src/compiler/evm_frontend/evm_mir_compiler.cpp @@ -2997,17 +2997,24 @@ EVMMirBuilder::handleCompareEqU64(const Operand &FullOp, uint64_t U64Val) { MInstruction *LowEq = createInstruction( false, EqPred, &Ctx.I64Type, LHS[0], CmpVal); - // Check that upper limbs are all zero via OR-fold - MInstruction *Upper = createInstruction( - false, OP_or, MirI64Type, LHS[1], LHS[2]); - Upper = createInstruction(false, OP_or, MirI64Type, Upper, - LHS[3]); - MInstruction *UpperZero = createInstruction( - false, EqPred, &Ctx.I64Type, Upper, Zero); + MInstruction *FinalResult; + if (FullOp.getRange() == ValueRange::U64) { + // Upper limbs are provably zero (every existing U64 producer materializes + // literal MIR Zero in limbs[1..3]). Skip the OR-fold and the zero-test. + FinalResult = LowEq; + } else { + // Check that upper limbs are all zero via OR-fold + MInstruction *Upper = createInstruction( + false, OP_or, MirI64Type, LHS[1], LHS[2]); + Upper = createInstruction(false, OP_or, MirI64Type, + Upper, LHS[3]); + MInstruction *UpperZero = createInstruction( + false, EqPred, &Ctx.I64Type, Upper, Zero); - // Final: low matches AND upper is zero - MInstruction *FinalResult = createInstruction( - false, OP_and, &Ctx.I64Type, LowEq, UpperZero); + // Final: low matches AND upper is zero + FinalResult = createInstruction( + false, OP_and, &Ctx.I64Type, LowEq, UpperZero); + } U256Inst Result = {}; Result[0] = protectUnsafeValue(FinalResult, MirI64Type); @@ -3027,22 +3034,29 @@ EVMMirBuilder::handleCompareLtRhsU64(const Operand &LHSOp, uint64_t RhsU64) { EVMFrontendContext::getMIRTypeFromEVMType(EVMType::UINT64); MInstruction *Zero = createIntConstInstruction(MirI64Type, 0); - MInstruction *Upper = createInstruction( - false, OP_or, MirI64Type, LHS[1], LHS[2]); - Upper = createInstruction(false, OP_or, MirI64Type, Upper, - LHS[3]); - auto NePred = CmpInstruction::Predicate::ICMP_NE; - MInstruction *HasUpper = createInstruction( - false, NePred, &Ctx.I64Type, Upper, Zero); - MInstruction *RhsVal = createIntConstInstruction(MirI64Type, RhsU64); auto LtPred = CmpInstruction::Predicate::ICMP_ULT; MInstruction *LowLt = createInstruction( false, LtPred, &Ctx.I64Type, LHS[0], RhsVal); - // result = hasUpper ? 0 : lowLt - MInstruction *FinalResult = createInstruction( - false, &Ctx.I64Type, HasUpper, Zero, LowLt); + MInstruction *FinalResult; + if (LHSOp.getRange() == ValueRange::U64) { + // Upper limbs are provably zero — HasUpper would always be false, so the + // select collapses to LowLt. + FinalResult = LowLt; + } else { + MInstruction *Upper = createInstruction( + false, OP_or, MirI64Type, LHS[1], LHS[2]); + Upper = createInstruction(false, OP_or, MirI64Type, + Upper, LHS[3]); + auto NePred = CmpInstruction::Predicate::ICMP_NE; + MInstruction *HasUpper = createInstruction( + false, NePred, &Ctx.I64Type, Upper, Zero); + + // result = hasUpper ? 0 : lowLt + FinalResult = createInstruction(false, &Ctx.I64Type, + HasUpper, Zero, LowLt); + } U256Inst Result = {}; Result[0] = protectUnsafeValue(FinalResult, MirI64Type); @@ -3061,24 +3075,32 @@ EVMMirBuilder::handleCompareGtRhsU64(const Operand &LHSOp, uint64_t RhsU64) { MType *MirI64Type = EVMFrontendContext::getMIRTypeFromEVMType(EVMType::UINT64); MInstruction *Zero = createIntConstInstruction(MirI64Type, 0); - MInstruction *One = createIntConstInstruction(MirI64Type, 1); - - MInstruction *Upper = createInstruction( - false, OP_or, MirI64Type, LHS[1], LHS[2]); - Upper = createInstruction(false, OP_or, MirI64Type, Upper, - LHS[3]); - auto NePred = CmpInstruction::Predicate::ICMP_NE; - MInstruction *HasUpper = createInstruction( - false, NePred, &Ctx.I64Type, Upper, Zero); MInstruction *RhsVal = createIntConstInstruction(MirI64Type, RhsU64); auto GtPred = CmpInstruction::Predicate::ICMP_UGT; MInstruction *LowGt = createInstruction( false, GtPred, &Ctx.I64Type, LHS[0], RhsVal); - // result = hasUpper ? 1 : lowGt - MInstruction *FinalResult = createInstruction( - false, &Ctx.I64Type, HasUpper, One, LowGt); + MInstruction *FinalResult; + if (LHSOp.getRange() == ValueRange::U64) { + // Upper limbs are provably zero — HasUpper would always be false, so the + // select collapses to LowGt. + FinalResult = LowGt; + } else { + MInstruction *One = createIntConstInstruction(MirI64Type, 1); + + MInstruction *Upper = createInstruction( + false, OP_or, MirI64Type, LHS[1], LHS[2]); + Upper = createInstruction(false, OP_or, MirI64Type, + Upper, LHS[3]); + auto NePred = CmpInstruction::Predicate::ICMP_NE; + MInstruction *HasUpper = createInstruction( + false, NePred, &Ctx.I64Type, Upper, Zero); + + // result = hasUpper ? 1 : lowGt + FinalResult = createInstruction(false, &Ctx.I64Type, + HasUpper, One, LowGt); + } U256Inst Result = {}; Result[0] = protectUnsafeValue(FinalResult, MirI64Type); diff --git a/src/compiler/evm_frontend/evm_mir_compiler.h b/src/compiler/evm_frontend/evm_mir_compiler.h index 6519acc38..7e20b3e1d 100644 --- a/src/compiler/evm_frontend/evm_mir_compiler.h +++ b/src/compiler/evm_frontend/evm_mir_compiler.h @@ -264,6 +264,11 @@ class EVMMirBuilder final { return A.getRange() == ValueRange::U64 && B.getRange() == ValueRange::U64; } + // Pick the wider range between two operands (monotone join for OR/XOR). + static ValueRange maxRange(const Operand &A, const Operand &B) { + return A.getRange() > B.getRange() ? A.getRange() : B.getRange(); + } + constexpr bool isReg() { return false; } constexpr bool isTempReg() { return true; } @@ -676,7 +681,10 @@ class EVMMirBuilder final { for (size_t I = 1; I < EVM_ELEMENTS_COUNT; ++I) { Result[I] = Other[I]; } - return Operand(Result, EVMType::UINT256); + // OR/XOR with a u64 constant: limbs[1..3] pass through as the same + // MInstruction pointers from OtherOp, so the value range of those + // limbs is preserved exactly. max(U64, OtherOp.range) = OtherOp.range. + return Operand(Result, EVMType::UINT256, OtherOp.getRange()); } } @@ -690,6 +698,13 @@ class EVMMirBuilder final { false, getMirOpcode(Operator), MirI64Type, LHS[I], RHS[I]); Result[I] = protectUnsafeValue(LocalResult, MirI64Type); } + // OR/XOR are monotone on range: bits set in either operand are set in the + // result, so result range = max(LHS, RHS). AND already returns earlier with + // its own narrowed range, so this branch only fires for OR/XOR. + if constexpr (Operator == BinaryOperator::BO_OR || + Operator == BinaryOperator::BO_XOR) { + return Operand(Result, EVMType::UINT256, Operand::maxRange(LHSOp, RHSOp)); + } return Operand(Result, EVMType::UINT256); } @@ -753,6 +768,12 @@ class EVMMirBuilder final { Result = handleArithmeticRightShift(Value, ShiftAmount, IsLargeShift); } + // Unsigned right shift cannot widen: an N-bit value shifted right yields an + // at-most-N-bit value. SHL widens by construction; SAR sign-fills upper + // limbs; both keep the conservative U256 default. + if constexpr (Operator == BinaryOperator::BO_SHR_U) { + return Operand(Result, EVMType::UINT256, ValueOp.getRange()); + } return Operand(Result, EVMType::UINT256); } From 6168f5a53d84e731da01c47030c569100433d51a Mon Sep 17 00:00:00 2001 From: Abmcar Date: Tue, 12 May 2026 16:42:44 +0800 Subject: [PATCH 2/6] perf(compiler): extend U64 compare narrow path to U128-range operands MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit For LHS operands tagged ValueRange::U128, limbs[2..3] are provably zero (every existing U128 producer materializes literal MIR Zero in those positions — see invariant chain in 2026-05-12-g4-safe-valuerange/README.md). The previous U256 fallback in handleCompareEqU64 / LtRhsU64 / GtRhsU64 emitted a 3-limb OR-fold (LHS[1] | LHS[2] | LHS[3]); for U128 inputs that collapses to a single LHS[1] vs Zero check. Per-helper savings vs the U256 fallback path: - handleCompareEqU64: 2 OR instructions saved (single CMP + AND remains). - handleCompareLtRhsU64: 2 OR saved (single NE-CMP + Select remains). - handleCompareGtRhsU64: 2 OR saved (single NE-CMP + Select remains). Also initialize FinalResult to nullptr in all three helpers as defensive practice against future branch additions (no behavior change today). Local validation on g4-safe-valuerange: - tools/format.sh check clean - multipass unittests 223/223 - multipass statetest -k fork_Cancun 2723/2723 - 27-bench local ping-pong is too noisy on this machine (baseline drifts ~10pp between back-to-back runs); deferring perf judgment to the CI multipass regression check. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../evm_frontend/evm_mir_compiler.cpp | 27 ++++++++++++++++--- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/src/compiler/evm_frontend/evm_mir_compiler.cpp b/src/compiler/evm_frontend/evm_mir_compiler.cpp index 276aa9b0c..03e52b761 100644 --- a/src/compiler/evm_frontend/evm_mir_compiler.cpp +++ b/src/compiler/evm_frontend/evm_mir_compiler.cpp @@ -2997,11 +2997,17 @@ EVMMirBuilder::handleCompareEqU64(const Operand &FullOp, uint64_t U64Val) { MInstruction *LowEq = createInstruction( false, EqPred, &Ctx.I64Type, LHS[0], CmpVal); - MInstruction *FinalResult; + MInstruction *FinalResult = nullptr; if (FullOp.getRange() == ValueRange::U64) { // Upper limbs are provably zero (every existing U64 producer materializes // literal MIR Zero in limbs[1..3]). Skip the OR-fold and the zero-test. FinalResult = LowEq; + } else if (FullOp.getRange() == ValueRange::U128) { + // Limbs[2..3] are provably zero — only LHS[1] needs to be checked. + MInstruction *UpperZero = createInstruction( + false, EqPred, &Ctx.I64Type, LHS[1], Zero); + FinalResult = createInstruction( + false, OP_and, &Ctx.I64Type, LowEq, UpperZero); } else { // Check that upper limbs are all zero via OR-fold MInstruction *Upper = createInstruction( @@ -3039,11 +3045,18 @@ EVMMirBuilder::handleCompareLtRhsU64(const Operand &LHSOp, uint64_t RhsU64) { MInstruction *LowLt = createInstruction( false, LtPred, &Ctx.I64Type, LHS[0], RhsVal); - MInstruction *FinalResult; + MInstruction *FinalResult = nullptr; if (LHSOp.getRange() == ValueRange::U64) { // Upper limbs are provably zero — HasUpper would always be false, so the // select collapses to LowLt. FinalResult = LowLt; + } else if (LHSOp.getRange() == ValueRange::U128) { + // Limbs[2..3] are provably zero — HasUpper reduces to LHS[1] != 0. + auto NePred = CmpInstruction::Predicate::ICMP_NE; + MInstruction *HasUpper = createInstruction( + false, NePred, &Ctx.I64Type, LHS[1], Zero); + FinalResult = createInstruction(false, &Ctx.I64Type, + HasUpper, Zero, LowLt); } else { MInstruction *Upper = createInstruction( false, OP_or, MirI64Type, LHS[1], LHS[2]); @@ -3081,11 +3094,19 @@ EVMMirBuilder::handleCompareGtRhsU64(const Operand &LHSOp, uint64_t RhsU64) { MInstruction *LowGt = createInstruction( false, GtPred, &Ctx.I64Type, LHS[0], RhsVal); - MInstruction *FinalResult; + MInstruction *FinalResult = nullptr; if (LHSOp.getRange() == ValueRange::U64) { // Upper limbs are provably zero — HasUpper would always be false, so the // select collapses to LowGt. FinalResult = LowGt; + } else if (LHSOp.getRange() == ValueRange::U128) { + // Limbs[2..3] are provably zero — HasUpper reduces to LHS[1] != 0. + MInstruction *One = createIntConstInstruction(MirI64Type, 1); + auto NePred = CmpInstruction::Predicate::ICMP_NE; + MInstruction *HasUpper = createInstruction( + false, NePred, &Ctx.I64Type, LHS[1], Zero); + FinalResult = createInstruction(false, &Ctx.I64Type, + HasUpper, One, LowGt); } else { MInstruction *One = createIntConstInstruction(MirI64Type, 1); From f63608afb57c845794095bf66cd8d18e37a8539e Mon Sep 17 00:00:00 2001 From: Abmcar Date: Wed, 13 May 2026 11:03:19 +0800 Subject: [PATCH 3/6] refactor(compiler): tighten G4-safe code (review follow-ups) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cleanup pass on PR #499 from /simplify review: - Drop the Operand::maxRange one-line helper; use std::max at the single call site to match the existing std::min precedent for AND narrowing. - Add a "ordering is load-bearing" invariant note above the ValueRange enum to guard against silent breakage from future reorderings. - Drop the defensive `MInstruction *FinalResult = nullptr;` init in the three compare helpers; the if/else-if/else cascade already assigns on every path, and removing the nullptr lets the compiler warn loudly if a future branch addition skips assignment. - Trim Phase 5 OR/XOR comment from 3 lines to 1 (the AND-returns-earlier half is already encoded by the if constexpr guard). - Drop the WHAT-style trailing comments in the U256 fallback paths ("// Final: low matches AND upper is zero", "// result = hasUpper ? ...") that restate the next assignment. No behavior change. Local validation: format clean, dtvmapi builds, 223 / 223 multipass unittests pass. Statetest skipped (diff is structurally equivalent — only comment/init/helper-naming changes). Co-Authored-By: Claude Opus 4.7 (1M context) --- src/compiler/evm_frontend/evm_mir_compiler.cpp | 14 +++----------- src/compiler/evm_frontend/evm_mir_compiler.h | 12 +++--------- 2 files changed, 6 insertions(+), 20 deletions(-) diff --git a/src/compiler/evm_frontend/evm_mir_compiler.cpp b/src/compiler/evm_frontend/evm_mir_compiler.cpp index 03e52b761..f8610d724 100644 --- a/src/compiler/evm_frontend/evm_mir_compiler.cpp +++ b/src/compiler/evm_frontend/evm_mir_compiler.cpp @@ -2997,7 +2997,7 @@ EVMMirBuilder::handleCompareEqU64(const Operand &FullOp, uint64_t U64Val) { MInstruction *LowEq = createInstruction( false, EqPred, &Ctx.I64Type, LHS[0], CmpVal); - MInstruction *FinalResult = nullptr; + MInstruction *FinalResult; if (FullOp.getRange() == ValueRange::U64) { // Upper limbs are provably zero (every existing U64 producer materializes // literal MIR Zero in limbs[1..3]). Skip the OR-fold and the zero-test. @@ -3009,15 +3009,12 @@ EVMMirBuilder::handleCompareEqU64(const Operand &FullOp, uint64_t U64Val) { FinalResult = createInstruction( false, OP_and, &Ctx.I64Type, LowEq, UpperZero); } else { - // Check that upper limbs are all zero via OR-fold MInstruction *Upper = createInstruction( false, OP_or, MirI64Type, LHS[1], LHS[2]); Upper = createInstruction(false, OP_or, MirI64Type, Upper, LHS[3]); MInstruction *UpperZero = createInstruction( false, EqPred, &Ctx.I64Type, Upper, Zero); - - // Final: low matches AND upper is zero FinalResult = createInstruction( false, OP_and, &Ctx.I64Type, LowEq, UpperZero); } @@ -3045,7 +3042,7 @@ EVMMirBuilder::handleCompareLtRhsU64(const Operand &LHSOp, uint64_t RhsU64) { MInstruction *LowLt = createInstruction( false, LtPred, &Ctx.I64Type, LHS[0], RhsVal); - MInstruction *FinalResult = nullptr; + MInstruction *FinalResult; if (LHSOp.getRange() == ValueRange::U64) { // Upper limbs are provably zero — HasUpper would always be false, so the // select collapses to LowLt. @@ -3065,8 +3062,6 @@ EVMMirBuilder::handleCompareLtRhsU64(const Operand &LHSOp, uint64_t RhsU64) { auto NePred = CmpInstruction::Predicate::ICMP_NE; MInstruction *HasUpper = createInstruction( false, NePred, &Ctx.I64Type, Upper, Zero); - - // result = hasUpper ? 0 : lowLt FinalResult = createInstruction(false, &Ctx.I64Type, HasUpper, Zero, LowLt); } @@ -3094,7 +3089,7 @@ EVMMirBuilder::handleCompareGtRhsU64(const Operand &LHSOp, uint64_t RhsU64) { MInstruction *LowGt = createInstruction( false, GtPred, &Ctx.I64Type, LHS[0], RhsVal); - MInstruction *FinalResult = nullptr; + MInstruction *FinalResult; if (LHSOp.getRange() == ValueRange::U64) { // Upper limbs are provably zero — HasUpper would always be false, so the // select collapses to LowGt. @@ -3109,7 +3104,6 @@ EVMMirBuilder::handleCompareGtRhsU64(const Operand &LHSOp, uint64_t RhsU64) { HasUpper, One, LowGt); } else { MInstruction *One = createIntConstInstruction(MirI64Type, 1); - MInstruction *Upper = createInstruction( false, OP_or, MirI64Type, LHS[1], LHS[2]); Upper = createInstruction(false, OP_or, MirI64Type, @@ -3117,8 +3111,6 @@ EVMMirBuilder::handleCompareGtRhsU64(const Operand &LHSOp, uint64_t RhsU64) { auto NePred = CmpInstruction::Predicate::ICMP_NE; MInstruction *HasUpper = createInstruction( false, NePred, &Ctx.I64Type, Upper, Zero); - - // result = hasUpper ? 1 : lowGt FinalResult = createInstruction(false, &Ctx.I64Type, HasUpper, One, LowGt); } diff --git a/src/compiler/evm_frontend/evm_mir_compiler.h b/src/compiler/evm_frontend/evm_mir_compiler.h index 7e20b3e1d..6734a9d97 100644 --- a/src/compiler/evm_frontend/evm_mir_compiler.h +++ b/src/compiler/evm_frontend/evm_mir_compiler.h @@ -264,11 +264,6 @@ class EVMMirBuilder final { return A.getRange() == ValueRange::U64 && B.getRange() == ValueRange::U64; } - // Pick the wider range between two operands (monotone join for OR/XOR). - static ValueRange maxRange(const Operand &A, const Operand &B) { - return A.getRange() > B.getRange() ? A.getRange() : B.getRange(); - } - constexpr bool isReg() { return false; } constexpr bool isTempReg() { return true; } @@ -698,12 +693,11 @@ class EVMMirBuilder final { false, getMirOpcode(Operator), MirI64Type, LHS[I], RHS[I]); Result[I] = protectUnsafeValue(LocalResult, MirI64Type); } - // OR/XOR are monotone on range: bits set in either operand are set in the - // result, so result range = max(LHS, RHS). AND already returns earlier with - // its own narrowed range, so this branch only fires for OR/XOR. + // OR/XOR are monotone on range: result range = max(LHS, RHS). if constexpr (Operator == BinaryOperator::BO_OR || Operator == BinaryOperator::BO_XOR) { - return Operand(Result, EVMType::UINT256, Operand::maxRange(LHSOp, RHSOp)); + return Operand(Result, EVMType::UINT256, + std::max(LHSOp.getRange(), RHSOp.getRange())); } return Operand(Result, EVMType::UINT256); } From 2139cede5ed26d033dc67832dbe071cbf1fe340e Mon Sep 17 00:00:00 2001 From: Abmcar Date: Fri, 15 May 2026 10:57:54 +0800 Subject: [PATCH 4/6] refactor(compiler): tighten G4-safe invariants per Copilot review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Static-assert ValueRange enum ordering (U64 < U128 < U256) so the std::min/std::max contract in AND/OR/XOR can't be silently broken. - Add isLiteralZero debug helper and assert it on the upper limbs of the U64/U128 fast paths in handleCompareEqU64, handleCompareLtRhsU64, handleCompareGtRhsU64 — promotes the producer-side invariant from prose to a runtime check (ZEN_ASSERT, zero cost in release). - Fix README wording: 4-limb OR-fold -> 3-upper-limb OR-fold. Addresses Copilot review on #499. Co-Authored-By: Claude Opus 4.7 --- .../2026-05-12-g4-safe-valuerange/README.md | 2 +- .../evm_frontend/evm_mir_compiler.cpp | 24 +++++++++++++++++++ src/compiler/evm_frontend/evm_value_range.h | 11 +++++++++ 3 files changed, 36 insertions(+), 1 deletion(-) diff --git a/docs/changes/2026-05-12-g4-safe-valuerange/README.md b/docs/changes/2026-05-12-g4-safe-valuerange/README.md index cbb6905ec..e43c88a95 100644 --- a/docs/changes/2026-05-12-g4-safe-valuerange/README.md +++ b/docs/changes/2026-05-12-g4-safe-valuerange/README.md @@ -22,7 +22,7 @@ This work is **orthogonal** to the `perf/value-range-cfg-join` branch, which per - `(narrow_mul a, b) | c` → range falls back to U256, blocking any downstream narrow-path lowering. - `(narrow_mul a, b) >> k` → same loss. -- `(narrow_mul a, b) < C64` → constant fast path triggers, but the helper still emits a 4-limb OR-fold check on the wide operand's upper limbs that we *know* are zero. +- `(narrow_mul a, b) < C64` → constant fast path triggers, but the helper still emits a 3-upper-limb OR-fold check (`LHS[1] | LHS[2] | LHS[3]`) that we *know* is zero. The 2026-05-12 verification doc isolated this as the smallest, lowest-risk subset of the G4 work: monotone propagation only, no wraparound or sign-fill semantics. The win is small per opcode but **load-bearing** for any future narrow consumer (compare-via-range, narrow lowering paths for additional opcodes). diff --git a/src/compiler/evm_frontend/evm_mir_compiler.cpp b/src/compiler/evm_frontend/evm_mir_compiler.cpp index f8610d724..82ded9569 100644 --- a/src/compiler/evm_frontend/evm_mir_compiler.cpp +++ b/src/compiler/evm_frontend/evm_mir_compiler.cpp @@ -2984,6 +2984,21 @@ EVMMirBuilder::handleSubU64Const(const Operand &LHSOp, return Operand(Result, EVMType::UINT256); } +namespace { +// True iff Inst is a materialized integer constant whose value is zero. +// Used to assert the upper-limb invariant of ValueRange::{U64,U128} producers +// in compare fast paths: skipping the OR-fold is only safe when those limbs +// are literal MIR zero. See evm_mir_compiler.h ValueRange comment. +bool isLiteralZero(MInstruction *Inst) { + if (auto *CI = llvm::dyn_cast(Inst)) { + if (auto *IntConst = llvm::dyn_cast(&CI->getConstant())) { + return IntConst->getValue().isZero(); + } + } + return false; +} +} // namespace + typename EVMMirBuilder::Operand EVMMirBuilder::handleCompareEqU64(const Operand &FullOp, uint64_t U64Val) { U256Inst LHS = extractU256Operand(FullOp); @@ -3001,9 +3016,12 @@ EVMMirBuilder::handleCompareEqU64(const Operand &FullOp, uint64_t U64Val) { if (FullOp.getRange() == ValueRange::U64) { // Upper limbs are provably zero (every existing U64 producer materializes // literal MIR Zero in limbs[1..3]). Skip the OR-fold and the zero-test. + ZEN_ASSERT(isLiteralZero(LHS[1]) && isLiteralZero(LHS[2]) && + isLiteralZero(LHS[3])); FinalResult = LowEq; } else if (FullOp.getRange() == ValueRange::U128) { // Limbs[2..3] are provably zero — only LHS[1] needs to be checked. + ZEN_ASSERT(isLiteralZero(LHS[2]) && isLiteralZero(LHS[3])); MInstruction *UpperZero = createInstruction( false, EqPred, &Ctx.I64Type, LHS[1], Zero); FinalResult = createInstruction( @@ -3046,9 +3064,12 @@ EVMMirBuilder::handleCompareLtRhsU64(const Operand &LHSOp, uint64_t RhsU64) { if (LHSOp.getRange() == ValueRange::U64) { // Upper limbs are provably zero — HasUpper would always be false, so the // select collapses to LowLt. + ZEN_ASSERT(isLiteralZero(LHS[1]) && isLiteralZero(LHS[2]) && + isLiteralZero(LHS[3])); FinalResult = LowLt; } else if (LHSOp.getRange() == ValueRange::U128) { // Limbs[2..3] are provably zero — HasUpper reduces to LHS[1] != 0. + ZEN_ASSERT(isLiteralZero(LHS[2]) && isLiteralZero(LHS[3])); auto NePred = CmpInstruction::Predicate::ICMP_NE; MInstruction *HasUpper = createInstruction( false, NePred, &Ctx.I64Type, LHS[1], Zero); @@ -3093,9 +3114,12 @@ EVMMirBuilder::handleCompareGtRhsU64(const Operand &LHSOp, uint64_t RhsU64) { if (LHSOp.getRange() == ValueRange::U64) { // Upper limbs are provably zero — HasUpper would always be false, so the // select collapses to LowGt. + ZEN_ASSERT(isLiteralZero(LHS[1]) && isLiteralZero(LHS[2]) && + isLiteralZero(LHS[3])); FinalResult = LowGt; } else if (LHSOp.getRange() == ValueRange::U128) { // Limbs[2..3] are provably zero — HasUpper reduces to LHS[1] != 0. + ZEN_ASSERT(isLiteralZero(LHS[2]) && isLiteralZero(LHS[3])); MInstruction *One = createIntConstInstruction(MirI64Type, 1); auto NePred = CmpInstruction::Predicate::ICMP_NE; MInstruction *HasUpper = createInstruction( diff --git a/src/compiler/evm_frontend/evm_value_range.h b/src/compiler/evm_frontend/evm_value_range.h index 01cab3a1f..0c1a6a448 100644 --- a/src/compiler/evm_frontend/evm_value_range.h +++ b/src/compiler/evm_frontend/evm_value_range.h @@ -12,11 +12,22 @@ namespace COMPILER { // single-instruction fast paths instead of expensive multi-limb arithmetic. // Shared between the EVM IR builder's `Operand` abstraction and EVMAnalyzer's // per-block-entry stack-slot range tracking. +// +// The enumerator ordering is load-bearing: ordinal must be non-decreasing in +// width (U64 < U128 < U256). bothFitU64, the std::min narrowing in AND, and +// the std::max widening in OR/XOR all rely on this. enum class EVMValueRange : uint8_t { U64, // Fits in 64 bits (limbs [1..3] == 0) U128, // Fits in 128 bits (limbs [2..3] == 0) U256, // Full 256 bits — conservative default }; +static_assert(static_cast(EVMValueRange::U64) < + static_cast(EVMValueRange::U128) && + static_cast(EVMValueRange::U128) < + static_cast(EVMValueRange::U256), + "EVMValueRange enumerators must be ordered U64 < U128 < U256; " + "AND uses std::min for narrowing and OR/XOR uses std::max for " + "widening, both rely on this ordinal contract."); } // namespace COMPILER From 8e0cd91add5da446809a0900364bfd7effdb29b9 Mon Sep 17 00:00:00 2001 From: Abmcar Date: Fri, 15 May 2026 11:09:04 +0800 Subject: [PATCH 5/6] chore(ci): trigger re-run after rapidjson fetch 504 Co-Authored-By: Claude Opus 4.7 From 7c67753a4a51e2c50d1bdb408c2169142f4de9b8 Mon Sep 17 00:00:00 2001 From: Abmcar Date: Fri, 15 May 2026 21:32:10 +0800 Subject: [PATCH 6/6] fix(compiler): drop isLiteralZero assertions broken by cfg-join landing MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The isLiteralZero ZEN_ASSERTs in the U64/U128 compare fast paths were checking a producer-side structural invariant ("upper limbs are literal MIR Zero constants"). PR #493 (EVMRangeAnalyzer cfg-join) widened the Range contract from "literal-zero MIR" to "value-level zero" — the analyzer now retrofits Range = U64 onto operands whose backing variables hold any MIR that evaluates to a u64-fitting value. The old assertions fired on analyzer-narrowed operands and crashed the JIT (CI hit SIGABRT on stEIP1153_transientStorage.transStorageReset). Remove the helper and the six ZEN_ASSERTs; the surrounding comments now say "value-zero by Range contract" instead of "provably zero (literal MIR)". The fast paths themselves are unchanged and still correct under the new contract — they read Range and elide reading upper limbs, which is sound for both producer-direct and analyzer-derived U64 tags. Update the change-doc's "Invariant chain" section to reflect the two producer paths (direct materialization + analyzer narrowing) now in effect, instead of the old "today vs. with-cfg-join" framing. Local validation: format clean, dtvmapi builds, 223/223 multipass unittests pass, 2723/2723 statetest -k fork_Cancun pass. Co-Authored-By: Claude Opus 4.7 --- .../2026-05-12-g4-safe-valuerange/README.md | 26 +++-------- .../evm_frontend/evm_mir_compiler.cpp | 44 +++++-------------- 2 files changed, 18 insertions(+), 52 deletions(-) diff --git a/docs/changes/2026-05-12-g4-safe-valuerange/README.md b/docs/changes/2026-05-12-g4-safe-valuerange/README.md index e43c88a95..de257cda1 100644 --- a/docs/changes/2026-05-12-g4-safe-valuerange/README.md +++ b/docs/changes/2026-05-12-g4-safe-valuerange/README.md @@ -46,30 +46,18 @@ The 2026-05-12 verification doc isolated this as the smallest, lowest-risk subse ### Invariant chain (correctness of the compare-side fast path) -Today, every `ValueRange::U64` producer materializes **literal MIR zero constants** in limbs[1..3] — not just modeled-zero. Verified producer sites: +The compare-side fast path trusts a **value-level** invariant: when an operand carries `ValueRange::U64`, its upper limbs evaluate to zero at runtime — even if the MIR for those limbs is not a literal `Zero` constant. This is the **Range contract**. -- `evm_mir_compiler.h:565` (general compare result via `handleCompareImpl` → `handleCompareEQZ` / `handleCompareEQ` / `handleCompareGT_LT`; each writes `Result[1..3] = Zero` literally) -- `evm_mir_compiler.h:629` (AND u64-const fast path: `Result[I] = Zero` for `I ≥ 1`) -- `evm_mir_compiler.cpp:2024` (DIV u64÷u64: `{DivResult, Zero, Zero, Zero}`) -- `evm_mir_compiler.cpp:2194` (MOD u64÷u64: `{ModResult, Zero, Zero, Zero}`) -- `evm_mir_compiler.cpp:3002-3006` / `3037-3041` / `3073-3077` (the three existing `handleCompare*U64` helpers — `Result[I] = Zero`) -- `evm_mir_compiler.cpp:3689-3693` (BYTE: `ResultComponents[I] = Zero`) +Two independent producer paths uphold the contract: -The new compare-side fast path **does not weaken** this invariant: it only reads the `Range` tag and *elides reading* `LHS[1..3]`. It never creates a U64-tagged operand with non-literal-zero upper limbs. +1. **Direct materialization** (pre-existing, since PR #458). Every producer that explicitly assigns `Range = U64` also writes literal MIR `Zero` to `limbs[1..3]`. Verified producer sites: `evm_mir_compiler.h:565` (general compare result), `evm_mir_compiler.h:629` (AND u64-const fast path), `evm_mir_compiler.cpp:2024` (DIV u64÷u64), `evm_mir_compiler.cpp:2194` (MOD u64÷u64), the three existing `handleCompare*U64` helpers, and BYTE. +2. **Analyzer-derived narrowing** (PR #493, `EVMRangeAnalyzer`). The dataflow analyzer retrofits `Range = U64` onto stack-popped operands whose backing variables hold any MIR that *evaluates to* a u64-fitting value. The analyzer guarantees value-level zero in `limbs[1..3]`, not literal-zero MIR. -**SHR_U caveat (value-level, not MIR-level)**: `handleLogicalRightShift` emits `Select(IsLargeShift, Zero, )` for upper limbs when the input has `ValueRange::U64`. The *runtime value* is zero (correct), but the *MIR* is not necessarily a literal `Zero` constant. Future consumers must gate on `Range` rather than MIR-pattern-match upper limbs. The compare-side fast path proposed here is unaffected because it elides reading upper limbs entirely when `Range == U64`. +The new compare-side fast path **does not weaken** the Range contract: it only reads `getRange()` and elides reading `LHS[1..3]`. It never creates a U64-tagged operand with non-zero upper-limb values. -### Compose with `perf/value-range-cfg-join` +**SHR_U caveat (value-level, not MIR-level)**: `handleLogicalRightShift` emits `Select(IsLargeShift, Zero, )` for upper limbs when the input has `ValueRange::U64`. The runtime value is zero, but the MIR is not necessarily a literal `Zero`. Consumers must gate on `Range` rather than MIR-pattern-match upper limbs — the compare-side fast path here does exactly that. -Once `perf/value-range-cfg-join` lands, `setRange(...)` will retrofit `Range = U64` onto stack-popped operands whose backing variables may contain *anything* at the MIR level (the analyzer guarantees value-level zero, not MIR-level zero). This change extends the **trust chain** for compare-side correctness from: - -> compare correctness ⇐ Range == U64 ⇐ producer materializes literal-zero MIR - -to: - -> compare correctness ⇐ Range == U64 ⇐ (today) literal-zero MIR OR (with cfg-join) analyzer soundness - -The same trust model already applies to the AND `NarrowRange` path at `evm_mir_compiler.h:633-655`. The combination is consistent with current practice; the analyzer's correctness tests should exercise EQ/LT/GT with analyzer-narrowed U64 ranges once cfg-join lands. +The same trust model already applies to the AND `NarrowRange` path at `evm_mir_compiler.h:633-655`. ## Implementation diff --git a/src/compiler/evm_frontend/evm_mir_compiler.cpp b/src/compiler/evm_frontend/evm_mir_compiler.cpp index 82ded9569..7558ca424 100644 --- a/src/compiler/evm_frontend/evm_mir_compiler.cpp +++ b/src/compiler/evm_frontend/evm_mir_compiler.cpp @@ -2984,21 +2984,6 @@ EVMMirBuilder::handleSubU64Const(const Operand &LHSOp, return Operand(Result, EVMType::UINT256); } -namespace { -// True iff Inst is a materialized integer constant whose value is zero. -// Used to assert the upper-limb invariant of ValueRange::{U64,U128} producers -// in compare fast paths: skipping the OR-fold is only safe when those limbs -// are literal MIR zero. See evm_mir_compiler.h ValueRange comment. -bool isLiteralZero(MInstruction *Inst) { - if (auto *CI = llvm::dyn_cast(Inst)) { - if (auto *IntConst = llvm::dyn_cast(&CI->getConstant())) { - return IntConst->getValue().isZero(); - } - } - return false; -} -} // namespace - typename EVMMirBuilder::Operand EVMMirBuilder::handleCompareEqU64(const Operand &FullOp, uint64_t U64Val) { U256Inst LHS = extractU256Operand(FullOp); @@ -3014,14 +2999,11 @@ EVMMirBuilder::handleCompareEqU64(const Operand &FullOp, uint64_t U64Val) { MInstruction *FinalResult; if (FullOp.getRange() == ValueRange::U64) { - // Upper limbs are provably zero (every existing U64 producer materializes - // literal MIR Zero in limbs[1..3]). Skip the OR-fold and the zero-test. - ZEN_ASSERT(isLiteralZero(LHS[1]) && isLiteralZero(LHS[2]) && - isLiteralZero(LHS[3])); + // Upper limbs are value-zero by Range contract (producer materialization + // or EVMRangeAnalyzer narrowing). Skip the OR-fold and the zero-test. FinalResult = LowEq; } else if (FullOp.getRange() == ValueRange::U128) { - // Limbs[2..3] are provably zero — only LHS[1] needs to be checked. - ZEN_ASSERT(isLiteralZero(LHS[2]) && isLiteralZero(LHS[3])); + // Limbs[2..3] are value-zero by Range contract — only LHS[1] needs check. MInstruction *UpperZero = createInstruction( false, EqPred, &Ctx.I64Type, LHS[1], Zero); FinalResult = createInstruction( @@ -3062,14 +3044,12 @@ EVMMirBuilder::handleCompareLtRhsU64(const Operand &LHSOp, uint64_t RhsU64) { MInstruction *FinalResult; if (LHSOp.getRange() == ValueRange::U64) { - // Upper limbs are provably zero — HasUpper would always be false, so the - // select collapses to LowLt. - ZEN_ASSERT(isLiteralZero(LHS[1]) && isLiteralZero(LHS[2]) && - isLiteralZero(LHS[3])); + // Upper limbs are value-zero by Range contract — HasUpper would always + // be false, so the select collapses to LowLt. FinalResult = LowLt; } else if (LHSOp.getRange() == ValueRange::U128) { - // Limbs[2..3] are provably zero — HasUpper reduces to LHS[1] != 0. - ZEN_ASSERT(isLiteralZero(LHS[2]) && isLiteralZero(LHS[3])); + // Limbs[2..3] are value-zero by Range contract — HasUpper reduces to + // LHS[1] != 0. auto NePred = CmpInstruction::Predicate::ICMP_NE; MInstruction *HasUpper = createInstruction( false, NePred, &Ctx.I64Type, LHS[1], Zero); @@ -3112,14 +3092,12 @@ EVMMirBuilder::handleCompareGtRhsU64(const Operand &LHSOp, uint64_t RhsU64) { MInstruction *FinalResult; if (LHSOp.getRange() == ValueRange::U64) { - // Upper limbs are provably zero — HasUpper would always be false, so the - // select collapses to LowGt. - ZEN_ASSERT(isLiteralZero(LHS[1]) && isLiteralZero(LHS[2]) && - isLiteralZero(LHS[3])); + // Upper limbs are value-zero by Range contract — HasUpper would always + // be false, so the select collapses to LowGt. FinalResult = LowGt; } else if (LHSOp.getRange() == ValueRange::U128) { - // Limbs[2..3] are provably zero — HasUpper reduces to LHS[1] != 0. - ZEN_ASSERT(isLiteralZero(LHS[2]) && isLiteralZero(LHS[3])); + // Limbs[2..3] are value-zero by Range contract — HasUpper reduces to + // LHS[1] != 0. MInstruction *One = createIntConstInstruction(MirI64Type, 1); auto NePred = CmpInstruction::Predicate::ICMP_NE; MInstruction *HasUpper = createInstruction(