Restore stochastic transition() semantics, schedule-independent $rdist_normal, and measured-law cross acceptance#12
Open
BucketSran wants to merge 9 commits into
Conversation
Commit 06526fa achieved python/rust stochastic parity by bypassing the transition() ramp entirely for $rdist_normal-derived targets in BOTH engines (python: backend.py codegen direct _set_output; rust: snap block in rust_sim_apply_transitions plus blanket active-ramp breakpoint skips). That aligned the engines on the wrong semantics: outputs jumped instantaneously at the evaluation following the event, erasing the ramp that Spectre (and pre-06526fa EVAS) produce. On the vaBench dither case this moved the vout_o edge by ~250 ps and broke the certified EVAS/Spectre parity (1.19 ps -> 252 ps). Remove the python codegen bypass and the rust snap, and re-enable active-ramp completion breakpoints for stochastic models. Keep the predictive target-change breakpoint skip (extrapolating future random draws is genuinely unsound) and keep the stateless time-hashed RNG. Verified: vaBench dither tb gold edge delta vs stored Spectre evidence 252.2 ps -> 2.28 ps on both engines, python == rust bit-exact on that case; 37-entry release-evidence sweep shows exactly one signal changed (the fixed one); test_engine/test_compiler/test_examples/test_netlist all pass; 15 bundled parity examples pass. Known follow-up (noise_gen example re-xfailed): with real ramps restored, python/rust step schedules can differ by ~ps at one event, and the time-hashed RNG then draws different values. Long-run bit-exact parity needs the hash keyed on event occurrence index instead of wall-clock time. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Pins the behavior restored by the previous commit: a transition() whose target derives from $rdist_normal must keep the pre-event value at the event-tick row and place the ramp midpoint ~tedge/2 after the tick. Verified discriminating: on 06526fa (bypass present) the probe reports value_at_tick=1.01 and midpoint 0.975 ns; on this branch 0.0 and 1.100 ns (analytic ideal). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The stateless time-hashed RNG made every draw bit-sensitive to the exact event time: when python/rust step schedules differed at femtosecond level, the engines drew entirely different sequences (noise_gen example diverged after ~32.5 ns once true transition ramps were restored). Hash (seed_bits, per-seed draw index) instead, identically in both engines. Event-body draws fire the same number of times in both engines, so sequences now match regardless of sub-ps schedule jitter — and the semantics matches the LRM's sequential per-seed stream model. Implementation: - python: per-instance draw counters in CompiledModel._rand_normal; the time argument is accepted but no longer enters the hash. - rust: thread-local per-seed counters in util.rs, reset at the start of every full-program trace (program.rs) so runs stay reproducible. - bundled noise_gen example rewritten to the EVAS-recommended sample-and-hold form (@(timer(dt, dt)) periodic resampling), matching the vaBench gold pattern; continuous-body draws remain supported but are documented as not bit-reproducible across engines (one draw per body evaluation; engines step on different schedules). Verified: - test_noise_gen_python_vs_evas_rust_noise_parity un-xfailed and passes: python == rust bit-exact over 1000 periodic draws (shape + atol=0.0). - full suites green: engine/compiler/netlist 520, examples 12, rust parity 16/16. - vaBench dither tb gold: 3.25 ps edge delta vs stored Spectre evidence (gate 5 ps), python == rust 0.00 ps; 37-entry release sweep shows the dither delta as the only changed signal (2.28 -> 3.25 ps, still pass). Note: this is the second deliberate draw-sequence change after 06526fa's time-hash; any waveform baselines capturing $rdist_normal values must be regenerated (fold into the pending dual-evidence re-baseline). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…tics Integrates the upstream fix series (1e6dd21, 21b6530, 28c9100) with this branch's semantics restoration. Resolution decisions: - KEEP upstream's finer rdist gating for the predictive target-change breakpoint (transition-target / continuous-body granularity) — strictly better than this branch's body-wide skip: event-body stochastic models regain predictive breakpoints for their deterministic transitions. - KEEP upstream's cross-acceptance machinery (accepted event time), but flip the default: the slack factor is now an explicit opt-in via `simulatorOptions options evas_cross_acceptance_slack_factor=...` instead of defaulting to 0.25 under errpreset=conservative. Rationale: the 2026-04 closure decision requires tolerance-compatible event timing to be opt-in and never benchmark-default; the 0.25*ramp scaling also contradicts the April measurements (Spectre lateness varied 1.5ps->~0 with maxstep 20p->1p at constant tedge=200p, so ramp time is the wrong scaling variable). A principled window model is being derived from a dedicated DOE before any default is reconsidered. - DROP upstream's force_immediate_random_target snap path entirely: snapping continuous-context stochastic transitions diverges from the python engine's true ramp semantics restored on this branch and from LRM transition() semantics; full ramping is kept in both engines. - KEEP this branch's draw-index RNG, ramp restoration, and counter reset. Verified post-merge: engine/compiler/netlist 522 passed (incl. both upstream tests), examples 12, rust parity 16/16; vaBench dither tb gold 3.25 ps vs stored Spectre on BOTH engines, python == rust 0.00 ps. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…tics Restore stochastic transition() semantics and make $rdist_normal schedule-independent
Replaces the heuristic 0.25 * transition-ramp acceptance window with the
law measured by the 2026-06-12 cross-lateness DOE (52 strict-Spectre
probe measurements, testspace/cross-lateness-doe-20260612):
Delta = 0.5 * reltol * |V_cross| / |slope|
independent of maxstep (1p-500p), errpreset and explicit cross time_tol;
the law retrodicts the April R014 residual exactly (0.5e-4 * 30ns =
1.5000 ps vs 1.500025 ps measured).
Implementation:
- runner: the user-facing simulatorOptions factor is now a multiple of
the law (1.0 = Spectre-like); the engine-level FFI scalar becomes
kappa = factor * 0.5 * reltol, so slack = kappa * |V_cross| / |slope|.
- rust: the acceptance breakpoint evaluates node magnitude AT the
interpolated root (the DOE shows the band references the crossing
value, not the step-end value) and divides by the bracket slope; the
ramp-time heuristic and its helper are removed.
- accepted-event-time is now PRE-PHASE ONLY: the acceptance breakpoint
scans pre-phase (source-driven) crossings, so post-phase
(contributed-node-referencing) crossings keep exact interpolated
timing. Previously they were switched to an arbitrary accepted grid
step without a matching breakpoint (10-18 ps error observed).
Validation: with factor=1.0, reltol=1e-4, EVAS reproduces the measured
strict-Spectre lateness on 4 held-out external-ramp probes to
0.001-0.05 fs. Default behavior (factor unset) remains exact/analytic.
New end-to-end guard: tests/test_netlist.py::TestCrossAcceptanceLawMode.
Suites: 523 + 12 + 16/16 green.
Known v1 scope limit: phase classification treats any cross expr that
references a contributed branch node (including rails like VSS used as
the node2 of output contributions) as post-phase, so law mode currently
applies to ground-referenced cross expressions on source-driven nets.
Refining the classification is a follow-up.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
V(n1, n2) <+ x drives n1 relative to n2; n2 is a reference, not a driven node. Counting node2 as contributed made the event phase classifier mark every cross expression that references a rail (e.g. V(in, VSS) with VSS used as the node2 of output contributions) as post-phase. Detection ordering does not require that: a node2 reference's value is not changed by the contribution, so reading it pre-phase is safe — and the common ground-referenced benchmark style was silently excluded from pre-phase-only features such as the cross-acceptance law mode. Verified: - V(in, VSS)-style external-ramp probes now reproduce the measured strict-Spectre lateness in law mode to 0.001-0.05 fs (previously they fell to post-phase and law mode never applied). - Default behavior unchanged: full suites 523 + 12 + 16/16 green and the 37-entry release-evidence sweep shows zero signal-level diffs vs the previous baseline. - New unit guard: tests/test_netlist.py::TestCrossPhaseClassification. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Contributor
Author
|
Added 🤖 Generated with Claude Code |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This series completes the stochastic-parity work started in
06526faand continued in28c9100, fixing the remaining semantic regressions on the python engine, making cross-engine random sequences schedule-independent, and replacing the cross-acceptance heuristic with a law measured from strict-mode Spectre.1. Restore transition() semantics for stochastic targets — both engines
06526fabypassed thetransition()ramp for$rdist_normal-derived targets in both engines (python codegen direct-write + rust snap).28c9100fixed the rust side but left the python bypass, so as of upstreammainthe packaged default engine still snaps and the engines disagree with each other (measured on the vaBench dither tb gold: python 252.2 ps vs stored Spectre; python-vs-rust 275 ps).This PR removes the python codegen bypass and the remaining rust snap path (
force_immediate_random_target), re-enables active-ramp completion breakpoints (deterministic, no prediction involved), and keeps the predictive target-change breakpoint skip with28c9100's finer gating. Result: dither tb gold 3.25 ps vs stored Spectre on BOTH engines, python == rust bit-exact.2. Hash $rdist_normal on per-seed draw index instead of wall-clock time
The time-hashed RNG made every draw bit-sensitive to the exact event time: femtosecond-level schedule differences between engines produced entirely different sequences once real ramps were restored. Hashing (seed_bits, per-seed draw index) — identically in python (per-instance counters) and rust (thread-local counters, reset per trace) — makes sequences schedule-independent and matches the LRM's sequential per-seed stream model. The noise_gen parity test is un-xfailed and passes bit-exact over 1000 periodic draws; the bundled example is rewritten to the recommended
@(timer(dt, dt))sample-and-hold form.3. Cross-acceptance window driven by the measured Spectre law
A 52-measurement DOE against strict-mode Spectre ($abstime probe DUTs, slopes over 2 decades, irrational-flavored roots) shows Spectre accepts a cross event late by
independent of maxstep (1p–500p), errpreset, and explicit cross
time_tol. The law retrodicts the April R014 residual exactly (0.5 × 1e-4 × 30 ns = 1.5000 ps vs 1.500025 ps measured); the April "maxstep=1p removes the delay" observation was a grid-aliasing artifact (root aligned to the step grid).Accordingly:
0.25 × transition-rampwindow is replaced by the law (kappa = user_factor × 0.5 × reltolfolded in the netlist runner; rust computesslack = kappa × |V_root| / |slope|with the magnitude evaluated at the interpolated root);simulatorOptions options evas_cross_acceptance_slack_factor=1.0and defaults to 0.0 in all presets (per the 2026-04 closure decision, exact/analytic stays the default and benchmark flows must not enable tolerance-compatible behavior);Validated end-to-end: with factor=1.0 EVAS reproduces the measured strict-Spectre lateness on held-out external-ramp probes to 0.001–0.05 fs.
Verification
28c9100tests, adapted)Notes
06526fa); waveform baselines capturing$rdist_normalvalues must be regenerated.vaEvas/testspace/edge-timing-triage-20260611/andvaEvas/testspace/cross-lateness-doe-20260612/.🤖 Generated with Claude Code