Skip to content

Pass explicit AnalysisContext; treat VariableRegistry as a global service#1088

Merged
elazarg merged 21 commits intomainfrom
verification-context
Apr 17, 2026
Merged

Pass explicit AnalysisContext; treat VariableRegistry as a global service#1088
elazarg merged 21 commits intomainfrom
verification-context

Conversation

@elazarg
Copy link
Copy Markdown
Collaborator

@elazarg elazarg commented Apr 17, 2026

Summary

Reshapes the verifier's analysis API to thread per-analysis state explicitly via a new AnalysisContext struct (program_info, options, platform), and corrects the modeling of VariableRegistry as a global name-interning service rather than per-analysis state.

The branch grew through several refactor steps that, taken together, replace the implicit thread-local environment with explicit parameters wherever the data is genuinely per-analysis, and clarify the responsibilities of the registry where it isn't.

Why

Before this branch, the analyzer reached for thread_local_options, thread_local_program_info, and variable_registry from arbitrary depths in the call graph. That made:

  • the dependencies of EbpfDomain, EbpfChecker, EbpfTransformer opaque;
  • testing harder (every test sets up the thread-locals);
  • the registry's role unclear (used both as an interner and, incorrectly, as a per-analysis state oracle).

Two semantic bugs surfaced from that confusion:

  • EbpfDomain::get_loop_count_upper_bound() queried variable_registry.get_loop_counters(), which after the registry-as-global reframing accumulated counters across analyses. Stale counters evaluated to top in current state, sending max_loop_count silently to INT_MAX.
  • EbpfDomain::from_constraints could leave state bottom while stack stayed materialized — to_set() then printed stack content instead of _|_.

Both are fixed.

What changed

  • AnalysisContext carries the three things that vary per analysis. VariableRegistry deliberately isn't a member: it's a global interning service (no clearing, no LazyAllocator wrapper, never threaded as state).
  • Domain construction is context-aware: top(context) / top(size_t), setup_entry(bool, context), both from_constraints overloads, calculate_constant_limits(context, loop_counters), widen(other, to_constants, context, loop_counters). Sized BitsetDomain(size_t) / ArrayDomain(size_t) ctors; default ctors removed.
  • EbpfDomain::stack becomes std::optional<ArrayDomain> with the invariant state.is_bottom() ⇔ !stack.has_value(). Bottom default-construction needs no thread-local read. BitsetDomain and ArrayDomain lose their fake is_bottom/set_to_bottom (the bottom concept now lives only at EbpfDomain).
  • State-mutating methods on EbpfDomain (add_value_constraint, assume_eq_types, restrict_type) maintain the invariant via a single normalize_after_state_mutation helper.
  • Loop counters are sourced from the analyzer's Wto (per-program loop heads), not from the registry.
  • Type variables are sourced from TypeDomain::variables() / variables_with_type(type) (per-state tracked vars), not from variable_registry.get_type_variables() (which is removed).
  • VariableRegistry is now: non-copyable, all methods const over a mutable interning cache, make renamed to intern. RegPack and reg_pack moved here from type_to_num.hpp.
  • Other thread-local reads in adjacent code (extract_instruction_deps, RelevantState, get_assertions, Unmarshaller, cfg_builder, result.cpp) take their inputs explicitly.

Known leftover

EbpfDomain::to_set() still serializes a bottom domain as top, with a BUG: comment, to satisfy YAML test expectations encoded in post: []. Real fix is to return bottom and update the YAML expectations — left as a follow-up.

Test plan

  • Full test suite passes: 1296 / 236 expected-fail / 1 skipped.
  • Loop-counter-bound bug verified by inspection (no test pinned the value before).
  • Independent code review pass; one invariant hole found and fixed before opening this PR.

🤖 Generated with Claude Code

elazarg and others added 19 commits April 16, 2026 22:55
Wrap the thread-local analysis inputs (ProgramInfo, options, platform,
VariableRegistry) in a single AnalysisContext aggregate, and plumb it
explicitly through analyze, ebpf_domain_check, ebpf_domain_transform,
and ebpf_domain_initialize_loop_counter. Backward-compatible overloads
continue to synthesize the context from thread-locals.

This is a migration step: EbpfDomain internals and a few helpers in
ebpf_domain.cpp still reach the thread-locals, so analyze asserts that
the caller-supplied VariableRegistry matches variable_registry.get().

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
…er helpers into instance methods

- ptr_max() in ebpf_domain.hpp now takes max_packet_size directly and
  no longer reads thread_local_options. The only remaining callers
  pass thread_local_options.max_packet_size (in ebpf_domain.cpp) or
  context.options.max_packet_size (in ebpf_transformer.cpp). The
  duplicate static ptr_max(context) in ebpf_transformer.cpp is gone.
- max_packet_size() helper removed; its sole caller inlines
  thread_local_options.max_packet_size, making the dependency visible.
- do_load_stack, do_store_stack, recompute_stack_numeric_size were
  static helpers taking an AnalysisContext and ArrayDomain. Since
  EbpfTransformer already holds both as members and is always
  constructed per-instruction, convert them to non-static methods
  that use this->stack and this->context.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
…, initialize_packet

These entry-point domain methods previously read thread_local_options,
variable_registry, and thread_local_program_info directly. They now
receive them explicitly via AnalysisContext.

- setup_entry(bool init_r1, const AnalysisContext&)
- initialize_packet(const AnalysisContext&)
- from_constraints(const set<string>&, bool, const AnalysisContext&)
  (the string overload only; the typed-constraints overload doesn't
  touch thread-locals and is unchanged)

Callers:
- fwd_analyzer::analyze forwards its context.
- result.cpp callers construct a context from thread_local_analysis_context.
- test_analysis_context passes its explicit context.

The "migration step" assertion in analyze() remains justified by the
map accessors and calculate_constant_limits in ebpf_domain.cpp, which
still read thread-locals.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
…loop_count_upper_bound

The five map accessors (get_map_type, get_map_inner_map_fd,
get_map_key_size, get_map_value_size, get_map_max_entries) now take
a const ebpf_platform_t& rather than reaching into
thread_local_program_info.

get_loop_count_upper_bound takes a VariableRegistry& rather than
reaching into variable_registry.

Callers (EbpfChecker::ValidMapKeyValue, EbpfTransformer::operator(Call),
InterleavedFwdFixpointIterator::max_loop_count) forward the right
piece of their already-held AnalysisContext.

The migration-phase assertion in analyze() still holds: only
calculate_constant_limits (consumed via LazyAllocator) remains on
thread-locals.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
…yable

The registry is a name -> id cache. Given the same sequence of
`make(name)` calls it returns the same Variable ids; clients never
observe state beyond that mapping. Reflect this in the API:

- `names` is now `mutable`, and every public factory / enumeration
  method (`reg`, `cell_var`, `kind_var`, `loop_counter`,
  `meta_offset`, `packet_size`, `stack_frame_var`, `get_type_variables`,
  `get_loop_counters`) is `const`. Callers can hold a
  `const VariableRegistry&` and still construct Variables.
- A header comment explains that `mutable` is for memoization and
  that the existing thread_local storage is what lets us skip
  synchronization on `names`.

Copying is deleted: duplicating the cache would let two registries
assign different ids to the same name, and since `Variable` is a bare
index with no back-pointer, a Variable from one registry would
silently misinterpret against the other. Move is kept (it transfers
identity rather than forking it).

The one LazyAllocator<VariableRegistry>::operator=(const T&) caller in
test_analysis_context switches to `.clear()` — same effect (next
get() lazily re-initializes) without relying on copy.

`EbpfDomain::get_loop_count_upper_bound` can now take
`const VariableRegistry&` again, which is how it should have been
typed from the start.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
…et to the compat shim

Now that VariableRegistry's public API is const-qualified, the context
can hold it by const reference. Callers never need a mutable handle;
the registry interns new ids through its internal `mutable` cache.
A comment on the field points readers at var_registry.hpp for the why.

Resetting the registry between runs was baked into
ebpf_verifier_clear_before_analysis. That made sense when the registry
was a thread-local singleton, but once the registry is passed in via
the context, its lifetime is the caller's concern. The reset moves to
the back-compat shims (analyze(prog) and analyze(prog, inv)), which
keep the old "each call starts with a fresh thread-local registry"
semantics. The context-taking overloads now just clear the
per-analysis ArrayDomain cache and trust the caller.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
…e, drop the thread-local reads

extract_instruction_deps used thread_local_options.total_stack_size()
when normalizing derived-pointer stack offsets back into R10-relative
ones. Accept the stack size as an explicit argument instead; the
fwd_analyzer caller passes context.options.total_stack_size(), and the
test callers pass a default-constructed ebpf_verifier_options_t's value.

RelevantState used thread_local_options.total_stack_size() inside
is_relevant_constraint to convert relative stack offsets to absolute
"s[...]" names for pattern matching. Store the size on the struct at
construction (from context.options.total_stack_size()). The
join_relevance copy in printing.cpp propagates it from the first
parent's state.

Also: is_valid_after / check_observation_at_label already construct an
AnalysisContext locally; switch their remaining setup_constraints
reads from thread_local_options to context.options for consistency.

After this change, result.cpp has no remaining thread-local reads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
…ading thread_local_options

AssertExtractor's Mem overload had the only thread_local_options read
in the IR layer: a comparison against options.subprogram_stack_size
when the access is R10-relative. Carry the options explicitly through
AssertExtractor, get_assertions, and its three callers (Program::
from_sequence passes what it already has; EbpfChecker passes
context.options; test_marshal passes a default-constructed options).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
…hread_local_options

Unmarshaller already stores subprogram_stack_size as a per-run field
populated from the options passed to unmarshal(). Add allow_division_by_zero
next to it, for the same reason, so the div-by-zero warning stops
relying on thread_local_options being set up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
add_cfg_nodes used thread_local_options.max_call_stack_frames for
its depth check; instruction_seq_to_cfg and resolve_pseudo_load
used thread_local_program_info for the platform and map descriptors.
They now take what they need explicitly: max_call_stack_frames for
the nesting check, ProgramInfo for the platform and map descriptors.
Program::from_sequence passes its already-held info/options.

Note: the callback-target post-processing in Program::from_sequence
still mutates thread_local_program_info directly. That's tied to
how EbpfChecker reads callback_target_labels/_with_exit through
context.program_info — ProgramInfo holds these sets, and the compat
shim in thread_local_analysis_context reads the same thread-local.
Leaving that path alone for now.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
…bottom

Stack content has no inherent bottom of its own; bottom for the whole
EbpfDomain is now represented uniformly by clearing all fields:
state goes to TypeToNumDomain::bottom() and stack drops to nullopt.
This is the AddBottom pattern already used for NumAbsDomain, applied
externally to ArrayDomain.

Invariant: state.is_bottom() <=> !stack.has_value(). Enforced by
factories (bottom(), set_to_bottom()) and by binary ops (which
short-circuit on bottom inputs and return bottom() outputs). The
transformer's mid-instruction mutations may temporarily violate the
invariant; ebpf_domain_transform normalizes at the end.

Default construction no longer reads thread_local_options for stack
sizing — bottom needs no size. Non-bottom factories (top, setup_entry,
from_constraints string overload) take an AnalysisContext and pass
total_stack_size through to a new sized BitsetDomain ctor. A size-only
top(size_t) overload is provided for tests that don't have a full
context handy.

calculate_constant_limits still reads thread_local_options because it
is consumed via a zero-arg LazyAllocator factory. Made explicit with
a TODO; will be addressed in stage 3 when the constant-limits cache
is restructured.

A leaked-behavior bug is preserved (and flagged) in to_set(): a bottom
domain serializes as top because several YAML tests encode that
historical answer in their `post: []` expectations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
BitsetDomain::is_bottom() always returned false and set_to_bottom()
just reset the bitset to a non-bottom state — the bottom concept was
fictional. ArrayDomain forwarded both unchanged. Now that EbpfDomain
wraps stack in std::optional and represents bottom externally
(AddBottom-style), neither layer needs the fake.

Removed: is_bottom() and set_to_bottom() on both classes; the dead
`if (is_bottom())` branch in BitsetDomain::to_set().

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
…ugh widen and calculate_constant_limits

The size-less default ctors were the last paths in BitsetDomain /
ArrayDomain that read thread_local_options. With every non-bottom
construction site now going through the size-taking ctors, the
defaults are unused and removed.

EbpfDomain::calculate_constant_limits is no longer a zero-arg
LazyAllocator factory. It takes the AnalysisContext explicitly,
reading total_stack_size, packet_size, check_for_termination, and
loop counters off the context instead of thread-locals. The cache is
gone; widen recomputes on demand. The cost (~100 add_constraint calls
per widen-with-constants) is small.

EbpfDomain::widen now takes context too. The single internal caller
(extrapolate in fwd_analyzer.cpp) forwards its context through.
clear_thread_local_state on EbpfDomain is removed (the cache it
cleared no longer exists); ebpf_verifier_clear_thread_local_state
drops the corresponding call.

The typed-overload of from_constraints picks up a defaulted
total_stack_size parameter so test callers (test_subsumption,
test_join) keep working without a context.

After this change ebpf_domain.cpp has zero thread-local reads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
reg_pack(i) is just a bulk version of reg(kind, i): every field is
registry->reg(SomeKind, i). Move it next to its peers (reg, type_reg,
cell_var, kind_var, …) on VariableRegistry, and move RegPack itself
into var_registry.hpp where its fields are produced.

EbpfTransformer and EbpfChecker get same-named member methods that
delegate to context.variables.reg_pack — so the existing ~37 unqualified
`reg_pack(...)` calls inside those classes resolve to the member with
no source change. Other callers that hold a context (setup_entry,
calculate_constant_limits, do_load_ctx) call context.variables.reg_pack
directly. do_load_packet_or_shared takes a VariableRegistry parameter.

The free `reg_pack(int)` and `reg_pack(Reg)` shims in type_to_num.hpp
remain (using the thread-local registry) for the call sites that don't
yet have a context: TypeToNumDomain methods, EbpfDomain const methods
(get_r0, get_stack_offset, get_map_fd_range), and tests. Those will
migrate in subsequent commits as their callers are parameterized.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
…ervice

VariableRegistry is a name-interning service: name -> stable id, with
the only correctness requirement that the same name maps to the same
id within an analysis. Two analyses sharing one registry is fine
(monotonic growth across runs, but no observable effect). It belongs
in the same conceptual bucket as malloc — infrastructure, not
per-analysis state.

So:
- AnalysisContext loses its `variables` field. The struct is now the
  three things that genuinely vary per analysis: program_info, options,
  platform.
- The migration-phase asserts in `analyze(..., context)` go away —
  they have nothing to assert.
- The compat shims `analyze(prog)` / `analyze(prog, inv)` no longer
  reset the registry. Names accumulate across runs; that's fine.
- Domain code that briefly went through `context.variables.X` reverts
  to `variable_registry->X`. Convenience reg_pack methods on
  EbpfTransformer / EbpfChecker are removed (the free `reg_pack` shim
  in type_to_num.hpp is enough).
- `EbpfDomain::get_loop_count_upper_bound` drops its registry parameter.
- `do_load_packet_or_shared` drops the registry parameter we'd added.

Tests stay green (1297 / 236 expected-fail / 1 skipped). The bulk of
the previous context-plumbing work — `AnalysisContext` itself,
`setup_entry`/`from_constraints`/`top` taking context, the
options-and-platform threading — keeps its value: those things ARE
per-analysis. Only the registry plumbing was solving the wrong problem.

The Rust port (../prevail-rust) takes the opposite tack and threads
&mut VariableRegistry everywhere because the borrow checker mandates
it. C++ doesn't have that constraint, so the global is fine.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
The registry doesn't need lazy construction (it's used by every
analysis, so first-touch eager init is fine) and explicitly never
clears (names accumulate monotonically, like a string pool). The
LazyAllocator wrapper exposed clear/set/operator= that we'd never
call and would invite misuse, so collapse the wrapping: a plain
`thread_local VariableRegistry variable_registry;` direct.

Header-side comment explains both choices in a few lines.

The ~80 callers switch from `variable_registry->X` to
`variable_registry.X`. Mechanical sed.

test_analysis_context.cpp was demonstrating that an explicit
AnalysisContext could carry a different VariableRegistry from the
thread-local. With the registry now correctly modeled as a global
service (not part of AnalysisContext), there's nothing left to
demonstrate; delete the file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
`from_constraints` was the visible symptom: mutating state.types via
restrict_to / assume_eq_types could push the domain to bottom while
leaving stack materialized, breaking the documented "state.is_bottom()
<=> !stack" invariant. to_set() then printed the stack content
instead of "_|_".

Root cause: those direct accesses bypassed any EbpfDomain wrapper,
and the existing add_value_constraint wrapper didn't normalize either.

Fix at the right layer: every EbpfDomain-level state mutator
(add_value_constraint, assume_eq_types, restrict_type) calls a single
normalize_after_state_mutation helper that drops stack when state goes
bottom. Both from_constraints overloads now route through these
wrappers instead of poking state.types directly. The invariant is
re-established at the point the divergence could arise, not
defensively at the call site.

The numeric_ranges loop in from_constraints(string) gets one explicit
is_bottom() check before dereferencing inv.stack — the loop only
makes sense on a non-bottom domain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
The two loop-counter call sites — calculate_constant_limits and
get_loop_count_upper_bound — were asking variable_registry "what loop
counters exist?" That question is analysis-specific (per-program loop
heads), not registry-state. With the registry as a true global, the
registry accumulates counters across analyses on the same thread, so
the answer is wrong: get_loop_count_upper_bound went to +∞ as soon as
any prior analysis introduced loop counters not in the current state's
numeric domain (which evaluate to top), making max_loop_count silently
saturate at INT_MAX.

Source the set from the right place: InterleavedFwdFixpointIterator
already walks _wto.for_each_loop_head to initialize counters. Compute
the corresponding Variables once in the analyzer's constructor and
thread the std::span<const Variable> through extrapolate -> widen ->
calculate_constant_limits, and through max_loop_count ->
get_loop_count_upper_bound. The registry never gets asked.

Tests stay green. The bug wasn't caught by tests because no test pins
a specific max_loop_count value — but multi-program runs in
long-running processes were silently reporting unbounded loops.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
The four remaining variable_registry.get_type_variables() callers were
asking the global interner "what type variables exist?" — that's
analysis-state, not registry-state. With the registry as a permanent
global, the answer accumulates across runs on the same thread.

TypeDomain knows its own tracked variables (the var_ids map). Expose
two queries:

- variables(): the full set this domain tracks. Used by the join /
  subsumption helpers in TypeToNumDomain (operator<= and operator|=)
  that genuinely need to iterate per-variable structure.
- variables_with_type(type): tracked variables whose TypeSet may
  contain `type`. Used where the question is "which variables might
  have type T?" (havoc_all_locations_having_type, do_store_stack's
  stack_numeric_size update).

For collect_type_dependent_constraints we iterate the union of the
two operands' tracked variables, since a variable tracked on only
one side can still have differing type info.

variable_registry.get_type_variables() is removed — no callers left.
The std::string ends_with helper that supported it goes too.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 17, 2026

📝 Walkthrough

Walkthrough

Adds prevail::AnalysisContext and threads it through analysis, checker, transformer, and domain APIs; replaces many thread-local/global accesses with context or registry instance methods; converts stack/bitset domains to size-constructed, non-bottom-intrinsic types and represents stack bottom via std::optional; updates related signatures and tests.

Changes

Cohort / File(s) Summary
Context/Infrastructure
src/analysis_context.hpp, src/verifier.hpp
Introduce prevail::AnalysisContext, thread_local_analysis_context() helper, and new analyze overloads accepting an AnalysisContext.
Core EbpfDomain API
src/crab/ebpf_domain.hpp, src/crab/ebpf_domain.cpp
Make stack an std::optional<ArrayDomain>, add context-aware APIs (top/widen/calculate_constant_limits/setup_entry/from_constraints/initialize_packet/get_map_*), normalize bottom/stack invariant, change ptr_max signature, remove thread-local constant machinery.
Checker & Transformer
src/crab/ebpf_checker.cpp, src/crab/ebpf_transformer.cpp
Inject const AnalysisContext& into EbpfChecker/EbpfTransformer; add context parameters to ebpf_domain_check/ebpf_domain_transform and replace thread-local program/options/platform usage.
Array / Bitset Domains
src/crab/array_domain.hpp, src/crab/array_domain.cpp, src/crab/bitset_domain.hpp, src/crab/bitset_domain.cpp
Require explicit size constructors, remove intrinsic bottom APIs (set_to_bottom/is_bottom), adjust to_set semantics and document bottom represented externally.
Variable Registry & RegPack
src/crab/var_registry.hpp, src/crab/var_registry.cpp
Switch makeintern, mark many accessors const, add RegPack and reg_pack(int) const, replace thread_local LazyAllocator with thread_local VariableRegistry.
Registry Call-site Adjustments
src/crab/finite_domain.cpp, src/crab/type_domain.cpp, src/crab/type_domain.hpp, src/crab/type_to_num.cpp, src/crab/type_to_num.hpp, src/crab/zone_domain.cpp
Replace variable_registry->... with variable_registry...., add TypeDomain::variables() and variables_with_type(), adapt type-to-num iteration and reg_pack delegation to registry instance.
Forward Analyzer / Analysis Entrypoints
src/fwd_analyzer.cpp, src/result.cpp, src/result.hpp
Interleaved forward iterator and analyze overloads accept/store AnalysisContext; compute per-analysis loop-counters; propagate context into widen/extrapolate/check/transform calls; add total_stack_size plumbing to extract_instruction_deps/RelevantState.
IR, CFG, Assertions, Unmarshal
src/ir/assertions.cpp, src/ir/program.hpp, src/ir/cfg_builder.cpp, src/ir/parse.cpp, src/ir/unmarshal.cpp
get_assertions now takes options; CFG and pseudo resolution use ProgramInfo and limits from options; parse/unmarshal and Unmarshaller use instance-registry/options instead of thread-locals.
Printing & Reporting
src/printing.cpp
Printing updated to use member-access registry APIs; join relevance copies total_stack_size for correct relevance unions.
Tests
src/test/test_elf_loader.cpp, src/test/test_failure_slice.cpp, src/test/test_marshal.cpp
Update tests to pass explicit stack-size and options/context where required; minor formatting simplification in one test.
Misc: small API/site changes
src/crab/array_domain.cpp, src/crab/bitset_domain.cpp, various crab sources
Remove/adjust bottom-handling methods; replace pointer-style registry access with direct instance access across numerous files.

Possibly related issues

Possibly related PRs

Suggested labels

feature

Suggested reviewers

  • dthaler
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 17.62% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main architectural change: threading AnalysisContext explicitly and reclassifying VariableRegistry as a global service.
Description check ✅ Passed The description thoroughly explains the rationale, semantic bugs fixed, key API changes, and implementation details, directly relating to the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch verification-context

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
src/crab/type_to_num.cpp (1)

114-128: 🧹 Nitpick | 🔵 Trivial

Iterating types.variables() instead of every registered type var is a precision improvement — worth noting soundness explicitly.

The previous implementation walked all registry-wide type variables; the new one only walks variables tracked in this. For an untracked variable, the abstract type is top (any type is possible), so none of its kinds are "nonexistent" and no havoc of tmp.values is warranted — meaning this change makes operator<= strictly stricter (less likely to return true spuriously), which is sound.

As per coding guidelines: "Any change that affects analysis results should spell out the argument for soundness: what inputs are assumed, what invariants are maintained, and how the change preserves them." Consider adding a one-liner comment on line 116 capturing that reasoning so auditors don't have to re-derive it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/crab/type_to_num.cpp` around lines 114 - 128, Add a one-line comment
inside TypeToNumDomain::get_nonexistent_kind_variables (near the loop over
types.variables()) explaining that we intentionally iterate only the variables
tracked in this->types (not all registry-wide type vars) because untracked
variables are treated as top (all types possible) so no kinds are "nonexistent"
for them; mention that this preserves soundness for operator<= and prevents
spurious inclusion in tmp.values, and reference type_to_kinds and
variable_registry.kind_var for context.
src/result.hpp (1)

66-74: ⚠️ Potential issue | 🟡 Minor

total_stack_size = 0 default is a silent-wrong-answer hazard.

RelevantState is passed around by value and constructed in several places (e.g., join_relevance on printing.cpp:815, where only the primary label's field is copied into the merged state). A zero default means any code path that forgets to seed it will quietly translate relative offsets against a stack size of 0 rather than failing loudly.

Either drop the default and force callers to supply it, or make translation assert total_stack_size > 0 before using it. A short comment here stating "must match options.total_stack_size() of the owning analysis" would also help.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/result.hpp` around lines 66 - 74, RelevantState's total_stack_size must
not default to 0 because callers sometimes copy/construct it without seeding and
that silently produces wrong translations; remove the "= 0" default, add an
explicit constructor (e.g., RelevantState(int total_stack_size)) that
initializes total_stack_size and keeps registers/stack_offsets
default-initialized, update all call sites that create RelevantState (including
the merge in join_relevance) to pass the correct options.total_stack_size(), and
add a short comment on the field that it must match options.total_stack_size()
of the owning analysis.
src/crab/var_registry.cpp (1)

232-240: 🧹 Nitpick | 🔵 Trivial

get_loop_counters() does a redundant O(n) lookup per counter and matches too loose a prefix.

Each matching name is re-interned via intern(name), which scans the entire names vector again — so this is O(n²) in the number of interned names, even though we already know the index while iterating. Also, loop_counter(label) produces "pc[" + label + "]", but the filter uses bare "pc", which would falsely match any future name starting with pc (e.g. pc_whatever).

♻️ Proposed refactor
 std::vector<Variable> VariableRegistry::get_loop_counters() const {
     std::vector<Variable> res;
-    for (const std::string& name : names) {
-        if (name.starts_with("pc")) {
-            res.push_back(intern(name));
-        }
+    for (size_t i = 0; i < names.size(); ++i) {
+        if (names[i].starts_with("pc[")) {
+            res.push_back(Variable(i));
+        }
     }
     return res;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/crab/var_registry.cpp` around lines 232 - 240, The get_loop_counters()
implementation currently re-interns each matching name causing O(n²) work and
uses too-broad prefix "pc"; change VariableRegistry::get_loop_counters to
iterate with an index over names, match the exact "pc[" prefix (since
loop_counter(label) formats as "pc[" + label + "]"), and construct Variables
directly from the known index/slot rather than calling intern(name) again so you
avoid the extra scan.
src/crab/ebpf_domain.cpp (2)

289-308: ⚠️ Potential issue | 🟠 Major

Wrong field compared in get_map_inner_map_fdmap->type vs. accumulated inner_map_fd.

After seeding inner_map_fd = map->inner_map_fd, subsequent iterations compare map->type != *inner_map_fd instead of map->inner_map_fd != *inner_map_fd. That means the consistency check across the fd range is effectively checking the wrong descriptor field: two maps whose inner_map_fd values differ will pass this guard as long as their types happen to match the first seen inner_map_fd (and vice versa). The function can then return a spurious "agreed" inner_map_fd, which is a soundness hazard for any caller keying off it.

Likely pre-existing, but worth fixing in this PR since the signature is being reworked here anyway.

🐛 Proposed fix
         if (!inner_map_fd.has_value()) {
             inner_map_fd = map->inner_map_fd;
-        } else if (map->type != *inner_map_fd) {
+        } else if (map->inner_map_fd != *inner_map_fd) {
             return {};
         }

As per coding guidelines: "Any change that affects analysis results should spell out the argument for soundness: what inputs are assumed, what invariants are maintained, and how the change preserves them."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/crab/ebpf_domain.cpp` around lines 289 - 308, The function
get_map_inner_map_fd seeds inner_map_fd from map->inner_map_fd but then
erroneously compares map->type to *inner_map_fd; change the consistency check so
each subsequent iteration compares map->inner_map_fd to *inner_map_fd (not
map->type) and return {} on mismatch. Locate the loop in get_map_inner_map_fd,
keep the initial seeding of inner_map_fd from map->inner_map_fd, and replace the
map->type != *inner_map_fd condition with map->inner_map_fd != *inner_map_fd to
ensure the returned inner_map_fd is consistent across the full fd range obtained
via get_map_fd_range and platform.get_map_descriptor.

267-286: ⚠️ Potential issue | 🟡 Minor

Dead null check on a reference-returning accessor.

platform.get_map_descriptor(map_fd) returns a reference, so &platform.get_map_descriptor(...) is never null and the if (map == nullptr) return {}; guard is unreachable. Same pattern at lines 297, 319, 337, 355. Pre-existing, but now that platform is passed explicitly the contract is clearer — either drop the dead branch, or switch to a nullable API if invalid fds should produce {} instead of relying on whatever get_map_descriptor does internally for unknown fds.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/crab/ebpf_domain.cpp` around lines 267 - 286,
platform.get_map_descriptor(...) returns a reference so taking its address and
checking for nullptr is dead code; change the local variable to a reference
(EbpfMapDescriptor& map = platform.get_map_descriptor(map_fd)) and remove the
unreachable if (map == nullptr) return {} check in EbpfDomain::get_map_type, and
apply the same fix to the other similar sites where you use
&platform.get_map_descriptor(...) (the checks at the other occurrences should be
removed or replaced by reference usage). If the intended semantics are to handle
invalid fds, instead change/get_map_descriptor to return a nullable type
(pointer or std::optional) and update callers accordingly — pick one approach
and make it consistent across all occurrences.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/analysis_context.hpp`:
- Around line 20-24: AnalysisContext contains non-owning const references
(program_info, options, platform) and can silently dangle if the referents are
shorter-lived; add a one-line doc comment immediately above struct
AnalysisContext clarifying the lifetime contract (e.g., that an AnalysisContext
is non-owning and must not outlive the ProgramInfo/options/platform used to
construct it and must not be cached beyond the creating analysis call), and
optionally add a brief TODO or assertion location comment where AnalysisContext
instances are created to remind callers to enforce that precondition; reference
the struct AnalysisContext and its members program_info, options, platform in
the comment.
- Around line 5-10: Remove the unused include of "crab/var_registry.hpp" from
the top of analysis_context.hpp: delete the `#include` "crab/var_registry.hpp"
line so AnalysisContext and its doc comment about variable_registry remain but
the registry header is not pulled in transitively; verify no symbols from
var_registry are referenced in the file (if any are, replace with a forward
declaration or adjust references) and run a rebuild to ensure no missing
dependencies.

In `@src/crab/ebpf_checker.cpp`:
- Around line 221-225: The checks in ebpf_checker.cpp use
context.program_info.callback_target_labels and callback_targets_with_exit but
Program::from_sequence() currently builds those sets on
thread_local_program_info, so the AnalysisContext's ProgramInfo stays empty and
valid callbacks are rejected; move the derived metadata population into the
ProgramInfo instance that AnalysisContext carries (or into Program) so that
context.program_info.callback_target_labels and
context.program_info.callback_targets_with_exit are populated before
ebpf_checker.cpp's checks run, ensuring the checks see the same sets regardless
of explicit-context vs thread-local paths (update Program::from_sequence() or
the code that constructs ProgramInfo to fill these fields, and remove reliance
on thread_local_program_info).

In `@src/crab/ebpf_domain.cpp`:
- Around line 22-31: EbpfDomain::to_set currently returns StringInvariant::top()
when stack is null to satisfy existing YAML tests, but it should return
StringInvariant::bottom(); change the return in EbpfDomain::to_set from
StringInvariant::top() to StringInvariant::bottom() and update any dependent
logic as needed, and create a tracking artifact (issue or PR stub) referencing
this change and the required YAML test updates so the test expectations (post:
[]) are amended in the same change; use the symbols EbpfDomain::to_set,
StringInvariant::bottom(), StringInvariant::top() in the tracking description.

In `@src/crab/ebpf_domain.hpp`:
- Around line 128-133: The comment claims all mutators maintain the
`state.is_bottom() <=> !stack` invariant via normalize_after_state_mutation, but
havoc(Variable) (implemented as state.values.havoc(var)) does not call
normalize_after_state_mutation; either defensively invoke
normalize_after_state_mutation() at the end of havoc(Variable) in
ebpf_domain.cpp to preserve the documented invariant (call
normalize_after_state_mutation() after state.values.havoc(var)), or update the
comment in ebpf_domain.hpp to exclude havoc from that group and add a short
rationale referencing havoc/ state.values.havoc explaining why it cannot drive
the numeric domain to bottom so the invariant still holds.

In `@src/crab/ebpf_transformer.cpp`:
- Around line 32-36: The constructor for EbpfTransformer currently
unconditionally dereferences _dom.stack (stack(*_dom.stack)) which can occur
when entry_inv is bottom; fix by guarding creation: before constructing
EbpfTransformer in ebpf_domain_transform (and the other site referenced around
1493-1495), check that _dom.stack has a value (e.g., if (!_dom.stack)
return/skip/handle bottom) or ensure entry_inv is satisfiable; alternatively
change EbpfTransformer to not dereference in its initializer (store a
pointer/optional reference to _dom.stack and only .value() after entry_inv is
confirmed non-bottom inside EbpfTransformer::run()). Ensure you update all
construction sites (including the ones noted) accordingly.

In `@src/crab/type_to_num.cpp`:
- Around line 209-218: The function
TypeToNumDomain::havoc_all_locations_having_type currently assumes
TypeDomain::variables_with_type returns empty when bottom; make that
precondition explicit by checking the domain bottom state at the start of
havoc_all_locations_having_type — e.g., add an assert(!is_bottom()) (or an
explicit if (is_bottom()) return;) and a short comment referencing the reliance
on variables_with_type's contract so the function fails fast if bottom-handling
changes in the future. Ensure the check uses TypeToNumDomain::is_bottom() to
locate the contract enforcement.

In `@src/result.cpp`:
- Around line 114-116: The AnalysisResult code currently rebuilds
AnalysisContext from thread_local_analysis_context() (see the calls near
EbpfDomain::from_constraints in result.cpp and the similar sites at 130-134 and
386-387), which causes results to depend on ambient TLS; change the
implementation to stop reading TLS: either fetch the AnalysisContext stored on
the AnalysisResult instance (add or use an existing member) or require an
explicit AnalysisContext parameter to the function that creates the EbpfDomain,
and pass that context into EbpfDomain::from_constraints instead of calling
thread_local_analysis_context(); update all similar call sites (the three
locations noted) to use the stored/explicit context.

In `@src/test/test_elf_loader.cpp`:
- Line 562: The SECTION call is on a single line while other SECTION blocks use
multi-line formatting; change SECTION("0x80000000 exceeds int32 — bails out
without mutation") { check_bailout_preserves_program(0x80000000ULL); } to the
same multi-line style used elsewhere (open SECTION on its own line, put the
check_bailout_preserves_program(0x80000000ULL); call on the next line, then
close the SECTION block on its own line) so the SECTION and the call to
check_bailout_preserves_program remain consistently formatted with surrounding
tests.

---

Outside diff comments:
In `@src/crab/ebpf_domain.cpp`:
- Around line 289-308: The function get_map_inner_map_fd seeds inner_map_fd from
map->inner_map_fd but then erroneously compares map->type to *inner_map_fd;
change the consistency check so each subsequent iteration compares
map->inner_map_fd to *inner_map_fd (not map->type) and return {} on mismatch.
Locate the loop in get_map_inner_map_fd, keep the initial seeding of
inner_map_fd from map->inner_map_fd, and replace the map->type != *inner_map_fd
condition with map->inner_map_fd != *inner_map_fd to ensure the returned
inner_map_fd is consistent across the full fd range obtained via
get_map_fd_range and platform.get_map_descriptor.
- Around line 267-286: platform.get_map_descriptor(...) returns a reference so
taking its address and checking for nullptr is dead code; change the local
variable to a reference (EbpfMapDescriptor& map =
platform.get_map_descriptor(map_fd)) and remove the unreachable if (map ==
nullptr) return {} check in EbpfDomain::get_map_type, and apply the same fix to
the other similar sites where you use &platform.get_map_descriptor(...) (the
checks at the other occurrences should be removed or replaced by reference
usage). If the intended semantics are to handle invalid fds, instead
change/get_map_descriptor to return a nullable type (pointer or std::optional)
and update callers accordingly — pick one approach and make it consistent across
all occurrences.

In `@src/crab/type_to_num.cpp`:
- Around line 114-128: Add a one-line comment inside
TypeToNumDomain::get_nonexistent_kind_variables (near the loop over
types.variables()) explaining that we intentionally iterate only the variables
tracked in this->types (not all registry-wide type vars) because untracked
variables are treated as top (all types possible) so no kinds are "nonexistent"
for them; mention that this preserves soundness for operator<= and prevents
spurious inclusion in tmp.values, and reference type_to_kinds and
variable_registry.kind_var for context.

In `@src/crab/var_registry.cpp`:
- Around line 232-240: The get_loop_counters() implementation currently
re-interns each matching name causing O(n²) work and uses too-broad prefix "pc";
change VariableRegistry::get_loop_counters to iterate with an index over names,
match the exact "pc[" prefix (since loop_counter(label) formats as "pc[" + label
+ "]"), and construct Variables directly from the known index/slot rather than
calling intern(name) again so you avoid the extra scan.

In `@src/result.hpp`:
- Around line 66-74: RelevantState's total_stack_size must not default to 0
because callers sometimes copy/construct it without seeding and that silently
produces wrong translations; remove the "= 0" default, add an explicit
constructor (e.g., RelevantState(int total_stack_size)) that initializes
total_stack_size and keeps registers/stack_offsets default-initialized, update
all call sites that create RelevantState (including the merge in join_relevance)
to pass the correct options.total_stack_size(), and add a short comment on the
field that it must match options.total_stack_size() of the owning analysis.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3c0686dc-13ed-4737-ab14-d4fdffae0d13

📥 Commits

Reviewing files that changed from the base of the PR and between ac03ef6 and 6141700.

📒 Files selected for processing (30)
  • src/analysis_context.hpp
  • src/crab/array_domain.cpp
  • src/crab/array_domain.hpp
  • src/crab/bitset_domain.cpp
  • src/crab/bitset_domain.hpp
  • src/crab/ebpf_checker.cpp
  • src/crab/ebpf_domain.cpp
  • src/crab/ebpf_domain.hpp
  • src/crab/ebpf_transformer.cpp
  • src/crab/finite_domain.cpp
  • src/crab/type_domain.cpp
  • src/crab/type_domain.hpp
  • src/crab/type_to_num.cpp
  • src/crab/type_to_num.hpp
  • src/crab/var_registry.cpp
  • src/crab/var_registry.hpp
  • src/crab/zone_domain.cpp
  • src/fwd_analyzer.cpp
  • src/ir/assertions.cpp
  • src/ir/cfg_builder.cpp
  • src/ir/parse.cpp
  • src/ir/program.hpp
  • src/ir/unmarshal.cpp
  • src/printing.cpp
  • src/result.cpp
  • src/result.hpp
  • src/test/test_elf_loader.cpp
  • src/test/test_failure_slice.cpp
  • src/test/test_marshal.cpp
  • src/verifier.hpp
💤 Files with no reviewable changes (1)
  • src/crab/bitset_domain.cpp

Comment thread src/analysis_context.hpp
Comment thread src/analysis_context.hpp
Comment thread src/crab/ebpf_checker.cpp
Comment thread src/crab/ebpf_domain.cpp
Comment thread src/crab/ebpf_domain.hpp Outdated
Comment thread src/crab/ebpf_transformer.cpp
Comment thread src/crab/type_to_num.cpp
Comment thread src/result.cpp
Comment thread src/test/test_elf_loader.cpp
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6141700718

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/fwd_analyzer.cpp
- Guard ebpf_domain_initialize_loop_counter against bottom domains; the
  EbpfTransformer ctor dereferences dom.stack, which is nullopt when bottom.
- Assert !is_bottom() in TypeToNumDomain::havoc_all_locations_having_type
  so the reliance on variables_with_type returning empty for bottom fails
  loudly if that contract ever changes.
- Drop the unused crab/var_registry.hpp include from analysis_context.hpp.
- Spell out AnalysisContext's non-owning lifetime contract.
- Clarify in ebpf_domain.hpp that havoc does not need normalization.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/crab/ebpf_domain.hpp`:
- Around line 100-102: Remove the implicit default for total_stack_size on
EbpfDomain::from_constraints so callers must supply it explicitly: update the
declaration and corresponding definition of EbpfDomain::from_constraints to
remove the default argument (the parameter remains size_t total_stack_size), and
update all call sites (including tests) to pass an explicit value (use
ebpf_verifier_options_t{}.total_stack_size() in tests where the default behavior
is intended). Ensure any overloads or wrappers around from_constraints are
adjusted likewise to avoid implicit defaults.

In `@src/crab/type_to_num.cpp`:
- Around line 135-144: The current code builds type_vars as a std::set which
gives deterministic ordering but incurs O(log n) inserts; change type_vars to a
std::unordered_set<Variable> when collecting from types.variables() and
right.types.variables() to get amortized O(1) inserts (then iterate that
unordered_set when computing per-variable work for join_selective); if
deterministic iteration is actually required downstream (e.g. some future
non-commutative use of values.set(...)), sort the collected elements into a
vector after deduplication before iterating; ensure Variable is hashable
(add/enable a std::hash<Variable> specialization) or fall back to std::set if
hashing isn’t available.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9bd0b51c-6d06-4fab-a581-a68f1d46cdb1

📥 Commits

Reviewing files that changed from the base of the PR and between 6141700 and 6c4d449.

📒 Files selected for processing (4)
  • src/analysis_context.hpp
  • src/crab/ebpf_domain.hpp
  • src/crab/ebpf_transformer.cpp
  • src/crab/type_to_num.cpp

Comment thread src/crab/ebpf_domain.hpp
Comment thread src/crab/type_to_num.cpp
@elazarg elazarg changed the title Thread explicit AnalysisContext through the verifier; treat VariableRegistry as a global service Pass explicit AnalysisContext; treat VariableRegistry as a global service Apr 17, 2026
Resolve conflict in compute_failure_slices: keep main's
compute_slice_from_label extraction while preserving the PR's
explicit AnalysisContext for ebpf_domain_check and RelevantState init.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Elazar Gershuni <elazarg@gmail.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/result.cpp (1)

430-610: ⚠️ Potential issue | 🔴 Critical

Thread seed_relevance.total_stack_size through every RelevantState insertion into maps—stack constraint filtering is silently broken.

seed_relevance.total_stack_size arrives via parameter but never propagates into visited[current_label], slice_labels[current_label], or join_expansion entries. Every insertion uses default-constructed RelevantState with total_stack_size = 0.

is_relevant_constraint computes absolute stack offsets as total_stack_size + rel_offset (line ~69 in result.cpp). With zeroed total_stack_size, all stack range checks fail silently—constraints involving s[...] get compared against wrong absolute ranges.

Apply the fix at these locations:

  • Line 432: existing.total_stack_size = seed_relevance.total_stack_size; after visited[current_label]
  • Line 546–549: existing.total_stack_size = seed_relevance.total_stack_size; after slice_labels[current_label]
  • Line 555–557: existing.total_stack_size = seed_relevance.total_stack_size; after slice_labels[current_label]
  • Lines 594–603: Set total_stack_size when populating join_expansion
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/result.cpp` around lines 430 - 610, The RelevantState.total_stack_size
from seed_relevance is not being propagated when inserting into maps, breaking
stack-offset checks; update every place a RelevantState is
default-constructed/merged so its total_stack_size is set to
seed_relevance.total_stack_size: after creating/merging visited[current_label]
set existing.total_stack_size = seed_relevance.total_stack_size; after merging
into slice_labels[current_label] (both in the instruction_contributes branch and
the no-deps branch) set that existing.total_stack_size =
seed_relevance.total_stack_size; and when inserting entries into join_expansion
(both for the join label and for predecessor entries) initialize their
.total_stack_size from seed_relevance.total_stack_size (or from the visited
entry used when copying relevance). Ensure each inserted/merged RelevantState
preserves or copies total_stack_size rather than leaving it zero.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/result.hpp`:
- Around line 66-70: The field RelevantState::total_stack_size currently
defaults to 0 which silently mis-translates relative stack offsets; update
RelevantState so total_stack_size has no default (or use an explicit sentinel
like -1) and ensure callers like compute_slice_from_label initialize it from the
seed, then add an assertion at the top of RelevantState::is_relevant_constraint
(assert(total_stack_size >= 0)) before computing abs = total_stack_size +
rel_offset to fail loudly on uninitialized state.

---

Outside diff comments:
In `@src/result.cpp`:
- Around line 430-610: The RelevantState.total_stack_size from seed_relevance is
not being propagated when inserting into maps, breaking stack-offset checks;
update every place a RelevantState is default-constructed/merged so its
total_stack_size is set to seed_relevance.total_stack_size: after
creating/merging visited[current_label] set existing.total_stack_size =
seed_relevance.total_stack_size; after merging into slice_labels[current_label]
(both in the instruction_contributes branch and the no-deps branch) set that
existing.total_stack_size = seed_relevance.total_stack_size; and when inserting
entries into join_expansion (both for the join label and for predecessor
entries) initialize their .total_stack_size from seed_relevance.total_stack_size
(or from the visited entry used when copying relevance). Ensure each
inserted/merged RelevantState preserves or copies total_stack_size rather than
leaving it zero.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 18d38e41-a514-4886-ad2f-17bef043abdf

📥 Commits

Reviewing files that changed from the base of the PR and between 6c4d449 and 9153fed.

📒 Files selected for processing (2)
  • src/result.cpp
  • src/result.hpp

Comment thread src/result.hpp
@elazarg elazarg merged commit 4fda96b into main Apr 17, 2026
16 checks passed
@elazarg elazarg deleted the verification-context branch April 17, 2026 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant