feat(ir): add UseAfterDef structural property verifier#574
feat(ir): add UseAfterDef structural property verifier#574huxinyuan1215 wants to merge 7 commits intohw-native-sys:mainfrom
Conversation
📝 WalkthroughWalkthroughThis PR adds a UseAfterDef structural property verifier: new enum value, verifier implementation and factory, registry registration, error type and bindings, docs, and unit tests; it enforces variable-use-before-definition rules across function and control-flow scopes. Changes
Sequence DiagramsequenceDiagram
participant Registry as PropertyVerifierRegistry
participant Factory as CreateUseAfterDefPropertyVerifier
participant Verifier as UseAfterDefPropertyVerifier
participant Checker as UseAfterDefChecker
participant IR as FunctionBody
participant Diag as Diagnostics
Registry->>Factory: register(CreateUseAfterDefPropertyVerifier)
Factory->>Verifier: create verifier instance
Verifier->>Checker: instantiate visitor
loop for each Function in Program
Verifier->>IR: request traversal
IR->>Checker: traverse nodes
Checker->>Checker: manage scopes (params, assigns, loops, if-branches)
Checker->>Checker: lookup variable uses
alt use before def detected
Checker->>Diag: emit USE_BEFORE_DEF (401)
end
end
Verifier->>Diag: return diagnostics
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. 📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new structural property, Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request successfully introduces the UseAfterDef structural property verifier, which is a valuable addition to the IR verification system. The implementation correctly handles complex scoping rules for various IR statement types, and the Python bindings and documentation have been updated accordingly. The comprehensive unit tests provide good coverage for both valid and invalid scenarios.
| auto saved_scope = in_scope_; | ||
|
|
||
| if (op->then_body_) VisitStmt(op->then_body_); | ||
|
|
||
| in_scope_ = saved_scope; | ||
| if (op->else_body_.has_value() && *op->else_body_) { | ||
| VisitStmt(*op->else_body_); | ||
| } | ||
|
|
||
| // Restore: branch-local defs do not propagate outward. | ||
| in_scope_ = saved_scope; |
There was a problem hiding this comment.
The current approach to managing variable scope in VisitStmt_(const IfStmtPtr& op) involves copying the in_scope_ unordered_set before and after visiting each branch. While correct, for deeply nested IR structures or functions with a large number of variables, this repeated copying could introduce performance overhead. Consider optimizing this by using a stack-based approach to manage scope changes, where only additions/removals for the current scope are pushed/popped, rather than copying the entire set. This could improve efficiency for large programs.
References
- This suggestion to use a stack-based approach for managing scope changes in nested IR structures aligns with the general principle of employing scope-aware mechanisms for nested scopes. Rule 3, for instance, emphasizes using a stack of substitution maps or save/restore semantics for correctly handling variable shadowing during AST transformations. Although the immediate concern here is performance optimization for scope management rather than variable substitution correctness, the recommended stack-based pattern is a robust solution for efficiently handling variable availability across nested scopes.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/ir/transforms/unroll_loops_pass.cpp (1)
247-259:CollectLastDefsByNameshould also recurse intoIfStmtbranches for consistency.Similar to
collect_defs, this misses definitions inside conditional branches. For robustness with complex loop bodies:♻️ Suggested enhancement
static void CollectLastDefsByName(const StmtPtr& s, std::unordered_map<std::string, VarPtr>& defs) { if (!s) return; if (auto assign = std::dynamic_pointer_cast<const AssignStmt>(s)) { if (assign->var_) defs[assign->var_->name_hint_] = assign->var_; } else if (auto seq = std::dynamic_pointer_cast<const SeqStmts>(s)) { for (const auto& child : seq->stmts_) CollectLastDefsByName(child, defs); } else if (auto for_s = std::dynamic_pointer_cast<const ForStmt>(s)) { if (for_s->body_) CollectLastDefsByName(for_s->body_, defs); + } else if (auto if_s = std::dynamic_pointer_cast<const IfStmt>(s)) { + CollectLastDefsByName(if_s->then_, defs); + CollectLastDefsByName(if_s->else_, defs); } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ir/transforms/unroll_loops_pass.cpp` around lines 247 - 259, CollectLastDefsByName currently misses definitions inside conditional branches; add an else-if that dynamic_pointer_casts s to const IfStmt and, if present, call CollectLastDefsByName on the IfStmt's then and else bodies (e.g., then_body_ and else_body_ or the actual field names used for the IfStmt's branches) so last defs inside both branches are recorded; update CollectLastDefsByName to check each branch for null before recursing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/ir/transforms/unroll_loops_pass.cpp`:
- Around line 247-259: CollectLastDefsByName currently misses definitions inside
conditional branches; add an else-if that dynamic_pointer_casts s to const
IfStmt and, if present, call CollectLastDefsByName on the IfStmt's then and else
bodies (e.g., then_body_ and else_body_ or the actual field names used for the
IfStmt's branches) so last defs inside both branches are recorded; update
CollectLastDefsByName to check each branch for null before recursing.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: ac497ba6-0246-4d80-905c-c25eabe173e7
📒 Files selected for processing (2)
src/ir/transforms/unroll_loops_pass.cppsrc/ir/verifier/verify_use_after_def.cpp
77ebc82 to
ff4e830
Compare
Add IRProperty::UseAfterDef and a corresponding PropertyVerifier that checks every Var reference in an expression is dominated by a definition. Register it as a structural property so it is verified automatically before and after every pass alongside TypeChecked, BreakContinueValid, and NoRedundantBlocks. Scoping rules implemented: - Function parameters: in scope for entire function body - AssignStmt: LHS enters scope after RHS is evaluated - ForStmt: loop_var and iter_args scoped to body; return_vars enter enclosing scope after the loop - WhileStmt: iter_args scoped to condition and body; return_vars enter enclosing scope after the loop - IfStmt: branch-local definitions do not propagate outward; return_vars enter enclosing scope after the if Closes hw-native-sys#548
…nused includes The UnrollLoops pass was producing invalid IR after the UseAfterDef structural property verifier was added: all unrolled iterations used the original outer variable instead of chaining outputs as inputs to the next iteration. Fix by building a shadow_map that identifies AssignStmt targets in the loop body that shadow outer-scope vars by name, then using carry_map to chain each iteration's output as the next iteration's input. The final carry is propagated to subsequent statements (e.g. return) via pending_subst in VisitStmt_(SeqStmtsPtr). Zero-trip loops are handled by mapping body def vars back to their outer vars. Nested unroll loops inside regular for loops are handled via CollectLastDefsByName which compares pre/post def vars to build the substitution. Also removes unused includes (function.h, program.h, span.h) from verify_use_after_def.cpp that were flagged by clang-tidy. Fixes 6 failing tests in test_unroll_loops_pass.py.
- Move UseAfterDef from GetStructuralProperties() to GetDefaultVerifyProperties() so it is checked on explicit request rather than before every pass, fixing 85 unit test failures caused by manually-constructed test IR that intentionally has undefined variables - Change pointer-keyed unordered_map to std::map in unroll_loops_pass.cpp (shadow_map, carry_map, final_carry, carry locals) to fix bugprone-nondeterministic-pointer-iteration-order clang-tidy errors - Add missing #include "pypto/ir/program.h" to verify_use_after_def.cpp to fix misc-include-cleaner clang-tidy error - Update Doxygen comments in ir_property.h and docs (EN + ZH) to reflect the new property set assignments
ff4e830 to
40d0a22
Compare
…perty-use-after-def # Conflicts: # include/pypto/ir/transforms/ir_property.h # src/ir/transforms/ir_property.cpp # tests/ut/ir/transforms/test_pass_manager.py # tests/ut/ir/transforms/test_pass_pipeline.py
- Fix MemRefUpdateMutator in allocate_memory_addr_pass to store new Var/IterArg pointers in var_remap_ so subsequent uses map to the same new pointer, preventing false use-before-def reports - Fix UseAfterDefChecker IfStmt handling: in leak mode (no return_vars), union both branch scopes into the enclosing scope so variables defined in both branches are visible after the if statement - Update test_verify_ssa_pass scope violation tests to accept either UseAfterDef or SSA verifier error messages (both are correct) - Restore test_verify_no_nested_call_pass to upstream's simpler IR that doesn't trigger NormalizedStmtStructure post-verification - Fix test_pass_manager merge artifact: restore missing pm/result lines
…perty-use-after-def
hw-native-sys#548) Fix the UseAfterDef verifier to correctly handle variables that leak out of if-branches and for-loops in non-SSA (leak) mode. Previously, the verifier incorrectly reported use-after-def errors for variables defined in then-only branches or loop bodies when used after the construct. Also fix upstream-missed test updates: add explicit memory_space argument to ir.TileType() calls in test files that were broken after commit 45a3110 moved memory_space from MemRef to TileType. Add memory_space parameter to IRBuilder.tile_type() helper for consistency.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new UseAfterDef structural property verifier, which is a significant improvement for IR correctness. The changes correctly integrate the new verifier into the build system, documentation, and Python bindings. The LoopUnrollMutator and parse_if_statement logic have been appropriately updated to ensure correct variable scoping and substitution, which is crucial for the UseAfterDef analysis. The addition of memory_space to TileType constructors across various test files ensures consistency and proper type inference. The new unit tests for UseAfterDef cover both valid and invalid scenarios, demonstrating the effectiveness of the verifier. Overall, the changes are well-implemented and enhance the robustness of the IR.
There was a problem hiding this comment.
Pull request overview
Adds a new IR structural verifier (UseAfterDef) to ensure variable uses occur only after a dominating definition, integrating it into the verification pipeline (C++), Python bindings/stubs, docs, and unit tests.
Changes:
- Introduce
IRProperty::UseAfterDefand implement/registerUseAfterDefCheckverifier. - Expose the new property + error enum in Python bindings/type stubs and document the rule.
- Add focused unit tests and adjust existing tests affected by the stricter verification.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/ut/ir/transforms/test_verify_use_after_def.py | New unit tests for the UseAfterDef verifier (valid/invalid cases + error code/rule name). |
| tests/ut/ir/transforms/test_verify_ssa_pass.py | Relaxed regex expectations to allow new “used before definition” failure mode. |
| tests/ut/ir/transforms/test_python_printer.py | Update TileType construction to include explicit memory_space when printing/round-tripping. |
| tests/ut/ir/transforms/test_insert_sync.py | Update TileType creation sites to pass memory_space. |
| tests/ut/ir/transforms/test_basic_memory_reuse.py | Update TileType construction for new signature/semantics. |
| tests/ut/ir/memory/test_memref.py | Update TileType construction sites to include memory_space. |
| src/ir/verifier/verify_use_after_def.cpp | New C++ verifier implementation (UseAfterDefCheck). |
| src/ir/verifier/property_verifier_registry.cpp | Register the new verifier in the registry. |
| src/ir/transforms/unroll_loops_pass.cpp | Update unroll logic to perform var substitution/carry across unrolled iterations. |
| src/ir/transforms/ir_property.cpp | Add UseAfterDef to stringification, structural props, and default verify set. |
| src/ir/transforms/allocate_memory_addr_pass.cpp | Attempt to cache remapped Vars/IterArgs during MemRef address updates. |
| python/pypto/pypto_core/passes.pyi | Add IRProperty.UseAfterDef and UseAfterDefErrorType to Python stubs. |
| python/pypto/language/parser/ast_parser.py | Fix if/else parsing so else branch doesn’t see then-branch leaked vars in leak mode. |
| python/pypto/ir/builder.py | Extend tile_type(...) builder helper to accept/pass through memory_space. |
| python/bindings/modules/passes.cpp | Bind IRProperty.UseAfterDef and UseAfterDefErrorType to Python. |
| include/pypto/ir/verifier/verifier.h | Declare CreateUseAfterDefPropertyVerifier() factory. |
| include/pypto/ir/verifier/verification_error.h | Define use_after_def::ErrorType (401) and declare ErrorTypeToString. |
| include/pypto/ir/transforms/ir_property.h | Add UseAfterDef to the IRProperty enum and update structural/default property docs. |
| docs/zh-cn/dev/passes/99-verifier.md | Document UseAfterDef rule and add it to structural/default property sets. |
| docs/en/dev/passes/99-verifier.md | Document UseAfterDef rule and add it to structural/default property sets. |
| CMakeLists.txt | Add new verifier source to the build. |
| /// Recurses into ForStmt bodies and SeqStmts. | ||
| static void CollectLastDefsByName(const StmtPtr& s, std::unordered_map<std::string, VarPtr>& defs) { | ||
| if (!s) return; | ||
| if (auto assign = std::dynamic_pointer_cast<const AssignStmt>(s)) { | ||
| if (assign->var_) defs[assign->var_->name_hint_] = assign->var_; | ||
| } else if (auto seq = std::dynamic_pointer_cast<const SeqStmts>(s)) { | ||
| for (const auto& child : seq->stmts_) CollectLastDefsByName(child, defs); | ||
| } else if (auto for_s = std::dynamic_pointer_cast<const ForStmt>(s)) { | ||
| if (for_s->body_) CollectLastDefsByName(for_s->body_, defs); |
| - `AssignStmt`: LHS variable enters scope after RHS is evaluated | ||
| - `ForStmt`: `loop_var` and `iter_args` are in scope inside the loop body only; `return_vars` enter the enclosing scope after the loop | ||
| - `WhileStmt`: `iter_args` are in scope for the condition and body; `return_vars` enter the enclosing scope after the loop | ||
| - `IfStmt`: definitions inside then/else branches do **not** propagate to the outer scope; `return_vars` enter the enclosing scope after the if |
| - `AssignStmt`:LHS 变量在 RHS 求值后进入作用域 | ||
| - `ForStmt`:`loop_var` 和 `iter_args` 仅在循环体内可见;`return_vars` 在循环结束后进入外层作用域 | ||
| - `WhileStmt`:`iter_args` 在条件和循环体内可见;`return_vars` 在循环结束后进入外层作用域 | ||
| - `IfStmt`:then/else 分支内的定义**不**传播到外层作用域;`return_vars` 在 if 结束后进入外层作用域 |
| static_cast<int>(use_after_def::ErrorType::USE_BEFORE_DEF), msg.str(), | ||
| op->span_); | ||
| } | ||
| // Leaf node — no children to recurse into. |
| auto saved_scope = in_scope_; | ||
|
|
||
| // loop_var and iter_args are in scope only inside the body. | ||
| if (op->loop_var_) in_scope_.insert(op->loop_var_.get()); | ||
| for (const auto& iter_arg : op->iter_args_) { |
| // Collect body def vars that shadow outer-scope vars by name. | ||
| // shadow_map: body_def_var* → outer_var VarPtr (the var it shadows) | ||
| std::map<const Var*, VarPtr> shadow_map; | ||
| { | ||
| std::unordered_map<std::string, const Var*> def_by_name; | ||
| std::function<void(const StmtPtr&)> collect_defs = [&](const StmtPtr& s) { | ||
| if (!s) return; | ||
| if (auto assign = std::dynamic_pointer_cast<const AssignStmt>(s)) { | ||
| if (assign->var_) def_by_name[assign->var_->name_hint_] = assign->var_.get(); | ||
| } else if (auto seq = std::dynamic_pointer_cast<const SeqStmts>(s)) { | ||
| for (const auto& child : seq->stmts_) collect_defs(child); | ||
| } | ||
| }; | ||
| collect_defs(op->body_); | ||
|
|
||
| std::function<void(const ExprPtr&)> collect_uses = [&](const ExprPtr& e) { | ||
| if (!e) return; | ||
| if (auto var = std::dynamic_pointer_cast<const Var>(e)) { | ||
| auto it = def_by_name.find(var->name_hint_); | ||
| if (it != def_by_name.end() && var.get() != it->second) { | ||
| shadow_map[it->second] = std::const_pointer_cast<Var>(var); | ||
| } |
| // Check if already remapped (same old pointer seen again). | ||
| auto it = var_remap_.find(op.get()); | ||
| if (it != var_remap_.end()) { | ||
| return it->second; | ||
| } | ||
| TypePtr new_type = UpdateTypeMemRef(op->GetType()); | ||
| if (new_type != op->GetType()) { | ||
| return std::make_shared<Var>(op->name_hint_, new_type, op->span_); | ||
| auto new_var = std::make_shared<Var>(op->name_hint_, new_type, op->span_); | ||
| var_remap_[op.get()] = new_var; | ||
| return new_var; |
| * - WhileStmt::iter_args_: in scope inside the loop body (including condition) | ||
| * - WhileStmt::return_vars_: defined in the enclosing scope after the loop | ||
| * - IfStmt::return_vars_: defined in the enclosing scope after the if | ||
| * - Definitions inside if/else branches do NOT propagate to the outer scope |
| // iter_args are in scope for condition and body. | ||
| for (const auto& iter_arg : op->iter_args_) { | ||
| if (iter_arg) in_scope_.insert(iter_arg.get()); | ||
| } | ||
|
|
||
| if (op->condition_) VisitExpr(op->condition_); | ||
| if (op->body_) VisitStmt(op->body_); | ||
|
|
||
| for (const auto& iter_arg : op->iter_args_) { | ||
| if (iter_arg) in_scope_.erase(iter_arg.get()); | ||
| } | ||
|
|
||
| // return_vars are defined in the enclosing scope after the loop. | ||
| for (const auto& rv : op->return_vars_) { | ||
| if (rv) in_scope_.insert(rv.get()); | ||
| } | ||
| } |
| std::function<void(const ExprPtr&)> collect_uses = [&](const ExprPtr& e) { | ||
| if (!e) return; | ||
| if (auto var = std::dynamic_pointer_cast<const Var>(e)) { | ||
| auto it = def_by_name.find(var->name_hint_); | ||
| if (it != def_by_name.end() && var.get() != it->second) { | ||
| shadow_map[it->second] = std::const_pointer_cast<Var>(var); | ||
| } | ||
| } else if (auto call = std::dynamic_pointer_cast<const Call>(e)) { | ||
| for (const auto& arg : call->args_) collect_uses(arg); | ||
| } | ||
| }; | ||
| std::function<void(const StmtPtr&)> collect_rhs = [&](const StmtPtr& s) { | ||
| if (!s) return; | ||
| if (auto assign = std::dynamic_pointer_cast<const AssignStmt>(s)) { | ||
| collect_uses(assign->value_); | ||
| } else if (auto seq = std::dynamic_pointer_cast<const SeqStmts>(s)) { | ||
| for (const auto& child : seq->stmts_) collect_rhs(child); | ||
| } | ||
| }; | ||
| collect_rhs(op->body_); | ||
| } |
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
python/pypto/ir/builder.py (1)
714-721:⚠️ Potential issue | 🟠 Major
tile_type()introduces a positional-argument breaking change.By inserting
memory_spacebeforespan(Line 714-Line 721), previous positional calls that passedspanas the 5th argument now bind incorrectly.Compatibility-safe signature update
def tile_type( self, shape: Sequence[int | ir.Expr], dtype: DataType, memref: ir.MemRef | None = None, tile_view: ir.TileView | None = None, - memory_space: ir.MemorySpace | None = None, span: ir.Span | None = None, + *, + memory_space: ir.MemorySpace | None = None, ) -> ir.TileType:Also applies to: 746-746
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/pypto/ir/builder.py` around lines 714 - 721, The tile_type() signature change moved memory_space before span and breaks existing positional calls; revert the breaking change by making memory_space a keyword-only parameter (e.g., insert a positional-only separator so span stays as the 5th positional arg) or else restore the original parameter order so span remains the 5th positional argument; apply the same fix to the other affected function(s) mentioned around line 746 to preserve backward compatibility.
🧹 Nitpick comments (1)
python/pypto/ir/builder.py (1)
730-730: Docstring contract and runtime behavior are inconsistent.Line 730 says
memory_spaceis required withmemref, but the method currently acceptsmemrefwithmemory_space=None. Either enforce it in code or soften the docstring.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/pypto/ir/builder.py` at line 730, The docstring states memory_space is required when memref is provided but the method currently allows memory_space=None; modify the function that accepts the parameters memref and memory_space to validate this contract at runtime by adding a guard like: if memref is not None and memory_space is None: raise ValueError("memory_space is required when memref is provided"), or alternatively update the docstring to indicate memory_space is optional—ensure the change is applied in the function/method that declares the memref and memory_space parameters so callers get consistent behavior and messaging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/en/dev/passes/99-verifier.md`:
- Around line 121-128: The docs state that IfStmt branch-local definitions do
not propagate, but the UseAfterDef test
test_valid_then_only_leak_visible_after_if and the verifier accept a then-only
definition leaking out; update the IfStmt scoping rule text to reflect the
implemented contract: indicate that variables defined in a then/else branch may
propagate to the enclosing scope (or specifically that then-only definitions are
visible after the IfStmt) and clarify how return_vars behave, so the
documentation for IfStmt matches the verifier behavior (refer to IfStmt,
UseAfterDef, and test_valid_then_only_leak_visible_after_if to validate
wording).
In `@docs/zh-cn/dev/passes/99-verifier.md`:
- Around line 121-128: 文档中关于 IfStmt 的作用域规则描述不匹配测试实现:当前说“then/else
分支内的定义不传播到外层”,但 tests/ut/ir/transforms/test_verify_use_after_def.py 中的
test_valid_then_only_leak_visible_after_if 将“只有 then 分支定义的变量在 if 结束后可见”视为合法;请将
docs/zh-cn/dev/passes/99-verifier.md 中 IfStmt 条目改为与实现/英文文档保持一致,明确说明哪些
return_vars/分支内定义会在 if 结束后进入外层作用域(例如 then-only leak 情形),使中文文档与
test_valid_then_only_leak_visible_after_if 和 docs/en/dev 的语义一致。
In `@src/ir/transforms/ir_property.cpp`:
- Around line 115-124: GetStructuralProperties now includes
IRProperty::UseAfterDef but GetVerifiedProperties still omits it, causing
structural-start verification to skip UseAfterDef; update the implementation of
GetVerifiedProperties to include IRProperty::UseAfterDef in its returned
IRPropertySet (so the intersection
GetStructuralProperties().Intersection(GetVerifiedProperties()) will include
UseAfterDef), referencing the existing functions GetStructuralProperties,
GetDefaultVerifyProperties and GetVerifiedProperties to locate where to add
IRProperty::UseAfterDef.
In `@src/ir/transforms/unroll_loops_pass.cpp`:
- Around line 156-164: The carry mapping is captured from def_var_map computed
from DeepClone before recursive rewriting, so nested VisitStmt may rewrite those
defs and leave carry_map pointing to stale vars; move the carry derivation to
after processing the cloned body (i.e., call VisitStmt(cloned_body) first and
then obtain the final def remap to update carry_map), or change VisitStmt to
return both the rewritten stmt and its def remap and use that remap to populate
carry_map (reference symbols: DeepClone, cloned_body, def_var_map, VisitStmt,
carry_map, shadow_map, unrolled).
In `@src/ir/verifier/verify_use_after_def.cpp`:
- Around line 66-77: The VisitVarLike_ implementation flags a use-before-def but
never continues the IR traversal, so nested uses in var-like metadata are not
visited; update VisitVarLike_ to still call the base traversal
(IRVisitor::VisitVarLike_(op)) after performing the in_scope_ check and
recording the diagnostic (i.e., keep the scope check and diagnostics as-is, then
invoke IRVisitor::VisitVarLike_(op) to recurse into any metadata/children),
ensuring you reference the VisitVarLike_ method and IRVisitor::VisitVarLike_
when making the change.
- Around line 134-159: The VisitStmt_(const WhileStmtPtr& op) leaves loop-local
definitions from op->body_ in in_scope_ when op->return_vars_ are inserted, so
restore the original scope before adding the return_vars_; capture the in_scope_
state (or remove only loop-local symbols) after visiting op->body_ and before
inserting op->return_vars_, then insert the return_vars_ into the
restored/enclosing scope so loop-local defs are not visible after the loop;
refer to VisitStmt_(const WhileStmtPtr& op), in_scope_, op->iter_args_,
op->condition_, op->body_, and op->return_vars_ to locate where to
snapshot/restore or selectively erase loop-local entries.
In `@tests/ut/ir/transforms/test_verify_ssa_pass.py`:
- Line 307: Update the pytest.raises calls that pass regex patterns as normal
strings to use raw-string literals so backslashes and regex metacharacters are
not interpreted by Python string escapes; specifically change the match argument
in the pytest.raises call that currently reads match="used before
definition|used outside its defining scope" (and the similar patterns at the
other occurrences) to use a raw string (r"...") in the test_verify_ssa_pass.py
tests (look for the pytest.raises(...) occurrences around the failing
assertions).
---
Outside diff comments:
In `@python/pypto/ir/builder.py`:
- Around line 714-721: The tile_type() signature change moved memory_space
before span and breaks existing positional calls; revert the breaking change by
making memory_space a keyword-only parameter (e.g., insert a positional-only
separator so span stays as the 5th positional arg) or else restore the original
parameter order so span remains the 5th positional argument; apply the same fix
to the other affected function(s) mentioned around line 746 to preserve backward
compatibility.
---
Nitpick comments:
In `@python/pypto/ir/builder.py`:
- Line 730: The docstring states memory_space is required when memref is
provided but the method currently allows memory_space=None; modify the function
that accepts the parameters memref and memory_space to validate this contract at
runtime by adding a guard like: if memref is not None and memory_space is None:
raise ValueError("memory_space is required when memref is provided"), or
alternatively update the docstring to indicate memory_space is optional—ensure
the change is applied in the function/method that declares the memref and
memory_space parameters so callers get consistent behavior and messaging.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 60549225-8658-44d2-8bc1-2473d6b8b3e7
📒 Files selected for processing (21)
CMakeLists.txtdocs/en/dev/passes/99-verifier.mddocs/zh-cn/dev/passes/99-verifier.mdinclude/pypto/ir/transforms/ir_property.hinclude/pypto/ir/verifier/verification_error.hinclude/pypto/ir/verifier/verifier.hpython/bindings/modules/passes.cpppython/pypto/ir/builder.pypython/pypto/language/parser/ast_parser.pypython/pypto/pypto_core/passes.pyisrc/ir/transforms/allocate_memory_addr_pass.cppsrc/ir/transforms/ir_property.cppsrc/ir/transforms/unroll_loops_pass.cppsrc/ir/verifier/property_verifier_registry.cppsrc/ir/verifier/verify_use_after_def.cpptests/ut/ir/memory/test_memref.pytests/ut/ir/transforms/test_basic_memory_reuse.pytests/ut/ir/transforms/test_insert_sync.pytests/ut/ir/transforms/test_python_printer.pytests/ut/ir/transforms/test_verify_ssa_pass.pytests/ut/ir/transforms/test_verify_use_after_def.py
✅ Files skipped from review due to trivial changes (1)
- src/ir/transforms/allocate_memory_addr_pass.cpp
🚧 Files skipped from review as they are similar to previous changes (5)
- CMakeLists.txt
- python/bindings/modules/passes.cpp
- include/pypto/ir/verifier/verification_error.h
- include/pypto/ir/transforms/ir_property.h
- include/pypto/ir/verifier/verifier.h
| **Scoping rules:** | ||
|
|
||
| - Function parameters are in scope for the entire function body | ||
| - `AssignStmt`: LHS variable enters scope after RHS is evaluated | ||
| - `ForStmt`: `loop_var` and `iter_args` are in scope inside the loop body only; `return_vars` enter the enclosing scope after the loop | ||
| - `WhileStmt`: `iter_args` are in scope for the condition and body; `return_vars` enter the enclosing scope after the loop | ||
| - `IfStmt`: definitions inside then/else branches do **not** propagate to the outer scope; `return_vars` enter the enclosing scope after the if | ||
|
|
There was a problem hiding this comment.
IfStmt scoping rule documentation conflicts with new UseAfterDef test behavior.
Line 127 says branch-local definitions do not propagate, but test_valid_then_only_leak_visible_after_if in tests/ut/ir/transforms/test_verify_use_after_def.py (Lines 188-210) expects then-only definition leakage to be accepted by UseAfterDef. Please reconcile the documented rule with the implemented/tested contract.
As per coding guidelines: English docs in docs/en/dev/ are the ground truth and should be updated when behavior changes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/en/dev/passes/99-verifier.md` around lines 121 - 128, The docs state
that IfStmt branch-local definitions do not propagate, but the UseAfterDef test
test_valid_then_only_leak_visible_after_if and the verifier accept a then-only
definition leaking out; update the IfStmt scoping rule text to reflect the
implemented contract: indicate that variables defined in a then/else branch may
propagate to the enclosing scope (or specifically that then-only definitions are
visible after the IfStmt) and clarify how return_vars behave, so the
documentation for IfStmt matches the verifier behavior (refer to IfStmt,
UseAfterDef, and test_valid_then_only_leak_visible_after_if to validate
wording).
| **作用域规则:** | ||
|
|
||
| - 函数参数在整个函数体内可见 | ||
| - `AssignStmt`:LHS 变量在 RHS 求值后进入作用域 | ||
| - `ForStmt`:`loop_var` 和 `iter_args` 仅在循环体内可见;`return_vars` 在循环结束后进入外层作用域 | ||
| - `WhileStmt`:`iter_args` 在条件和循环体内可见;`return_vars` 在循环结束后进入外层作用域 | ||
| - `IfStmt`:then/else 分支内的定义**不**传播到外层作用域;`return_vars` 在 if 结束后进入外层作用域 | ||
|
|
There was a problem hiding this comment.
IfStmt 作用域规则描述与新用例语义不一致。
Line 127 写明“then/else 内定义不传播到外层”,但 tests/ut/ir/transforms/test_verify_use_after_def.py 的 test_valid_then_only_leak_visible_after_if(Lines 188-210)明确将该场景视为 UseAfterDef 合法。建议统一文档与实现/测试语义,避免读者误解。
As per coding guidelines: docs/zh-cn/dev/ must stay aligned with English docs when docs/en/dev/ changes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/zh-cn/dev/passes/99-verifier.md` around lines 121 - 128, 文档中关于 IfStmt
的作用域规则描述不匹配测试实现:当前说“then/else 分支内的定义不传播到外层”,但
tests/ut/ir/transforms/test_verify_use_after_def.py 中的
test_valid_then_only_leak_visible_after_if 将“只有 then 分支定义的变量在 if 结束后可见”视为合法;请将
docs/zh-cn/dev/passes/99-verifier.md 中 IfStmt 条目改为与实现/英文文档保持一致,明确说明哪些
return_vars/分支内定义会在 if 结束后进入外层作用域(例如 then-only leak 情形),使中文文档与
test_valid_then_only_leak_visible_after_if 和 docs/en/dev 的语义一致。
| const IRPropertySet& GetStructuralProperties() { | ||
| static const IRPropertySet props{IRProperty::TypeChecked, IRProperty::BreakContinueValid, | ||
| IRProperty::NoRedundantBlocks}; | ||
| IRProperty::NoRedundantBlocks, IRProperty::UseAfterDef}; | ||
| return props; | ||
| } | ||
|
|
||
| const IRPropertySet& GetDefaultVerifyProperties() { | ||
| static const IRPropertySet props{IRProperty::SSAForm, IRProperty::TypeChecked, IRProperty::NoNestedCalls, | ||
| IRProperty::BreakContinueValid, IRProperty::NoRedundantBlocks}; | ||
| static const IRPropertySet props{IRProperty::SSAForm, IRProperty::TypeChecked, | ||
| IRProperty::NoNestedCalls, IRProperty::BreakContinueValid, | ||
| IRProperty::NoRedundantBlocks, IRProperty::UseAfterDef}; |
There was a problem hiding this comment.
UseAfterDef is structural here, but still not in auto-verified set.
Line 115-Line 124 adds UseAfterDef to structural/default verification sets, but GetVerifiedProperties() still omits it. In the current pipeline behavior, that means structural-start verification can skip UseAfterDef.
Suggested fix (outside this hunk)
const IRPropertySet& GetVerifiedProperties() {
static const IRPropertySet props{IRProperty::SSAForm, IRProperty::TypeChecked,
IRProperty::AllocatedMemoryAddr, IRProperty::BreakContinueValid,
- IRProperty::NoRedundantBlocks};
+ IRProperty::NoRedundantBlocks, IRProperty::UseAfterDef};
return props;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ir/transforms/ir_property.cpp` around lines 115 - 124,
GetStructuralProperties now includes IRProperty::UseAfterDef but
GetVerifiedProperties still omits it, causing structural-start verification to
skip UseAfterDef; update the implementation of GetVerifiedProperties to include
IRProperty::UseAfterDef in its returned IRPropertySet (so the intersection
GetStructuralProperties().Intersection(GetVerifiedProperties()) will include
UseAfterDef), referencing the existing functions GetStructuralProperties,
GetDefaultVerifyProperties and GetVerifiedProperties to locate where to add
IRProperty::UseAfterDef.
| auto [cloned_body, def_var_map] = DeepClone(op->body_, sub_map); | ||
| // Update carry: outer_var → this iteration's fresh output var | ||
| for (const auto& [body_def, outer_var] : shadow_map) { | ||
| auto it = def_var_map.find(body_def); | ||
| if (it != def_var_map.end()) { | ||
| carry_map[outer_var.get()] = it->second; | ||
| } | ||
| } | ||
| unrolled.push_back(VisitStmt(cloned_body)); |
There was a problem hiding this comment.
Carry state is captured before recursive mutation, which can leave stale carried vars.
Line 156 computes def_var_map from the raw clone and Line 161 stores carry from it, but Line 164 then runs VisitStmt(cloned_body), which may rewrite defs (e.g., nested unroll/substitution). Next iterations can therefore substitute with a var that is no longer the final emitted def. Carry should be derived from the post-VisitStmt body, or VisitStmt should return a def remap alongside the new stmt.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ir/transforms/unroll_loops_pass.cpp` around lines 156 - 164, The carry
mapping is captured from def_var_map computed from DeepClone before recursive
rewriting, so nested VisitStmt may rewrite those defs and leave carry_map
pointing to stale vars; move the carry derivation to after processing the cloned
body (i.e., call VisitStmt(cloned_body) first and then obtain the final def
remap to update carry_map), or change VisitStmt to return both the rewritten
stmt and its def remap and use that remap to populate carry_map (reference
symbols: DeepClone, cloned_body, def_var_map, VisitStmt, carry_map, shadow_map,
unrolled).
| void VisitVarLike_(const VarPtr& op) override { | ||
| if (!op) return; | ||
| if (in_scope_.find(op.get()) == in_scope_.end()) { | ||
| std::ostringstream msg; | ||
| msg << "Variable '" << op->name_hint_ << "' used before definition" | ||
| << " in function '" << func_name_ << "'"; | ||
| diagnostics_.emplace_back(DiagnosticSeverity::Error, "UseAfterDefCheck", | ||
| static_cast<int>(use_after_def::ErrorType::USE_BEFORE_DEF), msg.str(), | ||
| op->span_); | ||
| } | ||
| // Leaf node — no children to recurse into. | ||
| } |
There was a problem hiding this comment.
VisitVarLike_ should continue base traversal after the scope check.
Line 66-Line 77 records the diagnostic but never calls IRVisitor::VisitVarLike_(op), so nested uses reachable through var-like metadata aren’t traversed.
Suggested fix
void VisitVarLike_(const VarPtr& op) override {
if (!op) return;
if (in_scope_.find(op.get()) == in_scope_.end()) {
std::ostringstream msg;
msg << "Variable '" << op->name_hint_ << "' used before definition"
<< " in function '" << func_name_ << "'";
diagnostics_.emplace_back(DiagnosticSeverity::Error, "UseAfterDefCheck",
static_cast<int>(use_after_def::ErrorType::USE_BEFORE_DEF), msg.str(),
op->span_);
}
- // Leaf node — no children to recurse into.
+ IRVisitor::VisitVarLike_(op);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| void VisitVarLike_(const VarPtr& op) override { | |
| if (!op) return; | |
| if (in_scope_.find(op.get()) == in_scope_.end()) { | |
| std::ostringstream msg; | |
| msg << "Variable '" << op->name_hint_ << "' used before definition" | |
| << " in function '" << func_name_ << "'"; | |
| diagnostics_.emplace_back(DiagnosticSeverity::Error, "UseAfterDefCheck", | |
| static_cast<int>(use_after_def::ErrorType::USE_BEFORE_DEF), msg.str(), | |
| op->span_); | |
| } | |
| // Leaf node — no children to recurse into. | |
| } | |
| void VisitVarLike_(const VarPtr& op) override { | |
| if (!op) return; | |
| if (in_scope_.find(op.get()) == in_scope_.end()) { | |
| std::ostringstream msg; | |
| msg << "Variable '" << op->name_hint_ << "' used before definition" | |
| << " in function '" << func_name_ << "'"; | |
| diagnostics_.emplace_back(DiagnosticSeverity::Error, "UseAfterDefCheck", | |
| static_cast<int>(use_after_def::ErrorType::USE_BEFORE_DEF), msg.str(), | |
| op->span_); | |
| } | |
| IRVisitor::VisitVarLike_(op); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ir/verifier/verify_use_after_def.cpp` around lines 66 - 77, The
VisitVarLike_ implementation flags a use-before-def but never continues the IR
traversal, so nested uses in var-like metadata are not visited; update
VisitVarLike_ to still call the base traversal (IRVisitor::VisitVarLike_(op))
after performing the in_scope_ check and recording the diagnostic (i.e., keep
the scope check and diagnostics as-is, then invoke IRVisitor::VisitVarLike_(op)
to recurse into any metadata/children), ensuring you reference the VisitVarLike_
method and IRVisitor::VisitVarLike_ when making the change.
| void VisitStmt_(const WhileStmtPtr& op) override { | ||
| if (!op) return; | ||
|
|
||
| // IterArg initial values are evaluated in the outer scope. | ||
| for (const auto& iter_arg : op->iter_args_) { | ||
| if (iter_arg && iter_arg->initValue_) { | ||
| VisitExpr(iter_arg->initValue_); | ||
| } | ||
| } | ||
|
|
||
| // iter_args are in scope for condition and body. | ||
| for (const auto& iter_arg : op->iter_args_) { | ||
| if (iter_arg) in_scope_.insert(iter_arg.get()); | ||
| } | ||
|
|
||
| if (op->condition_) VisitExpr(op->condition_); | ||
| if (op->body_) VisitStmt(op->body_); | ||
|
|
||
| for (const auto& iter_arg : op->iter_args_) { | ||
| if (iter_arg) in_scope_.erase(iter_arg.get()); | ||
| } | ||
|
|
||
| // return_vars are defined in the enclosing scope after the loop. | ||
| for (const auto& rv : op->return_vars_) { | ||
| if (rv) in_scope_.insert(rv.get()); | ||
| } |
There was a problem hiding this comment.
WhileStmt scope restoration is incomplete in SSA mode.
In Line 134-Line 159, defs created inside op->body_ remain in in_scope_ even when return_vars_ is present. That makes loop-local defs incorrectly visible after the while loop.
Suggested fix
void VisitStmt_(const WhileStmtPtr& op) override {
if (!op) return;
+ auto saved_scope = in_scope_;
// IterArg initial values are evaluated in the outer scope.
for (const auto& iter_arg : op->iter_args_) {
if (iter_arg && iter_arg->initValue_) {
VisitExpr(iter_arg->initValue_);
}
}
// iter_args are in scope for condition and body.
for (const auto& iter_arg : op->iter_args_) {
if (iter_arg) in_scope_.insert(iter_arg.get());
}
if (op->condition_) VisitExpr(op->condition_);
if (op->body_) VisitStmt(op->body_);
for (const auto& iter_arg : op->iter_args_) {
if (iter_arg) in_scope_.erase(iter_arg.get());
}
- // return_vars are defined in the enclosing scope after the loop.
- for (const auto& rv : op->return_vars_) {
- if (rv) in_scope_.insert(rv.get());
- }
+ if (!op->return_vars_.empty()) {
+ // SSA mode: only return_vars are visible after the loop.
+ in_scope_ = saved_scope;
+ for (const auto& rv : op->return_vars_) {
+ if (rv) in_scope_.insert(rv.get());
+ }
+ }
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ir/verifier/verify_use_after_def.cpp` around lines 134 - 159, The
VisitStmt_(const WhileStmtPtr& op) leaves loop-local definitions from op->body_
in in_scope_ when op->return_vars_ are inserted, so restore the original scope
before adding the return_vars_; capture the in_scope_ state (or remove only
loop-local symbols) after visiting op->body_ and before inserting
op->return_vars_, then insert the return_vars_ into the restored/enclosing scope
so loop-local defs are not visible after the loop; refer to VisitStmt_(const
WhileStmtPtr& op), in_scope_, op->iter_args_, op->condition_, op->body_, and
op->return_vars_ to locate where to snapshot/restore or selectively erase
loop-local entries.
|
|
||
| verify_pass = passes.run_verifier() | ||
| with pytest.raises(Exception, match="used outside its defining scope"): | ||
| with pytest.raises(Exception, match="used before definition|used outside its defining scope"): |
There was a problem hiding this comment.
Use raw-string regex literals in pytest.raises(..., match=...).
Line 307, Line 341, Line 372, and Line 396 use regex metacharacters in normal strings, which triggers RUF043 and is easy to avoid with raw strings.
Suggested fix
- with pytest.raises(Exception, match="used before definition|used outside its defining scope"):
+ with pytest.raises(Exception, match=r"used before definition|used outside its defining scope"):
verify_pass(program)
...
- with pytest.raises(Exception, match="used before definition|used outside its defining scope"):
+ with pytest.raises(Exception, match=r"used before definition|used outside its defining scope"):
verify_pass(program)
...
- with pytest.raises(Exception, match="used before definition|used outside its defining scope"):
+ with pytest.raises(Exception, match=r"used before definition|used outside its defining scope"):
verify_pass(program)
...
- with pytest.raises(Exception, match="used before definition|used outside its defining scope"):
+ with pytest.raises(Exception, match=r"used before definition|used outside its defining scope"):
verify_pass(program)Also applies to: 341-341, 372-372, 396-396
🧰 Tools
🪛 Ruff (0.15.6)
[warning] 307-307: Pattern passed to match= contains metacharacters but is neither escaped nor raw
(RUF043)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/ut/ir/transforms/test_verify_ssa_pass.py` at line 307, Update the
pytest.raises calls that pass regex patterns as normal strings to use raw-string
literals so backslashes and regex metacharacters are not interpreted by Python
string escapes; specifically change the match argument in the pytest.raises call
that currently reads match="used before definition|used outside its defining
scope" (and the similar patterns at the other occurrences) to use a raw string
(r"...") in the test_verify_ssa_pass.py tests (look for the pytest.raises(...)
occurrences around the failing assertions).
Summary
Add `IRProperty::UseAfterDef` as a structural property with a corresponding `UseAfterDefVerifier` that checks all variable uses occur after their definitions
in the IR.
`AllocateStmt`; popped on exit)
Testing
Related Issues
Closes #(issue number)