refactor(codegen): use Out/InOut params for orchestration output tensors by YunjiQin · Pull Request #620 · hw-native-sys/pypto

YunjiQin · 2026-03-19T07:55:48Z

Summary

Refactor orchestration codegen to derive output tensors from Out/InOut function parameters instead of inferring them from return statements
Support incore call return tensors using different names with args.
Update all examples and tests to use pl.Out[...] parameter syntax for output tensors
Remove dead fields from OrchestrationInfoCollector that were no longer read after the refactor

Testing

All 2558 tests pass
Code review completed
Pre-commit hooks pass (clang-format, cpplint, ruff, pyright)

Related Issues

fix #583

coderabbitai · 2026-03-19T07:56:07Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Refactors orchestration codegen to stop creating local return tensors and instead emit reference aliases for kernel outputs; updates many examples and tests to accept external output handles (pl.Out / pl.InOut) and adds comprehensive orchestration-codegen documentation.

Changes

Cohort / File(s)	Summary
Documentation (EN & ZH) `docs/en/dev/codegen/02-orchestration_codegen.md`, `docs/zh-cn/dev/codegen/02-orchestration_codegen.md`	New comprehensive docs describing orchestration codegen architecture, phases, internals (collector, stmt codegen, op registry), examples and Python API notes.
Core Orchestration Codegen `src/codegen/orchestration/orchestration_codegen.cpp`	Removed helpers for counting/inferring return tensors and expected args; simplified OrchestrationInfoCollector to tuple metadata only; changed AssignStmt emission to create C++ `Tensor&` aliases from callee `Out`/`InOut` params; removed orchestration-local return-tensor creation logic.
IR-parser examples `examples/ir_parser/...` `examples/ir_parser/batch_paged_attention_example.py`, `.../orchestration_example.py`, `.../paged_attention_example.py`, `.../vector_example_dag.py`	Switched orchestration signatures to accept `pl.Out[...]` outputs and removed internal `pl.create_tensor` allocations; removed unused size_* scalar params in paged attention.
Language examples (beginner/intermediate/llm) `examples/language/beginner/...`, `examples/language/intermediate/...`, `examples/language/llm_models/llama_7b_mini.py`	Many orchestrator functions updated to take `pl.Out[...]` outputs (or `pl.InOut[...]` where in-place), removing internal output allocations and forwarding caller-provided buffers into kernels.
System tests (codegen & runtime) `tests/st/codegen/...`, `tests/st/runtime/...`	Test programs updated to reflect external output parameters (`pl.Out`/`pl.InOut`) and removed local `pl.create_tensor` allocations; paged-attention test tensor specs for size_* removed.
Unit tests (codegen) `tests/ut/codegen/test_orchestration_codegen.py`	Multiple orchestration unit tests updated to expect external `pl.Out` / `pl.InOut` parameters; local create_tensor calls removed; tuple/in-place test signatures adjusted accordingly.

Sequence Diagram(s)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat(codegen): enhance orchestration codegen with tensor APIs, scalar ops, and paged attention #204: overlaps on src/codegen/orchestration/orchestration_codegen.cpp changes (tuple mappings and external-tensor API work).
refactor(test): import example programs in st tests to eliminate duplication #285: related updates converting examples/tests to use pl.Out/pl.InOut output parameters.
test(st): Add Paged Attention system tests for four PTO kernels #257: directly touches paged-attention tests updated here (signature/spec changes).

Suggested reviewers

lyfne123
Hzfengsy

Poem

"🐰 I hopped through code and found a way,
To let outputs leap out into the day.
No more creating inside the nest,
Kernels write where callers request.
Aliases bind, the orchestration sings—hooray!"

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.16% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and accurately summarizes the main refactoring: changing orchestration codegen to use Out/InOut parameters for output tensors instead of inferring from returns.
Description check	✅ Passed	The PR description accurately describes the changeset: refactoring orchestration codegen to use Out/InOut parameters for output tensors, updating examples/tests, and removing dead code.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-03-19T07:56:22Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the orchestration code generation process by shifting from implicit inference of output tensors to explicit declaration using pl.Out and pl.InOut type annotations in Python function parameters. This change improves the robustness and readability of the orchestration IR, making the intent of output tensors clear at the function signature level. The underlying C++ code generation logic has been simplified, and internal metadata collection mechanisms have been streamlined to align with this more explicit approach.

Highlights

Refactored Orchestration Codegen: The orchestration code generation now explicitly derives output tensors from Out/InOut function parameters, enhancing clarity and simplifying the codegen logic.
Updated Python Syntax: All relevant examples and tests have been updated to utilize the new pl.Out[...] and pl.InOut[...] parameter syntax for declaring output tensors in Python orchestration functions.
Streamlined Metadata Collection: The OrchestrationInfoCollector has been optimized by removing fields previously used for inferring output tensors from return statements, as this information is now explicitly provided via parameter annotations.
C++ Alias Generation for InCore Calls: The codegen now emits C++ reference aliases for InCore call return values when their names differ from the corresponding Out/InOut arguments, ensuring correct variable mapping.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/st/runtime/test_ctrl_flow.py (1)
524-532: ⚠️ Potential issue | 🟡 Minor

Update orchestrator functions for consistency or add explanatory comment.

The three test cases (TestForLoopBreak, TestForLoopContinue, and TestForLoopBreakContinue) use pl.create_tensor() to allocate output tensors in their orchestrator functions, while all other test cases in this file use pl.Out[...] parameters. Additionally, the InCore kernel functions within these same test cases use pl.Out[...], creating an inconsistency within each test class.

Either update these orchestrator functions to use pl.Out[...] for consistency, or add a brief comment explaining why this pattern differs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/st/runtime/test_ctrl_flow.py` around lines 524 - 532, The orchestrator
functions (e.g., orchestrator in
TestForLoopBreak/TestForLoopContinue/TestForLoopBreakContinue) are inconsistent
with the rest of the file by allocating outputs with pl.create_tensor rather
than taking pl.Out[...] parameters while their kernels (kernel_break,
kernel_continue, etc.) use pl.Out; update each orchestrator signature to accept
the output tensor as a pl.Out[[256, 64], pl.FP32] parameter and wire that
through when calling the corresponding kernel (e.g., pass the pl.Out c into
kernel_break), or if create_tensor is intentional, add a short explanatory
comment above each orchestrator explaining why create_tensor is required for
these tests to justify the deviation.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@tests/st/runtime/test_ctrl_flow.py`:
- Around line 524-532: The orchestrator functions (e.g., orchestrator in
TestForLoopBreak/TestForLoopContinue/TestForLoopBreakContinue) are inconsistent
with the rest of the file by allocating outputs with pl.create_tensor rather
than taking pl.Out[...] parameters while their kernels (kernel_break,
kernel_continue, etc.) use pl.Out; update each orchestrator signature to accept
the output tensor as a pl.Out[[256, 64], pl.FP32] parameter and wire that
through when calling the corresponding kernel (e.g., pass the pl.Out c into
kernel_break), or if create_tensor is intentional, add a short explanatory
comment above each orchestrator explaining why create_tensor is required for
these tests to justify the deviation.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 5a2ac031-bf8d-4d67-a831-dc874c631271

📥 Commits

Reviewing files that changed from the base of the PR and between 95925ed and 1f94b46.

📒 Files selected for processing (25)

docs/en/dev/codegen/02-orchestration_codegen.md
docs/zh-cn/dev/codegen/02-orchestration_codegen.md
examples/ir_parser/batch_paged_attention_example.py
examples/ir_parser/orchestration_example.py
examples/ir_parser/paged_attention_example.py
examples/ir_parser/vector_example_dag.py
examples/language/beginner/basic_ops.py
examples/language/beginner/elementwise.py
examples/language/beginner/hello_world.py
examples/language/beginner/matmul.py
examples/language/intermediate/activation.py
examples/language/intermediate/ffn_activations.py
examples/language/intermediate/layer_norm.py
examples/language/intermediate/rms_norm.py
examples/language/intermediate/softmax.py
examples/language/intermediate/vector_dag.py
examples/language/llm_models/llama_7b_mini.py
src/codegen/orchestration/orchestration_codegen.cpp
tests/st/codegen/test_batch_paged_attention.py
tests/st/codegen/test_paged_attention.py
tests/st/runtime/test_ctrl_flow.py
tests/st/runtime/test_dynamic_shape.py
tests/st/runtime/test_fillpad.py
tests/st/runtime/test_matmul.py
tests/ut/codegen/test_orchestration_codegen.py

gemini-code-assist · 2026-03-19T08:06:06Z

Warning

Gemini is experiencing higher than usual traffic and was unable to create the review. Please try again in a few hours by commenting /gemini review.

Replace return-var inference with explicit pl.Out/pl.InOut parameter annotations in orchestration functions. Output tensors are now passed as params rather than allocated via pl.create_tensor() in the body. C++ codegen removes dead code: CountReturnTensors, CountExpectedArgs, GetIntermediateTensorType, return_vars tracking, and return_names_ member. Adds alias generation for InCore call return values that map to Out/InOut args.

…t params Migrate output tensor declarations from local pl.create_tensor() to explicit pl.Out[...] parameters across all examples and tests, aligning with the Out/InOut orchestration codegen refactor.

Remove output_tensors, output_tensor_assigns, tuple_element_map, and call_to_result_var which were populated but never read after the Out/InOut param refactor. Also remove the no-op SetTupleElementMap method and its call site.

YunjiQin · 2026-03-19T11:54:02Z

@lyfne123

coderabbitai bot reviewed Mar 19, 2026

View reviewed changes

YunjiQin added 4 commits March 19, 2026 19:41

refactor(examples,tests): update orchestration functions to use pl.Ou…

1c701f5

…t params Migrate output tensor declarations from local pl.create_tensor() to explicit pl.Out[...] parameters across all examples and tests, aligning with the Out/InOut orchestration codegen refactor.

update new st orch format

771d9a5

YunjiQin force-pushed the orch branch from eb2e2f1 to 771d9a5 Compare March 19, 2026 11:42

lyfne123 approved these changes Mar 19, 2026

View reviewed changes

lyfne123 merged commit e7271b2 into hw-native-sys:main Mar 19, 2026
7 checks passed

YunjiQin deleted the orch branch March 19, 2026 12:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(codegen): use Out/InOut params for orchestration output tensors#620

refactor(codegen): use Out/InOut params for orchestration output tensors#620
lyfne123 merged 4 commits intohw-native-sys:mainfrom
YunjiQin:orch

YunjiQin commented Mar 19, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 19, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist bot commented Mar 19, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

gemini-code-assist bot commented Mar 19, 2026

Uh oh!

YunjiQin commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

YunjiQin commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Related Issues

Uh oh!

coderabbitai bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist bot commented Mar 19, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot commented Mar 19, 2026

Uh oh!

YunjiQin commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

YunjiQin commented Mar 19, 2026 •

edited

Loading

coderabbitai bot commented Mar 19, 2026 •

edited

Loading