Refactor TensorMap to per-ring buckets and pools by jvjhfhg · Pull Request #310 · hw-native-sys/simpler

jvjhfhg · 2026-03-17T12:09:59Z

refactor owner TensorMap storage into per-ring buckets, pools, and cleanup tracking
add a fallback tensormap for external tensors and cross-ring INOUT modifiers while keeping owner-ring history ring-local
route lookup and removal across owner and fallback sources and bind make_tensor() to the current scope ring
update paged attention to treat oi_batch as INOUT in the example and matching device test

gemini-code-assist · 2026-03-17T12:10:27Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the TensorMap implementation to improve performance and isolation by introducing a per-ring architecture. Instead of a single global TensorMap, each processing ring now maintains its own independent set of TensorMap data structures. This change enables more efficient cleanup and lookup operations, particularly through a "break-on-stale" optimization, and better manages tensor dependencies across different rings. The orchestrator has been updated to correctly assign ring IDs to tensors, ensuring proper routing within the new TensorMap structure.

Highlights

Per-Ring TensorMap Structure: Introduced PTO2TensorMapRing to manage independent buckets, entry pools, free lists, and task entry heads for each processing ring, enhancing isolation and performance.
TensorMap Facade: Refactored PTO2TensorMap to act as a facade, routing all operations (lookup, insert, cleanup) to the appropriate PTO2TensorMapRing instance based on the tensor's ring_id.
Tensor Ring ID: Added a uint8_t ring_id field to the Tensor struct, allowing tensors to be explicitly associated with a specific ring, with TENSOR_RING_ID_NONE indicating an unassigned or external tensor.
Orchestrator Ring ID Assignment: Modified the orchestrator to automatically assign the current ring's ring_id to OUTPUT tensors and unassigned INOUT tensors during task submission.
Break-on-Stale Optimization: Enabled a 'break-on-stale' optimization within PTO2TensorMapRing lookup chains, which leverages the newest-first ordering of entries to stop searching a chain as soon as a stale entry is encountered.
Decentralized Cleanup Tracking: Moved the tensormap_last_cleanup tracking variable from the orchestrator's global state into each individual PTO2TensorMapRing instance, further decentralizing and isolating ring-specific state.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_orchestrator.cpp
- Removed initialization of orch->tensormap_last_cleanup.
- Added logic to assign ring_id to INOUT tensors if unassigned, and to OUTPUT tensors.
src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_orchestrator.h
- Removed tensormap_last_cleanup member from PTO2OrchestratorState.
src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_tensormap.cpp
- Renamed PTO2TensorMap to PTO2TensorMapRing and updated its init, destroy, print_stats, valid_count, lookup, insert, and cleanup_retired methods to operate on a single ring.
- Created a new PTO2TensorMap struct as a facade, containing an array of PTO2TensorMapRing instances and routing calls based on tensor.ring_id.
- Modified sync_tensormap to iterate through all PTO2TensorMapRing instances and perform per-ring validity synchronization and cleanup.
- Implemented the "break-on-stale" optimization in PTO2TensorMapRing::lookup.
src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_tensormap.h
- Introduced PTO2TensorMapRing struct with per-ring specific members like buckets, entry_pool, free_entry_list, task_entry_heads, task_window_size, last_task_alive, and last_cleanup.
- Updated PTO2TensorMap to be a facade, holding an array of PTO2TensorMapRing objects.
- Adjusted comments to reflect the new per-ring design and the "break-on-stale" optimization.
- Removed ring_id parameter from sync_validity and cleanup_retired in PTO2TensorMapRing as it now operates on its own ring.
src/a2a3/runtime/tensormap_and_ringbuffer/runtime/tensor.h
- Added uint8_t ring_id field to the Tensor struct.
- Defined TENSOR_RING_ID_NONE constant.
- Updated Tensor constructors and print method to include the new ring_id field.

Activity

No specific activity (comments, reviews, progress updates) was provided in the context for this pull request.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a significant and well-executed refactoring of TensorMap to a per-ring architecture. By creating PTO2TensorMapRing for per-ring resource management and making PTO2TensorMap a facade, the design improves isolation and enables a key performance optimization in lookup with break-on-stale logic. The changes are consistent and robust across all modified files. My review includes a couple of minor suggestions to simplify memory allocation and initialization by using calloc.

gemini-code-assist · 2026-03-17T12:12:59Z

src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_tensormap.cpp

+    buckets = (PTO2TensorMapEntry**)malloc(new_num_buckets * sizeof(PTO2TensorMapEntry*));
+    if (!buckets) return false;
    for (int32_t i = 0; i < new_num_buckets; i++) {
        buckets[i] = nullptr;
    }


For simplicity and to ensure zero-initialization, you can replace malloc and the subsequent loop with a single call to calloc.

Suggested change

buckets = (PTO2TensorMapEntry**)malloc(new_num_buckets * sizeof(PTO2TensorMapEntry*));

if (!buckets) return false;

for (int32_t i = 0; i < new_num_buckets; i++) {

buckets[i] = nullptr;

}

buckets = (PTO2TensorMapEntry**)calloc(new_num_buckets, sizeof(PTO2TensorMapEntry*));

if (!buckets) return false;

gemini-code-assist · 2026-03-17T12:12:59Z

src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_tensormap.cpp

+    task_entry_heads = (PTO2TensorMapEntry**)malloc(new_task_window_size * sizeof(PTO2TensorMapEntry*));
+    if (!task_entry_heads) {
+        free(buckets); buckets = nullptr;
+        free(entry_pool); entry_pool = nullptr;
+        free(free_entry_list); free_entry_list = nullptr;
+        return false;
    }
-
-    for (int r = 0; r < PTO2_MAX_RING_DEPTH; r++) {
-        last_task_alives[r] = 0;
+    for (int32_t i = 0; i < new_task_window_size; i++) {
+        task_entry_heads[i] = nullptr;
    }


Similar to the buckets allocation, you can use calloc here to both allocate and zero-initialize the task_entry_heads array, which simplifies the code.

Suggested change

task_entry_heads = (PTO2TensorMapEntry**)malloc(new_task_window_size * sizeof(PTO2TensorMapEntry*));

if (!task_entry_heads) {

free(buckets); buckets = nullptr;

free(entry_pool); entry_pool = nullptr;

free(free_entry_list); free_entry_list = nullptr;

return false;

}

for (int r = 0; r < PTO2_MAX_RING_DEPTH; r++) {

last_task_alives[r] = 0;

for (int32_t i = 0; i < new_task_window_size; i++) {

task_entry_heads[i] = nullptr;

}

task_entry_heads = (PTO2TensorMapEntry**)calloc(new_task_window_size, sizeof(PTO2TensorMapEntry*));

if (!task_entry_heads) {

free(buckets); buckets = nullptr;

free(entry_pool); entry_pool = nullptr;

free(free_entry_list); free_entry_list = nullptr;

return false;

}

poursoul · 2026-03-17T12:18:56Z

src/a2a3/runtime/tensormap_and_ringbuffer/runtime/tensor.h

        bool is_raw_eq_shapes = false,
-        bool manual_dep = false) {
+        bool manual_dep = false,
+        uint8_t in_ring_id = TENSOR_RING_ID_NONE) {


tensor的ring_id必须设置，tensor不能在scope A内定义，然后让scope A里的scope B里的task分配内存，所以在make_tensor的时候就要指定ring（等于当前scope的ring_id），view操作继承。submit_task发现tensor的type是output的时候，task的ring_id必须和tensor的ring_id一致，否则要报错。

poursoul · 2026-03-17T12:20:54Z

src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_tensormap.h

+            // Per-ring chain: entries are ordered newest-first (head-insert).
+            // A stale entry means all subsequent entries are also stale — break.
            if (!entry_valid(*cur_entry)) {
-                cur_entry = next_entry;


这里需要还原为老的逻辑，把next置为nullptr，是否立即进行entry清理再看。

- refactor owner TensorMap storage into per-ring buckets, pools, and cleanup tracking - add a fallback tensormap for external tensors and cross-ring INOUT modifiers while keeping owner-ring history ring-local - route lookup and removal across owner and fallback sources and bind make_tensor() to the current scope ring - update paged attention to treat oi_batch as INOUT in the example and matching device test

gemini-code-assist bot reviewed Mar 17, 2026

View reviewed changes

poursoul reviewed Mar 17, 2026

View reviewed changes

jvjhfhg force-pushed the refactor-tensormap branch 2 times, most recently from d00c195 to ec51fd1 Compare March 18, 2026 09:52

jvjhfhg force-pushed the refactor-tensormap branch from ec51fd1 to 0ef4a89 Compare March 18, 2026 11:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor TensorMap to per-ring buckets and pools#310

Refactor TensorMap to per-ring buckets and pools#310
jvjhfhg wants to merge 1 commit intohw-native-sys:mainfrom
jvjhfhg:refactor-tensormap

jvjhfhg commented Mar 17, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Mar 17, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 17, 2026

Uh oh!

gemini-code-assist bot Mar 17, 2026

Uh oh!

poursoul Mar 17, 2026

Uh oh!

poursoul Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jvjhfhg commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Mar 17, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

poursoul Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

poursoul Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jvjhfhg commented Mar 17, 2026 •

edited

Loading