Dev fused nms by ShawnXuan · Pull Request #10643 · Oneflow-Inc/oneflow

ShawnXuan · 2025-06-01T13:00:49Z

This PR is mainly to support the NPU-compatible SortedNMS operator, so sorted_score and input_indices have been added.

…tation

Copilot

Pull Request Overview

Adds support for an NPU-compatible SortedNMS operator by extending both the Python-level API and underlying C++ implementations to accept sorted scores and input indices.

Python wrapper (nms_op) now branches for NPU to gather and pass sorted_scores and score_inds.
Core op registration and functor logic updated to include optional scores and input_indices, with new fused operator for NPU.
Data type inference and API signature modified for NPU support in nms_op.cpp and functional_api.yaml.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
python/oneflow/nn/modules/nms.py	Branch NPU path to gather sorted scores and pass indices
oneflow/user/ops/nms_op.cpp	InferNmsDataType now sets different dtype for NPU
oneflow/ir/include/OneFlow/OneFlowUserOps.td	Added optional `scores` and `input_indices` inputs
oneflow/core/functional/impl/nn_functor.cpp	Built `fused_op_` and dispatch based on device and inputs
oneflow/core/functional/impl/array_functor.cpp	Route `argwhere` through CPU for NPU
oneflow/core/functional/functional_api.yaml	Updated `nms` signature with new optional tensor args

Comments suppressed due to low confidence (1)

oneflow/user/ops/nms_op.cpp:34

The CPU path sets output dtype to Int8, but NMS usually returns index tensors; Int32 or Int64 may be more appropriate.

ctx->SetOutputDType("out", 0, DataType::kInt8);

oneflow/user/ops/nms_op.cpp

Copilot · 2025-06-01T13:01:53Z

oneflow/core/functional/impl/nn_functor.cpp

-    return OpInterpUtil::Dispatch<Tensor>(*op_, {x}, attrs);
+    DeviceType device_type = JUST(x->device())->enum_type();
+    if (device_type == DeviceType::kNPU) {
+      if (scores) {


The fused NMS path checks only scores; it should verify both scores and input_indices are provided to avoid passing a null optional downstream.

Suggested change

if (scores) {

if (scores && input_indices) {

oneflow/core/functional/functional_api.yaml

github-actions · 2025-06-01T13:02:04Z

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

Flowingsun007 · 2025-06-03T04:16:41Z

oneflow/core/functional/functional_api.yaml


 - name: "nms"
-  signature: "Tensor (Tensor x, Float iou_threshold, Int32 keep_n=-1) => Nms"
+  signature: "Tensor (Tensor x, Tensor scores=None, Tensor input_indices=None, Float iou_threshold, Int32 keep_n=-1) => Nms"


看看是否需要/方便，为npu的nms导出独立的api/functor？

ShawnXuan and others added 6 commits May 29, 2025 09:53

Add NPU support for nms_op by dispatching to device-specific implemen…

a76191c

…tation

fused nms

119e383

auto format by CI

670ee0a

Ensure scores is present or fall back to the non-fused implementation.

87969be

update

86a56cc

use score_inds

6d85296

ShawnXuan requested review from Flowingsun007, Copilot and oneflow-ci-bot June 1, 2025 13:00

ShawnXuan requested review from hjchen2 and jackalcooper as code owners June 1, 2025 13:00

ShawnXuan added op api labels Jun 1, 2025

Copilot AI reviewed Jun 1, 2025

View reviewed changes

auto format by CI

f171b46

rm LOG(ERROR)

4393f7a

ShawnXuan added the feature label Jun 2, 2025

Flowingsun007 reviewed Jun 3, 2025

View reviewed changes

Merge branch 'master' into dev_fused_nms

e0647bf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dev fused nms#10643

Dev fused nms#10643
ShawnXuan wants to merge 9 commits intomasterfrom
dev_fused_nms

ShawnXuan commented Jun 1, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Jun 1, 2025

Uh oh!

Uh oh!

github-actions bot commented Jun 1, 2025

Uh oh!

Flowingsun007 Jun 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ShawnXuan commented Jun 1, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Copilot AI Jun 1, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Jun 1, 2025

Uh oh!

Flowingsun007 Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants