[QNN-EP] Add dump_onnx_subgraph for per-partition debugging by qti-niscmami · Pull Request #372 · onnxruntime/onnxruntime-qnn

qti-niscmami · 2026-05-12T21:27:32Z

Adds session config option dump_onnx_subgraph=1 (+ onnx_subgraph_dir=<path>)
that emits each QNN-claimed partition as a self-contained, runnable ONNX model —
captured inside CompileImpl just before QNN op-builders rewrite the partition's
op_types into QNN_OP_*. Each partition produces <fused_node_name>.onnx plus
a <fused_node_name>.onnx.data sidecar (ONNX standard external_data layout).

Description

Changes:

New dump_onnx_subgraph_ / onnx_subgraph_dir_ members on QnnEp, parsed
from session config (mirrors the existing dump_json_qnn_graph pattern)
New helper qnn::DumpPartitionAsOnnxModel in
core/providers/qnn/builder/onnx_subgraph_dumper.{h,cc} — walks the
per-partition OrtGraph via the C API and emits onnx::ModelProto
Dump call wired into QnnEp::CompileImpl between fused_node_name resolution
and qnn_model->ComposeGraph(...); skipped on the EPContext / DLC-context
load paths
Initializers stream to a .onnx.data sidecar via TensorProto.external_data
rather than inline raw_data — avoids protobuf's 2 GB single-message ceiling
on large models
Subgraph attributes (If/Loop bodies) hard-fail with a warning rather than
silently producing an unrunnable dump
CMake links onnx onnx_proto into the QNN EP plugin
New unit test onnx_subgraph_dump_test.cc: QDQ Conv→Relu dump + negative test

Motivation and Context

The QNN-side view of a partition is dumpable today (dump_json_qnn_graph,
qnn_saver_path, dump_qnn_ir_dlc, APIREC), but the ONNX-side view of the same
partition is not — making per-partition correlation, Netron inspection, and
isolated end-to-end debug runs painful. This option closes that gap. Each dumped
file passes onnx.checker.check_model and round-trips cleanly through a fresh
ORT+QNN session to a single all-HTP EPContext node.

qti-chuteng · 2026-05-13T02:42:45Z

Hi @qti-niscmami, this implementation will add onnx and onnx_proto as new dependencies. If these are necessary, that's fine, but otherwise, we prefer not to increase dependencies in our codebase. Also, expanding the config group may need further discussion.

I believe this functionality could be achieved by using a Python script for decoupling. Could you share your thoughts on this approach?

qti-niscmami added 6 commits May 12, 2026 14:23

Initial implementation of the onnx subgraph

8a05b57

Cleanup guard for onnx subgraph dump

5997d21

Fix unit tests

fd05f4f

Fix CMAKE build force third-party as SYSTEM

92f48e9

Fix dump_onnx_subgraph CI failures on aarch64 + lint

970bafc

Cleanup md description for flag

781bb9b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QNN-EP] Add dump_onnx_subgraph for per-partition debugging#372

[QNN-EP] Add dump_onnx_subgraph for per-partition debugging#372
qti-niscmami wants to merge 6 commits into
mainfrom
dev/qti-niscmami/onnx_subgraph_feature

qti-niscmami commented May 12, 2026 •

edited

Loading

Uh oh!

qti-chuteng commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

qti-niscmami commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Uh oh!

qti-chuteng commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

qti-niscmami commented May 12, 2026 •

edited

Loading