[QNN-EP] Add dump_onnx_subgraph for per-partition debugging#372
Draft
qti-niscmami wants to merge 6 commits into
Draft
[QNN-EP] Add dump_onnx_subgraph for per-partition debugging#372qti-niscmami wants to merge 6 commits into
qti-niscmami wants to merge 6 commits into
Conversation
Collaborator
|
Hi @qti-niscmami, this implementation will add onnx and onnx_proto as new dependencies. If these are necessary, that's fine, but otherwise, we prefer not to increase dependencies in our codebase. Also, expanding the config group may need further discussion. I believe this functionality could be achieved by using a Python script for decoupling. Could you share your thoughts on this approach? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds session config option
dump_onnx_subgraph=1(+onnx_subgraph_dir=<path>)that emits each QNN-claimed partition as a self-contained, runnable ONNX model —
captured inside
CompileImpljust before QNN op-builders rewrite the partition'sop_types into
QNN_OP_*. Each partition produces<fused_node_name>.onnxplusa
<fused_node_name>.onnx.datasidecar (ONNX standardexternal_datalayout).Description
Changes:
dump_onnx_subgraph_/onnx_subgraph_dir_members onQnnEp, parsedfrom session config (mirrors the existing
dump_json_qnn_graphpattern)qnn::DumpPartitionAsOnnxModelincore/providers/qnn/builder/onnx_subgraph_dumper.{h,cc}— walks theper-partition
OrtGraphvia the C API and emitsonnx::ModelProtoQnnEp::CompileImplbetweenfused_node_nameresolutionand
qnn_model->ComposeGraph(...); skipped on the EPContext / DLC-contextload paths
.onnx.datasidecar viaTensorProto.external_datarather than inline
raw_data— avoids protobuf's 2 GB single-message ceilingon large models
silently producing an unrunnable dump
onnx onnx_protointo the QNN EP pluginonnx_subgraph_dump_test.cc: QDQ Conv→Relu dump + negative testMotivation and Context
The QNN-side view of a partition is dumpable today (
dump_json_qnn_graph,qnn_saver_path,dump_qnn_ir_dlc, APIREC), but the ONNX-side view of the samepartition is not — making per-partition correlation, Netron inspection, and
isolated end-to-end debug runs painful. This option closes that gap. Each dumped
file passes
onnx.checker.check_modeland round-trips cleanly through a freshORT+QNN session to a single all-HTP
EPContextnode.