Skip to content

[QNN-EP] Add dump_onnx_subgraph for per-partition debugging#372

Draft
qti-niscmami wants to merge 6 commits into
mainfrom
dev/qti-niscmami/onnx_subgraph_feature
Draft

[QNN-EP] Add dump_onnx_subgraph for per-partition debugging#372
qti-niscmami wants to merge 6 commits into
mainfrom
dev/qti-niscmami/onnx_subgraph_feature

Conversation

@qti-niscmami
Copy link
Copy Markdown
Collaborator

@qti-niscmami qti-niscmami commented May 12, 2026

Adds session config option dump_onnx_subgraph=1 (+ onnx_subgraph_dir=<path>)
that emits each QNN-claimed partition as a self-contained, runnable ONNX model —
captured inside CompileImpl just before QNN op-builders rewrite the partition's
op_types into QNN_OP_*. Each partition produces <fused_node_name>.onnx plus
a <fused_node_name>.onnx.data sidecar (ONNX standard external_data layout).

Description

Changes:

  • New dump_onnx_subgraph_ / onnx_subgraph_dir_ members on QnnEp, parsed
    from session config (mirrors the existing dump_json_qnn_graph pattern)
  • New helper qnn::DumpPartitionAsOnnxModel in
    core/providers/qnn/builder/onnx_subgraph_dumper.{h,cc} — walks the
    per-partition OrtGraph via the C API and emits onnx::ModelProto
  • Dump call wired into QnnEp::CompileImpl between fused_node_name resolution
    and qnn_model->ComposeGraph(...); skipped on the EPContext / DLC-context
    load paths
  • Initializers stream to a .onnx.data sidecar via TensorProto.external_data
    rather than inline raw_data — avoids protobuf's 2 GB single-message ceiling
    on large models
  • Subgraph attributes (If/Loop bodies) hard-fail with a warning rather than
    silently producing an unrunnable dump
  • CMake links onnx onnx_proto into the QNN EP plugin
  • New unit test onnx_subgraph_dump_test.cc: QDQ Conv→Relu dump + negative test

Motivation and Context

The QNN-side view of a partition is dumpable today (dump_json_qnn_graph,
qnn_saver_path, dump_qnn_ir_dlc, APIREC), but the ONNX-side view of the same
partition is not — making per-partition correlation, Netron inspection, and
isolated end-to-end debug runs painful. This option closes that gap. Each dumped
file passes onnx.checker.check_model and round-trips cleanly through a fresh
ORT+QNN session to a single all-HTP EPContext node.

@qti-chuteng
Copy link
Copy Markdown
Collaborator

Hi @qti-niscmami, this implementation will add onnx and onnx_proto as new dependencies. If these are necessary, that's fine, but otherwise, we prefer not to increase dependencies in our codebase. Also, expanding the config group may need further discussion.

I believe this functionality could be achieved by using a Python script for decoupling. Could you share your thoughts on this approach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants