Add FP8 support for the ONNX backend by andrey-churkin · Pull Request #4072 · openvinotoolkit/nncf

andrey-churkin · 2026-05-15T08:24:26Z

Changes

Add support for nncf.CompressWeightsMode.FP8_E4M3 mode in the nncf.compress_weights() method for the ONNX backend.
Add support for quantization using nncf.QuantizationMode.FP8_E4M3 and nncf.QuantizationMode.FP8_E5M2 modes in the nncf.quantize() method for the ONNX backend.

Reason for changes

Add support for FP8 quantization and weight compression in the ONNX backend.

Related tickets

CVS-180789

Tests

TBD

Weight compression - success

daniil-lyakhov

No major comments, please add some tests

daniil-lyakhov · 2026-05-18T11:27:38Z

+                get_weight_quantization_axis(node, target_point.port_id) if target_point.is_weight_target_point() else 1
+            )
+        onnx_parameters = convert_fc_params_to_onnx_params(parameters, axis)
+        nncf_input_node_next_nodes = ONNXMinMaxAlgoBackend._get_input_edges_mapping(nncf_graph)


Potential point for an optimization in future. Maybe we could add a comment to highligt it

daniil-lyakhov · 2026-05-18T11:30:45Z

+        if weight_dtype == onnx.TensorProto.FLOAT8E4M3FN:
+            np_dtype = helper.tensor_dtype_to_np_dtype(weight_dtype)
+            vals = onnx.numpy_helper.saturate_cast(np.asarray(quantized_weights), np_dtype).flatten()
+        else:
+            vals = quantized_weights


Two similar code blocks, maybe worth a private method?

Add FP8 quantization support for the ONNX backend

cd0f9fb

andrey-churkin requested a review from a team as a code owner May 15, 2026 08:24

update test

1dc817f

github-actions Bot added the NNCF ONNX Pull requests that updates NNCF ONNX label May 15, 2026

andrey-churkin requested review from andreyanufr and daniil-lyakhov May 15, 2026 13:11

daniil-lyakhov reviewed May 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add FP8 support for the ONNX backend#4072

Add FP8 support for the ONNX backend#4072
andrey-churkin wants to merge 2 commits into
openvinotoolkit:developfrom
andrey-churkin:ac/fp8_onnx

andrey-churkin commented May 15, 2026 •

edited by github-actions Bot

Loading

Uh oh!

daniil-lyakhov left a comment

Uh oh!

daniil-lyakhov May 18, 2026

Uh oh!

daniil-lyakhov May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

andrey-churkin commented May 15, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Reason for changes

Related tickets

Tests

Uh oh!

daniil-lyakhov left a comment

Choose a reason for hiding this comment

Uh oh!

daniil-lyakhov May 18, 2026

Choose a reason for hiding this comment

Uh oh!

daniil-lyakhov May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

andrey-churkin commented May 15, 2026 •

edited by github-actions Bot

Loading