[OpenVINO] calibration_device parameter#4055
Open
daniil-lyakhov wants to merge 3 commits into
Open
Conversation
c604cb2 to
e89e1cc
Compare
7706894 to
19493a2
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
Adds an OpenVINO-specific calibration_device advanced parameter to route calibration-time inference to a user-selected OpenVINO device (e.g., GPU) and makes non-OpenVINO backends explicitly reject this option.
Changes:
- Added
calibration_device: str | Noneto advanced quantization and compression parameter dataclasses (documented as OpenVINO-only). - Implemented a
calibration_device_context()(contextvars-based) and madeOVNativeEnginecompile models for the context-selected device. - Propagated the option through OpenVINO quantization/weight-compression flows and added non-OV guards + cross-framework tests.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/nncf/openvino/engine.py |
Introduces calibration_device_context() and makes OVNativeEngine compile on the context-selected device (CPU default). |
src/nncf/openvino/quantization/quantize_model.py |
Wraps OpenVINO calibration-related algorithm steps in calibration_device_context() to ensure device propagation. |
src/nncf/quantization/advanced_parameters.py |
Adds calibration_device field + docstring entries to AdvancedQuantizationParameters and AdvancedCompressionParameters. |
src/nncf/quantization/quantize_model.py |
Adds ParameterNotSupportedError guards for calibration_device on ONNX/Torch/TorchFX backends. |
tests/openvino/native/test_engine.py |
Adds unit test verifying calibration_device_context() affects OVNativeEngine compile device and CPU-only FP32 config behavior. |
tests/openvino/native/quantization/test_quantize_api.py |
Adds OpenVINO device-propagation tests for quantize and quantize_with_accuracy_control; refactors subset size validation test. |
tests/openvino/native/quantization/test_weights_compression.py |
Adds OpenVINO device-propagation test for compress_weights(...calibration_device=...). |
tests/cross_fw/test_templates/template_test_quantize_api.py |
Adds a cross-framework template asserting non-OV backends reject calibration_device for quantize. |
tests/torch/function_hook/quantization/test_quantize_api.py |
Instantiates the template for Torch Function Hook backend. |
tests/torch/fx/test_quantize_api.py |
Instantiates the template for Torch FX backend using a minimal exported model. |
tests/onnx/quantization/test_quantize_api.py |
Instantiates the template for ONNX and adds a quantize_with_accuracy_control rejection test. |
tests/cross_fw/test_templates/template_test_weights_compression.py |
Adds a cross-framework template test asserting non-OV backends reject calibration_device for compress_weights. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changes
Adds a
calibration_deviceparameter toAdvancedCompressionParametersandAdvancedQuantizationParametersthat allows users to specify which OpenVINO device to use for calibration inference (e.g."GPU","AUTO:GPU,CPU").Source changes:
src/nncf/openvino/engine.py— Addedcalibration_device_context()context manager usingcontextvars.ContextVar.OVNativeEnginereads the context variable to determine the compile device instead of hardcoding"CPU". FP32 inference precision config is only applied for CPU device.src/nncf/quantization/advanced_parameters.py— Addedcalibration_device: str | None = Nonefield to bothAdvancedQuantizationParametersandAdvancedCompressionParameters.src/nncf/openvino/quantization/quantize_model.py— Wraps calibration calls incalibration_device_context()forquantize_impl,quantize_with_accuracy_control_impl, andcompress_weights_impl.src/nncf/quantization/quantize_model.py— AddedParameterNotSupportedErrorguards forcalibration_devicein non-OV backends (Torch, TorchFX, ONNX) acrosscompress_weights,quantize, andquantize_with_accuracy_control.Reason for changes
Up to 120x speed up for PI0.5 quantization calibration on Intel(R) Arc(TM) B580 Graphics
Users need the ability to run calibration inference on a device other than CPU (e.g. GPU) for faster quantization/compression. This parameter is only meaningful for the OpenVINO backend; other backends now raise an explicit
ParameterNotSupportedErrorinstead of silently ignoring it.Related tickets
184686
Tests
tests/openvino/native/test_engine.py— Unit test forcalibration_device_contextandOVNativeEnginedevice propagation.tests/cross_fw/test_templates/template_test_weights_compression.py— Template testtest_compress_weights_calibration_deviceverifyingParameterNotSupportedErroris raised on non-OV backends.tests/openvino/native/quantization/test_weights_compression.py— OV override verifying the device is correctly passed through toov.Core.compile_modelvia monkeypatch.tests/cross_fw/test_templates/template_test_quantize_api.py— New templateTemplateTestQuantizeApiwithtest_quantize_calibration_devicefor non-OV backends.tests/openvino/native/quantization/test_quantize_api.py— OV overrides forquantizeandquantize_with_accuracy_controlverifying device propagation.