Aanuf/fix for asym#4074
Open
andreyanufr wants to merge 55 commits into
Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes a quantization range bug in asymmetric weight compression where input weights whose [min, max] range does not include zero (all-positive or all-negative values) produced degenerate decompressed outputs. The fix forces the quantization range to always span zero by clamping min_values <= 0 and max_values >= 0 before computing the scale and zero point. The change is mirrored in both the reference NumPy/Tensor path and the optimized OpenVINO graph builder.
Changes:
- In the reference asymmetric path, clamp
min_valuesandmax_valuesso the range always includes zero before callingcalculate_scale_zero_point. - In the optimized OpenVINO model builder, perform the equivalent
opset.minimum/opset.maximumagainst a0.0constant when computing min/max.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/nncf/quantization/algorithms/weight_compression/weight_lowering.py | Adds zero-inclusive clamping to min/max inside the asymmetric branch of calculate_integer_quantization_params. |
| src/nncf/openvino/optimized_functions/models.py | Adds equivalent zero-inclusive clamping to min/max in _build_integer_quantization_model, but unconditionally rather than only for asymmetric mode. |
Comment on lines
+321
to
+323
| zero = fns.zeros_like(min_values) | ||
| min_values = fns.minimum(zero, min_values) | ||
| max_values = fns.maximum(zero, max_values) |
Comment on lines
315
to
+323
| if config.is_asym_mode: | ||
| level_low = 0 | ||
| level_high = 2**num_bits - 1 | ||
| min_values = fns.min(weight, axis=reduction_axes, keepdims=True) # [a1, r, a2] -> [a1, 1, a2] | ||
| max_values = fns.max(weight, axis=reduction_axes, keepdims=True) # [a1, r, a2] -> [a1, 1, a2] | ||
|
|
||
| zero = fns.zeros_like(min_values) | ||
| min_values = fns.minimum(zero, min_values) | ||
| max_values = fns.maximum(zero, max_values) |
Comment on lines
141
to
145
| example_inputs_numpy = example_input.detach().cpu().numpy() | ||
| stripped_ov_output = torch.tensor(model(example_inputs_numpy)[0], device=example_input.device) | ||
|
|
||
| # TODO(aanuf): fix input_low, input_range computation for AsymmetricQuantizer | ||
| assert torch.allclose(tuned_output, stripped_output, atol=1e-1) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changes
Fixed compression range for asymmetric compression if all values are positive or negative.
Reason for changes
For vector [-22. -21. -20. -19. -18. -17. -16. -15.] current implementation gives decompressed values after integer_quantize_dequantize_weight(..) equal to [-7. -7. -7. -7. -7. -7. -7. -7. ] bacause zero_point before clamp equal to
-22 / scale = -22 * 255/(-15 + 22) = 804 and after clamp is 0, but min value is -22/scale = -804 and max value is -15/scale = -548, and after clamp all values equal to zero.
But if add 0 to range of values: [-22. -21. -20. -19. -18. -17. -16. -15. 0.] then scale = 22/256, zero_point = -255, min_value=-255, max_value=0 and we have correct range.
Related tickets
CVS-186919
Tests
Test examples - success