Skip to content

Aanuf/fix for asym#4074

Open
andreyanufr wants to merge 55 commits into
openvinotoolkit:developfrom
andreyanufr:aanuf/fix_for_asym
Open

Aanuf/fix for asym#4074
andreyanufr wants to merge 55 commits into
openvinotoolkit:developfrom
andreyanufr:aanuf/fix_for_asym

Conversation

@andreyanufr
Copy link
Copy Markdown
Collaborator

@andreyanufr andreyanufr commented May 15, 2026

Changes

Fixed compression range for asymmetric compression if all values ​​are positive or negative.

Reason for changes

For vector [-22. -21. -20. -19. -18. -17. -16. -15.] current implementation gives decompressed values after integer_quantize_dequantize_weight(..) equal to [-7. -7. -7. -7. -7. -7. -7. -7. ] bacause zero_point before clamp equal to
-22 / scale = -22 * 255/(-15 + 22) = 804 and after clamp is 0, but min value is -22/scale = -804 and max value is -15/scale = -548, and after clamp all values equal to zero.

But if add 0 to range of values: [-22. -21. -20. -19. -18. -17. -16. -15. 0.] then scale = 22/256, zero_point = -255, min_value=-255, max_value=0 and we have correct range.

Related tickets

CVS-186919

Tests

Test examples - success

alexsu52 and others added 30 commits September 2, 2024 13:22
Copilot AI review requested due to automatic review settings May 15, 2026 14:29
@andreyanufr andreyanufr requested a review from a team as a code owner May 15, 2026 14:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a quantization range bug in asymmetric weight compression where input weights whose [min, max] range does not include zero (all-positive or all-negative values) produced degenerate decompressed outputs. The fix forces the quantization range to always span zero by clamping min_values <= 0 and max_values >= 0 before computing the scale and zero point. The change is mirrored in both the reference NumPy/Tensor path and the optimized OpenVINO graph builder.

Changes:

  • In the reference asymmetric path, clamp min_values and max_values so the range always includes zero before calling calculate_scale_zero_point.
  • In the optimized OpenVINO model builder, perform the equivalent opset.minimum/opset.maximum against a 0.0 constant when computing min/max.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/nncf/quantization/algorithms/weight_compression/weight_lowering.py Adds zero-inclusive clamping to min/max inside the asymmetric branch of calculate_integer_quantization_params.
src/nncf/openvino/optimized_functions/models.py Adds equivalent zero-inclusive clamping to min/max in _build_integer_quantization_model, but unconditionally rather than only for asymmetric mode.

Comment thread src/nncf/openvino/optimized_functions/models.py Outdated
Comment on lines +321 to +323
zero = fns.zeros_like(min_values)
min_values = fns.minimum(zero, min_values)
max_values = fns.maximum(zero, max_values)
@andreyanufr andreyanufr marked this pull request as draft May 15, 2026 14:55
@github-actions github-actions Bot added the NNCF OpenVINO Pull requests that updates NNCF OpenVINO label May 21, 2026
@github-actions github-actions Bot added NNCF PT Pull requests that updates NNCF PyTorch NNCF ONNX Pull requests that updates NNCF ONNX labels May 22, 2026
@andreyanufr andreyanufr marked this pull request as ready for review May 22, 2026 13:55
@andreyanufr andreyanufr requested a review from Copilot May 26, 2026 09:15
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Comment on lines 315 to +323
if config.is_asym_mode:
level_low = 0
level_high = 2**num_bits - 1
min_values = fns.min(weight, axis=reduction_axes, keepdims=True) # [a1, r, a2] -> [a1, 1, a2]
max_values = fns.max(weight, axis=reduction_axes, keepdims=True) # [a1, r, a2] -> [a1, 1, a2]

zero = fns.zeros_like(min_values)
min_values = fns.minimum(zero, min_values)
max_values = fns.maximum(zero, max_values)
Comment on lines 141 to 145
example_inputs_numpy = example_input.detach().cpu().numpy()
stripped_ov_output = torch.tensor(model(example_inputs_numpy)[0], device=example_input.device)

# TODO(aanuf): fix input_low, input_range computation for AsymmetricQuantizer
assert torch.allclose(tuned_output, stripped_output, atol=1e-1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

NNCF ONNX Pull requests that updates NNCF ONNX NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF PT Pull requests that updates NNCF PyTorch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants