From 51ef40f4cf3a7ee9b03dacabccc5fb1eb468d499 Mon Sep 17 00:00:00 2001 From: Alexander Dokuchaev Date: Mon, 2 Feb 2026 11:54:49 +0200 Subject: [PATCH 01/19] release notes template --- ReleaseNotes.md | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index 112ed45c2d0..57abd0599c2 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -1,5 +1,53 @@ # Release Notes +## New in Release 3.0.0 + +Post-training Quantization: + +- Breaking changes: + - ... +- General: + - ... +- Features: + - ... +- Fixes: + - ... +- Improvements: + - ... +- Deprecations/Removals: + - ... +- Tutorials: + - ... +- Known issues: + - ... + +Compression-aware training: + +- Breaking changes: + - ... +- General: + - ... +- Features: + - ... +- Fixes: + - ... +- Improvements: + - ... +- Deprecations/Removals: + - ... +- Tutorials: + - ... +- Known issues: + - ... + +Deprecations/Removals: + +- ... + +Requirements: + +- ... + ## New in Release 2.19.0 Post-training Quantization: From 8bf1e8e7fcc549ab75e1f765fc28ca604db5552a Mon Sep 17 00:00:00 2001 From: Liubov Talamanova Date: Fri, 6 Feb 2026 14:39:19 +0000 Subject: [PATCH 02/19] Update ReleaseNotes.md --- ReleaseNotes.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index 57abd0599c2..51025464919 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -17,7 +17,10 @@ Post-training Quantization: - Deprecations/Removals: - ... - Tutorials: - - ... + - [Post-Training Optimization of Wan2.2 Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/supplementary_materials/notebooks/wan2.2-text-image-to-video/wan2.2-text-image-to-video.ipynb) + - [Post-Training Optimization of DeepSeek-OCR Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/deepseek-ocr/deepseek-ocr.ipynb) + - [Post-Training Optimization of Z-Image-Turbo Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/z-image-turbo/z-image-turbo.ipynb) + - [Post-Training Optimization of Qwen-Image Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/qwen-image/qwen-image.ipynb) - Known issues: - ... From 561997e6549455d1f00e9068363a772222891ac9 Mon Sep 17 00:00:00 2001 From: Liubov Talamanova Date: Fri, 6 Feb 2026 14:41:04 +0000 Subject: [PATCH 03/19] Update ReleaseNotes.md --- ReleaseNotes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index 51025464919..6fb05d71817 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -17,7 +17,7 @@ Post-training Quantization: - Deprecations/Removals: - ... - Tutorials: - - [Post-Training Optimization of Wan2.2 Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/supplementary_materials/notebooks/wan2.2-text-image-to-video/wan2.2-text-image-to-video.ipynb) + - [Post-Training Optimization of Wan2.2 Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/wan2.2-text-image-to-video/wan2.2-text-image-to-video.ipynb) - [Post-Training Optimization of DeepSeek-OCR Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/deepseek-ocr/deepseek-ocr.ipynb) - [Post-Training Optimization of Z-Image-Turbo Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/z-image-turbo/z-image-turbo.ipynb) - [Post-Training Optimization of Qwen-Image Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/qwen-image/qwen-image.ipynb) From 26abcffedc030cb9c4fe948b156e3c76493e7ed0 Mon Sep 17 00:00:00 2001 From: Aamir Nazir Date: Fri, 6 Feb 2026 18:54:00 +0400 Subject: [PATCH 04/19] Update ReleaseNotes.md --- ReleaseNotes.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index 6fb05d71817..d42422fb2ed 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -9,11 +9,11 @@ Post-training Quantization: - General: - ... - Features: - - ... + - (TorchFX) Preview support for the new `compress_pt2e` API has been introduced, enabling quantization of `torch.fx.GraphModule` models with the `OpenVINOQuantizer`. Users now can quantize their models in [ExecuTorch](https://github.com/pytorch/executorch) for the OpenVINO backend via the nncf `compress_pt2e` employing Scale Estimation and AWQ. - Fixes: - ... - Improvements: - - ... + - Add Support for compression of 3D weights in AWQ, Scale Estimation and GPTQ Algorithms. Models with MoE can be compressed now. - Deprecations/Removals: - ... - Tutorials: From 4510e78decaae2ccd4c61d4aec0a007804a470c9 Mon Sep 17 00:00:00 2001 From: Aamir Nazir Date: Fri, 6 Feb 2026 19:00:22 +0400 Subject: [PATCH 05/19] Update ReleaseNotes.md --- ReleaseNotes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index d42422fb2ed..0b95c51e24a 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -13,7 +13,7 @@ Post-training Quantization: - Fixes: - ... - Improvements: - - Add Support for compression of 3D weights in AWQ, Scale Estimation and GPTQ Algorithms. Models with MoE can be compressed now. + - Add Support for compression of 3D weights in AWQ, Scale Estimation and GPTQ Algorithms. Models with MoE (Mixture of Experts) can be compressed now. - Deprecations/Removals: - ... - Tutorials: From 6be51ce970d5a1de643550aa7d47b97e706b87df Mon Sep 17 00:00:00 2001 From: AndreiAnufriev Date: Fri, 6 Feb 2026 16:04:16 +0100 Subject: [PATCH 06/19] Update ReleaseNotes.md --- ReleaseNotes.md | 1 + 1 file changed, 1 insertion(+) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index 0b95c51e24a..ac2b430fe29 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -9,6 +9,7 @@ Post-training Quantization: - General: - ... - Features: + - (OpenVINO) Introduced new experimental compression data type ADAPTIVE_CODEBOOK. This compression type calculates a unique codebook for each MatMul or block of identical MatMuls (for example, all down_proj could have the same codebook). This approach reduces quality degradation in the case of per-channel weight compression. - (TorchFX) Preview support for the new `compress_pt2e` API has been introduced, enabling quantization of `torch.fx.GraphModule` models with the `OpenVINOQuantizer`. Users now can quantize their models in [ExecuTorch](https://github.com/pytorch/executorch) for the OpenVINO backend via the nncf `compress_pt2e` employing Scale Estimation and AWQ. - Fixes: - ... From edb75498ab2bc90513c5bf9b59a6a122bf7a6033 Mon Sep 17 00:00:00 2001 From: Daniil Lyakhov Date: Fri, 6 Feb 2026 16:25:14 +0100 Subject: [PATCH 07/19] Update ReleaseNotes.md --- ReleaseNotes.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index ac2b430fe29..cbb01b64b17 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -9,6 +9,7 @@ Post-training Quantization: - General: - ... - Features: + - Models containing MatMul operations with transposed activation inputs was supported in Weight Compression and AWQ algorithms. - (OpenVINO) Introduced new experimental compression data type ADAPTIVE_CODEBOOK. This compression type calculates a unique codebook for each MatMul or block of identical MatMuls (for example, all down_proj could have the same codebook). This approach reduces quality degradation in the case of per-channel weight compression. - (TorchFX) Preview support for the new `compress_pt2e` API has been introduced, enabling quantization of `torch.fx.GraphModule` models with the `OpenVINOQuantizer`. Users now can quantize their models in [ExecuTorch](https://github.com/pytorch/executorch) for the OpenVINO backend via the nncf `compress_pt2e` employing Scale Estimation and AWQ. - Fixes: @@ -40,7 +41,7 @@ Compression-aware training: - Deprecations/Removals: - ... - Tutorials: - - ... + - [Post-Training Quantization of YOLO26 OpenVINO Model](https://github.com/openvinotoolkit/nncf/tree/develop/examples/post_training_quantization/openvino/yolo26) - Known issues: - ... From 7686f61eb024e90f20463a5f3d9f008d49330651 Mon Sep 17 00:00:00 2001 From: Andrey Churkin Date: Mon, 9 Feb 2026 07:21:32 +0000 Subject: [PATCH 08/19] Update ReleaseNotes.md --- ReleaseNotes.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index cbb01b64b17..6b4f558f508 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -35,7 +35,8 @@ Compression-aware training: - Features: - ... - Fixes: - - ... + - (ONNX) Fixed `compress_quantize_weights_transformation()` method by removing names of deleted initializers from graph inputs. + - (ONNX) Fixed incorrect insertion of MatMulNBits nodes. - Improvements: - ... - Deprecations/Removals: From 47f896d5a71f6efe7f2b796f4b5a9ff553b58ba9 Mon Sep 17 00:00:00 2001 From: Andrey Churkin Date: Mon, 9 Feb 2026 07:22:40 +0000 Subject: [PATCH 09/19] Update ReleaseNotes.md --- ReleaseNotes.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index 6b4f558f508..a174bf542ef 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -13,7 +13,8 @@ Post-training Quantization: - (OpenVINO) Introduced new experimental compression data type ADAPTIVE_CODEBOOK. This compression type calculates a unique codebook for each MatMul or block of identical MatMuls (for example, all down_proj could have the same codebook). This approach reduces quality degradation in the case of per-channel weight compression. - (TorchFX) Preview support for the new `compress_pt2e` API has been introduced, enabling quantization of `torch.fx.GraphModule` models with the `OpenVINOQuantizer`. Users now can quantize their models in [ExecuTorch](https://github.com/pytorch/executorch) for the OpenVINO backend via the nncf `compress_pt2e` employing Scale Estimation and AWQ. - Fixes: - - ... + - (ONNX) Fixed `compress_quantize_weights_transformation()` method by removing names of deleted initializers from graph inputs. + - (ONNX) Fixed incorrect insertion of MatMulNBits nodes. - Improvements: - Add Support for compression of 3D weights in AWQ, Scale Estimation and GPTQ Algorithms. Models with MoE (Mixture of Experts) can be compressed now. - Deprecations/Removals: @@ -35,8 +36,7 @@ Compression-aware training: - Features: - ... - Fixes: - - (ONNX) Fixed `compress_quantize_weights_transformation()` method by removing names of deleted initializers from graph inputs. - - (ONNX) Fixed incorrect insertion of MatMulNBits nodes. + - ... - Improvements: - ... - Deprecations/Removals: From fbf165825429590562f64e78bcadc3de5d30a442 Mon Sep 17 00:00:00 2001 From: Alexander Dokuchaev Date: Mon, 9 Feb 2026 11:51:00 +0200 Subject: [PATCH 10/19] t --- ReleaseNotes.md | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index a174bf542ef..4f8a77280cd 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -5,13 +5,22 @@ Post-training Quantization: - Breaking changes: - - ... + - (TensorFlow) Removed support for TensorFlow backend. + - (PyTorch) Removed legacy `create_compressed_model` API for PyTorch backend, which was previously marked as deprecated. + - (PyTorch) Removed legacy algorithms for PyTorch that were based on using `NNCFNetwork`, e.g. NAS, Structural Pruning, AutoML, Knowledge Distillation, Mixed-Precision Quantization. + - Renamed `nncf.CompressWeightsMode.CB4_F8E4M3` mode option to `nncf.CompressWeightsMode.CB4`. - General: - - ... + - Added `nncf.prune` API function, which provides a unified interface for pruning algorithms. Currently available for PyTorch backend and supports Magnitude Pruning. + More details about the new API can be found in the [documentation](https://github.com/openvinotoolkit/nncf/tree/develop/docs/usage/training_time_compression/pruning/Usage.md). + - Added `nncf.build_graph` API function for building `NNCFGraph` from a model. + - Added [documentation](https://github.com/openvinotoolkit/nncf/blob/develop/docs/usage/IgnoredScope.md) about using `nncf.IgnoredScope`. + - Reworked `HWConfig`, now using Python-style definition of hardware configuration instead of JSON files. - Features: - Models containing MatMul operations with transposed activation inputs was supported in Weight Compression and AWQ algorithms. - (OpenVINO) Introduced new experimental compression data type ADAPTIVE_CODEBOOK. This compression type calculates a unique codebook for each MatMul or block of identical MatMuls (for example, all down_proj could have the same codebook). This approach reduces quality degradation in the case of per-channel weight compression. - (TorchFX) Preview support for the new `compress_pt2e` API has been introduced, enabling quantization of `torch.fx.GraphModule` models with the `OpenVINOQuantizer`. Users now can quantize their models in [ExecuTorch](https://github.com/pytorch/executorch) for the OpenVINO backend via the nncf `compress_pt2e` employing Scale Estimation and AWQ. + - (PyTorch) Added support for linear functions for the Fast Bias Correction algorithm. + - (OpenVINO) Added [activation profiler](https://github.com/openvinotoolkit/nncf/tree/develop/tools/activation_profiler) tool to collect and visualize tensor statistics. - Fixes: - (ONNX) Fixed `compress_quantize_weights_transformation()` method by removing names of deleted initializers from graph inputs. - (ONNX) Fixed incorrect insertion of MatMulNBits nodes. @@ -52,7 +61,7 @@ Deprecations/Removals: Requirements: -- ... +- Dropped `jsonschema`, `natsort`, and `pymoo` from dependencies as they are no longer required. ## New in Release 2.19.0 From bfd296139cf19a36952154f0f6ce2a3b3965146c Mon Sep 17 00:00:00 2001 From: Alexander Dokuchaev Date: Mon, 9 Feb 2026 16:01:53 +0400 Subject: [PATCH 11/19] Update ReleaseNotes.md --- ReleaseNotes.md | 1 + 1 file changed, 1 insertion(+) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index 4f8a77280cd..59ffc9c603f 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -62,6 +62,7 @@ Deprecations/Removals: Requirements: - Dropped `jsonschema`, `natsort`, and `pymoo` from dependencies as they are no longer required. +- Updated `numpy` to `>=1.24.0, <2.5.0`. ## New in Release 2.19.0 From e80b361f3f76a94bd30fe850f9b8ce6d71560d27 Mon Sep 17 00:00:00 2001 From: Aamir Nazir Date: Tue, 10 Feb 2026 12:37:07 +0400 Subject: [PATCH 12/19] Update ReleaseNotes.md --- ReleaseNotes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index 59ffc9c603f..ff5d8f9c19a 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -25,7 +25,7 @@ Post-training Quantization: - (ONNX) Fixed `compress_quantize_weights_transformation()` method by removing names of deleted initializers from graph inputs. - (ONNX) Fixed incorrect insertion of MatMulNBits nodes. - Improvements: - - Add Support for compression of 3D weights in AWQ, Scale Estimation and GPTQ Algorithms. Models with MoE (Mixture of Experts) can be compressed now. + - Added support for the compression of 3D weights in AWQ, Scale Estimation, and GPTQ algorithms. Models with MoE (Mixture of Experts), such as GPT-OSS-20B and Qwen3-30B-A3B, can be compressed with data-aware methods now. - Deprecations/Removals: - ... - Tutorials: From 8fe5f9d701ab6bc018cb2d80bb2ac7c3e9d30fc6 Mon Sep 17 00:00:00 2001 From: Liubov Talamanova Date: Tue, 10 Feb 2026 08:43:41 +0000 Subject: [PATCH 13/19] Update ReleaseNotes.md --- ReleaseNotes.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index ff5d8f9c19a..6292924144c 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -33,6 +33,8 @@ Post-training Quantization: - [Post-Training Optimization of DeepSeek-OCR Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/deepseek-ocr/deepseek-ocr.ipynb) - [Post-Training Optimization of Z-Image-Turbo Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/z-image-turbo/z-image-turbo.ipynb) - [Post-Training Optimization of Qwen-Image Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/qwen-image/qwen-image.ipynb) + - [Post-Training Optimization of Fun-ASR-Nano Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/funasr-nano/funasr-nano.ipynb) + - [Post-Training Optimization of Fun-CosyVoice 3.0 Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/cosyvoice3-tts/cosyvoice3-tts.ipynb) - Known issues: - ... From 8a5fa206070a942cfa30248eaf9f5c9d8128338f Mon Sep 17 00:00:00 2001 From: Maksim Proshin Date: Wed, 11 Feb 2026 12:30:24 +0400 Subject: [PATCH 14/19] Update ReleaseNotes.md --- ReleaseNotes.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index 6292924144c..afcc1278ef5 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -33,6 +33,8 @@ Post-training Quantization: - [Post-Training Optimization of DeepSeek-OCR Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/deepseek-ocr/deepseek-ocr.ipynb) - [Post-Training Optimization of Z-Image-Turbo Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/z-image-turbo/z-image-turbo.ipynb) - [Post-Training Optimization of Qwen-Image Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/qwen-image/qwen-image.ipynb) + - [Post-Training Optimization of Qwen3-TTS Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/qwen3-tts/qwen3-tts.ipynb) + - [Post-Training Optimization of Qwen3-ASR Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/qwen3-asr/qwen3-asr.ipynb) - [Post-Training Optimization of Fun-ASR-Nano Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/funasr-nano/funasr-nano.ipynb) - [Post-Training Optimization of Fun-CosyVoice 3.0 Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/cosyvoice3-tts/cosyvoice3-tts.ipynb) - Known issues: From ff121a745bde6c4416745a986cf02384266ed5b0 Mon Sep 17 00:00:00 2001 From: Daniil Lyakhov Date: Wed, 11 Feb 2026 11:26:35 +0100 Subject: [PATCH 15/19] Comments --- ReleaseNotes.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index afcc1278ef5..3c8e86d3505 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -16,7 +16,7 @@ Post-training Quantization: - Added [documentation](https://github.com/openvinotoolkit/nncf/blob/develop/docs/usage/IgnoredScope.md) about using `nncf.IgnoredScope`. - Reworked `HWConfig`, now using Python-style definition of hardware configuration instead of JSON files. - Features: - - Models containing MatMul operations with transposed activation inputs was supported in Weight Compression and AWQ algorithms. + - Added support for models containing MatMul operations with transposed activation inputs in data-free Weight Compression and data-aware AWQ algorithms. - (OpenVINO) Introduced new experimental compression data type ADAPTIVE_CODEBOOK. This compression type calculates a unique codebook for each MatMul or block of identical MatMuls (for example, all down_proj could have the same codebook). This approach reduces quality degradation in the case of per-channel weight compression. - (TorchFX) Preview support for the new `compress_pt2e` API has been introduced, enabling quantization of `torch.fx.GraphModule` models with the `OpenVINOQuantizer`. Users now can quantize their models in [ExecuTorch](https://github.com/pytorch/executorch) for the OpenVINO backend via the nncf `compress_pt2e` employing Scale Estimation and AWQ. - (PyTorch) Added support for linear functions for the Fast Bias Correction algorithm. @@ -29,6 +29,7 @@ Post-training Quantization: - Deprecations/Removals: - ... - Tutorials: + - [Post-Training Quantization of YOLO26 OpenVINO Model](https://github.com/openvinotoolkit/nncf/tree/develop/examples/post_training_quantization/openvino/yolo26) - [Post-Training Optimization of Wan2.2 Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/wan2.2-text-image-to-video/wan2.2-text-image-to-video.ipynb) - [Post-Training Optimization of DeepSeek-OCR Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/deepseek-ocr/deepseek-ocr.ipynb) - [Post-Training Optimization of Z-Image-Turbo Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/z-image-turbo/z-image-turbo.ipynb) @@ -55,7 +56,7 @@ Compression-aware training: - Deprecations/Removals: - ... - Tutorials: - - [Post-Training Quantization of YOLO26 OpenVINO Model](https://github.com/openvinotoolkit/nncf/tree/develop/examples/post_training_quantization/openvino/yolo26) + - - Known issues: - ... From c86eebb04d1e0707e0a8babb4570b74d42b80add Mon Sep 17 00:00:00 2001 From: Maksim Proshin Date: Wed, 11 Feb 2026 16:28:05 +0400 Subject: [PATCH 16/19] Update ReleaseNotes.md --- ReleaseNotes.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index 3c8e86d3505..1be7e5f7615 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -7,7 +7,7 @@ Post-training Quantization: - Breaking changes: - (TensorFlow) Removed support for TensorFlow backend. - (PyTorch) Removed legacy `create_compressed_model` API for PyTorch backend, which was previously marked as deprecated. - - (PyTorch) Removed legacy algorithms for PyTorch that were based on using `NNCFNetwork`, e.g. NAS, Structural Pruning, AutoML, Knowledge Distillation, Mixed-Precision Quantization. + - (PyTorch) Removed legacy algorithms for PyTorch that were based on using `NNCFNetwork`: NAS, Structural Pruning, AutoML, Knowledge Distillation, Mixed-Precision Quantization, and Movement Sparsity. - Renamed `nncf.CompressWeightsMode.CB4_F8E4M3` mode option to `nncf.CompressWeightsMode.CB4`. - General: - Added `nncf.prune` API function, which provides a unified interface for pruning algorithms. Currently available for PyTorch backend and supports Magnitude Pruning. @@ -17,7 +17,7 @@ Post-training Quantization: - Reworked `HWConfig`, now using Python-style definition of hardware configuration instead of JSON files. - Features: - Added support for models containing MatMul operations with transposed activation inputs in data-free Weight Compression and data-aware AWQ algorithms. - - (OpenVINO) Introduced new experimental compression data type ADAPTIVE_CODEBOOK. This compression type calculates a unique codebook for each MatMul or block of identical MatMuls (for example, all down_proj could have the same codebook). This approach reduces quality degradation in the case of per-channel weight compression. + - (OpenVINO) Introduced new experimental compression data type ADAPTIVE_CODEBOOK. This compression type calculates a unique codebook for each MatMul or block of identical MatMuls (for example, all down_proj could have the same codebook). This approach reduces quality degradation in the case of per-channel weight compression. See [example](https://github.com/openvinotoolkit/nncf/tree/develop/examples/llm_compression/openvino/smollm2_360m_adaptive_codebook). - (TorchFX) Preview support for the new `compress_pt2e` API has been introduced, enabling quantization of `torch.fx.GraphModule` models with the `OpenVINOQuantizer`. Users now can quantize their models in [ExecuTorch](https://github.com/pytorch/executorch) for the OpenVINO backend via the nncf `compress_pt2e` employing Scale Estimation and AWQ. - (PyTorch) Added support for linear functions for the Fast Bias Correction algorithm. - (OpenVINO) Added [activation profiler](https://github.com/openvinotoolkit/nncf/tree/develop/tools/activation_profiler) tool to collect and visualize tensor statistics. From 0618d6a836d234d6e283785c144ed79834ebc03b Mon Sep 17 00:00:00 2001 From: Maksim Proshin Date: Wed, 11 Feb 2026 16:32:16 +0400 Subject: [PATCH 17/19] Update ReleaseNotes.md --- ReleaseNotes.md | 28 +++------------------------- 1 file changed, 3 insertions(+), 25 deletions(-) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index 1be7e5f7615..65af9deb887 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -5,9 +5,6 @@ Post-training Quantization: - Breaking changes: - - (TensorFlow) Removed support for TensorFlow backend. - - (PyTorch) Removed legacy `create_compressed_model` API for PyTorch backend, which was previously marked as deprecated. - - (PyTorch) Removed legacy algorithms for PyTorch that were based on using `NNCFNetwork`: NAS, Structural Pruning, AutoML, Knowledge Distillation, Mixed-Precision Quantization, and Movement Sparsity. - Renamed `nncf.CompressWeightsMode.CB4_F8E4M3` mode option to `nncf.CompressWeightsMode.CB4`. - General: - Added `nncf.prune` API function, which provides a unified interface for pruning algorithms. Currently available for PyTorch backend and supports Magnitude Pruning. @@ -38,31 +35,12 @@ Post-training Quantization: - [Post-Training Optimization of Qwen3-ASR Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/qwen3-asr/qwen3-asr.ipynb) - [Post-Training Optimization of Fun-ASR-Nano Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/funasr-nano/funasr-nano.ipynb) - [Post-Training Optimization of Fun-CosyVoice 3.0 Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/cosyvoice3-tts/cosyvoice3-tts.ipynb) -- Known issues: - - ... - -Compression-aware training: - -- Breaking changes: - - ... -- General: - - ... -- Features: - - ... -- Fixes: - - ... -- Improvements: - - ... -- Deprecations/Removals: - - ... -- Tutorials: - - -- Known issues: - - ... Deprecations/Removals: -- ... + - (TensorFlow) Removed support for TensorFlow backend. + - (PyTorch) Removed legacy `create_compressed_model` API for PyTorch backend, which was previously marked as deprecated. + - (PyTorch) Removed legacy algorithms for PyTorch that were based on using `NNCFNetwork`: NAS, Structural Pruning, AutoML, Knowledge Distillation, Mixed-Precision Quantization, and Movement Sparsity. Requirements: From f85fea990cd6973beb7b2ee22e92b7ca69ccc7ef Mon Sep 17 00:00:00 2001 From: Alexander Dokuchaev Date: Mon, 16 Feb 2026 13:57:06 +0400 Subject: [PATCH 18/19] Update ReleaseNotes.md --- ReleaseNotes.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index 65af9deb887..753e08d275d 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -9,14 +9,14 @@ Post-training Quantization: - General: - Added `nncf.prune` API function, which provides a unified interface for pruning algorithms. Currently available for PyTorch backend and supports Magnitude Pruning. More details about the new API can be found in the [documentation](https://github.com/openvinotoolkit/nncf/tree/develop/docs/usage/training_time_compression/pruning/Usage.md). - - Added `nncf.build_graph` API function for building `NNCFGraph` from a model. + - Added `nncf.build_graph` API function for building `NNCFGraph` from a model. This API can be used to inspect and define [the ignored scope](/docs/usage/IgnoredScope.md). - Added [documentation](https://github.com/openvinotoolkit/nncf/blob/develop/docs/usage/IgnoredScope.md) about using `nncf.IgnoredScope`. - Reworked `HWConfig`, now using Python-style definition of hardware configuration instead of JSON files. - Features: - Added support for models containing MatMul operations with transposed activation inputs in data-free Weight Compression and data-aware AWQ algorithms. - (OpenVINO) Introduced new experimental compression data type ADAPTIVE_CODEBOOK. This compression type calculates a unique codebook for each MatMul or block of identical MatMuls (for example, all down_proj could have the same codebook). This approach reduces quality degradation in the case of per-channel weight compression. See [example](https://github.com/openvinotoolkit/nncf/tree/develop/examples/llm_compression/openvino/smollm2_360m_adaptive_codebook). - (TorchFX) Preview support for the new `compress_pt2e` API has been introduced, enabling quantization of `torch.fx.GraphModule` models with the `OpenVINOQuantizer`. Users now can quantize their models in [ExecuTorch](https://github.com/pytorch/executorch) for the OpenVINO backend via the nncf `compress_pt2e` employing Scale Estimation and AWQ. - - (PyTorch) Added support for linear functions for the Fast Bias Correction algorithm. + - (PyTorch) Added support for linear functions for the Fast Bias Correction algorithm to improve the accuracy of such models after the quantization. - (OpenVINO) Added [activation profiler](https://github.com/openvinotoolkit/nncf/tree/develop/tools/activation_profiler) tool to collect and visualize tensor statistics. - Fixes: - (ONNX) Fixed `compress_quantize_weights_transformation()` method by removing names of deleted initializers from graph inputs. From 8d0b68bbbdb5785e00c65d3577348314e4406a04 Mon Sep 17 00:00:00 2001 From: Alexander Dokuchaev Date: Thu, 19 Feb 2026 13:05:51 +0200 Subject: [PATCH 19/19] linter --- ReleaseNotes.md | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/ReleaseNotes.md b/ReleaseNotes.md index 753e08d275d..b5bf0a034cc 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -23,8 +23,6 @@ Post-training Quantization: - (ONNX) Fixed incorrect insertion of MatMulNBits nodes. - Improvements: - Added support for the compression of 3D weights in AWQ, Scale Estimation, and GPTQ algorithms. Models with MoE (Mixture of Experts), such as GPT-OSS-20B and Qwen3-30B-A3B, can be compressed with data-aware methods now. -- Deprecations/Removals: - - ... - Tutorials: - [Post-Training Quantization of YOLO26 OpenVINO Model](https://github.com/openvinotoolkit/nncf/tree/develop/examples/post_training_quantization/openvino/yolo26) - [Post-Training Optimization of Wan2.2 Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/wan2.2-text-image-to-video/wan2.2-text-image-to-video.ipynb) @@ -38,9 +36,9 @@ Post-training Quantization: Deprecations/Removals: - - (TensorFlow) Removed support for TensorFlow backend. - - (PyTorch) Removed legacy `create_compressed_model` API for PyTorch backend, which was previously marked as deprecated. - - (PyTorch) Removed legacy algorithms for PyTorch that were based on using `NNCFNetwork`: NAS, Structural Pruning, AutoML, Knowledge Distillation, Mixed-Precision Quantization, and Movement Sparsity. +- (TensorFlow) Removed support for TensorFlow backend. +- (PyTorch) Removed legacy `create_compressed_model` API for PyTorch backend, which was previously marked as deprecated. +- (PyTorch) Removed legacy algorithms for PyTorch that were based on using `NNCFNetwork`: NAS, Structural Pruning, AutoML, Knowledge Distillation, Mixed-Precision Quantization, and Movement Sparsity. Requirements: