[quantization] Introduce wrapper for Qwen3VLVisionRotaryEmbedding by dvsav · Pull Request #496 · Samsung/TICO

dvsav · 2026-02-16T14:36:01Z

This change introduces QuantQwen3VLVisionRotaryEmbedding wrapper to support post-training quantization of Qwen3VLVisionRotaryEmbedding module.

Why?

Qwen3VLVisionRotaryEmbedding module is used in the image encoder of Qwen model.
Trying to quantize Qwen3VLVisionRotaryEmbedding via PTQ generates exception PTQQuantizer: no quantization wrapper for Qwen3VLVisionRotaryEmbedding.

What

This change introduces:

Class QuantQwen3VLVisionRotaryEmbedding (tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_rotary_embedding.py).
Unit tests: class TestQuantQwen3VLVisionRotaryEmbedding (test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py) - skipped if transformers package is not installed.
New entry tico.quantization.wrapq.wrappers.qwen_vl.quant_vision_rotary_embedding in _CORE_MODULES (tico/quantization/wrapq/wrappers/registry.py).
Example of Qwen3VLVisionRotaryEmbedding quantization and conversion to Circle (tico/quantization/wrapq/examples/qwen/quantize_qwen_vision_rotary_embedding.py).

Unit Tests

Unit tests results with coverage information:

$ coverage run -m pytest test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py -v
======================================================================================= test session starts ========================================================================================
platform linux -- Python 3.10.12, pytest-8.4.0, pluggy-1.6.0 -- /home/d.savchenkov/myenv/bin/python3
cachedir: .pytest_cache
rootdir: /home/d.savchenkov/TICO
configfile: pyproject.toml
plugins: anyio-4.12.0, mock-3.15.1, xdist-3.7.0, cov-6.2.1
collected 10 items                                                                                                                                                                                 

test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_activation_stats_collected PASSED                                [ 10%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_different_sequence_lengths PASSED                                [ 20%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_dtype_override             PASSED                                [ 30%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_frequency_values_correct   PASSED                                [ 40%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_mode_transitions           PASSED                                [ 50%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_no_learnable_parameters    PASSED                                [ 60%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_observer_count             PASSED                                [ 70%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_output_shape               PASSED                                [ 80%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_quantised_output_close     PASSED                                [ 90%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_registration_in_registry   PASSED                                [100%]

================================================================================== 10 passed, 2 warnings in 6.43s ==================================================================================

Coverage info (irrelevant files skipped):

$ coverage report -m
Name                                                                        Stmts   Miss  Cover   Missing
---------------------------------------------------------------------------------------------------------
...
tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_rotary_embedding.py      24      0   100%
...
---------------------------------------------------------------------------------------------------------
TOTAL                                                                       10227   6595    36%

dvsav · 2026-02-16T15:18:02Z

For Reviewers

Below is the source code of Qwen3VLVisionRotaryEmbedding module that can be used to check the correctness of QuantQwen3VLVisionRotaryEmbedding implementation:

# transformers/models/qwen3_vl/modeling_qwen3_vl.py

class Qwen3VLVisionRotaryEmbedding(nn.Module):
    inv_freq: torch.Tensor  # fix linting for `register_buffer`

    def __init__(self, dim: int, theta: float = 10000.0) -> None:
        super().__init__()
        self.dim = dim
        self.theta = theta
        inv_freq = 1.0 / (theta ** (torch.arange(0, dim, 2, dtype=torch.float) / dim))
        self.register_buffer("inv_freq", inv_freq, persistent=False)

    def forward(self, seqlen: int) -> torch.Tensor:
        seq = torch.arange(seqlen, device=self.inv_freq.device, dtype=self.inv_freq.dtype)
        freqs = torch.outer(seq, self.inv_freq)
        return freqs

mhs4670go · 2026-02-19T07:32:34Z

#496 and #498 are necessary? You can refer to the llama decoder layer wrapper for position embeddings creation.

dayo09 · 2026-02-20T06:28:03Z

FYI, we used to create positional embeddings "statically". When we run transformers on devices, we fix sequence length for efficiency.

By fixing vision embedding's input seq_len, Qwen3VLVisionRotaryEmbedding folds to constant table.
And the following https://github.com/huggingface/transformers/blob/1618d44b9295361607ec74d7be860ba886aac039/src/transformers/models/qwen3_vl/modeling_qwen3_vl.py#L658 rotary embedding generation logic can be folds to constant too.

@mhs4670go mentioned that, thus, it seems when you implement the upper class('Qwen3VLVisionModel')'s wrapq class, above logics will be treated like one constant, needing no wrapper.

dayo09 · 2026-02-20T06:39:56Z

To fill you more,

By casting seq_len as python integer in the code, torch notice it as an 'specilized variable' (see: #431 (comment))
Even in this PR, the generated Qwen3VLVisionRotaryEmbedding's circle model is simply a 'weight' as it's folded into constant, as 'forward' functions takes 'seq_len' as an integer.
Likewise, try to fix 'seq_len' in your Qwen3VLTextRotaryEmbedding code. (#498) You can see that static table is created based on the give seq_len.

This change introduces QuantQwen3VLVisionRotaryEmbedding wrapper to support post-training quantization of Qwen3VLVisionRotaryEmbedding module. TICO-DCO-1.0-Signed-off-by: d.savchenkov <d.savchenkov@partner.samsung.com>

dvsav · 2026-03-06T13:16:20Z

Hi @mhs4670go and @dayo09

Thanks for your comments!

Now, please correct me if I'm wrong.
Rotary positional embeddings are computed in Qwen3VLVisionModel class. Most of the computing logic is contained in Qwen3VLVisionModel.rot_pos_emb. The embeddings computed by Qwen3VLVisionModel.rot_pos_emb depend on grid_thw parameter passed to Qwen3VLVisionModel.forward.

To be able to precompute RoPE embeddings we'll need to presume fixed grid_thw value at inference time. For example grid_thw=[[8, 24, 24]] (a batch containing 1 video with 8 temporal patches, 24 vertical patches, 24 horizontal patches; 8*24*24 = 4608 patches in total).

The value of that fixed grid_thw should be known to the constructor of QuantQwen3VLVisionModel class (QuantQwen3VLVisionModel.__init__).

The example input (passed to tico.convert) used for model export during the conversion to Circle must comply with that fixed grid_thw.

Should the example inputs used during the calibration comply with that fixed grid_thw? There are 2 options:

Yes, they should. In that case we don't have to implement RoPE embeddings computing logic for arbitrary grid_thw used only during calibration.
No, they may be different. In that case we have to implement RoPE embeddings computing logic for arbitrary grid_thw used only during calibration.

mhs4670go · 2026-03-09T04:56:07Z

@dvsav

Should the example inputs used during the calibration comply with that fixed grid_thw?

I think option 1 sounds reasonable.

You can use PTQConfig that receives an option like vision_grid_thw from users and the model would use to generate position_embeddings. Of course there should be a default value for the case when users doesn't give any value of it.

dvsav mentioned this pull request Feb 16, 2026

Qwen3-VL: Implement quantization wrappers #483

Open

dvsav marked this pull request as draft March 4, 2026 06:48

[quantization] Introduce wrapper for Qwen3VLVisionRotaryEmbedding

f5d54ff

This change introduces QuantQwen3VLVisionRotaryEmbedding wrapper to support post-training quantization of Qwen3VLVisionRotaryEmbedding module. TICO-DCO-1.0-Signed-off-by: d.savchenkov <d.savchenkov@partner.samsung.com>

dvsav force-pushed the quant_vision_rotary_embed branch from d614df6 to f5d54ff Compare March 4, 2026 07:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[quantization] Introduce wrapper for Qwen3VLVisionRotaryEmbedding#496

[quantization] Introduce wrapper for Qwen3VLVisionRotaryEmbedding#496
dvsav wants to merge 1 commit intoSamsung:mainfrom
dvsav:quant_vision_rotary_embed

dvsav commented Feb 16, 2026

Uh oh!

dvsav commented Feb 16, 2026

Uh oh!

mhs4670go commented Feb 19, 2026

Uh oh!

dayo09 commented Feb 20, 2026

Uh oh!

dayo09 commented Feb 20, 2026

Uh oh!

dvsav commented Mar 6, 2026

Uh oh!

mhs4670go commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dvsav commented Feb 16, 2026

Why?

What

Unit Tests

Uh oh!

dvsav commented Feb 16, 2026

For Reviewers

Uh oh!

mhs4670go commented Feb 19, 2026

Uh oh!

dayo09 commented Feb 20, 2026

Uh oh!

dayo09 commented Feb 20, 2026

Uh oh!

dvsav commented Mar 6, 2026

Uh oh!

mhs4670go commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants