[quantization] Introduce wrapper for Qwen3VLVisionRotaryEmbedding#496
[quantization] Introduce wrapper for Qwen3VLVisionRotaryEmbedding#496dvsav wants to merge 1 commit intoSamsung:mainfrom
Conversation
For ReviewersBelow is the source code of # transformers/models/qwen3_vl/modeling_qwen3_vl.py
class Qwen3VLVisionRotaryEmbedding(nn.Module):
inv_freq: torch.Tensor # fix linting for `register_buffer`
def __init__(self, dim: int, theta: float = 10000.0) -> None:
super().__init__()
self.dim = dim
self.theta = theta
inv_freq = 1.0 / (theta ** (torch.arange(0, dim, 2, dtype=torch.float) / dim))
self.register_buffer("inv_freq", inv_freq, persistent=False)
def forward(self, seqlen: int) -> torch.Tensor:
seq = torch.arange(seqlen, device=self.inv_freq.device, dtype=self.inv_freq.dtype)
freqs = torch.outer(seq, self.inv_freq)
return freqs |
|
FYI, we used to create positional embeddings "statically". When we run transformers on devices, we fix sequence length for efficiency. By fixing vision embedding's input seq_len, Qwen3VLVisionRotaryEmbedding folds to constant table. @mhs4670go mentioned that, thus, it seems when you implement the upper class('Qwen3VLVisionModel')'s wrapq class, above logics will be treated like one constant, needing no wrapper. |
|
To fill you more, By casting seq_len as python integer in the code, torch notice it as an 'specilized variable' (see: #431 (comment)) |
This change introduces QuantQwen3VLVisionRotaryEmbedding wrapper to support post-training quantization of Qwen3VLVisionRotaryEmbedding module. TICO-DCO-1.0-Signed-off-by: d.savchenkov <d.savchenkov@partner.samsung.com>
d614df6 to
f5d54ff
Compare
|
Hi @mhs4670go and @dayo09 Thanks for your comments! Now, please correct me if I'm wrong. To be able to precompute RoPE embeddings we'll need to presume fixed The value of that fixed The example input (passed to Should the example inputs used during the calibration comply with that fixed
|
I think option 1 sounds reasonable. You can use |
This change introduces
QuantQwen3VLVisionRotaryEmbeddingwrapper to support post-training quantization ofQwen3VLVisionRotaryEmbeddingmodule.Why?
Qwen3VLVisionRotaryEmbeddingmodule is used in the image encoder of Qwen model.Trying to quantize
Qwen3VLVisionRotaryEmbeddingvia PTQ generates exceptionPTQQuantizer: no quantization wrapper for Qwen3VLVisionRotaryEmbedding.What
This change introduces:
QuantQwen3VLVisionRotaryEmbedding(tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_rotary_embedding.py).class TestQuantQwen3VLVisionRotaryEmbedding(test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py) - skipped iftransformerspackage is not installed.tico.quantization.wrapq.wrappers.qwen_vl.quant_vision_rotary_embeddingin_CORE_MODULES(tico/quantization/wrapq/wrappers/registry.py).Qwen3VLVisionRotaryEmbeddingquantization and conversion to Circle (tico/quantization/wrapq/examples/qwen/quantize_qwen_vision_rotary_embedding.py).Unit Tests
Unit tests results with coverage information:
Coverage info (irrelevant files skipped):