[quantization] Introduce wrappers for Qwen3VLTextDecoderLayer and Qwen3VLTextModel by dayo09 · Pull Request #535 · Samsung/TICO

dayo09 · 2026-03-05T06:19:59Z

Let's add wrappers for upper level qwen3vl layers.

TICO-DCO-1.0-Signed-off-by: Dayoung Lee dayoung.lee@samsung.com

…n3VLTextModel - Add `QuantQwen3VLTextDecoderLayer`: wraps attention, MLP, and layernorm blocks; pre-builds static causal mask and RoPE templates to avoid dynamic ops in forward pass - Add `QuantQwen3VLTextModel`: pre-computes shared causal mask and RoPE once and passes them to every decoder layer, so they are quantized exactly once rather than independently in each layer - Register both wrappers in `_CORE_MODULES` Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

stamalakhov · 2026-03-05T06:40:34Z

tico/quantization/wrapq/wrappers/qwen_vl/quant_text_model.py

+            self._fq(cos, self.obs_cos),
+            self._fq(sin, self.obs_sin),


@dayo09
Sorry for disturbance. But

TICO/tico/quantization/wrapq/wrappers/llama/quant_model.py

Line 212 in ad6de0f

self._fq(cos[:, : hidden_states.size(1), :], self.obs_cos),

will disable dependence on size of inputs. (It proved to be useful for LLama).
It's similar to self.causal_mask_template[..., :seq_len, :seq_len].to(device) above (Ln127).
IMHO.

dayo09 · 2026-03-11T07:15:54Z

tico/quantization/wrapq/examples/qwen/quantize_text_decoder_layer.py

+print(f"│ Mean |diff|: {(q_out - fp_out).abs().mean().item():.6f}")
+print(f"│ PEIR       : {compute_peir(fp_out, q_out) * 100:.6f} %")
+print("└──────────────────────────────────────────────────────")
+print(plot_two_outputs(fp_out, q_out))


┌───────────── Quantization Error Summary ───────────── │ Mean |diff|: 0.071578 │ PEIR : 9.253764 % └────────────────────────────────────────────────────── ┌────────────────────────────────────────────┐ 5.1┤ • │ 3.4┤ • •••• • │ 1.7┤ •••••••••• │ 0.0┤ •••••••••• │ -1.7┤ • •••••• │ -3.4┤ •••••••• │ -5.1┤ • │ └┬──────────┬──────────┬─────────┬──────────┬┘ -5.1 -2.5 0.0 2.5 5.1

dayo09 · 2026-03-11T08:19:03Z

tico/quantization/wrapq/examples/qwen/quantize_text_model.py

+print(f"│ Mean |diff|: {(q_out - fp_out).abs().mean().item():.6f}")
+print(f"│ PEIR       : {compute_peir(fp_out, q_out) * 100:.6f} %")
+print("└──────────────────────────────────────────────────────")
+print(plot_two_outputs(fp_out, q_out))


python3 tico/quantization/wrapq/examples/qwen/quantize_text_model.py ┌───────────── Quantization Error Summary ───────────── │ Mean |diff|: 0.904804 │ PEIR : 351.709125 % └────────────────────────────────────────────────────── ┌──────────────────────────────────────────┐ 28.2┤ │ │ │ │ •• •• • │ 4.7┤ ••••• │ │ ••••• │ │ •••• │ -18.7┤ ••• │ │ │ │ │ -42.2┤ │ │ │ │ │ │ │ -65.6┤ │ │ │ │ │ -89.1┤ │ │ │ │ • │ -112.5┤ │ └┬─────────┬──────────┬─────────┬─────────┬┘ -112.5 -77.4 -42.2 -7.0 28.2

There is one big outlier. 😢

stamalakhov reviewed Mar 5, 2026

View reviewed changes

dayo09 added 2 commits March 11, 2026 16:13

Add examples

4a2e8e4

Improve PEIR by fixing rope and padding

e71a9b1

dayo09 force-pushed the 0303-text-models branch from 4829fa4 to e71a9b1 Compare March 11, 2026 07:15

dayo09 commented Mar 11, 2026

View reviewed changes

Add padding masking test

2c99691

dayo09 commented Mar 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[quantization] Introduce wrappers for Qwen3VLTextDecoderLayer and Qwen3VLTextModel#535

[quantization] Introduce wrappers for Qwen3VLTextDecoderLayer and Qwen3VLTextModel#535
dayo09 wants to merge 4 commits intoSamsung:mainfrom
dayo09:0303-text-models

dayo09 commented Mar 5, 2026

Uh oh!

stamalakhov Mar 5, 2026

Uh oh!

dayo09 Mar 11, 2026

Uh oh!

dayo09 Mar 11, 2026

Uh oh!

dayo09 Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dayo09 commented Mar 5, 2026

Uh oh!

stamalakhov Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

dayo09 Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

dayo09 Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

dayo09 Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants