Skip to content

[QwixOdmlQuantizationBoundary] Fix tfl.mul element type mismatch in selective quantization.#254

Open
copybara-service[bot] wants to merge 1 commit intomainfrom
test_900510098
Open

[QwixOdmlQuantizationBoundary] Fix tfl.mul element type mismatch in selective quantization.#254
copybara-service[bot] wants to merge 1 commit intomainfrom
test_900510098

Conversation

@copybara-service
Copy link
Copy Markdown

@copybara-service copybara-service bot commented Apr 16, 2026

[QwixOdmlQuantizationBoundary] Fix tfl.mul element type mismatch in selective quantization.

This change stabilizes the Qwix ODML quantization pipeline by enforcing strict boundaries between quantized and floating-point regions.

Key changes:

  • Enforced that rule is None or rule.act_qtype is None in _maybe_fake_quant implies full precision (FP), clearing all _FQ_ metadata and returning immediately. This prevents illegal metadata propagation ("leakage") across operations without defined quantization rules.
  • Updated FinalOutput to explicitly retrieve and pass the previous_rule to _maybe_fake_quant, preserving the intended delayed quantization behavior for model outputs while maintaining strict boundaries elsewhere.

These changes ensure that quantization intent does not leak across boundaries in a way that breaks binary operations like multiplication, resolving MLIR assertion failures during TFLite conversion.

@copybara-service copybara-service bot force-pushed the test_900510098 branch 3 times, most recently from 67c9e64 to af12f15 Compare April 17, 2026 00:20
…elective quantization.

This change stabilizes the Qwix ODML quantization pipeline by enforcing strict boundaries between quantized and floating-point regions.

Key changes:
- Enforced that `rule is None` or `rule.act_qtype is None` in `_maybe_fake_quant` implies full precision (FP), clearing all `_FQ_` metadata and returning immediately. This prevents illegal metadata propagation ("leakage") across operations without defined quantization rules.
- Updated `FinalOutput` to explicitly retrieve and pass the `previous_rule` to `_maybe_fake_quant`, preserving the intended delayed quantization behavior for model outputs while maintaining strict boundaries elsewhere.

These changes ensure that quantization intent does not leak across boundaries in a way that breaks binary operations like multiplication, resolving MLIR assertion failures during TFLite conversion.

PiperOrigin-RevId: 900510098
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant