Conversation
| p.t().dot(&x) | ||
| } else { | ||
| x.to_owned() | ||
| }; |
There was a problem hiding this comment.
Bug: OPQ Rotation Matrix Mismatch
The ProductQuantizer applies the rotation matrix inconsistently between the training (build_level0) and encoding (encode_vector) phases. During training, the rotation is applied as a right multiplication (X @ P), but during encoding, it's applied as a left multiplication with the transpose (P^T @ x). This mathematical discrepancy leads to incorrect vector transformations and quantization results when Optional Product Quantization (OPQ) is enabled. The encode_vector method should apply the rotation as x.dot(&p) to match the training phase.
Locations (1)
| "Dimension mismatch: {} != {} * {}", | ||
| total_dims, m_blocks, dims_per_block | ||
| )); | ||
| } |
There was a problem hiding this comment.
Bug: Quantizer Limits Exceeded
The ProductQuantizer's encoding and decoding logic implicitly limits k_centroids and m_blocks to a maximum of 16. This is because centroid IDs are packed into 4-bit fields within a u64 code.
- If
k_centroidsexceeds 16, centroid IDs will be truncated, causing silent data corruption. - If
m_blocksexceeds 16, the bit shift for encoding will exceed theu64's capacity, leading to undefined behavior or incorrect encoding.
Thebuild_level0function lacks validation for these implicit limits.
Locations (1)
| p.dot(&decoded) | ||
| } else { | ||
| decoded | ||
| } |
There was a problem hiding this comment.
Bug: Quantizer Decoding Errors: Rotation and Bit Packing
The ProductQuantizer::decode_code method contains two bugs:
- Incorrect OPQ inverse rotation: It applies the rotation matrix
Pinstead of its transposeP^Tto reverse the transformation. k_centroidsbit packing issue: It uses 4-bit packing (& 0xF) for centroid IDs, implicitly limitingk_centroidsto 16. This causes out-of-bounds access ifk_centroids < 16and incorrect encoding/decoding ifk_centroids > 16(due to ID truncation).
🔁 Pull Request Template – BharatMLStack
📌 Summary
📂 Modules Affected
horizon(Real-time systems / networking)online-feature-store(Feature serving infra)trufflebox-ui(Admin panel / UI)infra(Docker, CI/CD, GCP/AWS setup)docs(Documentation updates)___________✅ Type of Change
___________📊 Benchmark / Metrics (if applicable)