sycl: Q5_K reorder MMVQ/dequant + Q8_0 reorder MMVQ path by aicss-genai · Pull Request #22152 · ggml-org/llama.cpp

aicss-genai · 2026-04-20T07:04:28Z

Overview

Authors

Extends the reorder-quantized codepath to Q5_K (new) and adds a reorder
MMVQ kernel for Q8_0.

Adds block_q_t<GGML_TYPE_Q5_K> specialization with layout [qs (QK_K/2 per block)] [qh (QK_K/8 per block)] [scales] [dm] and matching get_block_offset / get_d_offset.
Adds reorder_qw_q5_k (weight reorder), reorder_mul_mat_vec_q5_k_q8_1_sycl (MMVQ kernel), dequantize_row_q5_K_sycl_reorder and the reorder variant of dequantize_block_q5_K.
Wires Q5_K into ggml_sycl_supports_reorder_mul_mat_sycl, ggml_sycl_supports_reorder_mmvq, and the reorder_qw dispatch.
Adds reorder_mul_mat_vec_q8_0_q8_1_sycl and inlines reorder_vec_dot_q_sycl<Q8_0>::operator() (removes the small vec_dot_q8_0_q8_1_impl helper).
Adds dequantize_q8_0_reorder and dequantize_block_q8_0_reorder helpers used by the Q8_0 reorder MMVQ path.

Uses the existing g_ggml_sycl_use_async_mem_op flag (default off in master); no dependency on #22066's async-toggle change.

Additional information

Split from #22066 per reviewer request for independent review.

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: Yes. This work was partially produced with an agentic engineering approach: agents surface issues and explore experiments while engineers identify and reject candidates using domain knowledge. Human feedback involved.

Signed-off-by: Chun Tao <chun.tao@intel.com>

ggml-gh-bot · 2026-04-20T07:08:44Z

Hi @aicss-genai, thanks for your contribution!

Per our contribution guidelines, the automated PR checker found the following issue(s) that need your attention:

Multiple open PRs from a new contributor: We limit new contributors (those without a previously merged PR) to 1 open PR at a time. You currently have 7 open PRs.

Please note that maintainers reserve the right to make final decisions on PRs. If you believe there is a mistake, please comment below.

ctao456 and others added 2 commits April 19, 2026 23:37

sycl: Q5_K reorder MMVQ/dequant + Q8_0 reorder MMVQ path

ac54489

Signed-off-by: Chun Tao <chun.tao@intel.com>

Merge branch 'ggml-org:master' into aicss-genai/sycl-bmg-upstream-pr-5

ce43550

aicss-genai requested a review from a team as a code owner April 20, 2026 07:04

github-actions bot added ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Apr 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sycl: Q5_K reorder MMVQ/dequant + Q8_0 reorder MMVQ path#22152

sycl: Q5_K reorder MMVQ/dequant + Q8_0 reorder MMVQ path#22152
aicss-genai wants to merge 2 commits intoggml-org:masterfrom
aicss-genai:aicss-genai/sycl-bmg-upstream-pr-5

aicss-genai commented Apr 20, 2026 •

edited

Loading

Uh oh!

ggml-gh-bot bot commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

aicss-genai commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Authors

Additional information

Requirements

Uh oh!

ggml-gh-bot bot commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aicss-genai commented Apr 20, 2026 •

edited

Loading