sycl: Battlemage AOT build via spir64_gen + MMQ subgroup annotations by aicss-genai · Pull Request #22147 · ggml-org/llama.cpp

aicss-genai · 2026-04-20T06:53:29Z

Overview

Authors

Enables AOT builds for Intel GPUs (validated on Intel® Arc™ Pro B70, BMG-G31, Xe2-HPG):

When GGML_SYCL_DEVICE_ARCH is set, switch to -fsycl-targets=spir64_gen with -Xsycl-target-backend="-device <arch>" and skip -ze-intel-greater-than-4GB-buffer-required (not accepted by the AOT path). Behavior is unchanged when GGML_SYCL_DEVICE_ARCH is unset.
Adds [[intel::reqd_sub_group_size(WARP_SIZE)]] to MMQ Q4_0/Q4_1/Q5_0/Q5_1 kernel launches. WARP_SIZE=16 on Intel targets; pinning the required subgroup size is required for spir64_gen AOT correctness and documents intent on JIT.

Also adds a documentation-only update to ggml_sycl_supports_mmq (still returns false) and a note in ggml_sycl_op_mul_mat_sycl recording that fused dequant+GEMM and MMQ/DPAS were both slower than dequant+oneDNN in our experiments.

Additional information

Split from #22066 per reviewer request for independent review.

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: Yes. This work was partially produced with an agentic engineering approach: agents surface issues and explore experiments while engineers identify and reject candidates using domain knowledge. Human feedback involved.

Signed-off-by: Chun Tao <chun.tao@intel.com>

ggml-gh-bot · 2026-04-20T06:58:06Z

Hi @aicss-genai, thanks for your contribution!

Per our contribution guidelines, the automated PR checker found the following issue(s) that need your attention:

Multiple open PRs from a new contributor: We limit new contributors (those without a previously merged PR) to 1 open PR at a time. You currently have 3 open PRs.

Please note that maintainers reserve the right to make final decisions on PRs. If you believe there is a mistake, please comment below.

ctao456 and others added 2 commits April 19, 2026 23:34

sycl: Battlemage AOT build via spir64_gen + MMQ subgroup annotations

4065b65

Signed-off-by: Chun Tao <chun.tao@intel.com>

Merge branch 'ggml-org:master' into aicss-genai/sycl-bmg-upstream-pr-1

dbbf560

aicss-genai requested a review from a team as a code owner April 20, 2026 06:53

github-actions bot added ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Apr 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sycl: Battlemage AOT build via spir64_gen + MMQ subgroup annotations#22147

sycl: Battlemage AOT build via spir64_gen + MMQ subgroup annotations#22147
aicss-genai wants to merge 2 commits intoggml-org:masterfrom
aicss-genai:aicss-genai/sycl-bmg-upstream-pr-1

aicss-genai commented Apr 20, 2026 •

edited

Loading

Uh oh!

ggml-gh-bot bot commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

aicss-genai commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Authors

Additional information

Requirements

Uh oh!

ggml-gh-bot bot commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aicss-genai commented Apr 20, 2026 •

edited

Loading