Skip to content

Fix TurboQuant turbo4 Vulkan SET_ROWS support#4

Merged
romgenie merged 1 commit into
masterfrom
codex/fix-turbo4-vulkan-set-rows
May 23, 2026
Merged

Fix TurboQuant turbo4 Vulkan SET_ROWS support#4
romgenie merged 1 commit into
masterfrom
codex/fix-turbo4-vulkan-set-rows

Conversation

@romgenie
Copy link
Copy Markdown

@romgenie romgenie commented May 23, 2026

Summary

  • Ports TurboQuant Vulkan SET_ROWS support for GGML_TYPE_TURBO4_0 so -ctv turbo4 can populate KV-cache rows on Vulkan instead of failing backend support checks.
  • Keeps turbo2/turbo3/turbo4 SET_ROWS pipeline creation behind Vulkan subgroup capability checks, with the turbo3 ballot path requiring ballot support.
  • Fixes review/CI issues found after opening the PR: restores non-Windows shared GGML_API extern, exports the TurboQuant CPU WHT global from its definition only, recomputes layer-adaptive KV mode per cache construction, rejects Grok+TurboKV when flash-attn is incompatible, bumps RPC protocol major after the ggml_op enum renumber, and removes unused CUDA SET_ROWS locals that fail NVCC -Werror.
  • Keeps related CI repairs in the branch: Windows server link fix, perplexity unused-variable fix, and ROCm/HIP 640x512 attention config coverage.

Verification

  • Remote B60/Vulkan build completed for ggml-base, llama-perplexity, llama-cli, llama-server, and test-backend-ops.
  • RPC verification built ggml-rpc and rpc-server with GGML_RPC=ON.
  • ./build/bin/test-backend-ops test -b Vulkan0 -o SET_ROWS_TURBO3 passed 21/21.
  • ./build/bin/test-backend-ops test -b Vulkan0 -o SET_ROWS_TURBO4 passed 21/21.
  • ./build/bin/test-chat passed.
  • Repository review agent re-run after fixes: no concerns.

Head verified locally/remotely before CUDA warning cleanup: 48dbb220c32b4186dd6861a98acad5b28d1fd35c.
Current head with CUDA warning cleanup: c43d35bbf.

@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

@romgenie
Copy link
Copy Markdown
Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

@romgenie
Copy link
Copy Markdown
Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 93b95eb170

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/llama-kv-cache.cpp Outdated
Comment thread ggml/include/ggml.h Outdated
Comment thread src/llama-context.cpp Outdated
@romgenie romgenie force-pushed the codex/fix-turbo4-vulkan-set-rows branch from 93b95eb to dedc62b Compare May 23, 2026 16:47
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dedc62b298

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread ggml/include/ggml.h
@romgenie romgenie force-pushed the codex/fix-turbo4-vulkan-set-rows branch from dedc62b to 48dbb22 Compare May 23, 2026 17:02
Copy link
Copy Markdown
Author

@codex review

Please review the updated head 48dbb220c32b4186dd6861a98acad5b28d1fd35c. I resolved the prior concerns around GGML_API extern, per-cache adaptive mode, Grok flash-attn handling, and the RPC protocol major bump after the ggml_op enum renumber.

@romgenie romgenie force-pushed the codex/fix-turbo4-vulkan-set-rows branch from 48dbb22 to c43d35b Compare May 23, 2026 17:21
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c43d35bbfb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread ggml/src/ggml-cpu/ops.cpp Outdated
Comment thread src/llama-kv-cache.cpp Outdated
Comment thread src/llama-kv-cache.cpp
@romgenie romgenie force-pushed the codex/fix-turbo4-vulkan-set-rows branch from c43d35b to 2d0f573 Compare May 23, 2026 18:08
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2d0f573fa6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/llama-graph.cpp
Comment thread src/llama-kv-cache.cpp Outdated
@romgenie romgenie force-pushed the codex/fix-turbo4-vulkan-set-rows branch from 2d0f573 to bf83572 Compare May 23, 2026 18:32
Copy link
Copy Markdown
Author

@codex review

Please review the updated head bf8357253. This push is intentionally marked [skip ci] while we resolve review feedback before resuming CI. It addresses the current unresolved Codex feedback: CPU SET_ROWS WHT group state is thread-local, InnerQ tensor updates are generation-based per KV cache, K-shift uses padded K tensor dimensions, InnerQ scale allocation covers turbo V-only/auto-asymmetric paths, and Vulkan now supports 128-wide GGML_OP_TURBO_WHT with InnerQ scale binding.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. You're on a roll.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@romgenie romgenie force-pushed the codex/fix-turbo4-vulkan-set-rows branch from bf83572 to 5a8882d Compare May 23, 2026 18:39
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5a8882d90a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread ggml/src/ggml-cpu/ggml-cpu.c Outdated
Comment thread ggml/src/ggml-vulkan/ggml-vulkan.cpp Outdated
Comment thread ggml/src/ggml.c
Comment thread ggml/src/ggml-metal/ggml-metal-ops.cpp
Comment thread ggml/src/ggml-metal/ggml-metal-ops.cpp
@romgenie romgenie force-pushed the codex/fix-turbo4-vulkan-set-rows branch 2 times, most recently from be3ac96 to cf34062 Compare May 23, 2026 19:06
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cf34062e51

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread ggml/src/ggml-turbo-quant.c Outdated
Comment thread ggml/src/ggml-cuda/turbo-innerq.cu
@romgenie romgenie force-pushed the codex/fix-turbo4-vulkan-set-rows branch 2 times, most recently from 379c56b to 921f10e Compare May 23, 2026 20:18
@romgenie
Copy link
Copy Markdown
Author

@codex review

Please review the current CI-suppressed head 921f10e671. CI is intentionally stopped while we iterate on review feedback. This head resolves the prior conversations: CPU WHT group state, InnerQ generation/snapshot publication, padded K shift strides, Vulkan/Metal TURBO_WHT support and scale handling, TURBO_WHT scale validation, CPU vec_dot heap allocation, and TurboQuant lazy-init thread safety. If you find concerns, comment inline so I can fix them before CI is re-enabled.

@romgenie
Copy link
Copy Markdown
Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 921f10e671

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread ggml/src/ggml-vulkan/ggml-vulkan.cpp Outdated
@romgenie romgenie force-pushed the codex/fix-turbo4-vulkan-set-rows branch from 921f10e to b4461ca Compare May 23, 2026 20:31
@romgenie
Copy link
Copy Markdown
Author

@codex review

Please review the current CI-suppressed head b4461ca19. This push resolves the latest Vulkan SET_ROWS concern by rejecting Turbo SET_ROWS when the source row width is not a 128-element block multiple, matching the shader block geometry. CI remains intentionally stopped until your review has no unresolved concerns.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Bravo.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@romgenie romgenie force-pushed the codex/fix-turbo4-vulkan-set-rows branch from b4461ca to 9711db6 Compare May 23, 2026 20:38
@romgenie romgenie merged commit 60635b6 into master May 23, 2026
47 of 61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant