Skip to content

replace ck_tile type convert by opus cast#2331

Open
yzhou103 wants to merge 8 commits intomainfrom
replace_cvt_by_opus
Open

replace ck_tile type convert by opus cast#2331
yzhou103 wants to merge 8 commits intomainfrom
replace_cvt_by_opus

Conversation

@yzhou103
Copy link
Contributor

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

@yzhou103 yzhou103 requested review from a team and Copilot March 18, 2026 11:00
@github-actions
Copy link
Contributor

🏷️ CI Guide

Runs automatically on every PR:

  • ✅ Pre-checks (submodule verification, code formatting)
  • ✅ Aiter op tests (gfx942 + gfx950)
  • ✅ Triton tests (only when aiter/ops/triton/** or related paths are changed)

Extended tests (opt-in via labels):

Label Tests
ci:sglang SGLang integration tests
ci:atom ATOM benchmark (DeepSeek-R1 + GPT-OSS)
ci:vllm vLLM benchmark
ci:all All of the above

Add labels via the sidebar or gh pr edit 2331 --add-label <label>

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the fused Q/K RoPE + concat + cache CUDA/HIP kernel path away from ck_tile vector conversion utilities toward OPUS-based casting/conversion (aiter::scaled_cast, opus::cast) and introduces an OPUS-specific dtype dispatch macro to instantiate kernels with OPUS scalar types.

Changes:

  • Update cache_kernels.cu to use aiter::scaled_cast / opus::cast and switch kernel instantiation to DISPATCH_BY_KV_CACHE_QUERY_DTYPE_OPUS.
  • Add DISPATCH_BY_KV_CACHE_QUERY_DTYPE_OPUS macro in quant_utils.cuh for OPUS type dispatch.
  • Extend aiter_opus_plus.h with scaled conversion helpers and add #pragma once where missing.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 9 comments.

File Description
csrc/kernels/cache_kernels.cu Replace ck_tile conversion/casts with OPUS equivalents and switch dtype dispatch to OPUS.
csrc/include/quant_utils.cuh Add OPUS variant of KV-cache/query dtype dispatch macro.
csrc/include/hip_reduce.h Add #pragma once and update copyright year.
csrc/include/aiter_opus_plus.h Add #pragma once and introduce OPUS-based scaled conversion utilities (scaled_cast).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@yzhou103 yzhou103 changed the title replace ck_tile type covert by opus cast replace ck_tile type convert by opus cast Mar 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants