Skip to content

GEMM+GEMM and CONV+GEMM support to quickTuningGen and GEMM+GEMM quick tuning list#2262

Open
dorde-antic wants to merge 11 commits intodevelopfrom
AIROCMLIR-72-gemm-gemm-support-in-quick-tuning-gen
Open

GEMM+GEMM and CONV+GEMM support to quickTuningGen and GEMM+GEMM quick tuning list#2262
dorde-antic wants to merge 11 commits intodevelopfrom
AIROCMLIR-72-gemm-gemm-support-in-quick-tuning-gen

Conversation

@dorde-antic
Copy link
Contributor

@dorde-antic dorde-antic commented Mar 1, 2026

Motivation

  • AIROCMLIR-71 ⌛(waiting for access to some machines to complete it)
    • gfx908 ✅
    • gfx1200 ✅
    • gfx1100 ✅
    • gfx90a ⌛
    • gfx942 ⌛
    • gfx950 ⌛
  • AIROCMLIR-72 ✅
  • AIROCMLIR-198 ✅

Technical Details

Test Plan

Quick tuning locally
tuningRunner and perfRunner in general
CI

Test Result

Quick tuning locally ✅
PR CI
Weekly CI
Nightly CI

Submission Checklist

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds two groups of changes: (1) updates the rocprofv3 profiler invocation in perfRunner.py and tuningRunner.py to use the correct --output-format csv flag (replacing the old -f csv flag), and (2) extends quickTuningGen.py to handle gemm_gemm and conv_gemm operations, and adds the corresponding GEMM+GEMM quick tuning parameter arrays for gfx908 (f16, f32) and gfx1200 (f16) architectures to QuickTuningPerfconfigs.inc.

Changes:

  • Updated rocprofv3 flag from -f csv to --output-format csv across performance runner scripts
  • Added GEMM+GEMM and CONV+GEMM operation support in the quickTuningGen.py code generator
  • Added GEMM+GEMM quick tuning parameter lists for gfx908 and gfx1200 to the .inc file

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
mlir/utils/performance/perfRunner.py Updates rocprofv3 --output-format csv flag in two profiler invocations
mlir/utils/performance/tuningRunner.py Same rocprofv3 flag update in verification pipeline
mlir/utils/performance/analysis/quickTuningGen.py Adds column definitions and full code-generator support for gemm_gemm and conv_gemm ops
mlir/include/mlir/Dialect/Rock/Tuning/QuickTuningPerfconfigs.inc Adds GEMM+GEMM quick tuning parameter arrays and lookup entries for gfx908 (f16, f32) and gfx1200 (f16)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@dorde-antic dorde-antic requested a review from umangyadav March 2, 2026 12:00
@dorde-antic dorde-antic marked this pull request as ready for review March 11, 2026 15:21
@dorde-antic dorde-antic requested a review from causten as a code owner March 11, 2026 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants