Skip to content

[AIROCMLIR-447] Lower migraphx.conv into Linalg#2241

Open
Mr-Anyone wants to merge 23 commits intodevelopfrom
pr-template-migraphx-to-linalg-conv
Open

[AIROCMLIR-447] Lower migraphx.conv into Linalg#2241
Mr-Anyone wants to merge 23 commits intodevelopfrom
pr-template-migraphx-to-linalg-conv

Conversation

@Mr-Anyone
Copy link
Member

@Mr-Anyone Mr-Anyone commented Feb 17, 2026

Motivation

Being able to lower migraphx.conv into linalg.

Technical Details

Forward convolution is lowered in three steps:

  1. Apply padding to the input when the op has non-zero padding.
  2. Expand the channel dimension into (group, channel_per_group), introducing a group dimension G. Input becomes NGC* (e.g. NGCL, NGCHW, NGCDHW) and filter becomes GFC* (e.g. GFCL, GFCHW, GFCDHW), matching the group attr.
  3. Emit the grouped linalg convolution (1D/2D/3D), then collapse the result back to the original NFHW/NFDHW shape for the type converter.

The linalg.generic for Conv1D implements this python loop essentially

def my_conv_1d(x, filters, group, strides=1, dilation=1):
    # NGCHW 
    batch = x.shape[0]
    group = x.shape[1]
    channel = x.shape[2]
    input_height = x.shape[3]
    # GFCHW
    group = filters.shape[0]
    filter_count = filters.shape[1]
    channel = filters.shape[2]
    kernel_height = filters.shape[3]
    
    output_height = int((input_height - kernel_height -(kernel_height - 1)*(dilation - 1))/strides) + 1 
    result = np.zeros((batch, group, filter_count, output_height))
    for n in range(batch):    
        for g in range(group):
            for f in range(filter_count):
                for ho in range(output_height):
                    # starting reduction here 
                    # computing the convolution 
                    for c in range(channel):
                        for k in range(kernel_height):
                            input_width_access = ho*strides+ k*dilation
                            result[n, g, f, ho] += x[n, g,c, input_width_access] * filters[g, f, c, k]
    return result 

The linalg.generic for Conv3D implements this python loop essentially

def my_conv_3d(x, filters, group, strides=(1, 1, 1), dilation=(1, 1, 1)):
    # NGCHW 
    batch = x.shape[0]
    group = x.shape[1]
    channel = x.shape[2]
    input_height = x.shape[3]
    input_width = x.shape[4]
    input_depth = x.shape[5]
    
    # GFCHW
    group = filters.shape[0]
    filter_count = filters.shape[1]
    channel = filters.shape[2]
    kernel_height = filters.shape[3]
    kernel_width = filters.shape[4]
    kernel_depth = filters.shape[5]
    
    output_height = int((input_height - kernel_height -(kernel_height - 1)*(dilation[0] - 1))/strides[0]) + 1 
    output_width = int((input_width - kernel_width - (kernel_width - 1)*(dilation[1] - 1) )/strides[1]) + 1 
    output_depth = int((input_depth - kernel_depth - (kernel_depth - 1)*(dilation[2] - 1) )/strides[2]) + 1 

    result = np.zeros((batch, group, filter_count, output_height, output_width, output_depth))
    for n in range(batch):
        for g in range(group): # paralle
            for f in range(filter_count):
                for ho in range(output_height):
                    for wo in range(output_width):
                        for do in range(output_depth):
                            # starting reduction here 
                            # computing the convolution 
                            for c in range(channel):
                                for k in range(kernel_height):
                                    for j in range(kernel_width):
                                        for z in range(kernel_depth):
                                            input_height_access = ho*strides[0] + k*dilation[0]
                                            input_width_access  = wo*strides[1] + j*dilation[1]
                                            input_depth_access  = do*strides[2] + z*dilation[2]
                                            result[n, g, f, ho, wo, do] += (
                                                x[n, g, c, input_height_access, input_width_access, input_depth_access]
                                                * filters[g, f, c, k, j, z]
                                            )
        
    return result 

Test Plan

Added both e2e test and lit test. The e2e test value are verified by pytorch.

Test Result

Passed both e2e test and lit test.

Submission Checklist

@Mr-Anyone Mr-Anyone requested a review from causten as a code owner February 17, 2026 21:03
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-conv branch from 9704ab3 to 14d20ad Compare February 17, 2026 21:13
@Mr-Anyone Mr-Anyone marked this pull request as draft February 18, 2026 02:24
@Mr-Anyone Mr-Anyone marked this pull request as draft February 18, 2026 02:24
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-6 branch from 3d4478e to 3d9c23f Compare February 18, 2026 15:07
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-conv branch 6 times, most recently from fe2bbc8 to 0f84c0a Compare February 18, 2026 22:07
@Mr-Anyone Mr-Anyone marked this pull request as ready for review February 18, 2026 22:08
@Mr-Anyone Mr-Anyone changed the title [AIROCMLIR-447] WIP (needs testcase) Lower migraphx.conv into Linalg [AIROCMLIR-447] Lower migraphx.conv into Linalg Feb 18, 2026
@@ -0,0 +1,35 @@
// RUN: rocmlir-gen -fut conv -arch %arch --clone-harness %s | rocmlir-driver --host-pipeline=migraphx-linalg,highlevel | rocmlir-gen -ph -print-results -rand_type float -rand_min 0 -rand_max 0 -fut conv_wrapper --verifier clone - | xmir-runner --shared-libs=%linalg_test_lib_dir/libmlir_rocm_runtime%shlibext,%conv_validation_wrapper_library_dir/libconv-validation-wrappers%shlibext,%linalg_test_lib_dir/libmlir_runner_utils%shlibext,%linalg_test_lib_dir/libmlir_float16_utils%shlibext,%linalg_test_lib_dir/libmlir_c_runner_utils%shlibext,%linalg_test_lib_dir/libmlir_async_runtime%shlibext --entry-point-result=void | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you actually want to set random values use a meaningful range. E.g., something like -rand_min -1 -rand_max 1

@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-conv branch 2 times, most recently from db757be to 18b0a33 Compare February 19, 2026 16:19
Copy link
Contributor

@pabloantoniom pabloantoniom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IR looks pretty good!

}
};

// Step 2: expand group dimension (NCHW -> NGCHW, FCHW -> GFCHW).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? The description is missing some motivation to explain why we need this.

Copy link
Contributor

@pabloantoniom pabloantoniom Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this offline, can you please add the motivation here or in the description please? Or even better, with a comment @Mr-Anyone

@@ -0,0 +1,44 @@
// RUN: rocmlir-opt -split-input-file --migraphx-to-linalg -verify-diagnostics %s | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this PR you support all the convolution flavors like dilation, padding, stride, and group. Is that correct? If so, I think we need to:

  1. Make it clear in the description
  2. Add proper testing for all that functionality

In particular, I think it would be beneficial to split this test into 4. 1D, 2D, 3D and unsupported cases (e.g., 4D, or other cases we don't support). And, in the 1,2,3D tests, we should add tests for dilation, padding (non-padding), stride and groups.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have generated more testcase, see migraphx-to-linalg-conv*

patterns.add<AsUnderlyingShapeConverter, AsLogicalShapeOpConverter>(
typeConverter, patterns.getContext());

// mhal.launch can be generated through rocmlir-gen, so we need a way to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand, can you give an example to show why we need this? Please add it to the description to motivate why we need it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During e2e testing, the rocmlir-gen seems to generate a wrapper that has mhal.launch.

Example (from test case below):

root@14298acabab2 ~/rocMLIR/build-release (pr-template-migraphx-to-linalg-conv)$ cat main.mlir
func.func @conv_1d_group(%in: !migraphx.shaped<10x8x123xf32, 984x123x1>, %fil: !migraphx.shaped<12x2x7xf32, 14x7x1>) -> !migraphx.shaped<10x12x53xf32, 636x53x1> {
  %out = migraphx.convolution %in, %fil {dilation = [4], group = 4 : i64, padding = [3,3], padding_mode = 0 : i64, stride = [2]} :
    <10x8x123xf32, 984x123x1>, <12x2x7xf32, 14x7x1> -> <10x12x53xf32,  636x53x1>
  func.return %out : !migraphx.shaped<10x12x53xf32,  636x53x1>
}
root@14298acabab2 ~/rocMLIR/build-release (pr-template-migraphx-to-linalg-conv)$ ./bin/rocmlir-gen main.mlir --clone-harness --arch gfx950 -fut conv_1d_group
module {
  func.func @conv_1d_group(%arg0: !migraphx.shaped<10x8x123xf32, 984x123x1> {mhal.read_access}, %arg1: !migraphx.shaped<12x2x7xf32, 14x7x1> {mhal.read_access}) -> (!migraphx.shaped<10x12x53xf32, 636x53x1> {mhal.write_access}) {
    %0 = migraphx.convolution %arg0, %arg1 {dilation = [4], group = 4 : i64, padding = [3, 3], padding_mode = 0 : i64, stride = [2]} : <10x8x123xf32, 984x123x1>, <12x2x7xf32, 14x7x1> -> <10x12x53xf32, 636x53x1>
    return %0 : !migraphx.shaped<10x12x53xf32, 636x53x1>
  }
  func.func @conv_1d_group_wrapper(%arg0: !migraphx.shaped<10x8x123xf32, 984x123x1>, %arg1: !migraphx.shaped<12x2x7xf32, 14x7x1>) -> !migraphx.shaped<10x12x53xf32, 636x53x1> {
    %token, %results = mhal.launch @conv_1d_group (%arg0, %arg1) : (!migraphx.shaped<10x8x123xf32, 984x123x1>, !migraphx.shaped<12x2x7xf32, 14x7x1>) -> !migraphx.shaped<10x12x53xf32, 636x53x1>
    mhal.await %token : !mhal.token
    return %results : !migraphx.shaped<10x12x53xf32, 636x53x1>
  }
  module @__xmodule_ attributes {mhal.arch = "gfx950", mhal.module} {
    func.func @conv_1d_group(%arg0: !migraphx.shaped<10x8x123xf32, 984x123x1> {mhal.read_access}, %arg1: !migraphx.shaped<12x2x7xf32, 14x7x1> {mhal.read_access}) -> (!migraphx.shaped<10x12x53xf32, 636x53x1> {mhal.write_access}) attributes {kernel, original_func = @conv_1d_group} {
      %0 = migraphx.convolution %arg0, %arg1 {dilation = [4], group = 4 : i64, padding = [3, 3], padding_mode = 0 : i64, stride = [2]} : <10x8x123xf32, 984x123x1>, <12x2x7xf32, 14x7x1> -> <10x12x53xf32, 636x53x1>
      return %0 : !migraphx.shaped<10x12x53xf32, 636x53x1>
    }
  }
}

If we don't have this line above, passing through ./bin/rocmlir-driver seems to result in failure due to type not being legalized yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you post the rocmlir-driver command as well? I cannot reproduce your issue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applying this diff:

diff --git a/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp b/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp
index 742d2b29056d..61f746ae9414 100644
--- a/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp
+++ b/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp
@@ -799,7 +799,7 @@ void mlir::migraphx::populateMIGraphXFuncBoundaryToLinalgConversionPatterns(

   // mhal.launch can be generated through rocmlir-gen, so we need a way to
   // legalize it
-  populateMIGraphXToLinalgMHALLauncherConversion(patterns, typeConverter);
+  // populateMIGraphXToLinalgMHALLauncherConversion(patterns, typeConverter);
   populateAnyFunctionOpInterfaceTypeConversionPattern(patterns, typeConverter);
   populateReturnOpTypeConversionPattern(patterns, typeConverter);
   populateCallOpTypeConversionPattern(patterns, typeConverter);
diff --git a/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalgPass.cpp b/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalgPass.cpp
index 75f5588d2024..9fd00ff82ed6 100644
--- a/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalgPass.cpp
+++ b/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalgPass.cpp
@@ -52,11 +52,11 @@ void mlir::migraphx::populateMIGraphXToLinalgBoundaryDialectConversion(
   target.addDynamicallyLegalOp<func::FuncOp>([&](func::FuncOp op) {
     return typeConverter.isSignatureLegal(op.getFunctionType());
   });
-  target.addDynamicallyLegalOp<mhal::LaunchOp>(
-      [=](mhal::LaunchOp op) -> std::optional<bool> {
-        return typeConverter.isLegal(op.getResultTypes()) &&
-               typeConverter.isLegal(op.getOperandTypes());
-      });
+  // target.addDynamicallyLegalOp<mhal::LaunchOp>(
+  //     [=](mhal::LaunchOp op) -> std::optional<bool> {
+  //       return typeConverter.isLegal(op.getResultTypes()) &&
+  //              typeConverter.isLegal(op.getOperandTypes());
+  //     });
   target.addDynamicallyLegalOp<func::ReturnOp>(
       [&](func::ReturnOp op) { return typeConverter.isLegal(op); });
   target.addDynamicallyLegalOp<func::CallOp>(

Running:

./bin/rocmlir-gen -fut conv_3d -arch gfx950 --clone-harness <filename> | ./bin/rocmlir-driver --host-pipeline=migraphx-linalg,highlevel --kernel-pipeline=migraphx-linalg,highlevel

On:

func.func @conv_3d(%arg0: !migraphx.shaped<2x4x2x2x2xf32, 32x8x4x2x1>, %arg1: !migraphx.shaped<2x3x5x5x5xf32, 375x125x25x5x1>, %arg2: !migraphx.shaped<4x3x2x2x2xf32, 24x8x4x2x1>) -> !migraphx.shaped<2x4x2x2x2xf32, 32x8x4x2x1>  {
  %0 = migraphx.convolution %arg1, %arg2 {dilation = [2, 2, 2], group = 1 : i64, padding = [0, 0, 0, 0, 0, 0], padding_mode = 0 : i64, stride = [2, 2, 2]} : <2x3x5x5x5xf32, 375x125x25x5x1>, <4x3x2x2x2xf32, 24x8x4x2x1> -> <2x4x2x2x2xf32, 32x8x4x2x1>
  %1 = migraphx.add %0, %arg0 : <2x4x2x2x2xf32, 32x8x4x2x1>, <2x4x2x2x2xf32, 32x8x4x2x1> -> <2x4x2x2x2xf32, 32x8x4x2x1>
  return %1 : !migraphx.shaped<2x4x2x2x2xf32, 32x8x4x2x1>
}

I got this on my end?

loc("<stdin>":7:138): error: failed to legalize unresolved materialization from ('tensor<96xf32>') to ('!migraphx.shaped<4x3x2x2x2xf32, 24x8x4x2x1>') that remained live after conversion
Lowering failed.

Copy link
Contributor

@pabloantoniom pabloantoniom Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the report. I have deep dived into this issue, and I think this is a design flaw of clone-harness. So I'm happy to keep this hack that you have, because MIGraphXToTosa also does it. I have created https://amd-hub.atlassian.net/browse/AIROCMLIR-555 to fix this design problem.

// converter.
Location loc = op.getLoc();
int64_t group = op.getGroupAttr().getInt();
int64_t dim = cast<RankedTensorType>(input.getType()).getRank() -
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we have a migraphx::ConvolutionOp method to get the rank?

@@ -0,0 +1,44 @@
// RUN: rocmlir-opt -split-input-file --migraphx-to-linalg -verify-diagnostics %s | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, does this PR support input and output with different types? For example f16 inputs with f32 outputs. If so, we need to test for that. If not, probably add it as a negative test (to the 4th test case if you follow my suggestion above)

SmallVector<int64_t, 4> newShape;
int64_t n = resultType.getDimSize(0);
int64_t newF = resultType.getDimSize(1) / group;
assert(resultType.getDimSize(1) % group == 0 &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's generate a proper failure error (LogicalResult) instead of using asserts

Copy link
Member Author

@Mr-Anyone Mr-Anyone Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original intention was to move this to the verifier, and make this an invariant of the operation.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements lowering of the migraphx.convolution operation to the Linalg dialect, enabling an alternative compilation path for MIGraphX convolutions. The implementation supports 1D, 2D, and 3D convolutions with configurable padding, stride, dilation, and grouping parameters.

Changes:

  • Added ConvConverter to lower migraphx.convolution to Linalg operations (generic ops for 1D/3D, named op for 2D)
  • Introduced new "migraphx-linalg" pipeline option in the rocmlir-driver
  • Added comprehensive lit tests and end-to-end tests with PyTorch validation

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
mlir/tools/rocmlir-driver/rocmlir-driver.cpp Added "migraphx-linalg" pipeline option and updated help text
mlir/test/Conversion/MIGraphXToLinalg/migraphx-to-linalg-not-implemented.mlir Removed convolution test since it's now implemented
mlir/test/Conversion/MIGraphXToLinalg/migraphx-to-linalg-conv.mlir Added lit tests for 1D, 2D, and 3D convolution lowering
mlir/test/Conversion/MIGraphXToLinalg/e2e/migraphx-to-linalg-conv-cpu.e2e.mlir Added end-to-end test with PyTorch-validated golden values
mlir/lib/Conversion/MIGraphXToTosa/MIGraphXToTosa.cpp Added populateMIGraphXToLinalgMHALLauncherConversion function
mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalgPass.cpp Added mhal.launch legalization support
mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp Implemented ConvConverter with padding, group expansion, and linalg lowering
mlir/include/mlir/Conversion/MIGraphXToTosa/MIGraphXToTosa.h Added function declaration for MHALLauncher conversion

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-6 branch from 3d9c23f to 2246225 Compare February 20, 2026 18:44
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-conv branch from 18b0a33 to cf79235 Compare February 20, 2026 18:45
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-6 branch from 2246225 to 56be60a Compare February 20, 2026 19:01
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-conv branch from cf79235 to 648cc4f Compare February 20, 2026 19:01
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-6 branch 2 times, most recently from 3de417c to 7ef3371 Compare February 20, 2026 19:59

def LinalgConvType
: Rock_I32Enum<"LinalgConvType",
"The layout of a grouped convolution operation",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wounld't use word layout. ngch doesn't mean arguments are laid out in that order in memory. here it is just used as a hint. Layout has special meaning about memory order.

It is better to say "Hint for convolution type" or something else.

Comment on lines +194 to +195
AffineExpr batch = d[0], group = d[1], filterExpr = d[2];
AffineExpr channel = d[3 + dim];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:
I find it better to use word inChannels and outChannels. I haven't seen people using filterExpr

// Iteration domain layout:
// parallel: batch, group, filter, oh_0 .. oh_{dim-1}
// reduction: channel, kh_0 .. kh_{dim-1}
int64_t totalDims = 4 + 2 * dim;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using magic numbers derive number 4 from output shape or resultType

Comment on lines +199 to +200
inputExprs.push_back(d[3 + i] * strideVals[i] +
d[4 + dim + i] * dilationVals[i]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:
I find it better if you can define d[3*i] as some variable first i.e. ih_d or something similar. It is much better for readability instead of trying to find out what d[3*i] means

Comment on lines +215 to +216
SmallVector<utils::IteratorType> iteratorTypes(3 + dim,
utils::IteratorType::parallel);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:
Avoid using magic numbers, derive value 3 from input/outptu shapes if possible.

// RUN: rocmlir-gen -fut conv_1d_group -arch %arch --clone-harness %s | rocmlir-driver --host-pipeline=migraphx-linalg,highlevel | rocmlir-gen -ph -print-results -rand 1 -rand_type=float -fut conv_1d_group_wrapper --verifier clone - | xmir-runner --shared-libs=%linalg_test_lib_dir/libmlir_rocm_runtime%shlibext,%conv_validation_wrapper_library_dir/libconv-validation-wrappers%shlibext,%linalg_test_lib_dir/libmlir_runner_utils%shlibext,%linalg_test_lib_dir/libmlir_float16_utils%shlibext,%linalg_test_lib_dir/libmlir_c_runner_utils%shlibext,%linalg_test_lib_dir/libmlir_async_runtime%shlibext --entry-point-result=void | FileCheck %s --check-prefix=BOTH
// RUN: rocmlir-gen -fut conv_1d_group -arch %arch --clone-harness %s | rocmlir-driver --host-pipeline=migraphx,highlevel --kernel-pipeline=migraphx,highlevel | rocmlir-gen -ph -print-results -rand 1 -rand_type=float -fut conv_1d_group_wrapper --verifier clone - | rocmlir-driver -host-pipeline mhal -kernel-pipeline full | xmir-runner --shared-libs=%linalg_test_lib_dir/libmlir_rocm_runtime%shlibext,%conv_validation_wrapper_library_dir/libconv-validation-wrappers%shlibext,%linalg_test_lib_dir/libmlir_runner_utils%shlibext,%linalg_test_lib_dir/libmlir_float16_utils%shlibext,%linalg_test_lib_dir/libmlir_c_runner_utils%shlibext,%linalg_test_lib_dir/libmlir_async_runtime%shlibext --entry-point-result=void | FileCheck %s --check-prefix=BOTH

// Only a small subset of the array is checked because it is quite huge
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we only keep one of the test into pr-e2e and move others into e2e ? We run pr-e2e on all PRs. Given this is running on CPU, it can increase PR CI time.
e2e will run during nightly

// RUN: rocmlir-gen -fut conv_2d_group -arch %arch --clone-harness %s | rocmlir-driver --host-pipeline=migraphx,highlevel --kernel-pipeline=migraphx,highlevel | rocmlir-gen -ph -print-results -rand 1 -rand_type=float -fut conv_2d_group_wrapper --verifier clone - | rocmlir-driver -host-pipeline mhal -kernel-pipeline full | xmir-runner --shared-libs=%linalg_test_lib_dir/libmlir_rocm_runtime%shlibext,%conv_validation_wrapper_library_dir/libconv-validation-wrappers%shlibext,%linalg_test_lib_dir/libmlir_runner_utils%shlibext,%linalg_test_lib_dir/libmlir_float16_utils%shlibext,%linalg_test_lib_dir/libmlir_c_runner_utils%shlibext,%linalg_test_lib_dir/libmlir_async_runtime%shlibext --entry-point-result=void | FileCheck %s --check-prefix=BOTH

// Here we are checking to see if conv_2d with non standard stride, dilation, and a group parameter matches the existing tosa pipeline
// Note - this array is quite large, so we are only checking a small subset
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only keep one of the tests into pr-e2e

cl::value_desc("comma separated list of rock pipelines: "
"migraphx,highlevel,mhal,runner or full"),
cl::init(""));
static cl::opt<std::string> hostPipeline(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there runner-pipelines.mlir tests that you need to modify ?

If not can you create such test ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I have added a file in runner-pipelines.mlir.

// CHECK-DAG: %[[expanded_1:.*]] = tensor.expand_shape %[[expanded_0]]
// CHECK-DAG: %[[expanded_2:.*]] = tensor.expand_shape %[[expanded]]
// CHECK-DAG: %[[cst:.*]] = arith.constant
// CHECK-DAG: %[[conv:.*]] = linalg.generic {{.*}} ins(%[[expanded_1]], %[[expanded_2]] : tensor{{.*}}) outs(%[[cst]] : tensor{{.*}})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:
Add checks for affine maps, because that is essential for correctness. You can just add that for one of the tests not necessarily all

// CHECK-DAG: %[[collapsed_0:.*]] = tensor.collapse_shape %[[collapsed]]
// CHECK-DAG: return %[[collapsed_0]]
func.func @conv_3d_groups(%in: !migraphx.shaped<1x6x10x10x10xf32, 6000x1000x100x10x1>, %fil: !migraphx.shaped<9x2x3x3x3xf32, 54x27x9x3x1>) -> !migraphx.shaped<1x9x8x8x8xf32, 4608x512x64x8x1> {
%out = migraphx.convolution %in, %fil {dilation = [1, 1, 1], group = 3 : i64, padding = [0, 0, 0, 0, 0, 0], padding_mode = 0 : i64, stride = [1, 1, 1]} :
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try values other than 1, 1, 1 or 0, 0,0 for strides,dilation and padding and check for affinemaps

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not sure if dilation of 0 is actually supported in onnx?

import torch
import torch.nn as nn
conv = nn.Conv3d(in_channels=3, out_channels=6, kernel_size=3, stride=1, dilation=0, padding=0)
x = torch.randn(1, 3, 10, 10, 10)
y = conv(x)

seems to give me this error:

RuntimeError: dilation should be greater than zero, but got [0, 0, 0]

maybe we should move this to the verifier?


// -----

// Input: NCDHW = 1x6x10x10x10, Filter: F(C/G)DHW = 9x2x3x3x3 (group=3, C_per_group=2, F_per_group=3)
Copy link
Member

@umangyadav umangyadav Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use k as output channel and c symbol for the input channels

Can we use that convention throughout ?

@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-conv branch 2 times, most recently from 65d2df2 to 2942a50 Compare March 13, 2026 20:33
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-conv branch from 2942a50 to 8cb54a3 Compare March 16, 2026 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants