[AIROCMLIR-447] Lower migraphx.conv into Linalg#2241
[AIROCMLIR-447] Lower migraphx.conv into Linalg#2241
migraphx.conv into Linalg#2241Conversation
9704ab3 to
14d20ad
Compare
3d4478e to
3d9c23f
Compare
fe2bbc8 to
0f84c0a
Compare
migraphx.conv into Linalgmigraphx.conv into Linalg
| @@ -0,0 +1,35 @@ | |||
| // RUN: rocmlir-gen -fut conv -arch %arch --clone-harness %s | rocmlir-driver --host-pipeline=migraphx-linalg,highlevel | rocmlir-gen -ph -print-results -rand_type float -rand_min 0 -rand_max 0 -fut conv_wrapper --verifier clone - | xmir-runner --shared-libs=%linalg_test_lib_dir/libmlir_rocm_runtime%shlibext,%conv_validation_wrapper_library_dir/libconv-validation-wrappers%shlibext,%linalg_test_lib_dir/libmlir_runner_utils%shlibext,%linalg_test_lib_dir/libmlir_float16_utils%shlibext,%linalg_test_lib_dir/libmlir_c_runner_utils%shlibext,%linalg_test_lib_dir/libmlir_async_runtime%shlibext --entry-point-result=void | FileCheck %s | |||
There was a problem hiding this comment.
If you actually want to set random values use a meaningful range. E.g., something like -rand_min -1 -rand_max 1
db757be to
18b0a33
Compare
pabloantoniom
left a comment
There was a problem hiding this comment.
IR looks pretty good!
| } | ||
| }; | ||
|
|
||
| // Step 2: expand group dimension (NCHW -> NGCHW, FCHW -> GFCHW). |
There was a problem hiding this comment.
Why is this needed? The description is missing some motivation to explain why we need this.
There was a problem hiding this comment.
We discussed this offline, can you please add the motivation here or in the description please? Or even better, with a comment @Mr-Anyone
| @@ -0,0 +1,44 @@ | |||
| // RUN: rocmlir-opt -split-input-file --migraphx-to-linalg -verify-diagnostics %s | FileCheck %s | |||
There was a problem hiding this comment.
I think in this PR you support all the convolution flavors like dilation, padding, stride, and group. Is that correct? If so, I think we need to:
- Make it clear in the description
- Add proper testing for all that functionality
In particular, I think it would be beneficial to split this test into 4. 1D, 2D, 3D and unsupported cases (e.g., 4D, or other cases we don't support). And, in the 1,2,3D tests, we should add tests for dilation, padding (non-padding), stride and groups.
There was a problem hiding this comment.
I have generated more testcase, see migraphx-to-linalg-conv*
| patterns.add<AsUnderlyingShapeConverter, AsLogicalShapeOpConverter>( | ||
| typeConverter, patterns.getContext()); | ||
|
|
||
| // mhal.launch can be generated through rocmlir-gen, so we need a way to |
There was a problem hiding this comment.
I don't understand, can you give an example to show why we need this? Please add it to the description to motivate why we need it
There was a problem hiding this comment.
During e2e testing, the rocmlir-gen seems to generate a wrapper that has mhal.launch.
Example (from test case below):
root@14298acabab2 ~/rocMLIR/build-release (pr-template-migraphx-to-linalg-conv)$ cat main.mlir
func.func @conv_1d_group(%in: !migraphx.shaped<10x8x123xf32, 984x123x1>, %fil: !migraphx.shaped<12x2x7xf32, 14x7x1>) -> !migraphx.shaped<10x12x53xf32, 636x53x1> {
%out = migraphx.convolution %in, %fil {dilation = [4], group = 4 : i64, padding = [3,3], padding_mode = 0 : i64, stride = [2]} :
<10x8x123xf32, 984x123x1>, <12x2x7xf32, 14x7x1> -> <10x12x53xf32, 636x53x1>
func.return %out : !migraphx.shaped<10x12x53xf32, 636x53x1>
}
root@14298acabab2 ~/rocMLIR/build-release (pr-template-migraphx-to-linalg-conv)$ ./bin/rocmlir-gen main.mlir --clone-harness --arch gfx950 -fut conv_1d_group
module {
func.func @conv_1d_group(%arg0: !migraphx.shaped<10x8x123xf32, 984x123x1> {mhal.read_access}, %arg1: !migraphx.shaped<12x2x7xf32, 14x7x1> {mhal.read_access}) -> (!migraphx.shaped<10x12x53xf32, 636x53x1> {mhal.write_access}) {
%0 = migraphx.convolution %arg0, %arg1 {dilation = [4], group = 4 : i64, padding = [3, 3], padding_mode = 0 : i64, stride = [2]} : <10x8x123xf32, 984x123x1>, <12x2x7xf32, 14x7x1> -> <10x12x53xf32, 636x53x1>
return %0 : !migraphx.shaped<10x12x53xf32, 636x53x1>
}
func.func @conv_1d_group_wrapper(%arg0: !migraphx.shaped<10x8x123xf32, 984x123x1>, %arg1: !migraphx.shaped<12x2x7xf32, 14x7x1>) -> !migraphx.shaped<10x12x53xf32, 636x53x1> {
%token, %results = mhal.launch @conv_1d_group (%arg0, %arg1) : (!migraphx.shaped<10x8x123xf32, 984x123x1>, !migraphx.shaped<12x2x7xf32, 14x7x1>) -> !migraphx.shaped<10x12x53xf32, 636x53x1>
mhal.await %token : !mhal.token
return %results : !migraphx.shaped<10x12x53xf32, 636x53x1>
}
module @__xmodule_ attributes {mhal.arch = "gfx950", mhal.module} {
func.func @conv_1d_group(%arg0: !migraphx.shaped<10x8x123xf32, 984x123x1> {mhal.read_access}, %arg1: !migraphx.shaped<12x2x7xf32, 14x7x1> {mhal.read_access}) -> (!migraphx.shaped<10x12x53xf32, 636x53x1> {mhal.write_access}) attributes {kernel, original_func = @conv_1d_group} {
%0 = migraphx.convolution %arg0, %arg1 {dilation = [4], group = 4 : i64, padding = [3, 3], padding_mode = 0 : i64, stride = [2]} : <10x8x123xf32, 984x123x1>, <12x2x7xf32, 14x7x1> -> <10x12x53xf32, 636x53x1>
return %0 : !migraphx.shaped<10x12x53xf32, 636x53x1>
}
}
}If we don't have this line above, passing through ./bin/rocmlir-driver seems to result in failure due to type not being legalized yet.
There was a problem hiding this comment.
Can you post the rocmlir-driver command as well? I cannot reproduce your issue
There was a problem hiding this comment.
Applying this diff:
diff --git a/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp b/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp
index 742d2b29056d..61f746ae9414 100644
--- a/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp
+++ b/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp
@@ -799,7 +799,7 @@ void mlir::migraphx::populateMIGraphXFuncBoundaryToLinalgConversionPatterns(
// mhal.launch can be generated through rocmlir-gen, so we need a way to
// legalize it
- populateMIGraphXToLinalgMHALLauncherConversion(patterns, typeConverter);
+ // populateMIGraphXToLinalgMHALLauncherConversion(patterns, typeConverter);
populateAnyFunctionOpInterfaceTypeConversionPattern(patterns, typeConverter);
populateReturnOpTypeConversionPattern(patterns, typeConverter);
populateCallOpTypeConversionPattern(patterns, typeConverter);
diff --git a/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalgPass.cpp b/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalgPass.cpp
index 75f5588d2024..9fd00ff82ed6 100644
--- a/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalgPass.cpp
+++ b/mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalgPass.cpp
@@ -52,11 +52,11 @@ void mlir::migraphx::populateMIGraphXToLinalgBoundaryDialectConversion(
target.addDynamicallyLegalOp<func::FuncOp>([&](func::FuncOp op) {
return typeConverter.isSignatureLegal(op.getFunctionType());
});
- target.addDynamicallyLegalOp<mhal::LaunchOp>(
- [=](mhal::LaunchOp op) -> std::optional<bool> {
- return typeConverter.isLegal(op.getResultTypes()) &&
- typeConverter.isLegal(op.getOperandTypes());
- });
+ // target.addDynamicallyLegalOp<mhal::LaunchOp>(
+ // [=](mhal::LaunchOp op) -> std::optional<bool> {
+ // return typeConverter.isLegal(op.getResultTypes()) &&
+ // typeConverter.isLegal(op.getOperandTypes());
+ // });
target.addDynamicallyLegalOp<func::ReturnOp>(
[&](func::ReturnOp op) { return typeConverter.isLegal(op); });
target.addDynamicallyLegalOp<func::CallOp>(Running:
./bin/rocmlir-gen -fut conv_3d -arch gfx950 --clone-harness <filename> | ./bin/rocmlir-driver --host-pipeline=migraphx-linalg,highlevel --kernel-pipeline=migraphx-linalg,highlevelOn:
func.func @conv_3d(%arg0: !migraphx.shaped<2x4x2x2x2xf32, 32x8x4x2x1>, %arg1: !migraphx.shaped<2x3x5x5x5xf32, 375x125x25x5x1>, %arg2: !migraphx.shaped<4x3x2x2x2xf32, 24x8x4x2x1>) -> !migraphx.shaped<2x4x2x2x2xf32, 32x8x4x2x1> {
%0 = migraphx.convolution %arg1, %arg2 {dilation = [2, 2, 2], group = 1 : i64, padding = [0, 0, 0, 0, 0, 0], padding_mode = 0 : i64, stride = [2, 2, 2]} : <2x3x5x5x5xf32, 375x125x25x5x1>, <4x3x2x2x2xf32, 24x8x4x2x1> -> <2x4x2x2x2xf32, 32x8x4x2x1>
%1 = migraphx.add %0, %arg0 : <2x4x2x2x2xf32, 32x8x4x2x1>, <2x4x2x2x2xf32, 32x8x4x2x1> -> <2x4x2x2x2xf32, 32x8x4x2x1>
return %1 : !migraphx.shaped<2x4x2x2x2xf32, 32x8x4x2x1>
}I got this on my end?
loc("<stdin>":7:138): error: failed to legalize unresolved materialization from ('tensor<96xf32>') to ('!migraphx.shaped<4x3x2x2x2xf32, 24x8x4x2x1>') that remained live after conversion
Lowering failed.There was a problem hiding this comment.
Thanks for the report. I have deep dived into this issue, and I think this is a design flaw of clone-harness. So I'm happy to keep this hack that you have, because MIGraphXToTosa also does it. I have created https://amd-hub.atlassian.net/browse/AIROCMLIR-555 to fix this design problem.
| // converter. | ||
| Location loc = op.getLoc(); | ||
| int64_t group = op.getGroupAttr().getInt(); | ||
| int64_t dim = cast<RankedTensorType>(input.getType()).getRank() - |
There was a problem hiding this comment.
Don't we have a migraphx::ConvolutionOp method to get the rank?
| @@ -0,0 +1,44 @@ | |||
| // RUN: rocmlir-opt -split-input-file --migraphx-to-linalg -verify-diagnostics %s | FileCheck %s | |||
There was a problem hiding this comment.
Also, does this PR support input and output with different types? For example f16 inputs with f32 outputs. If so, we need to test for that. If not, probably add it as a negative test (to the 4th test case if you follow my suggestion above)
| SmallVector<int64_t, 4> newShape; | ||
| int64_t n = resultType.getDimSize(0); | ||
| int64_t newF = resultType.getDimSize(1) / group; | ||
| assert(resultType.getDimSize(1) % group == 0 && |
There was a problem hiding this comment.
Let's generate a proper failure error (LogicalResult) instead of using asserts
There was a problem hiding this comment.
My original intention was to move this to the verifier, and make this an invariant of the operation.
There was a problem hiding this comment.
Pull request overview
This PR implements lowering of the migraphx.convolution operation to the Linalg dialect, enabling an alternative compilation path for MIGraphX convolutions. The implementation supports 1D, 2D, and 3D convolutions with configurable padding, stride, dilation, and grouping parameters.
Changes:
- Added
ConvConverterto lowermigraphx.convolutionto Linalg operations (generic ops for 1D/3D, named op for 2D) - Introduced new "migraphx-linalg" pipeline option in the rocmlir-driver
- Added comprehensive lit tests and end-to-end tests with PyTorch validation
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| mlir/tools/rocmlir-driver/rocmlir-driver.cpp | Added "migraphx-linalg" pipeline option and updated help text |
| mlir/test/Conversion/MIGraphXToLinalg/migraphx-to-linalg-not-implemented.mlir | Removed convolution test since it's now implemented |
| mlir/test/Conversion/MIGraphXToLinalg/migraphx-to-linalg-conv.mlir | Added lit tests for 1D, 2D, and 3D convolution lowering |
| mlir/test/Conversion/MIGraphXToLinalg/e2e/migraphx-to-linalg-conv-cpu.e2e.mlir | Added end-to-end test with PyTorch-validated golden values |
| mlir/lib/Conversion/MIGraphXToTosa/MIGraphXToTosa.cpp | Added populateMIGraphXToLinalgMHALLauncherConversion function |
| mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalgPass.cpp | Added mhal.launch legalization support |
| mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp | Implemented ConvConverter with padding, group expansion, and linalg lowering |
| mlir/include/mlir/Conversion/MIGraphXToTosa/MIGraphXToTosa.h | Added function declaration for MHALLauncher conversion |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
mlir/test/Conversion/MIGraphXToLinalg/e2e/migraphx-to-linalg-conv.cpu.mlir
Outdated
Show resolved
Hide resolved
mlir/test/Conversion/MIGraphXToLinalg/e2e/migraphx-to-linalg-conv.cpu.mlir
Outdated
Show resolved
Hide resolved
mlir/test/Conversion/MIGraphXToLinalg/e2e/migraphx-to-linalg-conv.cpu.mlir
Outdated
Show resolved
Hide resolved
3d9c23f to
2246225
Compare
18b0a33 to
cf79235
Compare
2246225 to
56be60a
Compare
cf79235 to
648cc4f
Compare
3de417c to
7ef3371
Compare
|
|
||
| def LinalgConvType | ||
| : Rock_I32Enum<"LinalgConvType", | ||
| "The layout of a grouped convolution operation", |
There was a problem hiding this comment.
I wounld't use word layout. ngch doesn't mean arguments are laid out in that order in memory. here it is just used as a hint. Layout has special meaning about memory order.
It is better to say "Hint for convolution type" or something else.
| AffineExpr batch = d[0], group = d[1], filterExpr = d[2]; | ||
| AffineExpr channel = d[3 + dim]; |
There was a problem hiding this comment.
nit:
I find it better to use word inChannels and outChannels. I haven't seen people using filterExpr
| // Iteration domain layout: | ||
| // parallel: batch, group, filter, oh_0 .. oh_{dim-1} | ||
| // reduction: channel, kh_0 .. kh_{dim-1} | ||
| int64_t totalDims = 4 + 2 * dim; |
There was a problem hiding this comment.
Instead of using magic numbers derive number 4 from output shape or resultType
| inputExprs.push_back(d[3 + i] * strideVals[i] + | ||
| d[4 + dim + i] * dilationVals[i]); |
There was a problem hiding this comment.
nit:
I find it better if you can define d[3*i] as some variable first i.e. ih_d or something similar. It is much better for readability instead of trying to find out what d[3*i] means
| SmallVector<utils::IteratorType> iteratorTypes(3 + dim, | ||
| utils::IteratorType::parallel); |
There was a problem hiding this comment.
nit:
Avoid using magic numbers, derive value 3 from input/outptu shapes if possible.
| // RUN: rocmlir-gen -fut conv_1d_group -arch %arch --clone-harness %s | rocmlir-driver --host-pipeline=migraphx-linalg,highlevel | rocmlir-gen -ph -print-results -rand 1 -rand_type=float -fut conv_1d_group_wrapper --verifier clone - | xmir-runner --shared-libs=%linalg_test_lib_dir/libmlir_rocm_runtime%shlibext,%conv_validation_wrapper_library_dir/libconv-validation-wrappers%shlibext,%linalg_test_lib_dir/libmlir_runner_utils%shlibext,%linalg_test_lib_dir/libmlir_float16_utils%shlibext,%linalg_test_lib_dir/libmlir_c_runner_utils%shlibext,%linalg_test_lib_dir/libmlir_async_runtime%shlibext --entry-point-result=void | FileCheck %s --check-prefix=BOTH | ||
| // RUN: rocmlir-gen -fut conv_1d_group -arch %arch --clone-harness %s | rocmlir-driver --host-pipeline=migraphx,highlevel --kernel-pipeline=migraphx,highlevel | rocmlir-gen -ph -print-results -rand 1 -rand_type=float -fut conv_1d_group_wrapper --verifier clone - | rocmlir-driver -host-pipeline mhal -kernel-pipeline full | xmir-runner --shared-libs=%linalg_test_lib_dir/libmlir_rocm_runtime%shlibext,%conv_validation_wrapper_library_dir/libconv-validation-wrappers%shlibext,%linalg_test_lib_dir/libmlir_runner_utils%shlibext,%linalg_test_lib_dir/libmlir_float16_utils%shlibext,%linalg_test_lib_dir/libmlir_c_runner_utils%shlibext,%linalg_test_lib_dir/libmlir_async_runtime%shlibext --entry-point-result=void | FileCheck %s --check-prefix=BOTH | ||
|
|
||
| // Only a small subset of the array is checked because it is quite huge |
There was a problem hiding this comment.
can we only keep one of the test into pr-e2e and move others into e2e ? We run pr-e2e on all PRs. Given this is running on CPU, it can increase PR CI time.
e2e will run during nightly
| // RUN: rocmlir-gen -fut conv_2d_group -arch %arch --clone-harness %s | rocmlir-driver --host-pipeline=migraphx,highlevel --kernel-pipeline=migraphx,highlevel | rocmlir-gen -ph -print-results -rand 1 -rand_type=float -fut conv_2d_group_wrapper --verifier clone - | rocmlir-driver -host-pipeline mhal -kernel-pipeline full | xmir-runner --shared-libs=%linalg_test_lib_dir/libmlir_rocm_runtime%shlibext,%conv_validation_wrapper_library_dir/libconv-validation-wrappers%shlibext,%linalg_test_lib_dir/libmlir_runner_utils%shlibext,%linalg_test_lib_dir/libmlir_float16_utils%shlibext,%linalg_test_lib_dir/libmlir_c_runner_utils%shlibext,%linalg_test_lib_dir/libmlir_async_runtime%shlibext --entry-point-result=void | FileCheck %s --check-prefix=BOTH | ||
|
|
||
| // Here we are checking to see if conv_2d with non standard stride, dilation, and a group parameter matches the existing tosa pipeline | ||
| // Note - this array is quite large, so we are only checking a small subset |
There was a problem hiding this comment.
Only keep one of the tests into pr-e2e
| cl::value_desc("comma separated list of rock pipelines: " | ||
| "migraphx,highlevel,mhal,runner or full"), | ||
| cl::init("")); | ||
| static cl::opt<std::string> hostPipeline( |
There was a problem hiding this comment.
Is there runner-pipelines.mlir tests that you need to modify ?
If not can you create such test ?
There was a problem hiding this comment.
Sure, I have added a file in runner-pipelines.mlir.
| // CHECK-DAG: %[[expanded_1:.*]] = tensor.expand_shape %[[expanded_0]] | ||
| // CHECK-DAG: %[[expanded_2:.*]] = tensor.expand_shape %[[expanded]] | ||
| // CHECK-DAG: %[[cst:.*]] = arith.constant | ||
| // CHECK-DAG: %[[conv:.*]] = linalg.generic {{.*}} ins(%[[expanded_1]], %[[expanded_2]] : tensor{{.*}}) outs(%[[cst]] : tensor{{.*}}) |
There was a problem hiding this comment.
nit:
Add checks for affine maps, because that is essential for correctness. You can just add that for one of the tests not necessarily all
| // CHECK-DAG: %[[collapsed_0:.*]] = tensor.collapse_shape %[[collapsed]] | ||
| // CHECK-DAG: return %[[collapsed_0]] | ||
| func.func @conv_3d_groups(%in: !migraphx.shaped<1x6x10x10x10xf32, 6000x1000x100x10x1>, %fil: !migraphx.shaped<9x2x3x3x3xf32, 54x27x9x3x1>) -> !migraphx.shaped<1x9x8x8x8xf32, 4608x512x64x8x1> { | ||
| %out = migraphx.convolution %in, %fil {dilation = [1, 1, 1], group = 3 : i64, padding = [0, 0, 0, 0, 0, 0], padding_mode = 0 : i64, stride = [1, 1, 1]} : |
There was a problem hiding this comment.
Try values other than 1, 1, 1 or 0, 0,0 for strides,dilation and padding and check for affinemaps
There was a problem hiding this comment.
I have not sure if dilation of 0 is actually supported in onnx?
import torch
import torch.nn as nn
conv = nn.Conv3d(in_channels=3, out_channels=6, kernel_size=3, stride=1, dilation=0, padding=0)
x = torch.randn(1, 3, 10, 10, 10)
y = conv(x)seems to give me this error:
RuntimeError: dilation should be greater than zero, but got [0, 0, 0]
maybe we should move this to the verifier?
|
|
||
| // ----- | ||
|
|
||
| // Input: NCDHW = 1x6x10x10x10, Filter: F(C/G)DHW = 9x2x3x3x3 (group=3, C_per_group=2, F_per_group=3) |
There was a problem hiding this comment.
We use k as output channel and c symbol for the input channels
Can we use that convention throughout ?
65d2df2 to
2942a50
Compare
2942a50 to
8cb54a3
Compare
Motivation
Being able to lower
migraphx.convinto linalg.Technical Details
Forward convolution is lowered in three steps:
The linalg.generic for Conv1D implements this python loop essentially
The linalg.generic for Conv3D implements this python loop essentially
Test Plan
Added both e2e test and lit test. The e2e test value are verified by pytorch.
Test Result
Passed both e2e test and lit test.
Submission Checklist