RFE: Add batching support (N-D tensors) to the gemm operator

**Problem Statement**
Currently, the `MLGraphBuilder.gemm()` operation is restricted to 2D inputs for the `a` and `b` operands. However, `MLGraphBuilder.matmul()` fully supports batching via N-dimensional tensors. 

If a developer wants to perform a batched General Matrix Multiplication — calculating $\alpha A B + \beta C$ across a batch — they cannot use the `gemm` operator natively. 

**Current Workaround & Limitations**
To achieve a batched GEMM, developers currently have to manually emulate the operation by composing a subgraph of multiple operators:
1. `matmul` (to handle the batched multiplication of A and B)
2. `mul` (to scale the result by $\alpha$)
3. `mul` (to scale C by $\beta$)
4. `add` (to combine the results)

This emulation is undesirable for a few reasons:
* **Graph Bloat:** It increases the number of nodes in the graph, adding overhead to graph construction and compilation.
* **Performance Penalties:** While some backend compilers might successfully fuse these separate `matmul`, `mul`, and `add` nodes back into a single batched GEMM operation, this fusion is not guaranteed. If the fusion step fails or isn't supported by a specific hardware backend, the execution will incur multiple memory read/write passes instead of utilizing a single, highly optimized hardware kernel (e.g., batched GEMM on GPUs or NPUs).

**Proposed Solution**
Update the `gemm` operator specification to accept N-dimensional tensors (where rank >= 2). For inputs of rank > 2, the leading dimensions should be treated as batch dimensions exactly like the `matmul` operation currently does. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFE: Add batching support (N-D tensors) to the gemm operator #929

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RFE: Add batching support (N-D tensors) to the gemm operator #929

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions