Skip to content

Change matmul / kernel_bench tests to reuse CPU pipeline#140

Merged
rengolin merged 8 commits into
llvm:mainfrom
rengolin:kb_pipeline
May 12, 2026
Merged

Change matmul / kernel_bench tests to reuse CPU pipeline#140
rengolin merged 8 commits into
llvm:mainfrom
rengolin:kb_pipeline

Conversation

@rengolin
Copy link
Copy Markdown
Member

@rengolin rengolin commented May 11, 2026

This PR moves all remaining schedules from CPU test matmul.py into their own schedules in Lighthouse, so they can be reused by the kernel_bench test for similar tests (L1, K1 & K2).

The performance of matmul.py remains the same, while the perf for the Kernel Bench kernels has improved dramatically.

This is only working for FP32 for now, so the BF16 tests in KB are still using the old (lower-to-loops) strategy. A follow up PR will fix the perf for BF16 and update the kernel bench testing to start tracking performance, just like matmul.py.

Fixes #123

assisted-by: GitHub Copilot

This PR moves all remaining schedules from CPU test `matmul.py` into
their own schedules in Lighthouse, so they can be reused by the
kernel_bench test for similar tests (L1, K1 & K2).

The performance of `matmul.py` remains the same, while the perf for the
Kernel Bench kernels has improved dramatically.

This is only working for FP32 for now, so the BF16 tests in KB are still
using the old (lower-to-loops) strategy. A follow up PR will fix the
perf for BF16 and update the kernel bench testing to start tracking
performance, just like `matmul.py`.

assisted-by: GitHub Copilot
@rengolin rengolin requested a review from adam-smnk May 11, 2026 13:58
@rengolin
Copy link
Copy Markdown
Member Author

PS: Added some benchmarking facilities on a separate branch and getting 70~80% of the performance for matmul.py on the two f32 KB tests with sizes 32x32x32 for the first and 64x32x64 for the second.

Comment thread lighthouse/schedule/x86/cache_tiling.py Outdated
Comment thread examples/end-to-end/KernelBench/cpu_matmul.yaml
Copy link
Copy Markdown
Member

@adam-smnk adam-smnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall direction is good.
+1 for converging on schedules instead of transforms/both
-1 for finer details

Comment thread lighthouse/pipeline/descriptor.py Outdated
Comment thread examples/end-to-end/KernelBench/test_kernel_bench.py Outdated
Comment thread examples/end-to-end/KernelBench/test_kernel_bench.py
Comment thread examples/cpu/x86/matmul.py Outdated
Comment thread examples/cpu/x86/matmul.py
Comment thread examples/cpu/x86/matmul.py
Comment thread lighthouse/schedule/x86/register_tiling.py Outdated
Comment thread examples/cpu/x86/matmul.py
Comment thread lighthouse/schedule/packing.py Outdated
Comment thread lighthouse/schedule/vectorization.py Outdated
Comment thread lighthouse/schedule/x86/cache_tiling.py Outdated
Comment thread lighthouse/schedule/x86/cache_tiling.py
Comment thread lighthouse/schedule/packing.py
Comment thread lighthouse/schedule/packing.py Outdated
Comment thread lighthouse/schedule/x86/register_tiling.py
@rengolin rengolin merged commit affba79 into llvm:main May 12, 2026
3 checks passed
@rengolin rengolin deleted the kb_pipeline branch May 12, 2026 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create a CPU general pipeline

2 participants