Skip to content

Latest commit

 

History

History
48 lines (34 loc) · 1.09 KB

File metadata and controls

48 lines (34 loc) · 1.09 KB

CUDA Operator Practice Framework

This repository now includes a small CMake-based framework for CUDA operator practice.

Build

cmake -S . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j

Optional architecture override:

cmake -S . -B build -DCMAKE_CUDA_ARCHITECTURES=86

Correctness Test

ctest --test-dir build --output-on-failure

Current test:

  • vec_add_correctness

Benchmark

./build/vec_add_bench --size 16777216 --warmup 20 --iters 100

Arguments:

  • --size: number of elements (default 1<<24)
  • --warmup: warmup launches before measurement
  • --iters: measured launches

Add a New Operator

  1. Add declaration in include/ops/<your_op>.cuh.
  2. Add CUDA implementation in src/ops/<your_op>.cu.
  3. Register source file in cuda_ops inside CMakeLists.txt.
  4. Add correctness test in tests/<your_op>_test.cu and register with add_test.
  5. Add benchmark binary in benchmarks/<your_op>_bench.cu.

Reusable helpers:

  • include/common/cuda_check.cuh: CUDA error checks.
  • include/common/timer.cuh: event-based GPU timing.