nick413-bit

nick413-bit

Popular repositories Loading

gfx906-fa-vllm gfx906-fa-vllm Public

FlashAttention-style custom attention backend for vLLM on AMD MI50/MI60/Radeon VII (gfx906). Downstream fork of mixa3607/ML-gfx906 with replacement HIP kernels and a vllm.general_plugins entry point.

Python