OpenAlchemy fork of whisper.cpp — RTX 50-series Blackwell (sm_120) NULL-slot guards on top of upstream. Powers /v1/audio/transcriptions
-
Updated
Jun 1, 2026 - C++
OpenAlchemy fork of whisper.cpp — RTX 50-series Blackwell (sm_120) NULL-slot guards on top of upstream. Powers /v1/audio/transcriptions
🧪 OpenAlchemy fork of llama.cpp — TurboQuant KV-cache compression (3-bit / 2-bit). ~4 GB VRAM saved on Qwen2.5-Coder-14B Q4_K_M @ 32k ctx, +47% gen speed.
Add a description, image, and links to the openalchemy topic page so that developers can more easily learn about it.
To associate your repository with the openalchemy topic, visit your repo's landing page and select "manage topics."