#

machine-learning-performance

Here is 1 public repository matching this topic...

l3tchupkt / adaptq

High-performance CPU KV-cache quantization engine for LLM inference (~10× speedup, 4× memory reduction) with Python & PyTorch support.

python inference pytorch simd attention avx2 quantization kv-cache llm cpu-optimization machine-learning-performance

Updated Apr 25, 2026
C++

Improve this page

Add a description, image, and links to the machine-learning-performance topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the machine-learning-performance topic, visit your repo's landing page and select "manage topics."