aphroditeformal93

aphroditeformal93

Popular repositories Loading

vllm-awq4-qwen vllm-awq4-qwen Public

Run Qwen 3.6-27B AWQ-INT4 models with DFlash speculative decoding on AMD Strix Halo hardware using vLLM for high-throughput inference.

Python