GPU-tuned Docker images for LLM inference on consumer hardware. Auto-detects your GPU, downloads the model, serves an OpenAI-compatible API.
-
Updated
Mar 2, 2026 - Shell
GPU-tuned Docker images for LLM inference on consumer hardware. Auto-detects your GPU, downloads the model, serves an OpenAI-compatible API.
Add a description, image, and links to the agentic-inference topic page so that developers can more easily learn about it.
To associate your repository with the agentic-inference topic, visit your repo's landing page and select "manage topics."