[feature] Offload KV cache from GPU (VRAM) to RAM

Running on RTX 6000 (96GB), 92GB are consumed with model loaded and 1 user (max 100K ctx). Remaining 4GB would fill up quickly with multiple users (not concurrent).