Skip to content

feat: wire gpu backend#1

Merged
mudler merged 8 commits into
mainfrom
feat/gpu-backend
May 29, 2026
Merged

feat: wire gpu backend#1
mudler merged 8 commits into
mainfrom
feat/gpu-backend

Conversation

@mudler

@mudler mudler commented May 29, 2026

Copy link
Copy Markdown
Collaborator

This wires Nvidia/Metal/Vulkan/Hipblas

mudler and others added 8 commits May 29, 2026 13:26
…efines

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ation pending

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
On-hardware validation (plan Task 7) complete on an NVIDIA GB10 (Grace
Blackwell, CUDA 13.0, compute capability 12.1):
  - GPU backend activates; weights realized in VRAM
  - rfdetr-base F16: 23.6 ms/image median (vs 274 ms same-box ARM CPU) = 11.6x
  - detections match CPU baseline 8/8 within tolerance (score <=0.05, bbox <=2px)
  - the 3 deformable-attention CUSTOM ops confirmed running on CPU via the
    scheduler fallback (GGML_SCHED_DEBUG=2)

Flips the README/BENCHMARK GPU status from 'validation pending' to validated
with real numbers.
@mudler mudler merged commit 65c0ffc into main May 29, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant