Add GGML CUDA backend support in InferenceEngine and GGML native bridge by zhongkaifu · Pull Request #7 · zhongkaifu/TensorSharp

zhongkaifu · 2026-04-06T03:46:51Z

Enable use of the GGML CUDA backend so GGML-based models can run on NVIDIA GPUs where available.
Make the managed GGML layer and native bridge detect, initialize and report CUDA support and improve native library resolution across platforms.

Added BackendType.GgmlCuda and wired model initialization in InferenceEngine.ModelBase to create a GgmlContext with GgmlBackendType.Cuda and use GgmlAllocator for CUDA-backed models.
Extended TensorSharp.GGML surface: added GgmlBackendType.Cuda, updated GgmlAllocator.BlasEnum to return BlasEnum.CUDA for CUDA backends, and improved native library probing to look for platform-specific filenames (.dll, .so, .dylib).
Updated native bridge (TensorSharp.GGML.Native/ggml_ops.cpp) to detect ggml-cuda.h when present, added BACKEND_TYPE_CUDA, and initialize CUDA with ggml_backend_cuda_init(0) when compiled in, while returning a clear error if CUDA support was not built into the native bridge.
Made CMakeLists.txt configurable to enable CUDA using TSG_ENABLE_CUDA and gated Metal/ObjC language and flags to Apple builds only so the native build can be targeted per-platform.

Attempted dotnet build InferenceEngine/InferenceEngine.csproj -v minimal, but dotnet was not available in the environment so the managed build could not be executed.
No native cmake/make build was run in this environment, so native bridge compilation with CUDA was not validated by automated builds here.

Add GGML CUDA backend support across engine and native bridge

d9c6d6b

zhongkaifu added the codex label Apr 6, 2026 — with ChatGPT Codex Connector

zhongkaifu added 2 commits April 5, 2026 20:52

Add Linux native build script and OS-specific GGML build hooks

93a7e3a

Auto-enable CUDA in Linux native build script when available

804ba67

Provide feedback