Add GGML CUDA backend support across inference and native bridge by zhongkaifu · Pull Request #3 · zhongkaifu/TensorSharp

zhongkaifu · 2026-04-05T17:02:00Z

Motivation

Enable selection and use of the GGML CUDA backend end-to-end so models can run on CUDA-capable GPUs through the existing GGML bridge and inference stack.
Provide a compile-time toggle to build the native GGML bridge with CUDA to avoid changing default builds when CUDA is not available.

Description

Added BackendType.GgmlCuda and wired it to GgmlBackendType.Cuda so the inference layer can select CUDA-accelerated GGML execution. (files: InferenceEngine/ModelBase.cs, InferenceConsole/Program.cs, InferenceWeb/ModelService.cs, InferenceWeb/wwwroot/index.html).
Extended managed GGML bindings with GgmlBackendType.Cuda and improved error messaging to report ggml-cuda initialization failures. (file: TensorSharp.GGML/GgmlNative.cs).
Mapped the GGML allocator to report BlasEnum.CUDA for CUDA backends and generalized Float32-only error text (no longer Metal-specific). (files: TensorSharp.GGML/GgmlAllocator.cs, TensorSharp.GGML/GgmlStorage.cs, TensorSharp.GGML/GgmlBasicOps.cs).
Added conditional CUDA support in the native bridge with ggml-cuda.h include, BACKEND_TYPE_CUDA handling and runtime guard using ggml_backend_cuda_init(0), and added a -DTENSORSHARP_ENABLE_CUDA=ON CMake option that drives GGML_CUDA. (files: TensorSharp.GGML.Native/ggml_ops.cpp, TensorSharp.GGML.Native/CMakeLists.txt).

Testing

Ran dotnet build TensorSharp.slnx -c Release which failed because dotnet is not installed in this environment. (result: failed).
Ran CMake configure with CUDA enabled using cmake -S TensorSharp.GGML.Native -B /tmp/tensorsharp-ggml-native-build -DTENSORSHARP_ENABLE_CUDA=ON which failed early due to a missing Objective-C compiler component (cc1obj) in the environment before CUDA-specific checks could run. (result: failed).
No unit or integration test runs completed in this environment due to missing toolchain components, but runtime guards and compile-time gating are in place to prevent accidental CUDA usage when the native bridge is built without CUDA support.

Codex Task

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a2baff2d5e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-05T17:06:11Z

TensorSharp.GGML.Native/ggml_ops.cpp

+            set_last_error("ggml-cuda backend requested, but this native bridge was built without CUDA support.");
+            return;


Reset backend state when CUDA support is unavailable

When ggml_cuda is requested on a build compiled without GGML_USE_CUDA, this branch returns after setting an error, but ensure_backend has already latched g_backend_type to CUDA and std::call_once prevents re-initialization. That leaves the process stuck so later attempts to load ggml_cpu/ggml_metal fail with “A different GGML backend was already initialized,” meaning a single failed CUDA attempt can permanently break backend selection until restart.

Useful? React with 👍 / 👎.

Add GGML CUDA backend support across inference and native bridge

a2baff2

zhongkaifu added the codex label Apr 5, 2026 — with ChatGPT Codex Connector

chatgpt-codex-connector bot reviewed Apr 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GGML CUDA backend support across inference and native bridge#3

Add GGML CUDA backend support across inference and native bridge#3
zhongkaifu wants to merge 1 commit intomainfrom
codex/update-inferenceengine-for-ggml_cuda-support

zhongkaifu commented Apr 5, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		set_last_error("ggml-cuda backend requested, but this native bridge was built without CUDA support.");
		return;

Conversation

zhongkaifu commented Apr 5, 2026

Motivation

Description

Testing

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant