Add GGML CUDA backend support across inference and native bridge#3
Add GGML CUDA backend support across inference and native bridge#3zhongkaifu wants to merge 1 commit intomainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a2baff2d5e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| set_last_error("ggml-cuda backend requested, but this native bridge was built without CUDA support."); | ||
| return; |
There was a problem hiding this comment.
Reset backend state when CUDA support is unavailable
When ggml_cuda is requested on a build compiled without GGML_USE_CUDA, this branch returns after setting an error, but ensure_backend has already latched g_backend_type to CUDA and std::call_once prevents re-initialization. That leaves the process stuck so later attempts to load ggml_cpu/ggml_metal fail with “A different GGML backend was already initialized,” meaning a single failed CUDA attempt can permanently break backend selection until restart.
Useful? React with 👍 / 👎.
Motivation
Description
BackendType.GgmlCudaand wired it toGgmlBackendType.Cudaso the inference layer can select CUDA-accelerated GGML execution. (files:InferenceEngine/ModelBase.cs,InferenceConsole/Program.cs,InferenceWeb/ModelService.cs,InferenceWeb/wwwroot/index.html).GgmlBackendType.Cudaand improved error messaging to reportggml-cudainitialization failures. (file:TensorSharp.GGML/GgmlNative.cs).BlasEnum.CUDAfor CUDA backends and generalized Float32-only error text (no longer Metal-specific). (files:TensorSharp.GGML/GgmlAllocator.cs,TensorSharp.GGML/GgmlStorage.cs,TensorSharp.GGML/GgmlBasicOps.cs).ggml-cuda.hinclude,BACKEND_TYPE_CUDAhandling and runtime guard usingggml_backend_cuda_init(0), and added a-DTENSORSHARP_ENABLE_CUDA=ONCMake option that drivesGGML_CUDA. (files:TensorSharp.GGML.Native/ggml_ops.cpp,TensorSharp.GGML.Native/CMakeLists.txt).Testing
dotnet build TensorSharp.slnx -c Releasewhich failed becausedotnetis not installed in this environment. (result: failed).cmake -S TensorSharp.GGML.Native -B /tmp/tensorsharp-ggml-native-build -DTENSORSHARP_ENABLE_CUDA=ONwhich failed early due to a missing Objective-C compiler component (cc1obj) in the environment before CUDA-specific checks could run. (result: failed).Codex Task