Skip to content

Add GGML CUDA backend support in InferenceEngine and GGML native bridge#7

Open
zhongkaifu wants to merge 3 commits intomainfrom
codex/update-to-support-ggml_cuda-backend
Open

Add GGML CUDA backend support in InferenceEngine and GGML native bridge#7
zhongkaifu wants to merge 3 commits intomainfrom
codex/update-to-support-ggml_cuda-backend

Conversation

@zhongkaifu
Copy link
Copy Markdown
Owner

Motivation

  • Enable use of the GGML CUDA backend so GGML-based models can run on NVIDIA GPUs where available.
  • Make the managed GGML layer and native bridge detect, initialize and report CUDA support and improve native library resolution across platforms.

Description

  • Added BackendType.GgmlCuda and wired model initialization in InferenceEngine.ModelBase to create a GgmlContext with GgmlBackendType.Cuda and use GgmlAllocator for CUDA-backed models.
  • Extended TensorSharp.GGML surface: added GgmlBackendType.Cuda, updated GgmlAllocator.Bl​asEnum to return BlasEnum.CUDA for CUDA backends, and improved native library probing to look for platform-specific filenames (.dll, .so, .dylib).
  • Updated native bridge (TensorSharp.GGML.Native/ggml_ops.cpp) to detect ggml-cuda.h when present, added BACKEND_TYPE_CUDA, and initialize CUDA with ggml_backend_cuda_init(0) when compiled in, while returning a clear error if CUDA support was not built into the native bridge.
  • Made CMakeLists.txt configurable to enable CUDA using TSG_ENABLE_CUDA and gated Metal/ObjC language and flags to Apple builds only so the native build can be targeted per-platform.

Testing

  • Attempted dotnet build InferenceEngine/InferenceEngine.csproj -v minimal, but dotnet was not available in the environment so the managed build could not be executed.
  • No native cmake/make build was run in this environment, so native bridge compilation with CUDA was not validated by automated builds here.

Codex Task

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant