feat: Add support for split GGUF model loading#808
Open
hongkongkiwi wants to merge 4 commits intoutilityai:mainfrom
Open
feat: Add support for split GGUF model loading#808hongkongkiwi wants to merge 4 commits intoutilityai:mainfrom
hongkongkiwi wants to merge 4 commits intoutilityai:mainfrom
Conversation
This commit introduces comprehensive support for loading models from multiple split files: - Added `load_from_splits()` method to LlamaModel for loading models split across multiple files - Added utility functions `split_path()` and `split_prefix()` for working with split file naming conventions - Added split_model example demonstrating usage of the split loading functionality - Updated workspace Cargo.toml to include the new split_model example This feature enables loading very large models that have been split due to filesystem limitations or distribution requirements. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Remove unused Path import from split_model example - Remove RPC example from workspace members on split-model-loading branch 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Added documentation comments for RopeType enum variants - Ensures all public APIs are properly documented 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Author
|
This bascially adds in missing functionlaity into the rust library that's present in llama.cpp in the tools dir. I think it's worth keeping feature parity so all llama.cpp features can be used. |
…r split-model-loading This commit adds the scalable dynamic tools building system to the split-model-loading branch: - Adds generate_tools_cmake() function to dynamically create tools/CMakeLists.txt - Only builds tools for enabled features (solving PR utilityai#806 issue) - Split model loading doesn't require tools but maintains architecture consistency - Includes tools/CMakeLists.txt in Cargo.toml for build system compatibility - Uses feature-based conditional compilation for future extensibility This creates a merge-friendly architecture where each feature branch can extend tool building without conflicts. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Contributor
|
this seems to include #810, please separate these PRs properly. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
load_from_splits()method toLlamaModelfor loading models split across multiple filessplit_path()utility function to generate standardized split file pathssplit_prefix()utility function to extract prefix from split file pathsFeatures Added
API Changes
LlamaModel::load_from_splits(&backend, &[impl AsRef<Path>], ¶ms) -> Result<Self, LlamaModelLoadError>LlamaModel::split_path(prefix: &str, split_no: i32, split_count: i32) -> StringLlamaModel::split_prefix(split_path: &str, split_no: i32, split_count: i32) -> Option<String>Test Plan
Technical Details
The implementation uses the underlying
llama_model_load_from_splits,llama_split_path, andllama_split_prefixfunctions from llama.cpp, providing safe Rust wrappers with proper error handling and memory management.🤖 Generated with Claude Code