Unified model catalog across HF, LM Studio, Ollama, llama.cpp + content-based resolution#2012
Draft
AlexCheema wants to merge 2 commits into
Draft
Unified model catalog across HF, LM Studio, Ollama, llama.cpp + content-based resolution#2012AlexCheema wants to merge 2 commits into
AlexCheema wants to merge 2 commits into
Conversation
… content-based resolution Unifies exo's downloads view with locally-installed models from external tools so users can see (and where format permits, load) what they already have on disk without re-downloading. Drops the long-standing requirement that every model ship a safetensors index — single-file models like Qwen/Qwen3-0.6B now work. Switches model lookup from path-based to content-based fingerprinting so a mistyped folder name no longer triggers a redundant re-download. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…he receive loop A peer running an incompatible schema (newer or older) used to take down the entire gossipsub receive loop with a single ValidationError. Caught locally — log and drop the bad message, keep the loop alive. Surfaced while testing this branch against a peer on a different schema; the fix is independent of the schema work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ModelSourceregistry detects locally-installed models from HuggingFace cache, LM Studio, Ollama, llama.cpp, and exo's own dirs. Each entry surfaces instate.local_modelstagged withsource+format(safetensors / mlx / gguf). The Models page (renamed from Downloads) shows source badges, a source filter, and an "GGUF · n/a — not yet loadable" badge for entries exo can't run today. Inference path-resolution falls through to external sources so MLX/safetensors models in HF cache or LM Studio load without re-downloading.Qwen/Qwen3-0.6B): drop the hard requirement onmodel.safetensors.index.json.fetch_safetensors_sizefalls back tohuggingface_hub.model_info().safetensors.totalwhen the index 404s;_scan_model_directory/is_model_directory_completerecognise a non-.partialmodel.safetensorsas a complete dir.resolve_existing_modelnow does a convention-pass first (unchanged), then falls through to a content-pass that fingerprints architecture-defining keys inconfig.jsonand finds a matching dir regardless of folder name. RenameQwen--Qwen3-0.6B/towrong-typo/and exo still finds it — no redundant re-download. Fingerprint cost is sub-millisecond at realistic model counts (10 μs/dir scanned).End-to-end verified live on macOS against a real LM Studio install, a real HF cache, and a real Ollama install.
Test plan
uv run basedpyright— 0 errorsuv run ruff check— cleannix fmt— applieduv run pytest— 445 passed (+22 new: 19 source scanners + 4 scanner service + 8 fingerprint + 7 content resolution + 7 single-file)npx svelte-check— no new dashboard errors (16 pre-existing, unchanged)GET /sourcesreturns the 5 sources with availability flagsGET /state.localModelspopulates after the worker's first scan tickPOST /models/add Qwen/Qwen3-0.6Breturns 200 (single-file regression)Qwen/Qwen3-0.6Breturns tokensmv ~/.exo/models/Qwen--Qwen3-0.6B ~/.exo/models/wrong-typo, then chat — log showsResolved … via content fingerprint (folder name 'wrong-typo' != convention 'Qwen--Qwen3-0.6B'), no re-download/downloads— verify source badges, filter chips, GGUF "not yet loadable" state, and that delete buttons are hidden on non-exo entriesmlx-community/Qwen3-30B-A3B-4bit) still downloads and runs — regression checkOut of scope (named so we don't slip)
state.local_modelsinto the chat model picker — independent UI follow-up. Today the Models page surfaces them; the picker still searches HF + bundled cards.🤖 Generated with Claude Code