fix: use Ollama OpenAI-compatible endpoint (/v1) by default#685
Conversation
The Ollama provider inherits from OpenAI but used the native API base (e.g. http://localhost:11434) which doesn't serve endpoints like /chat/completions or /models. This caused: - Model listing to fail with 404 (tried /models instead of /api/tags) - with_schema to silently not work (models not in registry, no structured_output capability detected) Fix: automatically append /v1 to ollama_api_base so the provider uses Ollama's OpenAI-compatible endpoint. Idempotent — does not double-append if /v1 is already present. This gives: - Model listing works (/v1/models) - Chat works (/v1/chat/completions) - with_schema works (response_format passed correctly) - All OpenAI provider logic inherited without changes Tested with Ollama 0.17.5, gemma:latest (9B), llama3.2:3b (3B). Note: Ollama's native /api/chat endpoint with `format` param provides better nested JSON Schema enforcement than the OpenAI- compatible endpoint. A future enhancement could use the native endpoint for structured output.
There was a problem hiding this comment.
Pull request overview
This PR fixes the Ollama provider’s default API base so it targets Ollama’s OpenAI-compatible endpoints, which enables OpenAI-inherited behavior (model listing, chat, and structured output) to work correctly against a standard Ollama host URL.
Changes:
- Normalize
ollama_api_baseto always use the/v1OpenAI-compatible endpoint (without double-appending and with trailing-slash cleanup). - Update Ollama provider specs to cover
#api_basenormalization cases and to use a local config double (consistent with other provider specs).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| lib/ruby_llm/providers/ollama.rb | Adjusts #api_base to normalize to the /v1 OpenAI-compatible base URL for Ollama. |
| spec/ruby_llm/providers/ollama_spec.rb | Adds test coverage for #api_base normalization and refactors header tests to use an isolated config double. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #685 +/- ##
=======================================
Coverage 87.05% 87.05%
=======================================
Files 119 119
Lines 5594 5595 +1
Branches 1407 1408 +1
=======================================
+ Hits 4870 4871 +1
Misses 724 724 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Problem
The Ollama provider inherits from
OpenAIbut uses the native API base (e.g.http://localhost:11434) which doesn't serve OpenAI-compatible endpoints. This causes:GET /modelsbut Ollama serves models at/api/tags(native) or/v1/models(OpenAI-compatible)with_schemasilently doesn't work — since models aren't in the registry, they don't havestructured_outputcapability, soresponse_formatis never sentSteps to reproduce
Solution
Automatically append
/v1toollama_api_baseso the provider uses Ollama's OpenAI-compatible endpoint. Idempotent — does not double-append if/v1is already present (as in the existing test configuration).Since
Ollama < OpenAI, using the/v1endpoint means all OpenAI provider logic (chat, models, schemas) works without any other changes.What works after this fix
RubyLLM.models.refresh!fetches Ollama models correctlyRubyLLM.chat(model: "gemma:latest")finds the modelchat.with_schema(MySchema)sendsresponse_formatand returns parsed HashEdge cases handled
http://localhost:11434http://localhost:11434/v1http://localhost:11434/http://localhost:11434/v1http://localhost:11434/v1http://localhost:11434/v1(no change)http://localhost:11434/v1/http://localhost:11434/v1https://my-ollama.com:8080https://my-ollama.com:8080/v1Note on native API
Ollama's native
/api/chatendpoint withformatparam provides better nested JSON Schema enforcement than the OpenAI-compatible endpoint. A future enhancement could use the native endpoint for structured output while keeping/v1for everything else.Tests
5 new test cases for
#api_baseadded toollama_spec.rb, following existing spec style (subject,let,context,instance_double).Tested locally with Ollama 0.17.5, gemma:latest (9B), llama3.2:3b (3B).
🤖 Generated with Claude Code