Skip to content

fix: use Ollama OpenAI-compatible endpoint (/v1) by default#685

Open
justi wants to merge 2 commits into
crmne:mainfrom
justi:fix/ollama-v1-api-base
Open

fix: use Ollama OpenAI-compatible endpoint (/v1) by default#685
justi wants to merge 2 commits into
crmne:mainfrom
justi:fix/ollama-v1-api-base

Conversation

@justi

@justi justi commented Mar 17, 2026

Copy link
Copy Markdown
Contributor

Problem

The Ollama provider inherits from OpenAI but uses the native API base (e.g. http://localhost:11434) which doesn't serve OpenAI-compatible endpoints. This causes:

  1. Model listing fails with 404 — provider tries GET /models but Ollama serves models at /api/tags (native) or /v1/models (OpenAI-compatible)
  2. with_schema silently doesn't work — since models aren't in the registry, they don't have structured_output capability, so response_format is never sent

Steps to reproduce

RubyLLM.configure { |c| c.ollama_api_base = "http://localhost:11434" }
RubyLLM.models.refresh!
# => WARN: Failed to fetch Ollama models (RubyLLM::Error: 404 page not found)

chat = RubyLLM.chat(model: "gemma:latest")
# => RubyLLM::ModelNotFoundError: Unknown model: gemma:latest

Solution

Automatically append /v1 to ollama_api_base so the provider uses Ollama's OpenAI-compatible endpoint. Idempotent — does not double-append if /v1 is already present (as in the existing test configuration).

Since Ollama < OpenAI, using the /v1 endpoint means all OpenAI provider logic (chat, models, schemas) works without any other changes.

# Before: http://localhost:11434       → GET /models       → 404
# After:  http://localhost:11434/v1    → GET /v1/models    → 200 ✓

What works after this fix

  • RubyLLM.models.refresh! fetches Ollama models correctly
  • RubyLLM.chat(model: "gemma:latest") finds the model
  • chat.with_schema(MySchema) sends response_format and returns parsed Hash

Edge cases handled

Input Result
http://localhost:11434 http://localhost:11434/v1
http://localhost:11434/ http://localhost:11434/v1
http://localhost:11434/v1 http://localhost:11434/v1 (no change)
http://localhost:11434/v1/ http://localhost:11434/v1
https://my-ollama.com:8080 https://my-ollama.com:8080/v1

Note on native API

Ollama's native /api/chat endpoint with format param provides better nested JSON Schema enforcement than the OpenAI-compatible endpoint. A future enhancement could use the native endpoint for structured output while keeping /v1 for everything else.

Tests

5 new test cases for #api_base added to ollama_spec.rb, following existing spec style (subject, let, context, instance_double).

Tested locally with Ollama 0.17.5, gemma:latest (9B), llama3.2:3b (3B).

🤖 Generated with Claude Code

The Ollama provider inherits from OpenAI but used the native API
base (e.g. http://localhost:11434) which doesn't serve endpoints
like /chat/completions or /models. This caused:

- Model listing to fail with 404 (tried /models instead of /api/tags)
- with_schema to silently not work (models not in registry, no
  structured_output capability detected)

Fix: automatically append /v1 to ollama_api_base so the provider
uses Ollama's OpenAI-compatible endpoint. Idempotent — does not
double-append if /v1 is already present. This gives:

- Model listing works (/v1/models)
- Chat works (/v1/chat/completions)
- with_schema works (response_format passed correctly)
- All OpenAI provider logic inherited without changes

Tested with Ollama 0.17.5, gemma:latest (9B), llama3.2:3b (3B).

Note: Ollama's native /api/chat endpoint with `format` param
provides better nested JSON Schema enforcement than the OpenAI-
compatible endpoint. A future enhancement could use the native
endpoint for structured output.
Copilot AI review requested due to automatic review settings March 17, 2026 23:33

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes the Ollama provider’s default API base so it targets Ollama’s OpenAI-compatible endpoints, which enables OpenAI-inherited behavior (model listing, chat, and structured output) to work correctly against a standard Ollama host URL.

Changes:

  • Normalize ollama_api_base to always use the /v1 OpenAI-compatible endpoint (without double-appending and with trailing-slash cleanup).
  • Update Ollama provider specs to cover #api_base normalization cases and to use a local config double (consistent with other provider specs).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
lib/ruby_llm/providers/ollama.rb Adjusts #api_base to normalize to the /v1 OpenAI-compatible base URL for Ollama.
spec/ruby_llm/providers/ollama_spec.rb Adds test coverage for #api_base normalization and refactors header tests to use an isolated config double.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@codecov

codecov Bot commented May 12, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.05%. Comparing base (4942d6c) to head (d5fa8ec).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #685   +/-   ##
=======================================
  Coverage   87.05%   87.05%           
=======================================
  Files         119      119           
  Lines        5594     5595    +1     
  Branches     1407     1408    +1     
=======================================
+ Hits         4870     4871    +1     
  Misses        724      724           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants