OpenAI Compatible

OpenAI-Compatible

contractVersion: 1.1.0

Verified against: @ai-sdk/openai@3.0.55 (verification date 2026-05-01). Re-verified on every dependency bump to this package.

The bundled OpenAI-compatible provider is the reference extension for any backend that speaks an OpenAI-schema API. It supports OpenAI direct, self-hosted gateways, corporate proxies, and local runtimes through one baseURL configuration value.

protocol: openai-compatible

Configuration

Field	Required	Type	Meaning
`protocol`	yes	`"openai-compatible"`	Selects this adapter.
`apiKeyRef`	yes	reference (`{ kind: "env" \| "keyring", name }`)	Credential reference. Resolved at request time; never persisted in the manifest.
`baseURL`	yes	absolute http/https URL	Forwarded without mutation.
`models`	yes	`string[]`	The model names this entry serves. Each is selectable via `/model` while this provider is active.
`apiShape`	no	`"chat-completions"` (default) \| `"responses"`	Selects the wire contract.
`timeoutMs`	no	integer ms	Request timeout budget.
`defaultParams`	no	object	Provider-specific request defaults.

Required fields:

protocol is the literal string "openai-compatible". Core uses this — not the settings.json.providers map key — to look up the adapter.
apiKeyRef stores only a credential reference. The provider resolves it at request time.
baseURL is an absolute http or https URL and is forwarded without mutation.
models lists every model name the backend at baseURL serves. The provider entry exposes the union of these via /model; the active model defaults to models[0] unless active.model overrides.

Optional fields:

apiShape selects the wire contract. It defaults to chat-completions.
timeoutMs sets the request timeout budget.
defaultParams supplies provider-specific request defaults.

Example — the entry lives at settings.json.providers.<id>. The user picks the id (typically the backend's name):

{
  "providers": {
    "openai-prod": {
      "protocol": "openai-compatible",
      "apiKeyRef": { "kind": "env", "name": "OPENAI_API_KEY" },
      "baseURL": "https://api.openai.com/v1",
      "models": ["gpt-4o", "gpt-4o-mini"],
      "apiShape": "chat-completions"
    }
  },
  "active": { "provider": "openai-prod", "model": "gpt-4o" }
}

Multiple openai-compatible backends

Two entries may share protocol: "openai-compatible" and remain independently selectable. Use this when you want one config to expose, say, an OpenAI cloud endpoint and a local OpenAI-compatible runtime:

{
  "providers": {
    "openai-prod": {
      "protocol": "openai-compatible",
      "apiKeyRef": { "kind": "env", "name": "OPENAI_API_KEY" },
      "baseURL": "https://api.openai.com/v1",
      "models": ["gpt-4o", "gpt-4o-mini"]
    },
    "bailian": {
      "protocol": "openai-compatible",
      "apiKeyRef": { "kind": "keyring", "name": "bailian-api-key" },
      "baseURL": "http://192.168.1.253:8317/v1",
      "models": ["qwen3.6-plus", "glm-5"]
    }
  },
  "active": { "provider": "bailian", "model": "qwen3.6-plus" }
}

Each entry registers (providerId, modelId) pairs in the model registry. /provider bailian and /provider openai-prod swap the active provider; /model qwen3.6-plus and /model gpt-4o swap within the active provider's models[]. See Cardinality and Activation — Providers are loaded unlimited and active unlimited.

API shapes

`apiShape`	Endpoint family	Use when
`chat-completions`	Chat Completions compatible stream shape	The backend exposes `/chat/completions`-style chunks.
`responses`	Responses compatible stream shape	The backend exposes the newer Responses stream shape.

Both shapes normalize through the shared protocol adapter and emit the same internal StreamEvent union. See Protocol Adapters for the wire-shape mapping rows.

OpenAI-compatible wire-shape specialization

A single baseURL configuration value covers any backend that speaks one of the two OpenAI schemas. The apiShape toggle selects which wire contract the adapter assembles and parses.

`chat-completions` shape

The Chat Completions stream emits choices[].delta chunks. The adapter handles three delta classes:

Delta class	Adapter behavior
`choices[].delta.content`	Emit `text-delta`.
`choices[].delta.tool_calls[]`	Buffer `tool-call-delta` keyed by index; emit assembled `tool-call` once `name` is complete and `arguments` parses as JSON.
`choices[].finish_reason`	Emit exactly one `finish` event with the normalized reason.

The Chat Completions shape is the historical default and the most widely supported across OSS proxies, corporate gateways, and local runtimes.

`responses` shape

The Responses stream emits typed events on a single event channel. The adapter handles four event classes:

Event class	Adapter behavior
`response.output_text.delta`	Emit `text-delta`.
`response.function_call_arguments.delta`	Buffer `tool-call-delta`; emit assembled `tool-call` once arguments parse.
`response.reasoning_text.delta`	When `passReasoningToLoop: true`, emit `reasoning`; otherwise drop.
`response.completed`	Emit `finish` with the normalized reason.

The Responses shape is the newer schema; backends that opt in expose richer event taxonomy (reasoning, structured output, citations) on the same channel.

`baseURL` routing

baseURL is forwarded without mutation to the underlying ai-sdk client. The adapter does not normalize trailing slashes, append paths, or rewrite the URL. The configuration is the URL the backend serves; if the backend wants /v1/chat/completions, the user provides a baseURL that resolves correctly under the chosen apiShape.

Capability posture

The OpenAI-compatible reference provider declares streaming and toolCalling as hard; structuredOutput as preferred; everything else (multimodal, reasoning, contextWindow, promptCaching) as probed. See Model Capabilities for the seven-vector definition.

Per-vendor notes: the probed defaults exist because OpenAI-compatible backends vary widely. Capability Negotiation verifies the actual posture at session start before any dependent flow relies on it.

Native fields {#native-fields}

The adapter accepts the AI SDK's providerOptions.openai shape verbatim per @ai-sdk/openai@3.0.55. The native field set differs slightly between Responses and Chat Completions API shapes; both surfaces are covered below. Wire snake_case (e.g., reasoning_effort, service_tier, safety_identifier) is not accepted in defaultParams; only the AI SDK camelCase form is canonical. Wire names appear here as translation notes only.

Responses API (`apiShape: "responses"`)

Field	Type / Values	Notes
`reasoningEffort`	`"none" \| "minimal" \| "low" \| "medium" \| "high" \| "xhigh"`	Default `"medium"`. Per-model gating per OpenAI's API reference: o-series Chat Completions accepts `minimal
`reasoningSummary`	`"auto" \| "detailed"`	Default `undefined`. When set, reasoning summaries appear in the stream as `reasoning` events and in non-streaming responses under the `reasoning` field.
`forceReasoning`	bool	Forces a reasoning pass even on models where it would otherwise be skipped.
`textVerbosity`	`"low" \| "medium" \| "high"`	Default `"medium"`.
`serviceTier`	`"auto" \| "flex" \| "priority" \| "default"`	Default `"auto"`. `flex` available on o3 / o4-mini / gpt-5; `priority` is Enterprise-gated. Wire name: `service_tier`.
`safetyIdentifier`	string	User-provided safety/abuse identifier. New in latest stable. Wire name: `safety_identifier`.
`systemMessageMode`	`"system" \| "developer" \| "remove"`	Renders the assembled `system` layer as a `system` message, a `developer` message, or omits it entirely. `"remove"` interaction with Context Assembly is load-bearing: when the assembled system layer carries SM-stage bodies or `system-message` Context Provider contributions, `systemMessageMode: "remove"` emits a loud diagnostic and is rejected when the system layer is load-bearing. See Provider Params § Reserved options.
`parallelToolCalls`	bool	Default `true`.
`store`	bool	Default `true`.
`maxToolCalls`	integer	Cap on built-in tool-call invocations per response.
`metadata`	`Record<string, string>`	Free metadata stored with the generation.
`conversation`	string	OpenAI Conversation id to continue. Mutually exclusive with `previousResponseId`.
`previousResponseId`	string	Continuation handle for the prior response.
`user`	string	End-user identifier (legacy alias of `safetyIdentifier`).
`logprobs`	`bool \| number`	Return token logprobs (or top-N where N is the number).
`truncation`	`"auto" \| "disabled"`	Truncation strategy when input exceeds context window. Default `"disabled"` (request fails on overflow).
`strictJsonSchema`	bool	Default `true`. Strict structured-output schema enforcement.
`include`	`string[]`	Additional content to include in the response (`["file_search_call.results"]`, `["message.output_text.logprobs"]`).

Chat Completions API (`apiShape: "chat-completions"`, default)

The Chat Completions surface accepts most of the Responses fields above (where applicable) plus provider-specific sampling controls that live here in the adapter-native bucket (not the common bucket):

Field	Type / Values	Notes
`presencePenalty`	number	Sampling penalty.
`frequencyPenalty`	number	Sampling penalty.
`logitBias`	`Record<string, number>`	Token-id biases.
`maxCompletionTokens`	integer	Cap on completion tokens (Chat-Completions-specific; the common bucket's `maxOutputTokens` maps here for o-series and standard models).

reasoningEffort on Chat Completions accepts minimal | low | medium | high only — narrower than Responses.

Reserved (adapter-managed; `defaultParams` MAY NOT set)

The full defaultParams shape (zone split, validation, merge layers) is pinned by Provider Params. The reservations below are the OpenAI-specific carve-outs:

promptCacheKey, promptCacheRetention — cache identity is owned by Prompt Caching. The adapter forwards prompt_cache_key derived from the session id by default. The AI SDK exposes both as provider options; the wiki spec carves them out so cache identity routing remains an adapter concern in v1. promptCacheRetention: 'in_memory' | '24h' ('24h' on 5.1 series only) becomes user-configurable in a future contract revision.
instructions — Reserved. Setting instructions would create a second system-prompt surface that bypasses Context Assembly. The single assembled system layer is the canonical source. Future contract revision may define a constrained interaction.
prediction — Speculative-decoding hint. Reserved in v1 pending a dedicated decoding-hints contract.

System-prompt rendering and prefix caching

The adapter renders the Protocol Adapters system argument differently per apiShape:

`apiShape`	Wire rendering of `system`
`chat-completions`	A synthetic `{ role: "system", content }` message is prepended to the `messages` array.
`responses`	The top-level `instructions` field carries the merged system content.

Prefix caching on supported backends is automatic — there is no in-band cache marker to place. The adapter's job is to keep the static prefix byte-stable across turns: canonical JSON ordering, deterministic field order, no incidental whitespace drift. A stable prefix is the precondition for the backend's automatic prefix cache to hit.

Optional knobs the adapter forwards when the deployment exposes them:

prompt_cache_key — a routing identifier the backend uses to keep cache entries attached to the same logical session across replicas. The adapter derives this from the session id by default; deployments that pin the cache to a different identity can override.
prompt_cache_retention — a retention hint where supported.

Cache observability lands on usage.prompt_tokens_details.cached_tokens in the stream's terminal finish usage payload, normalized into the shared usage bag described in Protocol Adapters.

See Prompt Caching for the cross-provider strategy.

Auth and secrets hygiene

The provider never stores plaintext keys in manifests and never resolves apiKeyRef during construction. Request execution resolves the reference through the host secrets surface when present, falling back to host.env.get for environment references.

401 responses and credential-resolution failures surface as ProviderTransient / Unauthorized without including the resolved credential in errors or logs.

Error translation

Wire condition	Core class + code
429 / rate limit	`ProviderTransient` / `RateLimited`
5xx	`ProviderTransient` / `Provider5xx`
401	`ProviderTransient` / `Unauthorized`
Network timeout	`ProviderTransient` / `NetworkTimeout`
Tool capability mismatch	`ProviderCapability` / `MissingToolCalling`

The adapter honors signal and has no filesystem side effects.

Related pages

Changelog

1.0.0 — initial

OpenAI-compatible reference provider: apiShape defaults to chat-completions; standard provider error taxonomy; apiKeyRef resolved at request time.
OpenAI-compatible wire-shape specialization on top of ai-sdk v6: both chat-completions and responses stream shapes with a baseURL toggle. baseURL URL-routing semantics are forwarded without mutation.
System-prompt rendering: chat-completions prepends a synthetic system-role message; responses uses top-level instructions. Automatic prefix caching benefits from byte-stable prefix serialization; optional prompt_cache_key / prompt_cache_retention forwarded where supported. Cache hits reported via usage.prompt_tokens_details.cached_tokens. See Prompt Caching.

1.1.0 — `defaultParams` shape; native fields enumerated; reserved keys

New "Native fields" section enumerates the AI SDK 3.0.55 providerOptions.openai surface verbatim for both API shapes. Responses includes new safetyIdentifier, systemMessageMode, forceReasoning fields; reasoningEffort carries the full none | minimal | low | medium | high | xhigh enum with per-model gating per OpenAI's API reference (GPT-5.1 has no minimal; GPT-5.1-Codex-Max accepts none | medium | high | xhigh; xhigh for "models after gpt-5.1-codex-max"). Chat Completions section calls out adapter-native sampling fields (presencePenalty, frequencyPenalty, logitBias, maxCompletionTokens).
Reserved (adapter-managed) subsection: promptCacheKey and promptCacheRetention (cache identity owned by Prompt Caching; rationale documented); instructions (would create a second system-prompt surface bypassing Context Assembly); prediction (speculative-decoding hint deferred to future contract).
systemMessageMode: "remove" documented as rejected when the assembled system layer is load-bearing (SM stage body, system-message Context Provider contributions). Loud diagnostic when allowed; mutually exclusive with the assembled system layer in the load-bearing case.
Wire snake_case (reasoning_effort, service_tier, safety_identifier, etc.) documented as translation notes only; canonical accepted shape in defaultParams is the AI SDK camelCase form.
No removal of pre-existing prose; all changes are additive on top of 1.0.0.

Introduction

Reading

Core runtime

Contracts

Category contracts

Context

Security

Runtime behavior

Operations

Providers (bundled)

Integrations

MCP

Reference extensions

Tools

UI

Session Stores

Filesystem

Loggers

File

Providers

CLI-Wrapper

Hooks

Context Providers

System Prompt File

Commands

Bundled

Case studies

Flows

Maintainers

CLAUDE.md

OpenAI Compatible

OpenAI-Compatible

Configuration

Multiple openai-compatible backends

API shapes

OpenAI-compatible wire-shape specialization

chat-completions shape

responses shape

baseURL routing

Capability posture

Native fields {#native-fields}

Responses API (apiShape: "responses")

Chat Completions API (apiShape: "chat-completions", default)

Reserved (adapter-managed; defaultParams MAY NOT set)

System-prompt rendering and prefix caching

Auth and secrets hygiene

Error translation

Related pages

Changelog

1.0.0 — initial

1.1.0 — defaultParams shape; native fields enumerated; reserved keys

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

`chat-completions` shape

`responses` shape

`baseURL` routing

Responses API (`apiShape: "responses"`)

Chat Completions API (`apiShape: "chat-completions"`, default)

Reserved (adapter-managed; `defaultParams` MAY NOT set)

1.1.0 — `defaultParams` shape; native fields enumerated; reserved keys