Summary
Assessment of how OVO — a local, macOS/Apple-Silicon desktop LLM runtime — could be consumed from ail pipelines. Three integration paths are possible; one works today with zero code.
This issue is a scoping document, not an implementation plan. It exists so we can decide which path to invest in before writing code.
What OVO is
OVO is a desktop application (Tauri + FastAPI sidecar, MLX inference). It is not a CLI, not a stdin/stdout subprocess, and not a library — it is a long-running service that exposes three HTTP APIs on localhost:
| Port |
Flavor |
Key endpoints |
| 11435 |
Ollama-compatible |
POST /api/chat, POST /api/generate, POST /api/pull, GET /api/tags |
| 11436 |
OpenAI-compatible |
POST /v1/chat/completions (SSE via stream=true), POST /v1/completions, GET /v1/models |
| 11437 |
OVO-native |
GET /ovo/models, POST /ovo/models/download, GET /ovo/settings, GET /ovo/claude/context, GET /ovo/audit |
Platform: macOS 13+ on Apple Silicon only (no Intel, no Linux, no Windows). No documented auth scheme (localhost-only). Tool/function-calling support: not documented; needs empirical verification.
What ail already provides
Runner trait (ail-core/src/runner/mod.rs:280) with invoke() and invoke_streaming().
- Built-in
http / ollama runner (ail-core/src/runner/http.rs) that already speaks OpenAI-compatible /chat/completions. Configured via AIL_HTTP_BASE_URL, AIL_HTTP_TOKEN, AIL_HTTP_MODEL, AIL_HTTP_THINK. Does not stream, has no tool support, keeps session history in-process with full replay each turn (O(N²) tokens).
- Runtime plugin protocol (
spec/runner/r10-plugin-protocol.md, spec/runner/r11-plugin-discovery.md) — JSON-RPC 2.0 / NDJSON over stdin-stdout, manifests at ~/.ail/runners/*.yaml, handshake + invoke + streaming notifications (stream/delta, stream/thinking, stream/tool_use, stream/tool_result, stream/cost_update, stream/permission_request). Fully implemented in ail-core/src/runner/plugin/.
- Built-in name-collision guard reserves
claude, http, ollama, stub. The name ovo is free.
Path A — Zero-code: point the existing http runner at OVO
What it looks like:
export AIL_HTTP_BASE_URL="http://localhost:11436/v1"
export AIL_HTTP_MODEL="<ovo-model-name>"
ail --once "hello" --pipeline demo.ail.yaml
# demo.ail.yaml
pipeline:
- id: think
prompt: "{{ step.invocation.prompt }}"
runner: ollama # alias for http
Pros: works today, no code, no release.
Cons: no SSE streaming (buffered full response), no tool-calling path, full-history replay on each resume, no access to OVO's native 11437 endpoints (model list, audit, settings).
Effort: docs + a demo/ovo.ail.yaml fixture only.
Path B — Native Rust runner (ovo built-in inside ail-core)
What it looks like: new module ail-core/src/runner/ovo.rs, register in ail-core/src/runner/factory.rs, reserve the name alongside claude/http/ollama/stub.
Pros: direct, in-process, can call the native 11437 endpoints, can add SSE streaming and tool bridging, no subprocess overhead.
Cons: couples ail-core to a specific vendor's product, duplicates most of HttpRunner, burns a built-in name, adds an HTTP client dependency path we'd maintain forever. Contradicts the "Runner trait is the seam" principle in ARCHITECTURE.md — built-ins should be generic (HTTP, Claude CLI, Codex CLI), not per-vendor.
Recommendation: don't do this. Not appropriate for a third-party desktop app.
Path C — External plugin (~/.ail/runners/ovo.yaml)
What it looks like:
# ~/.ail/runners/ovo.yaml
name: ovo
version: "0.1.0"
executable: /usr/local/bin/ovo-ail-plugin
protocol_version: "1"
env:
OVO_BASE_URL: http://localhost:11436
# pipeline
pipeline:
- id: think
runner: ovo
prompt: "{{ step.invocation.prompt }}"
The plugin is a standalone executable (Python, Go, Rust — any language) that:
- Reads JSON-RPC 2.0 on stdin, writes on stdout (NDJSON,
\n-terminated).
- On
initialize, declares { name: "ovo", version, protocol_version: "1", capabilities: { streaming: true, session_resume: true, tool_events: <tbd>, permission_requests: false } }.
- On
invoke, calls OVO's POST /v1/chat/completions with stream=true, translates SSE deltas to stream/delta notifications, closes with a final invoke response populated with response, session_id, input_tokens, output_tokens, model.
- Holds conversation history in-plugin keyed by
session_id so resume_session_id is cheap (no replay from ail).
- Optionally calls
GET /ovo/models on initialize to advertise available models, and reports /ovo/audit events through stream/cost_update or log lines.
Pros: zero changes to ail-core, isolates OVO's API churn to one executable, unlocks SSE streaming + cheap session resume + future tool-event bridging, ships and versions independently, written in whatever language is most convenient.
Cons: subprocess spawn per invocation, plugin author owns retry/timeout/error-mapping, need to distribute a binary (or ask users to pip install a Python entry point).
Effort: ~1–2 days for a minimum-viable plugin + manifest + demo/ovo.ail.yaml + a short README. Spec is already published at spec/runner/r10-plugin-protocol.md.
Gaps / missing features to flag
Not blockers, but worth tracking:
- ail's built-in
HttpRunner does not consume SSE. Even when the backend streams, ail buffers. A plugin can translate SSE to stream/delta notifications; a future enhancement to HttpRunner could do the same generically (#TBD — worth a separate issue).
- ail's
HttpRunner has no tool/function-calling path. If OVO's /v1/chat/completions honors OpenAI tools, ail cannot currently route tool_call / tool_result pairs through the generic runner. A plugin can emit stream/tool_use / stream/tool_result notifications, so Path C is the unlock here.
- Session resume in
HttpRunner is full-replay. Plugin can hold server-side state.
- OVO's tool-calling capability is undocumented. Needs verification (run a probe against
/v1/chat/completions with tools in the payload). Outcome determines whether a plugin should bother with the tool-event surface.
- OVO's auth model is undocumented. Assumed localhost-only. Any non-local deployment story (SSH tunnel, remote Mac) needs separate design.
- OVO is macOS-only. Linux / Windows ail users cannot run OVO locally; documentation should be explicit about this.
- Plugin protocol
capabilities.permission_requests: true is not useful for OVO (no tools with side effects). Fine — plugin can report false.
Recommendation
Do both A and C, skip B.
- Immediately: document Path A in
README.md and add demo/ovo.ail.yaml so users can use OVO via the existing http/ollama runner today. Effort: hours.
- When there is demand (or as a reference plugin): build Path C as a separate repository (e.g.
ail-runner-ovo or a folder under demo/plugins/). This doubles as the first real-world validation of the plugin protocol beyond the in-tree tests, and is the right home for OVO-specific behavior. Effort: 1–2 days.
- Explicitly reject Path B. A vendor-specific built-in is the wrong layer; the plugin seam exists precisely to avoid this.
A formal implementation plan for Path C can follow once this direction is agreed.
Questions before committing to a plan
- Is OVO's
/v1/chat/completions faithful enough to OpenAI's spec that a plugin can assume tools, tool_choice, and stream all work? (Needs empirical probe.)
- Is there appetite to enhance
HttpRunner with SSE streaming generically, independent of OVO? That would let Path A cover more ground and reduce the need for a plugin.
- Should the OVO plugin live in this repo under
demo/plugins/ovo/ (as a reference) or in a separate repo?
Summary
Assessment of how OVO — a local, macOS/Apple-Silicon desktop LLM runtime — could be consumed from ail pipelines. Three integration paths are possible; one works today with zero code.
This issue is a scoping document, not an implementation plan. It exists so we can decide which path to invest in before writing code.
What OVO is
OVO is a desktop application (Tauri + FastAPI sidecar, MLX inference). It is not a CLI, not a stdin/stdout subprocess, and not a library — it is a long-running service that exposes three HTTP APIs on localhost:
POST /api/chat,POST /api/generate,POST /api/pull,GET /api/tagsPOST /v1/chat/completions(SSE viastream=true),POST /v1/completions,GET /v1/modelsGET /ovo/models,POST /ovo/models/download,GET /ovo/settings,GET /ovo/claude/context,GET /ovo/auditPlatform: macOS 13+ on Apple Silicon only (no Intel, no Linux, no Windows). No documented auth scheme (localhost-only). Tool/function-calling support: not documented; needs empirical verification.
What ail already provides
Runnertrait (ail-core/src/runner/mod.rs:280) withinvoke()andinvoke_streaming().http/ollamarunner (ail-core/src/runner/http.rs) that already speaks OpenAI-compatible/chat/completions. Configured viaAIL_HTTP_BASE_URL,AIL_HTTP_TOKEN,AIL_HTTP_MODEL,AIL_HTTP_THINK. Does not stream, has no tool support, keeps session history in-process with full replay each turn (O(N²) tokens).spec/runner/r10-plugin-protocol.md,spec/runner/r11-plugin-discovery.md) — JSON-RPC 2.0 / NDJSON over stdin-stdout, manifests at~/.ail/runners/*.yaml, handshake +invoke+ streaming notifications (stream/delta,stream/thinking,stream/tool_use,stream/tool_result,stream/cost_update,stream/permission_request). Fully implemented inail-core/src/runner/plugin/.claude,http,ollama,stub. The nameovois free.Path A — Zero-code: point the existing
httprunner at OVOWhat it looks like:
Pros: works today, no code, no release.
Cons: no SSE streaming (buffered full response), no tool-calling path, full-history replay on each resume, no access to OVO's native
11437endpoints (model list, audit, settings).Effort: docs + a
demo/ovo.ail.yamlfixture only.Path B — Native Rust runner (
ovobuilt-in insideail-core)What it looks like: new module
ail-core/src/runner/ovo.rs, register inail-core/src/runner/factory.rs, reserve the name alongsideclaude/http/ollama/stub.Pros: direct, in-process, can call the native
11437endpoints, can add SSE streaming and tool bridging, no subprocess overhead.Cons: couples
ail-coreto a specific vendor's product, duplicates most ofHttpRunner, burns a built-in name, adds an HTTP client dependency path we'd maintain forever. Contradicts the "Runner trait is the seam" principle inARCHITECTURE.md— built-ins should be generic (HTTP, Claude CLI, Codex CLI), not per-vendor.Recommendation: don't do this. Not appropriate for a third-party desktop app.
Path C — External plugin (
~/.ail/runners/ovo.yaml)What it looks like:
The plugin is a standalone executable (Python, Go, Rust — any language) that:
\n-terminated).initialize, declares{ name: "ovo", version, protocol_version: "1", capabilities: { streaming: true, session_resume: true, tool_events: <tbd>, permission_requests: false } }.invoke, calls OVO'sPOST /v1/chat/completionswithstream=true, translates SSE deltas tostream/deltanotifications, closes with a finalinvokeresponse populated withresponse,session_id,input_tokens,output_tokens,model.session_idsoresume_session_idis cheap (no replay from ail).GET /ovo/modelson initialize to advertise available models, and reports/ovo/auditevents throughstream/cost_updateor log lines.Pros: zero changes to
ail-core, isolates OVO's API churn to one executable, unlocks SSE streaming + cheap session resume + future tool-event bridging, ships and versions independently, written in whatever language is most convenient.Cons: subprocess spawn per invocation, plugin author owns retry/timeout/error-mapping, need to distribute a binary (or ask users to
pip installa Python entry point).Effort: ~1–2 days for a minimum-viable plugin + manifest +
demo/ovo.ail.yaml+ a short README. Spec is already published atspec/runner/r10-plugin-protocol.md.Gaps / missing features to flag
Not blockers, but worth tracking:
HttpRunnerdoes not consume SSE. Even when the backend streams, ail buffers. A plugin can translate SSE tostream/deltanotifications; a future enhancement toHttpRunnercould do the same generically (#TBD — worth a separate issue).HttpRunnerhas no tool/function-calling path. If OVO's/v1/chat/completionshonors OpenAItools, ail cannot currently routetool_call/tool_resultpairs through the generic runner. A plugin can emitstream/tool_use/stream/tool_resultnotifications, so Path C is the unlock here.HttpRunneris full-replay. Plugin can hold server-side state./v1/chat/completionswithtoolsin the payload). Outcome determines whether a plugin should bother with the tool-event surface.capabilities.permission_requests: trueis not useful for OVO (no tools with side effects). Fine — plugin can reportfalse.Recommendation
Do both A and C, skip B.
README.mdand adddemo/ovo.ail.yamlso users can use OVO via the existinghttp/ollamarunner today. Effort: hours.ail-runner-ovoor a folder underdemo/plugins/). This doubles as the first real-world validation of the plugin protocol beyond the in-tree tests, and is the right home for OVO-specific behavior. Effort: 1–2 days.A formal implementation plan for Path C can follow once this direction is agreed.
Questions before committing to a plan
/v1/chat/completionsfaithful enough to OpenAI's spec that a plugin can assumetools,tool_choice, andstreamall work? (Needs empirical probe.)HttpRunnerwith SSE streaming generically, independent of OVO? That would let Path A cover more ground and reduce the need for a plugin.demo/plugins/ovo/(as a reference) or in a separate repo?