Skip to content

OVO integration assessment: paths forward for ail ↔ ovoment/ovo-local-llm #159

@AlexChesser

Description

@AlexChesser

Summary

Assessment of how OVO — a local, macOS/Apple-Silicon desktop LLM runtime — could be consumed from ail pipelines. Three integration paths are possible; one works today with zero code.

This issue is a scoping document, not an implementation plan. It exists so we can decide which path to invest in before writing code.

What OVO is

OVO is a desktop application (Tauri + FastAPI sidecar, MLX inference). It is not a CLI, not a stdin/stdout subprocess, and not a library — it is a long-running service that exposes three HTTP APIs on localhost:

Port Flavor Key endpoints
11435 Ollama-compatible POST /api/chat, POST /api/generate, POST /api/pull, GET /api/tags
11436 OpenAI-compatible POST /v1/chat/completions (SSE via stream=true), POST /v1/completions, GET /v1/models
11437 OVO-native GET /ovo/models, POST /ovo/models/download, GET /ovo/settings, GET /ovo/claude/context, GET /ovo/audit

Platform: macOS 13+ on Apple Silicon only (no Intel, no Linux, no Windows). No documented auth scheme (localhost-only). Tool/function-calling support: not documented; needs empirical verification.

What ail already provides

  • Runner trait (ail-core/src/runner/mod.rs:280) with invoke() and invoke_streaming().
  • Built-in http / ollama runner (ail-core/src/runner/http.rs) that already speaks OpenAI-compatible /chat/completions. Configured via AIL_HTTP_BASE_URL, AIL_HTTP_TOKEN, AIL_HTTP_MODEL, AIL_HTTP_THINK. Does not stream, has no tool support, keeps session history in-process with full replay each turn (O(N²) tokens).
  • Runtime plugin protocol (spec/runner/r10-plugin-protocol.md, spec/runner/r11-plugin-discovery.md) — JSON-RPC 2.0 / NDJSON over stdin-stdout, manifests at ~/.ail/runners/*.yaml, handshake + invoke + streaming notifications (stream/delta, stream/thinking, stream/tool_use, stream/tool_result, stream/cost_update, stream/permission_request). Fully implemented in ail-core/src/runner/plugin/.
  • Built-in name-collision guard reserves claude, http, ollama, stub. The name ovo is free.

Path A — Zero-code: point the existing http runner at OVO

What it looks like:

export AIL_HTTP_BASE_URL="http://localhost:11436/v1"
export AIL_HTTP_MODEL="<ovo-model-name>"
ail --once "hello" --pipeline demo.ail.yaml
# demo.ail.yaml
pipeline:
  - id: think
    prompt: "{{ step.invocation.prompt }}"
    runner: ollama   # alias for http

Pros: works today, no code, no release.
Cons: no SSE streaming (buffered full response), no tool-calling path, full-history replay on each resume, no access to OVO's native 11437 endpoints (model list, audit, settings).
Effort: docs + a demo/ovo.ail.yaml fixture only.

Path B — Native Rust runner (ovo built-in inside ail-core)

What it looks like: new module ail-core/src/runner/ovo.rs, register in ail-core/src/runner/factory.rs, reserve the name alongside claude/http/ollama/stub.

Pros: direct, in-process, can call the native 11437 endpoints, can add SSE streaming and tool bridging, no subprocess overhead.
Cons: couples ail-core to a specific vendor's product, duplicates most of HttpRunner, burns a built-in name, adds an HTTP client dependency path we'd maintain forever. Contradicts the "Runner trait is the seam" principle in ARCHITECTURE.md — built-ins should be generic (HTTP, Claude CLI, Codex CLI), not per-vendor.

Recommendation: don't do this. Not appropriate for a third-party desktop app.

Path C — External plugin (~/.ail/runners/ovo.yaml)

What it looks like:

# ~/.ail/runners/ovo.yaml
name: ovo
version: "0.1.0"
executable: /usr/local/bin/ovo-ail-plugin
protocol_version: "1"
env:
  OVO_BASE_URL: http://localhost:11436
# pipeline
pipeline:
  - id: think
    runner: ovo
    prompt: "{{ step.invocation.prompt }}"

The plugin is a standalone executable (Python, Go, Rust — any language) that:

  1. Reads JSON-RPC 2.0 on stdin, writes on stdout (NDJSON, \n-terminated).
  2. On initialize, declares { name: "ovo", version, protocol_version: "1", capabilities: { streaming: true, session_resume: true, tool_events: <tbd>, permission_requests: false } }.
  3. On invoke, calls OVO's POST /v1/chat/completions with stream=true, translates SSE deltas to stream/delta notifications, closes with a final invoke response populated with response, session_id, input_tokens, output_tokens, model.
  4. Holds conversation history in-plugin keyed by session_id so resume_session_id is cheap (no replay from ail).
  5. Optionally calls GET /ovo/models on initialize to advertise available models, and reports /ovo/audit events through stream/cost_update or log lines.

Pros: zero changes to ail-core, isolates OVO's API churn to one executable, unlocks SSE streaming + cheap session resume + future tool-event bridging, ships and versions independently, written in whatever language is most convenient.
Cons: subprocess spawn per invocation, plugin author owns retry/timeout/error-mapping, need to distribute a binary (or ask users to pip install a Python entry point).

Effort: ~1–2 days for a minimum-viable plugin + manifest + demo/ovo.ail.yaml + a short README. Spec is already published at spec/runner/r10-plugin-protocol.md.

Gaps / missing features to flag

Not blockers, but worth tracking:

  • ail's built-in HttpRunner does not consume SSE. Even when the backend streams, ail buffers. A plugin can translate SSE to stream/delta notifications; a future enhancement to HttpRunner could do the same generically (#TBD — worth a separate issue).
  • ail's HttpRunner has no tool/function-calling path. If OVO's /v1/chat/completions honors OpenAI tools, ail cannot currently route tool_call / tool_result pairs through the generic runner. A plugin can emit stream/tool_use / stream/tool_result notifications, so Path C is the unlock here.
  • Session resume in HttpRunner is full-replay. Plugin can hold server-side state.
  • OVO's tool-calling capability is undocumented. Needs verification (run a probe against /v1/chat/completions with tools in the payload). Outcome determines whether a plugin should bother with the tool-event surface.
  • OVO's auth model is undocumented. Assumed localhost-only. Any non-local deployment story (SSH tunnel, remote Mac) needs separate design.
  • OVO is macOS-only. Linux / Windows ail users cannot run OVO locally; documentation should be explicit about this.
  • Plugin protocol capabilities.permission_requests: true is not useful for OVO (no tools with side effects). Fine — plugin can report false.

Recommendation

Do both A and C, skip B.

  1. Immediately: document Path A in README.md and add demo/ovo.ail.yaml so users can use OVO via the existing http/ollama runner today. Effort: hours.
  2. When there is demand (or as a reference plugin): build Path C as a separate repository (e.g. ail-runner-ovo or a folder under demo/plugins/). This doubles as the first real-world validation of the plugin protocol beyond the in-tree tests, and is the right home for OVO-specific behavior. Effort: 1–2 days.
  3. Explicitly reject Path B. A vendor-specific built-in is the wrong layer; the plugin seam exists precisely to avoid this.

A formal implementation plan for Path C can follow once this direction is agreed.

Questions before committing to a plan

  • Is OVO's /v1/chat/completions faithful enough to OpenAI's spec that a plugin can assume tools, tool_choice, and stream all work? (Needs empirical probe.)
  • Is there appetite to enhance HttpRunner with SSE streaming generically, independent of OVO? That would let Path A cover more ground and reduce the need for a plugin.
  • Should the OVO plugin live in this repo under demo/plugins/ovo/ (as a reference) or in a separate repo?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions