OVO integration assessment: paths forward for ail ↔ ovoment/ovo-local-llm

## Summary

Assessment of how [OVO](https://github.com/ovoment/ovo-local-llm) — a local, macOS/Apple-Silicon desktop LLM runtime — could be consumed from ail pipelines. Three integration paths are possible; one works today with zero code.

This issue is a scoping document, not an implementation plan. It exists so we can decide which path to invest in before writing code.

## What OVO is

OVO is a desktop application (Tauri + FastAPI sidecar, MLX inference). It is **not** a CLI, **not** a stdin/stdout subprocess, and **not** a library — it is a long-running service that exposes three HTTP APIs on localhost:

| Port | Flavor | Key endpoints |
|---|---|---|
| 11435 | Ollama-compatible | `POST /api/chat`, `POST /api/generate`, `POST /api/pull`, `GET /api/tags` |
| 11436 | OpenAI-compatible | `POST /v1/chat/completions` (SSE via `stream=true`), `POST /v1/completions`, `GET /v1/models` |
| 11437 | OVO-native | `GET /ovo/models`, `POST /ovo/models/download`, `GET /ovo/settings`, `GET /ovo/claude/context`, `GET /ovo/audit` |

Platform: **macOS 13+ on Apple Silicon only** (no Intel, no Linux, no Windows). No documented auth scheme (localhost-only). Tool/function-calling support: not documented; needs empirical verification.

## What ail already provides

- `Runner` trait (`ail-core/src/runner/mod.rs:280`) with `invoke()` and `invoke_streaming()`.
- Built-in `http` / `ollama` runner (`ail-core/src/runner/http.rs`) that already speaks OpenAI-compatible `/chat/completions`. Configured via `AIL_HTTP_BASE_URL`, `AIL_HTTP_TOKEN`, `AIL_HTTP_MODEL`, `AIL_HTTP_THINK`. **Does not stream, has no tool support**, keeps session history in-process with full replay each turn (O(N²) tokens).
- Runtime plugin protocol (`spec/runner/r10-plugin-protocol.md`, `spec/runner/r11-plugin-discovery.md`) — JSON-RPC 2.0 / NDJSON over stdin-stdout, manifests at `~/.ail/runners/*.yaml`, handshake + `invoke` + streaming notifications (`stream/delta`, `stream/thinking`, `stream/tool_use`, `stream/tool_result`, `stream/cost_update`, `stream/permission_request`). Fully implemented in `ail-core/src/runner/plugin/`.
- Built-in name-collision guard reserves `claude`, `http`, `ollama`, `stub`. The name `ovo` is free.

## Path A — Zero-code: point the existing `http` runner at OVO

**What it looks like:**

```bash
export AIL_HTTP_BASE_URL="http://localhost:11436/v1"
export AIL_HTTP_MODEL="<ovo-model-name>"
ail --once "hello" --pipeline demo.ail.yaml
```

```yaml
# demo.ail.yaml
pipeline:
  - id: think
    prompt: "{{ step.invocation.prompt }}"
    runner: ollama   # alias for http
```

**Pros:** works today, no code, no release.
**Cons:** no SSE streaming (buffered full response), no tool-calling path, full-history replay on each resume, no access to OVO's native `11437` endpoints (model list, audit, settings).
**Effort:** docs + a `demo/ovo.ail.yaml` fixture only.

## Path B — Native Rust runner (`ovo` built-in inside `ail-core`)

**What it looks like:** new module `ail-core/src/runner/ovo.rs`, register in `ail-core/src/runner/factory.rs`, reserve the name alongside `claude`/`http`/`ollama`/`stub`.

**Pros:** direct, in-process, can call the native `11437` endpoints, can add SSE streaming and tool bridging, no subprocess overhead.
**Cons:** couples `ail-core` to a specific vendor's product, duplicates most of `HttpRunner`, burns a built-in name, adds an HTTP client dependency path we'd maintain forever. Contradicts the "Runner trait is the seam" principle in `ARCHITECTURE.md` — built-ins should be generic (HTTP, Claude CLI, Codex CLI), not per-vendor.

**Recommendation: don't do this.** Not appropriate for a third-party desktop app.

## Path C — External plugin (`~/.ail/runners/ovo.yaml`)

**What it looks like:**

```yaml
# ~/.ail/runners/ovo.yaml
name: ovo
version: "0.1.0"
executable: /usr/local/bin/ovo-ail-plugin
protocol_version: "1"
env:
  OVO_BASE_URL: http://localhost:11436
```

```yaml
# pipeline
pipeline:
  - id: think
    runner: ovo
    prompt: "{{ step.invocation.prompt }}"
```

The plugin is a standalone executable (Python, Go, Rust — any language) that:

1. Reads JSON-RPC 2.0 on stdin, writes on stdout (NDJSON, `\n`-terminated).
2. On `initialize`, declares `{ name: "ovo", version, protocol_version: "1", capabilities: { streaming: true, session_resume: true, tool_events: <tbd>, permission_requests: false } }`.
3. On `invoke`, calls OVO's `POST /v1/chat/completions` with `stream=true`, translates SSE deltas to `stream/delta` notifications, closes with a final `invoke` response populated with `response`, `session_id`, `input_tokens`, `output_tokens`, `model`.
4. Holds conversation history in-plugin keyed by `session_id` so `resume_session_id` is cheap (no replay from ail).
5. Optionally calls `GET /ovo/models` on initialize to advertise available models, and reports `/ovo/audit` events through `stream/cost_update` or log lines.

**Pros:** zero changes to `ail-core`, isolates OVO's API churn to one executable, unlocks SSE streaming + cheap session resume + future tool-event bridging, ships and versions independently, written in whatever language is most convenient.
**Cons:** subprocess spawn per invocation, plugin author owns retry/timeout/error-mapping, need to distribute a binary (or ask users to `pip install` a Python entry point).

**Effort:** ~1–2 days for a minimum-viable plugin + manifest + `demo/ovo.ail.yaml` + a short README. Spec is already published at `spec/runner/r10-plugin-protocol.md`.

## Gaps / missing features to flag

Not blockers, but worth tracking:

- **ail's built-in `HttpRunner` does not consume SSE.** Even when the backend streams, ail buffers. A plugin can translate SSE to `stream/delta` notifications; a future enhancement to `HttpRunner` could do the same generically (#TBD — worth a separate issue).
- **ail's `HttpRunner` has no tool/function-calling path.** If OVO's `/v1/chat/completions` honors OpenAI `tools`, ail cannot currently route `tool_call` / `tool_result` pairs through the generic runner. A plugin can emit `stream/tool_use` / `stream/tool_result` notifications, so Path C is the unlock here.
- **Session resume in `HttpRunner` is full-replay.** Plugin can hold server-side state.
- **OVO's tool-calling capability is undocumented.** Needs verification (run a probe against `/v1/chat/completions` with `tools` in the payload). Outcome determines whether a plugin should bother with the tool-event surface.
- **OVO's auth model is undocumented.** Assumed localhost-only. Any non-local deployment story (SSH tunnel, remote Mac) needs separate design.
- **OVO is macOS-only.** Linux / Windows ail users cannot run OVO locally; documentation should be explicit about this.
- **Plugin protocol `capabilities.permission_requests: true` is not useful for OVO** (no tools with side effects). Fine — plugin can report `false`.

## Recommendation

**Do both A and C, skip B.**

1. **Immediately:** document Path A in `README.md` and add `demo/ovo.ail.yaml` so users can use OVO via the existing `http`/`ollama` runner today. Effort: hours.
2. **When there is demand (or as a reference plugin):** build Path C as a separate repository (e.g. `ail-runner-ovo` or a folder under `demo/plugins/`). This doubles as the first real-world validation of the plugin protocol beyond the in-tree tests, and is the right home for OVO-specific behavior. Effort: 1–2 days.
3. **Explicitly reject Path B.** A vendor-specific built-in is the wrong layer; the plugin seam exists precisely to avoid this.

A formal implementation plan for Path C can follow once this direction is agreed.

## Questions before committing to a plan

- Is OVO's `/v1/chat/completions` faithful enough to OpenAI's spec that a plugin can assume `tools`, `tool_choice`, and `stream` all work? (Needs empirical probe.)
- Is there appetite to enhance `HttpRunner` with SSE streaming generically, independent of OVO? That would let Path A cover more ground and reduce the need for a plugin.
- Should the OVO plugin live in this repo under `demo/plugins/ovo/` (as a reference) or in a separate repo?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OVO integration assessment: paths forward for ail ↔ ovoment/ovo-local-llm #159

Summary

What OVO is

What ail already provides

Path A — Zero-code: point the existing `http` runner at OVO

Path B — Native Rust runner (`ovo` built-in inside `ail-core`)

Path C — External plugin (`~/.ail/runners/ovo.yaml`)

Gaps / missing features to flag

Recommendation

Questions before committing to a plan

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Port	Flavor	Key endpoints
11435	Ollama-compatible	`POST /api/chat`, `POST /api/generate`, `POST /api/pull`, `GET /api/tags`
11436	OpenAI-compatible	`POST /v1/chat/completions` (SSE via `stream=true`), `POST /v1/completions`, `GET /v1/models`
11437	OVO-native	`GET /ovo/models`, `POST /ovo/models/download`, `GET /ovo/settings`, `GET /ovo/claude/context`, `GET /ovo/audit`

Uh oh!

OVO integration assessment: paths forward for ail ↔ ovoment/ovo-local-llm #159

Description

Summary

What OVO is

What ail already provides

Path A — Zero-code: point the existing http runner at OVO

Path B — Native Rust runner (ovo built-in inside ail-core)

Path C — External plugin (~/.ail/runners/ovo.yaml)

Gaps / missing features to flag

Recommendation

Questions before committing to a plan

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Path A — Zero-code: point the existing `http` runner at OVO

Path B — Native Rust runner (`ovo` built-in inside `ail-core`)

Path C — External plugin (`~/.ail/runners/ovo.yaml`)