Codex backend: subscription-billed OpenAI vertical independent of Claude

## Why

Kai currently runs every agent path through `claude --print` (or `claude` with stream-json headless flags): the conversational backend, PR review, issue triage, behavioral evaluation, and semantic-memory extraction. Anthropic's upcoming changes to Claude Code subscription coverage move those headless invocations behind separate metering starting around mid-June 2026, which exposes every Telegram turn, every triage pass, every review, and every memory extraction to per-call billing.

A second backend that bills against an OpenAI subscription (ChatGPT Plus / Pro / Business) restores zero-marginal-cost operation for operators who already pay for one. OpenAI Codex CLI is the closest structural analog to Claude Code: subscription billing, vendor-managed CLI, headless mode, persistent app-server option.

This epic is operational continuity, not architectural elegance. Multi-vendor resilience is a side benefit; the load-bearing motivation is that one vendor is changing the rules and the project needs an alternative substrate.

## Goal

Add a Codex backend that is a **fully independent vertical** from the existing claude backend. When an operator selects `AGENT_BACKEND=codex`, every agent-driven path in Kai (conversation, triage, review, behavioral eval, memory extraction) runs through OpenAI via the codex CLI, with no Anthropic dependency anywhere in the running process. A machine with no Claude Code installation must be able to run Kai end-to-end on the codex backend.

## Scope

Each vertical owns its own implementation, end-to-end:

- New `src/kai/codex.py` modeled on `src/kai/goose.py` (JSON-RPC 2.0 over a persistent subprocess). Subprocess lifecycle, model mapping (`_CODEX_MODEL_MAP`), event-to-`StreamEvent` translation, restart/kill semantics.
- New codex equivalents (or codex branches) of `triage.py`, `review.py`, `eval/behavioral.py`, `memory_extraction.py` that call OpenAI directly. The existing claude implementations continue to handle the claude backend; they are not modified to wrap a vendor-agnostic abstraction.
- `src/kai/config.py`: add `codex` to `VALID_BACKENDS`, add `VALID_PROVIDERS["codex"]` if needed, add codex-specific env vars (auth path, timeout, context window, etc.).
- `src/kai/install.py`: third backend option in the wizard; codex-aware prompts; `codex login` walkthrough.
- `src/kai/pool.py`: branch on `AGENT_BACKEND=codex` to instantiate `CodexBackend`.
- Tests: codex-side test density comparable to goose's (~60-100 tests), covering subprocess lifecycle, JSON-RPC framing, model mapping, and the codex-side triage/review/eval/extraction equivalents.

## Strictly out of scope

- No proxy, translation layer, or shared helper that routes a codex-active call through Anthropic or vice versa. The two backends do not share agent helpers above the `AgentBackend` ABC.
- No modifications to `claude.py`, `triage.py`, `review.py`, `eval/behavioral.py`, or `memory_extraction.py` to make them "backend-aware." Those modules continue to handle the claude backend; the codex equivalents are separate code.
- Backend-agnostic semantic memory. The codex backend ships with `MEMORY_ENABLED=false` as the default for codex-active installs, same posture the goose+openai smoke test runs with. Backend-agnostic extraction is a follow-up epic.
- OpenCode (#296) and the planned `AcpBackend` extraction. Both remain in the roadmap, deferred behind this epic.

## Acceptance criteria

1. A fresh install with `AGENT_BACKEND=codex`, ChatGPT subscription auth, and no Claude Code present completes the install wizard, starts the service, and handles a Telegram round-trip end-to-end.
2. On a codex-active install, no path invokes the `claude` binary. Verified by grep across the running modules and by absence of `claude --print` calls in the codex equivalents of triage/review/eval/extraction.
3. On a claude-active install, behavior is identical to today. The claude backend is untouched.
4. PR review and issue triage run on the codex backend when configured; behavioral evaluation runs against codex when configured; memory extraction is disabled by default when the backend is codex.
5. Schema-drift mitigation in place: codex CLI version pinned in install metadata; parsers validated against actual output from that version.

## Sequencing notes

- Codex first; OpenCode (#296) plus the `AcpBackend` extraction return to the roadmap after.
- Backend-agnostic semantic memory is a separate epic, filed after this one ships.
- Continuity deadline is approximately 30 days from filing.

## Sub-issues

To be filed against the spec once it lands. Initial slicing candidates:

- CodexBackend subprocess + JSON-RPC framing
- Wizard + config + pool wiring
- Codex triage agent
- Codex PR review agent
- Codex behavioral evaluation
- Codex-aware install wizard prompts
- Schema-drift defenses and version pinning

## Related

- #296 (OpenCode backend) — deferred behind this epic.
- Discussion #299 (multi-vendor resilience) — related goal, not the primary motivation here.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Codex backend: subscription-billed OpenAI vertical independent of Claude #480

Why

Goal

Scope

Strictly out of scope

Acceptance criteria

Sequencing notes

Sub-issues

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Codex backend: subscription-billed OpenAI vertical independent of Claude #480

Description

Why

Goal

Scope

Strictly out of scope

Acceptance criteria

Sequencing notes

Sub-issues

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions