Skip to content

Codex backend: subscription-billed OpenAI vertical independent of Claude #480

@dcellison

Description

@dcellison

Why

Kai currently runs every agent path through claude --print (or claude with stream-json headless flags): the conversational backend, PR review, issue triage, behavioral evaluation, and semantic-memory extraction. Anthropic's upcoming changes to Claude Code subscription coverage move those headless invocations behind separate metering starting around mid-June 2026, which exposes every Telegram turn, every triage pass, every review, and every memory extraction to per-call billing.

A second backend that bills against an OpenAI subscription (ChatGPT Plus / Pro / Business) restores zero-marginal-cost operation for operators who already pay for one. OpenAI Codex CLI is the closest structural analog to Claude Code: subscription billing, vendor-managed CLI, headless mode, persistent app-server option.

This epic is operational continuity, not architectural elegance. Multi-vendor resilience is a side benefit; the load-bearing motivation is that one vendor is changing the rules and the project needs an alternative substrate.

Goal

Add a Codex backend that is a fully independent vertical from the existing claude backend. When an operator selects AGENT_BACKEND=codex, every agent-driven path in Kai (conversation, triage, review, behavioral eval, memory extraction) runs through OpenAI via the codex CLI, with no Anthropic dependency anywhere in the running process. A machine with no Claude Code installation must be able to run Kai end-to-end on the codex backend.

Scope

Each vertical owns its own implementation, end-to-end:

  • New src/kai/codex.py modeled on src/kai/goose.py (JSON-RPC 2.0 over a persistent subprocess). Subprocess lifecycle, model mapping (_CODEX_MODEL_MAP), event-to-StreamEvent translation, restart/kill semantics.
  • New codex equivalents (or codex branches) of triage.py, review.py, eval/behavioral.py, memory_extraction.py that call OpenAI directly. The existing claude implementations continue to handle the claude backend; they are not modified to wrap a vendor-agnostic abstraction.
  • src/kai/config.py: add codex to VALID_BACKENDS, add VALID_PROVIDERS["codex"] if needed, add codex-specific env vars (auth path, timeout, context window, etc.).
  • src/kai/install.py: third backend option in the wizard; codex-aware prompts; codex login walkthrough.
  • src/kai/pool.py: branch on AGENT_BACKEND=codex to instantiate CodexBackend.
  • Tests: codex-side test density comparable to goose's (~60-100 tests), covering subprocess lifecycle, JSON-RPC framing, model mapping, and the codex-side triage/review/eval/extraction equivalents.

Strictly out of scope

  • No proxy, translation layer, or shared helper that routes a codex-active call through Anthropic or vice versa. The two backends do not share agent helpers above the AgentBackend ABC.
  • No modifications to claude.py, triage.py, review.py, eval/behavioral.py, or memory_extraction.py to make them "backend-aware." Those modules continue to handle the claude backend; the codex equivalents are separate code.
  • Backend-agnostic semantic memory. The codex backend ships with MEMORY_ENABLED=false as the default for codex-active installs, same posture the goose+openai smoke test runs with. Backend-agnostic extraction is a follow-up epic.
  • OpenCode (Add OpenCode as a third agent backend #296) and the planned AcpBackend extraction. Both remain in the roadmap, deferred behind this epic.

Acceptance criteria

  1. A fresh install with AGENT_BACKEND=codex, ChatGPT subscription auth, and no Claude Code present completes the install wizard, starts the service, and handles a Telegram round-trip end-to-end.
  2. On a codex-active install, no path invokes the claude binary. Verified by grep across the running modules and by absence of claude --print calls in the codex equivalents of triage/review/eval/extraction.
  3. On a claude-active install, behavior is identical to today. The claude backend is untouched.
  4. PR review and issue triage run on the codex backend when configured; behavioral evaluation runs against codex when configured; memory extraction is disabled by default when the backend is codex.
  5. Schema-drift mitigation in place: codex CLI version pinned in install metadata; parsers validated against actual output from that version.

Sequencing notes

  • Codex first; OpenCode (Add OpenCode as a third agent backend #296) plus the AcpBackend extraction return to the roadmap after.
  • Backend-agnostic semantic memory is a separate epic, filed after this one ships.
  • Continuity deadline is approximately 30 days from filing.

Sub-issues

To be filed against the spec once it lands. Initial slicing candidates:

  • CodexBackend subprocess + JSON-RPC framing
  • Wizard + config + pool wiring
  • Codex triage agent
  • Codex PR review agent
  • Codex behavioral evaluation
  • Codex-aware install wizard prompts
  • Schema-drift defenses and version pinning

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions