Skip to content

feat: add pi coding agent as a runtime#65

Open
armstrongsamr wants to merge 4 commits into
mainfrom
feat/pi-runtime
Open

feat: add pi coding agent as a runtime#65
armstrongsamr wants to merge 4 commits into
mainfrom
feat/pi-runtime

Conversation

@armstrongsamr

Copy link
Copy Markdown
Contributor

What

Adds the pi coding agent (earendil-works/pi) as a fourth agent runtime alongside Mastra, Claude Code, and Codex. Users can select it in Settings → Runtimes and for autonomous task agents.

Unlike the Claude/Codex runtimes (which wrap an official "drive-the-CLI" SDK), pi has no such SDK — its CLI exposes a headless --mode json. So the adapter spawns the pi binary once per turn (pi --mode json), parses pi's JSONL event stream, and translates its AgentEvent / AssistantMessageEvent union into Kai's StreamEvent format — mirroring the Codex per-turn model.

Key behaviours:

  • Detection on PATH only (detectPiCli/resolvePiCliPath), no bundling — consistent with how claude/codex are detected; reports "inactive" with an install hint when pi is absent.
  • Session resume via a Kai-owned UUID passed as --session-id (idempotent create-or-open in pi), persisted in conversationMetadata.piSessionId. (--resume/--continue are interactive/most-recent and can't resume a specific id headlessly.)
  • Model mapping: drives pi only for first-party Anthropic/OpenAI/Google/Bedrock models at their official endpoints, injecting the provider-specific API key env var. pi has no --base-url, so custom-endpoint models can't be targeted — those fall back to pi's own configured default with a one-line in-chat note.
  • Capabilities: builtInTools, sessions, multiProvider. No MCP (pi has none), so Kai skills/plugins/custom tools and plan mode are unavailable in this runtime; no memory/compaction/observer.

Security

  • API key passed via the provider-specific env var, never on argv (argv is visible via ps).
  • Prompt delivered via stdin, not argv — pi treats a leading @ as a file-read and - as a flag, and has no -- separator.
  • pi has no per-tool approval hook in any headless mode, so it runs bash + file edits unsupervised. The Kai approval mode maps to spawn-time tool scoping (the only gate available): full-auto (default) = all tools; auto-edit = --exclude-tools bash; suggest = --exclude-tools bash,edit,write. The Settings description states the autonomy plainly.
  • Child spawned detached and killed by process group on abort, so pi's bash grandchildren are reaped (no orphans). JSONL parsed line-by-line with size caps; JSON.parse only.

Why

Gives Kai users a third external coding-agent option. pi is multi-provider and self-extensible; wiring it as a first-class runtime lets people drive it from the same chat/agent surfaces as Claude Code and Codex.

Notes / follow-ups

  • Image attachments and a persistent --mode rpc variant are intentionally out of scope for v1 (the RPC variant can be retrofitted behind the same AgentRuntime facade with no contract change).
  • piSdk config lives under agent, which desktopConfigPayload() already persists wholesale — no allowlist change needed.

Checklist

  • pnpm lint passes
  • pnpm type-check passes
  • pnpm test passes (339 passed; 16 new tests in pi-runtime.test.ts)
  • pnpm build succeeds
  • No IPC change — pi reuses the existing agent stream/getAvailableRuntimes channels, so no preload bridge update needed
  • piSdk config nests under agent (persisted wholesale by desktopConfigPayload()) — no allowlist change required
  • Doc impact considered — runtime header comment updated; Settings/runtime UI describe pi's autonomy and limitations

Adds the pi coding agent (earendil-works/pi) as a fourth agent runtime
alongside Mastra, Claude Code, and Codex.

Unlike Claude/Codex (which wrap an official SDK), pi has no drive-the-CLI
SDK, so the adapter spawns the `pi` binary in headless `--mode json`
per turn and translates pi's JSONL event stream (AgentEvent /
AssistantMessageEvent) into Kai StreamEvents — mirroring the Codex
per-turn model.

- New PiRuntime (electron/agent/runtime/pi-runtime.ts): spawn pi --mode
  json, parse JSONL, translate events; Codex-parity capabilities
  (builtInTools + sessions + multiProvider; no MCP/memory/compaction).
- Detection on PATH only (detectPiCli/resolvePiCliPath) — no bundling.
- Session resume via a Kai-owned uuid passed as --session-id (idempotent
  create-or-open), persisted in conversationMetadata.piSessionId.
- Model mapping: drive pi only for first-party Anthropic/OpenAI/Google/
  Bedrock models at their official endpoints; otherwise fall back to pi's
  own default with an in-chat note (pi has no --base-url).
- Security: API key via provider-specific env var (never argv), prompt
  via stdin (avoids @/- argv hazards), spawn-time tool scoping from the
  approval mode (pi has no mid-stream approval hook), detached process
  group killed on abort.
- Wiring: register in main.ts, RuntimeId/labels/config schema (piSdk),
  explicit-mode resolution case, Settings + autonomous-agent picker, and
  the task-terminal PTY path.
- 16 new unit tests; full suite green (339 passed).
@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Coverage report

Project-wide coverage (reporting only — no gate):

Metric This PR Baseline
lines 32.91%
statements 31.46%
functions 29.42%
branches 24.75%

No baseline available — first run, missing artifact, or fetch failed.

Changed-file coverage (8 of 19 files instrumented; 11 not in unit slice)

Per-file table
File Lines Statements Functions Branches
electron/agent/runtime/claude-agent-runtime.ts 45.15% 44.69% 30.76% 46.17%
electron/agent/runtime/codex-runtime.ts 44.97% 43.50% 58.33% 35.78%
electron/agent/runtime/index.ts 66.66% 66.66% 85.71% 62.50%
electron/agent/runtime/mastra-runtime.ts 80.00% 80.00% 66.66% 78.57%
electron/agent/runtime/model-runtime-compat.ts 35.71% 35.21% 70.00% 28.40%
electron/agent/runtime/pi-runtime.ts 63.45% 61.21% 63.15% 44.04%
electron/config/schema.ts 100.00% 100.00% 100.00% 100.00%
electron/ipc/agent.ts 6.99% 6.48% 7.31% 0.91%

Coverage is reporting-only. No thresholds, no merge gate.

Adds two e2e suites validating the pi runtime end-to-end, and fixes a
wiring bug they surfaced.

- fix(ipc/agent): add 'pi' to the isBuiltInRuntime guard in the stream
  handler. Without it, selecting pi was misrouted down the plugin
  inference-provider path and failed with "no inference provider is
  available" instead of spawning the pi CLI. (Mirrors the same fix already
  made in model-runtime-compat.ts.)
- test(integration): real-subprocess suite (pi-runtime.integration.test.ts)
  drives the real PiRuntime.stream() against a fake `pi` shim
  (__tests__/fixtures/fake-pi.mjs) — exercises real spawn/stdin/stdout/exit,
  api-key-via-env, prompt-via-stdin, session-id resume, and verifies the
  process group (incl. a bash grandchild) is reaped on abort.
- test(e2e): Electron+Playwright GUI suite (e2e/pi-runtime.spec.ts) launches
  the live app with a fake `pi` on PATH + seeded config, asserts pi is
  detected via the real getAvailableRuntimes IPC, and runs a turn through
  agent.stream({runtimeOverride:'pi'}) asserting streamed text + tool events
  reach the renderer.

All green: type-check, lint, unit (344), integration (5), e2e (2).
Addresses findings from a security review of the pi runtime:

- M1 (process reaping): abort during spawn, or an abort against a hung
  child that has emitted nothing, could leave the stdout read loop blocked
  and skip the finally that reaps the process group. onAbort now also
  destroys child.stdout to unblock the loop, and we reap immediately if the
  signal is already aborted before the listener attaches. Adds a regression
  test (pre-aborted signal must still terminate cleanly).
- L1 (DoS): add a 64 MiB aggregate stdout ceiling (per-turn) on top of the
  existing 1 MiB per-line cap; emit an error and kill the process group if
  exceeded.
- L2 (defense-in-depth): only forward reasoningEffort to --thinking when it
  is one of pi's accepted levels.

No CRITICAL/HIGH findings: key-via-env-not-argv, prompt-via-stdin,
shell:false, hardened JSONL parsing, and the PTY arg-stripping were all
confirmed correct. All suites green (unit 345, integration 6, e2e 2).
… (M2)

Implements the DrZero-debated 'C+' resolution for M2 (pi runs tools with no
per-action approval): keep full-auto default, but make the disclosure derive
from the runtime contract instead of per-id magic strings.

- Add RuntimeCapabilities.perActionApproval (types.ts). false for pi (no
  headless tool-approval hook); true for mastra/claude/codex (Kai can gate
  each action via canUseTool/approvalPolicy/its own tool exec). Updates all
  runtime capability literals + test stubs.
- Expose perActionApproval through getAvailableRuntimes (index.ts).
- New shared <AutonomyWarning> component rendered wherever a runtime with
  perActionApproval===false is selected — closes the prior drift where
  RuntimeSettings warned but RuntimePicker (and ThreadSettingsModal) did not.
  Wired into all three selection surfaces; removed the autonomy clause from
  the pi description string (now capability-derived).
- Robustness: swallow async EPIPE on the child's stdin when an abort lands
  during spawn (writing to an already-exited child). Surfaced by the
  pre-abort regression test as an unhandled error.
- Tests: RuntimePicker component test (warning shown for pi, hidden for
  claude-code). All suites green: unit+component 345, integration 6, e2e 2.

Follow-up (not in scope): workspace/cwd confinement + cred-env scrubbing
should be applied uniformly to ALL autonomous runtimes (pi, codex full-auto,
claude bypass), tracked separately rather than singling out pi.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant