Unify real-time context fill signal across CLI and ACP backends

## Problem

Callers building real-time context fill indicators (fuel gauges) must handle backend-specific differences that should be abstracted by the library. A broader review of the MCP diagnostic server (the first non-trivial consumer of agentrun) reveals additional abstraction leaks beyond context fill that force every consumer to manage engine-level complexity.

### Current state — context fill

| Aspect | CLI (Claude) | ACP (OpenCode) |
|---|---|---|
| Message type carrying signal | `MessageText`, `MessageThinking` | `MessageContextWindow` |
| `ContextUsedTokens` | Per-call fill (input+cache) | Session-level fill |
| `ContextSizeTokens` | 0 (not reported) | Reported (e.g., 1048576) |
| Can compute percentage? | No | Yes |

### What callers have to do today

```go
// Must handle two different message types
switch msg.Type {
case agentrun.MessageContextWindow:
    // ACP: dedicated message, has both size and used
    pct := float64(msg.Usage.ContextUsedTokens) / float64(msg.Usage.ContextSizeTokens)

case agentrun.MessageText, agentrun.MessageThinking:
    // CLI: usage piggybacks on content messages, no ContextSizeTokens
    if msg.Usage != nil && msg.Usage.ContextUsedTokens > 0 {
        // absolute number only — can't compute percentage
    }
}
```

### What callers should be able to do

```go
case agentrun.MessageContextWindow:
    pct := float64(msg.Usage.ContextUsedTokens) / float64(msg.Usage.ContextSizeTokens)
    updateFuelGauge(pct)
```

Same message type, same fields, all backends.

## Abstraction leaks found in the MCP server

Reviewing `cmd/agentrun-mcp/` as the first real consumer exposed four additional complexity leaks beyond context fill:

### 1. `spawnPerTurn` branching — duplicated in every consumer

The MCP server branches on `spawnPerTurn` in **three places** (`doRunTurn`, `doSessionStart`, `doSessionSend`):

```go
if spawnPerTurn {
    drainSpawnPerTurn(ctx, proc, handler)
} else {
    turnErr = agentrun.RunTurn(ctx, proc, input.Prompt, handler)
}
```

Every consumer (MCP server, Foundry, future orchestrators) must know whether a backend is spawn-per-turn vs streaming and branch accordingly. `RunTurn` should handle both engine types internally.

### 2. `makeEngine` returns a boolean consumers must carry forever

`makeEngine` returns `(agentrun.Engine, bool, error)` where the bool is `spawnPerTurn`. This gets stored on `sessionEntry` and checked on every `session_send`. The engine knows its own semantics, but that knowledge doesn't survive the `agentrun.Engine` interface boundary — consumers track it externally.

### 3. No unified turn summary

After a turn, callers want: text, thinking, tool calls, usage, stop reason, denials. Currently they iterate all messages and switch on type. The MCP server returns raw message arrays with no summarization. A `TurnSummary` type or helper would let consumers avoid reimplementing message-iteration logic.

### 4. Context fill requires filtering on 3+ message types

Per the core issue — CLI puts `ContextUsedTokens` on `MessageText`/`MessageThinking`, ACP puts it on `MessageContextWindow`. The MCP server passes these through without normalization. A caller monitoring fill has to know which types carry the signal per backend.

## Underlying causes — context fill

1. **Different message types carry the signal.** ACP emits a dedicated `MessageContextWindow`. CLI piggybacks `ContextUsedTokens` onto content messages. Callers must know which types to listen on per backend.

2. **CLI lacks `ContextSizeTokens`.** Claude CLI doesn't report context window capacity mid-turn. Without the denominator, callers can't compute fill percentage. However, Claude CLI's result event includes `modelUsage.<model>.contextWindow` (e.g., `200000`), which could be captured from init or first-event metadata.

3. **Semantic gap.** CLI's `ContextUsedTokens` is per-API-call input-side fill (`InputTokens + CacheReadTokens + CacheWriteTokens`). ACP's is session-level context fill. Both answer "how full is the context?" but the measurement differs.

## Design considerations

### Context fill unification
- CLI backends that report per-call usage mid-turn (Claude) should synthesize `MessageContextWindow` messages — same as ACP already does natively.
- `ContextSizeTokens` could be populated from model metadata if available (Claude CLI `contextWindow` field, ACP `usage_update.size`). When not available, consumers fall back to absolute values.
- The synthesized `MessageContextWindow` should not duplicate data already on content messages — it's a separate signal.
- Backends that don't report per-call usage mid-turn (Codex, OpenCode CLI) would simply not emit `MessageContextWindow`, same as today.

### `RunTurn` unification
- `RunTurn` should accept all engine types and handle the send+drain vs drain-only branching internally.
- The `Process` interface or engine metadata should expose send semantics so `RunTurn` can branch — consumers should not.
- This eliminates `spawnPerTurn` as an external concept entirely.

### Turn summary
- A `TurnSummary` helper or type in the root package would collect text, thinking, tool calls, usage, stop reason, and denials from a message stream.
- Both the MCP server and Foundry currently reimplement this iteration logic.

## Proposed simplification summary

| Change | Where | Impact |
|---|---|---|
| Synthesize `MessageContextWindow` from CLI | `engine/cli/process.go` | Unified message type for context fill |
| Populate `ContextSizeTokens` from model metadata | `engine/cli/claude/parse.go` | Percentage-based fuel gauge for CLI |
| `RunTurn` handles all engine types | `runturn.go` | Eliminates `spawnPerTurn` branching in every consumer |
| Expose send semantics on Process/Engine | `agentrun.go` | Consumers don't carry external booleans |
| `TurnSummary` helper | root package | Consumers iterate once, get structured result |

## Related

- #39 — Populate `ContextUsedTokens` on mid-turn messages (implemented, immediate improvement)
- This issue addresses the remaining abstraction gaps after #39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify real-time context fill signal across CLI and ACP backends #40

Problem

Current state — context fill

What callers have to do today

What callers should be able to do

Abstraction leaks found in the MCP server

1. `spawnPerTurn` branching — duplicated in every consumer

2. `makeEngine` returns a boolean consumers must carry forever

3. No unified turn summary

4. Context fill requires filtering on 3+ message types

Underlying causes — context fill

Design considerations

Context fill unification

`RunTurn` unification

Turn summary

Proposed simplification summary

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Aspect	CLI (Claude)	ACP (OpenCode)
Message type carrying signal	`MessageText`, `MessageThinking`	`MessageContextWindow`
`ContextUsedTokens`	Per-call fill (input+cache)	Session-level fill
`ContextSizeTokens`	0 (not reported)	Reported (e.g., 1048576)
Can compute percentage?	No	Yes

Change	Where	Impact
Synthesize `MessageContextWindow` from CLI	`engine/cli/process.go`	Unified message type for context fill
Populate `ContextSizeTokens` from model metadata	`engine/cli/claude/parse.go`	Percentage-based fuel gauge for CLI
`RunTurn` handles all engine types	`runturn.go`	Eliminates `spawnPerTurn` branching in every consumer
Expose send semantics on Process/Engine	`agentrun.go`	Consumers don't carry external booleans
`TurnSummary` helper	root package	Consumers iterate once, get structured result

Unify real-time context fill signal across CLI and ACP backends #40

Description

Problem

Current state — context fill

What callers have to do today

What callers should be able to do

Abstraction leaks found in the MCP server

1. spawnPerTurn branching — duplicated in every consumer

2. makeEngine returns a boolean consumers must carry forever

3. No unified turn summary

4. Context fill requires filtering on 3+ message types

Underlying causes — context fill

Design considerations

Context fill unification

RunTurn unification

Turn summary

Proposed simplification summary

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

1. `spawnPerTurn` branching — duplicated in every consumer

2. `makeEngine` returns a boolean consumers must carry forever

`RunTurn` unification