perf: trim agent.agent_context.skills from /api/conversations responses by default (-98% wire bytes)

## Summary

The agent-server's ``ConversationInfo`` response includes the full ``agent.agent_context.skills`` list. For an agent configured with ``load_user_skills=true`` / ``load_public_skills=true``, the SDK's ``_load_auto_skills`` validator resolves the entire skill catalog (~40 entries in stock setups) and persists them inline. Every ``GET /api/conversations`` therefore carries ~260 KB of skill content that's only meaningful server-side — skill bodies are consumed at prompt-render time, never by API clients.

**Breaking change**: ``GET /conversations*`` now trims ``agent.agent_context.skills`` to ``[]`` by default. Callers that still need the legacy full-payload shape can opt back in with ``?include_skills=true``. Persisted state and in-memory runtime untouched.

## Audit: no consumer actually reads this field over HTTP

- **agent-canvas** — does not read ``agent.agent_context.skills`` from ``ConversationInfo`` (only reads ``agent.kind`` and ``agent.llm.model``).
- **OpenHands/OpenHands** (app-server) — does not read it from HTTP; consumed in-process via the agent's runtime.
- **SDK examples** — use ``LocalConversation`` (in-process), not ``RemoteConversation``.

No known caller is affected by the default trim. The ``?include_skills=true`` escape hatch exists only for unknown custom integrations.

## Empirical measurements (honest)

Run against agent-server on ``127.0.0.1:18000``, real 40-skill conversation. Two profile scripts attached at the bottom of this issue.

### Wire-level (rock solid, reproducible across 5 GETs each)

| Endpoint | Legacy (``?include_skills=true``) | Default (trimmed) | Delta |
|---|---|---|---|
| ``GET /api/conversations?ids=…`` (40-skill conv) | 265 KB, 3.3 ms server | 4.7 KB, 2.4 ms server | **-98% bytes, -27% server time** |

### Browser cold-cold-cold (3 trials; servers killed, Vite cache wiped between trials)

| Trial | Baseline (full payload) | With trim | Per-trial delta |
|---|---|---|---|
| 1 | 5144 ms | 4889 ms | -255 ms |
| 2 | 4956 ms | 4990 ms | +34 ms |
| 3 | 4970 ms | 4883 ms | -87 ms |
| **Median** | **4970 ms** | **4889 ms** | **-81 ms (-1.6%)** |
| **Mean** | **5023 ms** | **4921 ms** | **-102 ms (-2.0%)** |

Run-to-run variance is ~200 ms; the trim's effect is ~100 ms. That's at the edge of statistical significance from three trials — directionally consistent (2/3 trials favor the trim) but small enough that it's close to the noise floor.

Order-of-magnitude reconciliation:
- Network transfer (260 KB → 5 KB on localhost): ~2-5 ms
- JSON.parse (260 KB on V8): ~5-10 ms
- React Query structural-share traversal: ~5-10 ms
- Server-side serialization: ~1 ms
- Expected total: ~15-25 ms. Observed mean: ~100 ms — bigger than back-of-envelope math, possibly due to React reconciliation / GC pressure amplification, possibly noise.

## Where the trim actually wins

The browser cold-open wall-clock improvement is small on a fast localhost dev box. Where the optimization matters more:

- **Network bandwidth on cloud backends** (100ms+ RTT) — 260 KB per fetch is a real cost; 4.7 KB is not. Big difference on slow/metered links.
- **Sustained polling load** — ``useUserConversation`` refetches via ``refetchInterval``. Over a long session the cumulative bytes drop ~98%.
- **Server-side CPU** — 27% less serialization time per request. Adds up across many concurrent clients.
- **Client memory** — 260 KB × N conversations held in React Query's cache → ~5 KB × N. Meaningful for users with many open conversations.

## Implementation approach (route-boundary trim, breaking default)

Drop the ``skills`` array at the route boundary in the FastAPI handler that emits ``ConversationInfo``. Default response shape is the trimmed one; ``include_skills: bool = False`` query parameter accepts ``true`` to restore the legacy shape.

**Why route-boundary rather than in-model**: an earlier exploration (closed PR #3302) attempted to trim via a ``@field_serializer(\"skills\")`` on ``AgentContext``. It worked but accumulated 5 commits / ~150 lines of guards (persistence opt-out, ``loading_from_snapshot`` flag, deep-copy snapshots, ``round_trip`` handling, config-drift detection, model_copy-merge in the resume path) because the model was trying to keep three different truths in sync (in-memory full, wire trimmed, persisted full). Moving the trim to the FastAPI route boundary collapses all of that — the model stays a single source of truth, and the optimization is a small handful of lines that only fires when the route handler is told to.

**Why breaking-change default rather than opt-in**: an earlier revision of #3316 defaulted to include-skills and required callers to opt into the trim. Per maintainer feedback, since no client actually reads the field from HTTP, the opt-in shape was carrying ~260 KB of payload that nothing consumes. Flipping the default is the right cost/benefit: every consumer gets the slimmer payload for free, and the rare legacy caller pays a documented opt-in.

## Tracked across PRs

- **[software-agent-sdk#3316](https://github.com/OpenHands/software-agent-sdk/pull/3316)** — server-side breaking change. The load-bearing PR.
- **[typescript-client#172](https://github.com/OpenHands/typescript-client/pull/172)** (draft) — typed ``includeSkills`` option on ``ConversationClient`` for the rare legacy caller wanting to opt back to ``true``. Lands after #3316 cuts a release. Low-priority — most callers will never need it.
- ~~agent-canvas#651~~ — closed; canvas doesn't read the field and needs no change once the new SDK ships.

## Profile scripts

Drop these in ``tools/`` of a canvas worktree, then run from there:

### ``tools/profile-conversation-open.mjs``

REST waterfall in Node. Hits the three endpoints canvas calls on conversation open and times network / body-buffer / JSON.parse / structural-walk per phase. Reads the session API key from ``~/.openhands/agent-canvas/session-api-key.txt``.

```
node tools/profile-conversation-open.mjs <conversation-id>
```

### ``tools/profile-conversation-browser.mjs``

End-to-end via Playwright + Chromium with CDP for the network waterfall. Seeds canvas's ``openhands-backends`` / ``openhands-active-backend`` localStorage, navigates to the conversation, and measures wall-clock milestones (DOMContentLoaded → chat shell mounted → loading skeleton gone → first event painted). Distinguishes cold-cold full page reload from warm SPA switch.

For rigorous cold-cold-cold measurements:
1. ``kill $(lsof -i :18000 -t)`` — stop agent-server (clears in-memory ConversationService cache).
2. ``kill $(lsof -i :3001 -t)`` — stop canvas dev (clears Vite worker state).
3. ``find ~/worktrees/<canvas>/node_modules/.vite -mindepth 1 -delete`` — wipe Vite pre-bundle cache.
4. Start agent-server + canvas dev cold.
5. Run the probe.

```
node tools/profile-conversation-browser.mjs <conversation-id>
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: trim agent.agent_context.skills from /api/conversations responses by default (-98% wire bytes) #3301

Summary

Audit: no consumer actually reads this field over HTTP

Empirical measurements (honest)

Wire-level (rock solid, reproducible across 5 GETs each)

Browser cold-cold-cold (3 trials; servers killed, Vite cache wiped between trials)

Where the trim actually wins

Implementation approach (route-boundary trim, breaking default)

Tracked across PRs

Profile scripts

`tools/profile-conversation-open.mjs`

`tools/profile-conversation-browser.mjs`

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Trial	Baseline (full payload)	With trim	Per-trial delta
1	5144 ms	4889 ms	-255 ms
2	4956 ms	4990 ms	+34 ms
3	4970 ms	4883 ms	-87 ms
Median	4970 ms	4889 ms	-81 ms (-1.6%)
Mean	5023 ms	4921 ms	-102 ms (-2.0%)

perf: trim agent.agent_context.skills from /api/conversations responses by default (-98% wire bytes) #3301

Description

Summary

Audit: no consumer actually reads this field over HTTP

Empirical measurements (honest)

Wire-level (rock solid, reproducible across 5 GETs each)

Browser cold-cold-cold (3 trials; servers killed, Vite cache wiped between trials)

Where the trim actually wins

Implementation approach (route-boundary trim, breaking default)

Tracked across PRs

Profile scripts

tools/profile-conversation-open.mjs

tools/profile-conversation-browser.mjs

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`tools/profile-conversation-open.mjs`

`tools/profile-conversation-browser.mjs`