Skip to content

perf: trim agent.agent_context.skills from /api/conversations responses by default (-98% wire bytes) #3301

@simonrosenberg

Description

@simonrosenberg

Summary

The agent-server's ConversationInfo response includes the full agent.agent_context.skills list. For an agent configured with load_user_skills=true / load_public_skills=true, the SDK's _load_auto_skills validator resolves the entire skill catalog (~40 entries in stock setups) and persists them inline. Every GET /api/conversations therefore carries ~260 KB of skill content that's only meaningful server-side — skill bodies are consumed at prompt-render time, never by API clients.

Breaking change: GET /conversations* now trims agent.agent_context.skills to [] by default. Callers that still need the legacy full-payload shape can opt back in with ?include_skills=true. Persisted state and in-memory runtime untouched.

Audit: no consumer actually reads this field over HTTP

  • agent-canvas — does not read agent.agent_context.skills from ConversationInfo (only reads agent.kind and agent.llm.model).
  • OpenHands/OpenHands (app-server) — does not read it from HTTP; consumed in-process via the agent's runtime.
  • SDK examples — use LocalConversation (in-process), not RemoteConversation.

No known caller is affected by the default trim. The ?include_skills=true escape hatch exists only for unknown custom integrations.

Empirical measurements (honest)

Run against agent-server on 127.0.0.1:18000, real 40-skill conversation. Two profile scripts attached at the bottom of this issue.

Wire-level (rock solid, reproducible across 5 GETs each)

Endpoint Legacy (?include_skills=true) Default (trimmed) Delta
GET /api/conversations?ids=… (40-skill conv) 265 KB, 3.3 ms server 4.7 KB, 2.4 ms server -98% bytes, -27% server time

Browser cold-cold-cold (3 trials; servers killed, Vite cache wiped between trials)

Trial Baseline (full payload) With trim Per-trial delta
1 5144 ms 4889 ms -255 ms
2 4956 ms 4990 ms +34 ms
3 4970 ms 4883 ms -87 ms
Median 4970 ms 4889 ms -81 ms (-1.6%)
Mean 5023 ms 4921 ms -102 ms (-2.0%)

Run-to-run variance is ~200 ms; the trim's effect is ~100 ms. That's at the edge of statistical significance from three trials — directionally consistent (2/3 trials favor the trim) but small enough that it's close to the noise floor.

Order-of-magnitude reconciliation:

  • Network transfer (260 KB → 5 KB on localhost): ~2-5 ms
  • JSON.parse (260 KB on V8): ~5-10 ms
  • React Query structural-share traversal: ~5-10 ms
  • Server-side serialization: ~1 ms
  • Expected total: ~15-25 ms. Observed mean: ~100 ms — bigger than back-of-envelope math, possibly due to React reconciliation / GC pressure amplification, possibly noise.

Where the trim actually wins

The browser cold-open wall-clock improvement is small on a fast localhost dev box. Where the optimization matters more:

  • Network bandwidth on cloud backends (100ms+ RTT) — 260 KB per fetch is a real cost; 4.7 KB is not. Big difference on slow/metered links.
  • Sustained polling loaduseUserConversation refetches via refetchInterval. Over a long session the cumulative bytes drop ~98%.
  • Server-side CPU — 27% less serialization time per request. Adds up across many concurrent clients.
  • Client memory — 260 KB × N conversations held in React Query's cache → ~5 KB × N. Meaningful for users with many open conversations.

Implementation approach (route-boundary trim, breaking default)

Drop the skills array at the route boundary in the FastAPI handler that emits ConversationInfo. Default response shape is the trimmed one; include_skills: bool = False query parameter accepts true to restore the legacy shape.

Why route-boundary rather than in-model: an earlier exploration (closed PR #3302) attempted to trim via a @field_serializer(\"skills\") on AgentContext. It worked but accumulated 5 commits / ~150 lines of guards (persistence opt-out, loading_from_snapshot flag, deep-copy snapshots, round_trip handling, config-drift detection, model_copy-merge in the resume path) because the model was trying to keep three different truths in sync (in-memory full, wire trimmed, persisted full). Moving the trim to the FastAPI route boundary collapses all of that — the model stays a single source of truth, and the optimization is a small handful of lines that only fires when the route handler is told to.

Why breaking-change default rather than opt-in: an earlier revision of #3316 defaulted to include-skills and required callers to opt into the trim. Per maintainer feedback, since no client actually reads the field from HTTP, the opt-in shape was carrying ~260 KB of payload that nothing consumes. Flipping the default is the right cost/benefit: every consumer gets the slimmer payload for free, and the rare legacy caller pays a documented opt-in.

Tracked across PRs

Profile scripts

Drop these in tools/ of a canvas worktree, then run from there:

tools/profile-conversation-open.mjs

REST waterfall in Node. Hits the three endpoints canvas calls on conversation open and times network / body-buffer / JSON.parse / structural-walk per phase. Reads the session API key from ~/.openhands/agent-canvas/session-api-key.txt.

node tools/profile-conversation-open.mjs <conversation-id>

tools/profile-conversation-browser.mjs

End-to-end via Playwright + Chromium with CDP for the network waterfall. Seeds canvas's openhands-backends / openhands-active-backend localStorage, navigates to the conversation, and measures wall-clock milestones (DOMContentLoaded → chat shell mounted → loading skeleton gone → first event painted). Distinguishes cold-cold full page reload from warm SPA switch.

For rigorous cold-cold-cold measurements:

  1. kill $(lsof -i :18000 -t) — stop agent-server (clears in-memory ConversationService cache).
  2. kill $(lsof -i :3001 -t) — stop canvas dev (clears Vite worker state).
  3. find ~/worktrees/<canvas>/node_modules/.vite -mindepth 1 -delete — wipe Vite pre-bundle cache.
  4. Start agent-server + canvas dev cold.
  5. Run the probe.
node tools/profile-conversation-browser.mjs <conversation-id>

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions