feat(acp): surface current_model_id + available_models from session responses#3347
Conversation
ACP servers report the model they're using via ``models.currentModelId``
on ``NewSessionResponse`` and ``LoadSessionResponse`` (UNSTABLE field in
the spec, but already implemented by claude-agent-acp, codex-acp and
gemini-cli-acp). The SDK was discarding this response field, leaving
clients with no way to know which model the ACP harness was actually
running.
Capture it during ``_init`` and expose it as ``ACPAgent.current_model_id``:
- If ``self.acp_model`` is set (caller forced an override via
``set_session_model`` or session ``_meta``), trust that.
- Otherwise, fall back to whatever the server reported on the response.
- ``None`` for older agents that don't surface the field — callers
treat the value as best-effort.
Motivates a follow-up in the OpenHands app_server to persist this as
``conversation.llm_model`` so the UI can display the model an ACP
conversation is actually running.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🟢 Good taste — focused, backward-compatible extraction of optional ACP session metadata.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟢 LOW. This only surfaces already-returned ACP session metadata and does not change prompts, tool execution, dependencies, or the agent loop. Targeted tests passed locally (21 passed).
VERDICT: ✅ Worth merging.
This review was generated by an AI agent (OpenHands) on behalf of the user.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26229676594
Coverage Report •
|
|||||||||||||||||||||||||||||||||||||||||||||
The SDK exposes ``ACPAgent.current_model_id`` via a property backed by a
PrivateAttr, which is the right shape in-process — ``AgentBase`` is
frozen, and the value is per-session runtime state, not config. But
PrivateAttrs don't serialize, so the value can't ride out on the
``agent`` field of the API response, and downstream consumers (the
OpenHands app_server) currently have no way to read it.
Lift the value into a top-level ``current_model_id`` field on
``ConversationInfo``:
- Add the field to ``_ConversationInfoBase`` with ``default=None``.
- In ``_compose_conversation_info``, read ``state.agent.current_model_id``
via ``getattr`` (None for non-ACP agents and older ACP servers) and
pass it through.
- 6 new tests in ``tests/agent_server/test_conversation_info_model.py``
cover the lift logic, the no-model fallback, the native-agent path,
and the override-vs-server-report precedence.
Also: tighten docstrings on the SDK side to point readers at
``ConversationInfo.current_model_id`` when they need to cross the API
boundary.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified ACPAgent against a local ACP stdio server: the PR now exposes server-reported and loaded session model IDs, and caller acp_model overrides take precedence.
Does this PR achieve its stated goal?
Yes. The PR’s goal is to surface the ACP models.currentModelId from NewSessionResponse and LoadSessionResponse via ACPAgent.current_model_id, while preserving explicit caller override precedence. Running real Conversation + ACPAgent flows against a fake ACP server showed base main had no current_model_id surface, while commit 7f5b2d73 returned qa-reported-model-42 for new sessions, qa-caller-override-model when overridden, and qa-loaded-session-model after resuming a persisted ACP session through load_session.
| Phase | Result |
|---|---|
| Environment Setup | ✅ uv sync --dev completed and created .venv; local ACP harness executed successfully. |
| CI Status | ⏳ At query time: 8 passing, 17 pending, 1 skipped; no failing checks observed. |
| Functional Verification | ✅ Real ACP stdio server + SDK conversation flows verified new, override, and resumed sessions. |
Functional Verification
Test 1: New ACP session surfaces server-reported current model
Step 1 — Establish baseline on origin/main:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_current_model.py reported after checking out origin/main:
scenario=reported
reported_property_before_run=<missing>
reported_agent_name=qa-acp-agent
reported_agent_version=1.0.0
reported_current_model_id=<missing>
This confirms the previous SDK could initialize and prompt the ACP server, but did not expose any current_model_id property/value.
Step 2 — Apply the PR's changes:
Checked out 7f5b2d73ca7f32decb4da2c14bbb798761ec3215.
Step 3 — Re-run with the fix in place:
Ran the same command:
scenario=reported
reported_property_before_run=None
reported_agent_name=qa-acp-agent
reported_agent_version=1.0.0
reported_current_model_id=qa-reported-model-42
This shows the new public property exists before init (None) and is populated from NewSessionResponse.models.currentModelId after a real conversation run.
Test 2: Caller acp_model override wins over server-reported model
Step 1 — Establish baseline on origin/main:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_current_model.py override:
scenario=override
override_property_before_run=<missing>
override_agent_name=qa-acp-agent
override_agent_version=1.0.0
override_current_model_id=<missing>
The old SDK accepted the ACP run but exposed no resolved model identifier.
Step 2 — Apply the PR's changes:
Checked out 7f5b2d73ca7f32decb4da2c14bbb798761ec3215.
Step 3 — Re-run with the fix in place:
Ran the same command with the fake server reporting qa-server-model-ignored and the SDK configured with acp_model="qa-caller-override-model":
scenario=override
override_property_before_run=None
override_agent_name=qa-acp-agent
override_agent_version=1.0.0
override_current_model_id=qa-caller-override-model
This confirms the resolved model follows the PR’s documented precedence: caller override beats the ACP server’s reported value.
Test 3: Resumed ACP session surfaces LoadSessionResponse current model
Step 1 — Establish baseline on origin/main:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_current_model.py load, which creates a persisted conversation, closes it, then reopens the same conversation ID with a fresh ACPAgent:
scenario=load
first_property_before_run=<missing>
first_agent_name=qa-acp-agent
first_agent_version=1.0.0
first_current_model_id=<missing>
loaded_property_before_run=<missing>
loaded_agent_name=qa-acp-agent
loaded_agent_version=1.0.0
loaded_current_model_id=<missing>
The log also showed the base SDK resumed the ACP session, but it still did not expose the loaded model value.
Step 2 — Apply the PR's changes:
Checked out 7f5b2d73ca7f32decb4da2c14bbb798761ec3215.
Step 3 — Re-run with the fix in place:
Ran the same persisted-conversation resume flow:
scenario=load
first_property_before_run=None
first_agent_name=qa-acp-agent
first_agent_version=1.0.0
first_current_model_id=qa-new-session-model
loaded_property_before_run=None
loaded_agent_name=qa-acp-agent
loaded_agent_version=1.0.0
loaded_current_model_id=qa-loaded-session-model
The SDK log included Resumed ACP session: qa-session-load, and the second run exposed the model returned by LoadSessionResponse, confirming the resumed-session path works end-to-end.
CI query
Ran gh pr checks 3347 --repo OpenHands/software-agent-sdk --json name,state,bucket,link,workflow:
CI summary: {'pending': 17, 'pass': 8, 'skipping': 1}
No test suites, linters, formatters, or type checkers were run locally; this was only a status query plus functional SDK execution.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
Verdict: PASS
Spins up each known ACP provider in turn (Claude Code, Codex) with and
without the ``acp_model`` override, runs ``init_state``, and prints the
resolved ``current_model_id`` so reviewers can confirm the end-to-end
extraction works against real ACP subprocesses.
Empirically verified against:
- claude-agent-acp 0.36.1:
no override → current_model_id = 'default' (Claude Code's
placeholder when no model is forced — honest
but not great UI text; tracked separately)
acp_model override → current_model_id = the override
- codex-acp 0.9.4:
no override → current_model_id = 'gpt-5.5/xhigh' (real model
the server is actually using)
acp_model override → current_model_id = the override
Requires the relevant ACP credentials in env / disk (CLAUDE_CODE_OAUTH_TOKEN
or ~/.claude.json for Claude Code, ~/.codex/auth.json or OPENAI_API_KEY for
Codex). Not wired into CI — those subprocesses pull real images and need
real auth.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
End-to-end validationRan
Raw script output: Worth flaggingClaude Code's session response carries
I'd lean toward (1) for now and (2) longer term, but happy to defer the call until this PR lands. To reproduce: uv run python scripts/validate_current_model_id.py(Needs |
Some ACP servers — notably claude-agent-acp — report ``currentModelId``
as an opaque alias (e.g. the literal string ``"default"``) and only
expose the human-readable name through the matching entry in
``models.availableModels``. Downstream clients that surface the model
in UI got either:
- the unhelpful raw id (``"default"``) by reading currentModelId, or
- had to implement the alias-resolution lookup themselves.
Do the lookup once in the SDK and expose it as a sibling property:
- ``_extract_session_models``: pulls both ``currentModelId`` and
``availableModels`` off a session response (replaces the
single-purpose ``_extract_current_model_id`` helper, which is kept
as a thin wrapper for backward compat).
- ``_resolve_model_name(current_id, available)``: maps the id to the
matching ``ModelInfo.name``; falls back to the raw id when the
lookup misses (codex-acp already returns concrete ids).
- ``ACPAgent._available_models`` (PrivateAttr) captured during
``_init`` alongside ``_current_model_id``.
- New ``ACPAgent.current_model_name`` read-only property; mirrors
``current_model_id``'s lifecycle and serialization caveats
(in-process; agent-server lifts to ``ConversationInfo``).
- ``ConversationInfo.current_model_name``: agent-server lifts the
resolved name onto the response so cross-process consumers don't
need to do the lookup themselves.
End-to-end against real ACP subprocesses (see updated
``scripts/validate_current_model_id.py``):
claude-agent-acp 0.36.1 default config:
current_model_id = 'default'
current_model_name = 'Default (recommended)' ← was 'default'
claude-agent-acp 0.36.1 + acp_model override:
current_model_id = 'claude-opus-4-1'
current_model_name = 'Opus 4.1' ← prettier
codex-acp 0.9.4 default config:
current_model_id = 'gpt-5.5/xhigh'
current_model_name = 'GPT-5.5 (xhigh)' ← prettier
codex-acp 0.9.4 + acp_model override:
current_model_id = 'gpt-5'
current_model_name = 'gpt-5' ← unchanged (no match)
Tracks agentclientprotocol/claude-agent-acp#699
which asks the upstream to consider returning the concrete model id
directly; this PR also unblocks the downstream UX in the meantime.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Updated end-to-end validation (commit `235fed007`)Added `_resolve_model_name` + `ACPAgent.current_model_name` + `ConversationInfo.current_model_name`. The lookup against `available_models` resolves opaque aliases at the SDK boundary, sidesteps agentclientprotocol/claude-agent-acp#699 without waiting on upstream, and also improves codex-acp where it turns out `availableModels` carries friendlier names too.
Raw output: ``` === Claude Code (acp_model='claude-opus-4-1' override) === === Codex (no override → server-reported) === === Codex (acp_model='gpt-5' override) === To reproduce: `uv run python scripts/validate_current_model_id.py` (needs the usual Claude Code / Codex auth). |
End-to-end validation with real LLM turnsRan a tiny prompt through 7 scenarios — 4 Claude Code variants, 3 Codex variants — and checked what the SDK reports vs. what the LLM provider actually billed.
✓ Every scenario populated both † Codex token counts are 0 because codex-acp doesn't emit the usage_update notifications the SDK tracks for Claude Code — out of scope for this PR; the prompts did execute (no Initial attempt + correctionFirst pass failed for Codex with Reproduce# In the SDK worktree, with CLAUDE_CODE_OAUTH_TOKEN / ~/.codex/auth.json set:
uv run python /tmp/validate_acp_e2e.pyThe script lives at |
The model fields surfaced on ``ConversationInfo.current_model_*`` were
only populated while the ACP subprocess was actively running — once the
conversation went idle (or the agent-server restarted), the live agent
instance reconstructed from disk had ``_current_model_id`` back at the
PrivateAttr default of ``None``. End-to-end this meant the conversation
chip on a freshly-listed sidebar showed the provider brand name
("Claude Code") instead of the model the harness had actually picked
("Default (recommended)" / "GPT-5.5 (xhigh)").
Mirror the existing ``acp_session_id`` / ``acp_session_cwd`` /
``acp_agent_name`` persistence pattern: at the end of ``ACPAgent._init``,
write ``acp_current_model_id`` and the pre-resolved
``acp_current_model_name`` into ``state.agent_state``. That dict IS
serialized to ``base_state.json`` and read back when the agent-server
rebuilds a ConversationState on demand.
In the agent-server's ``_compose_conversation_info``, prefer the live
agent attrs (freshest source within an active session) and fall back
to the persisted ``agent_state`` values, so the response carries the
right model regardless of whether the conversation is hot or cold.
Two new tests in ``test_conversation_info_model.py`` lock the new
behavior: cold conversations source from ``agent_state``; live
sessions take precedence when both are present.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pre-commit on CI flagged formatting drift in the three files touched by this PR — ruff prefers the compact single-line forms over the manually wrapped ones I introduced (the wrapping was redundant once the lines fit under the 88-column limit). No behavior change; 204 tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
all-hands-bot
left a comment
There was a problem hiding this comment.
Clean, well-structured addition — the dual-source fallback (live PrivateAttr → persisted agent_state) for cold reads is exactly the right pattern, and the serialization boundary documentation in the docstrings is clear. Two small suggestions below; neither is a blocker.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
The PR achieves its goal: ACP session responses now surface the active model through ACPAgent and ConversationInfo, including the live HTTP API path.
Does this PR achieve its stated goal?
Yes. I exercised a fake ACP server as a real ACP subprocess and confirmed that models.currentModelId was absent from usable SDK/server surfaces on main, then present on the PR branch. I also started the actual agent-server and made real HTTP requests to create/run an ACP conversation; after initialization, GET /api/conversations/{id} returned current_model_id: "default" and current_model_name: "Default (recommended)".
| Phase | Result |
|---|---|
| Environment Setup | ✅ uv sync --dev completed; agent-server started locally with uv run python -m openhands.agent_server |
| CI Status | 🟡 Latest check showed 20 successful, 0 failing, 6 pending, 3 skipped |
| Functional Verification | ✅ Verified SDK runtime property, caller override precedence, persisted server info, and live HTTP API response |
Functional Verification
Test 1: SDK ACP session response model extraction
Step 1 — Establish baseline without the fix:
Ran git switch --detach origin/main && uv run python /tmp/qa_acp_current_model.py against a temporary ACP subprocess returning models.currentModelId = "default":
{
"server_reported_model": {
"data": {
"current_model_id": "<missing>",
"current_model_name": "<missing>",
"agent_state_model_id": null,
"agent_state_model_name": null
},
"ok": true
},
"caller_override_model": {
"data": {
"current_model_id": "<missing>",
"current_model_name": "<missing>"
},
"ok": true
}
}This confirms the old behavior: the ACP server reported model state, but clients had no current_model_id/current_model_name surface to read.
Step 2 — Apply the PR's changes:
Checked out feat-acp-surface-current-model-id at e6e97713818dadd4d2f0ac0e327067240a7d2280.
Step 3 — Re-run with the fix in place:
Ran uv run python /tmp/qa_acp_current_model.py:
{
"server_reported_model": {
"data": {
"current_model_id": "default",
"current_model_name": "Default (recommended)",
"agent_state_model_id": "default",
"agent_state_model_name": "Default (recommended)"
},
"ok": true
},
"caller_override_model": {
"data": {
"current_model_id": "forced-model",
"current_model_name": "forced-model",
"agent_state_model_id": "forced-model",
"agent_state_model_name": "forced-model"
},
"ok": true
}
}This confirms the PR captures the server-reported model, resolves the display name from availableModels, persists it into agent_state, and respects caller-provided acp_model override precedence.
Test 2: Agent-server ConversationInfo JSON response
Step 1 — Establish baseline without the fix:
The same baseline run on main exercised server composition with cold persisted state:
{
"conversation_info_cold_state": {
"data": {
"has_current_model_id_field": false,
"current_model_id": "<missing>",
"current_model_name": "<missing>"
},
"ok": true
}
}This shows downstream HTTP clients could not receive a top-level current_model_id from ConversationInfo before the PR.
Step 2 — Apply the PR's changes:
Used the PR branch and started the real server: uv run python -m openhands.agent_server.
Step 3 — Exercise the live API with the fix in place:
Posted a real ACP conversation using a temporary ACP subprocess:
curl -X POST http://127.0.0.1:8000/api/conversations -H 'Content-Type: application/json' --data @/tmp/start_acp_conversation.json
# POST HTTP 201Then triggered/polled the live conversation and fetched GET /api/conversations/e933e901-2993-428b-90de-a46d597ccd06:
{
"id": "e933e901-2993-428b-90de-a46d597ccd06",
"execution_status": "running",
"current_model_id": "default",
"current_model_name": "Default (recommended)",
"agent_state_model_id": "default",
"agent_state_model_name": "Default (recommended)"
}This verifies the PR works through the real REST boundary that downstream consumers use, not just in-process SDK access.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🟡 Acceptable — the metadata flow is mostly sound, but I found two issues to address before approval.
Risk: 🟡 Medium. This is low-risk API metadata plumbing overall, but resume-state correctness and a committed ad-hoc validation helper need cleanup.
This review was generated by an AI agent (OpenHands) on behalf of the user.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26234241675
…ic aliases
Surfacing ``ModelInfo.name`` for opaque ``currentModelId`` values like
claude-agent-acp's ``"default"`` produced equally-opaque chip text
("Default (recommended)") — the consumer wanted the *actual* underlying
model identity, which the same server already exposes one field over
in ``ModelInfo.description`` as the leading segment before " · "
("Opus 4.7 with 1M context · Most capable for complex work").
Treat a small set of generic-alias modelIds — ``default``, ``auto``,
``sonnet``, ``opus``, ``haiku`` — as the trigger: when ``current_model_id``
matches one of these, prefer the head of ``description`` over ``name``.
Concrete ids (codex-acp's ``gpt-5.5/xhigh``, explicit Anthropic versions,
…) continue to use ``name``, which is the right level of detail for
them.
Empirical resolution under the new logic:
claude-agent-acp default → "Opus 4.7 with 1M context"
claude-agent-acp acp_model=sonnet → "Sonnet 4.6"
claude-agent-acp acp_model=haiku → "Haiku 4.5"
codex-acp default → "GPT-5.5 (xhigh)" (unchanged)
codex-acp acp_model=gpt-5.4/low → "gpt-5.4 (low)" (unchanged)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The stats-streaming callback added in #3308 acquires the conversation state lock to construct a ConversationStateUpdateEvent: state = self._conversation._state with state: event = ConversationStateUpdateEvent(key="stats", value=state.stats) self._emit_event_from_thread(event) This callback is invoked synchronously by ``Telemetry.on_response`` (the regular Agent path) and ``ACPAgent._record_usage`` (the ACP path). Both call sites run inside ``LocalConversation.run()``'s ``with self._state:`` block, so the calling thread already owns ``state``'s FIFOLock. FIFOLock documents itself as reentrant, so on paper the re-acquisition should be a no-op — in practice, on the ACP code path it deadlocks (silently) before ``step()`` can emit the assistant's FinishAction / ObservationEvent, leaving every short text-only conversation hung in ``running`` status forever. Bisected by adding ``logger.info`` lines through ``step()`` and ``_record_usage`` until the last one to fire was ``"invoking stats_update_callback"`` — the next line ("returned") never landed. Adding logs inside ``stats_callback`` confirmed the hang is on the ``with state:`` re-entry. The fix is to drop the redundant lock acquisition: we're only reading ``state.stats`` once for event-construction (GIL-atomic) and the real persistence still happens via ``_emit_event_from_thread`` → a ``locked_on_event`` closure that takes the lock on its own executor thread. The inner ``with state:`` was both unnecessary and the source of the deadlock. Regression tests in ``TestStatsCallbackNoDeadlock`` verify the callback returns promptly both with and without a caller holding the state lock. The latter case pre-fix hangs forever — pytest timeout cap is set so CI doesn't sit on it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…odel-id Pull in the deadlock fix from PR #3349 so dev stacks pinned to this PR also get a working ACP path (otherwise every ACP conversation hangs on the redundant `with state:` in stats_callback before our chip data ever surfaces in the UI).
…ent-model-id # Conflicts: # tests/agent_server/test_event_service.py
- Preserve persisted acp_current_model_* on resume when load_session omits the UNSTABLE ``models`` field, so the chip survives restarts even on servers that only attach models to new_session responses. - Use the existing self.current_model_name property in init_state instead of re-calling _resolve_model_name with the raw args. - Remove unused _extract_current_model_id helper; production reads via _extract_session_models. Tests migrated to the same. - Delete scripts/validate_current_model_id.py — was an internal manual smoke test that always exited 0, gave no CI signal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
all-hands-bot
left a comment
There was a problem hiding this comment.
The design here is solid: capturing models.currentModelId via getattr-based tolerance for the UNSTABLE ACP capability, backing current_model_id/current_model_name by PrivateAttr (correct given the frozen AgentBase), using the dual-source fallback (live PrivateAttr → persisted agent_state) to survive cold reads and server restarts, and lifting the values onto ConversationInfo through getattr so non-ACP agents are no-ops. One test-correctness bug to fix before merging.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.
Python's class shadowing silently hid the first class's 6 edge-case tests (None response, empty string id, non-string id, etc.) — they never ran. Merge into a single class so ruff F811 stops firing and all 9 cases execute. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🟡 Acceptable — the metadata plumbing is useful, but there is one state-carryover bug to fix before I’d be comfortable approving.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟡 MEDIUM. This is mostly low-risk API metadata surfacing, but incorrect persisted runtime state can mislead clients after ACP resume fallback.
VERDICT: COMMENT — please address the stale model-state path below.
This review was generated by an AI agent (OpenHands) on behalf of the user.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26278971337
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified the ACP model-surfacing feature by running real SDK conversations and local agent-server HTTP flows against a temporary ACP JSON-RPC server; the PR surfaces and persists the current model as claimed.
Does this PR achieve its stated goal?
Yes. On main, the same fake ACP server reported models.currentModelId, but SDK/agent-server responses exposed no current_model_id/current_model_name; on this PR, a real Conversation.run() and /api/acp/conversations flow surfaced default plus the resolved display name Opus 4.7 with 1M context. I also verified caller override precedence (acp_model="gpt-5.5/xhigh") and that model metadata survives an agent-server restart via the persisted conversation state.
| Phase | Result |
|---|---|
| Environment Setup | ✅ uv sync --dev completed; no test suites/linters/pre-commit were run locally. |
| CI Status | unresolved-review-threads); pre-commit was in progress. |
| Functional Verification | ✅ SDK and agent-server behavior matched the PR goal end-to-end. |
Functional Verification
Test 1: SDK Conversation.run() surfaces ACP-reported model state
Step 1 — Establish baseline on origin/main:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_sdk_check.py against a temporary ACP server whose session/new response included models.currentModelId="default" and availableModels metadata:
QA_RESULT={"agent_state": {"acp_agent_name": "claude-agent-acp", "acp_agent_version": "qa-fake", "acp_session_cwd": "/tmp/qa-acp-workspace-5amso6be", "acp_session_id": "qa-session"}, "current_model_id": null, "current_model_name": null, "has_current_model_id_attr": false, "has_current_model_name_attr": false, "status": "ConversationExecutionStatus.FINISHED"}This confirms the previous behavior: ACP session metadata was accepted by the server, but the SDK agent exposed no current-model fields and did not persist them in agent_state.
Step 2 — Apply the PR's changes:
Checked out feat-acp-surface-current-model-id at 11d2e98f56c15738fbec00b03817a790c527fd5d.
Step 3 — Re-run with the fix in place:
Ran the same command:
QA_RESULT={"agent_state": {"acp_agent_name": "claude-agent-acp", "acp_agent_version": "qa-fake", "acp_current_model_id": "default", "acp_current_model_name": "Opus 4.7 with 1M context", "acp_session_cwd": "/tmp/qa-acp-workspace-qs6p6csz", "acp_session_id": "qa-session"}, "current_model_id": "default", "current_model_name": "Opus 4.7 with 1M context", "has_current_model_id_attr": true, "has_current_model_name_attr": true, "status": "ConversationExecutionStatus.FINISHED"}This shows the PR captures models.currentModelId, resolves the alias display name from availableModels, exposes both values on the running ACP agent, and persists them in agent_state.
Test 2: Caller acp_model override wins over server-reported model
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_sdk_override_check.py with ACPAgent(..., acp_model="gpt-5.5/xhigh") while the fake server still reported currentModelId="default":
QA_OVERRIDE_RESULT={"agent_state": {"acp_agent_name": "claude-agent-acp", "acp_agent_version": "qa-fake", "acp_current_model_id": "gpt-5.5/xhigh", "acp_current_model_name": "GPT-5.5 (xhigh)", "acp_session_cwd": "/tmp/qa-acp-override-workspace-shdjldmq", "acp_session_id": "qa-session"}, "current_model_id": "gpt-5.5/xhigh", "current_model_name": "GPT-5.5 (xhigh)", "status": "ConversationExecutionStatus.FINISHED"}This confirms the documented precedence: caller-provided acp_model overrides the ACP server's reported current model.
Test 3: agent-server HTTP API surfaces model fields after ACP initialization
Step 1 — Establish baseline on origin/main:
Started a local agent-server with uv run python -m uvicorn openhands.agent_server.api:api, created an ACP conversation via POST /api/acp/conversations, then polled GET /api/acp/conversations/{id} until finished:
GET_STATUS_FIELDS={"agent_state": {"acp_agent_name": "claude-agent-acp", "acp_agent_version": "qa-fake", "acp_session_cwd": "/tmp/qa-acp-http-workspace-rb93haz9", "acp_session_id": "qa-session"}, "current_model_id": null, "current_model_name": null, "execution_status": "finished", "id": "f4571fb7-ef4c-4dee-9a96-db04b624d906"}The API response did not surface model information before this PR.
Step 2 — Re-run with the PR branch:
Ran the same local HTTP flow on feat-acp-surface-current-model-id:
POST_STATUS_FIELDS={"agent_state": {}, "current_model_id": null, "current_model_name": null, "execution_status": "idle", "id": "2419d8ee-d390-420c-8986-c32cf195cf8e"}
GET_STATUS_FIELDS={"agent_state": {"acp_agent_name": "claude-agent-acp", "acp_agent_version": "qa-fake", "acp_current_model_id": "default", "acp_current_model_name": "Opus 4.7 with 1M context", "acp_session_cwd": "/tmp/qa-acp-http-workspace-t6t6_2cm", "acp_session_id": "qa-session"}, "current_model_id": "default", "current_model_name": "Opus 4.7 with 1M context", "execution_status": "finished", "id": "2419d8ee-d390-420c-8986-c32cf195cf8e"}The immediate create response is still null while initialization has not happened yet, but the completed conversation info response exposes the model fields as intended.
Test 4: model fields survive agent-server restart / cold read
Baseline on origin/main:
Created a conversation, stopped the server, restarted it, and fetched the same conversation:
LIVE_STATUS_FIELDS={"agent_state": {"acp_agent_name": "claude-agent-acp", "acp_agent_version": "qa-fake", "acp_session_cwd": "/tmp/qa-acp-restart-workspace-zvheuh8m", "acp_session_id": "qa-session"}, "current_model_id": null, "current_model_name": null, "execution_status": "finished", "id": "b1a4769a-793c-4eb0-8ad2-1d08e744f189"}
COLD_STATUS_FIELDS={"agent_state": {"acp_agent_name": "claude-agent-acp", "acp_agent_version": "qa-fake", "acp_session_cwd": "/tmp/qa-acp-restart-workspace-zvheuh8m", "acp_session_id": "qa-session"}, "current_model_id": null, "current_model_name": null, "execution_status": "finished", "id": "b1a4769a-793c-4eb0-8ad2-1d08e744f189"}PR branch:
Repeated the same restart flow:
LIVE_STATUS_FIELDS={"agent_state": {"acp_agent_name": "claude-agent-acp", "acp_agent_version": "qa-fake", "acp_current_model_id": "default", "acp_current_model_name": "Opus 4.7 with 1M context", "acp_session_cwd": "/tmp/qa-acp-restart-workspace-_vmx7636", "acp_session_id": "qa-session"}, "current_model_id": "default", "current_model_name": "Opus 4.7 with 1M context", "execution_status": "finished", "id": "8e962e32-680a-40da-9c5b-a947671da524"}
COLD_STATUS_FIELDS={"agent_state": {"acp_agent_name": "claude-agent-acp", "acp_agent_version": "qa-fake", "acp_current_model_id": "default", "acp_current_model_name": "Opus 4.7 with 1M context", "acp_session_cwd": "/tmp/qa-acp-restart-workspace-_vmx7636", "acp_session_id": "qa-session"}, "current_model_id": "default", "current_model_name": "Opus 4.7 with 1M context", "execution_status": "finished", "id": "8e962e32-680a-40da-9c5b-a947671da524"}This verifies the top-level ConversationInfo fields still populate after restart, not only while the original ACP subprocess is alive.
Issues Found
None from functional QA.
This review was created by an AI agent (OpenHands) on behalf of the user.
Final verdict: PASS
Previous fix preserved persisted acp_current_model_* whenever _current_model_id was None, which also kept stale values across cwd mismatches and load_session failures — both fall through to a fresh new_session whose response may omit the UNSTABLE models block. The chip would then describe the dead session. Distinguish using the session-id comparison: when init_state's prior acp_session_id matches what _start_acp_server settled on, we truly resumed and keep persisted model state; otherwise it was a fresh replacement and we clear the model keys. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
[Automatic Post]: It has been a while since there was any activity on this PR. @simonrosenberg, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up. This comment was created by an AI agent (OpenHands) on behalf of the user. |
Expose the ACP session's model list verbatim and let clients resolve the display label, rather than deriving a name server-side. Previously the SDK read both ``currentModelId`` and ``availableModels`` from the session response, but only used the list internally to prettify a single ``current_model_name`` string (via ``_resolve_model_name`` + a hardcoded ``_GENERIC_MODEL_ALIASES`` table) — then dropped the list. That curation table is the same kind of static knowledge that drifts, and it was redundant: a client with ``current_model_id`` + the full list resolves the label itself with a trivial id lookup. Changes: - New ``ACPModelInfo`` DTO (``openhands/sdk/agent/acp_models.py``): a stable, normalized mirror of the ACP protocol ``ModelInfo``. Lives in its own leaf module so the agent-server can import it for ``ConversationInfo`` without eagerly importing ``ACPAgent`` (which would register it in the agent DiscriminatedUnion). ``models`` is UNSTABLE upstream, so we re-map at the SDK boundary instead of leaking the vendored ``acp.schema`` type onto the public REST API. - ``ACPAgent``: drop ``current_model_name`` / ``_resolve_model_name`` / ``_GENERIC_MODEL_ALIASES``; add an ``available_models`` property. ``_extract_session_models`` now normalizes to ``list[ACPModelInfo]``. - agent-server: ``ConversationInfo.current_model_name`` -> ``available_models: list[ACPModelInfo]``; persist ``acp_available_models`` (was ``acp_current_model_name``) so the list survives cold reads, letting clients resolve the label (and render a picker) for idle/restored conversations. This is also the data source the unified ACP model picker (agent-canvas#769) needs, replacing the static client-side model lists. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Integrate the model-state surfacing (#3347) with the runtime model-switch work that landed on main (#3390). Conflicts in acp_agent.py, resolved by adapting #3347's model capture onto main's refactored `_init` (which now sets `_conn`/`_process` inside and returns only success-only fields): `_init` now also returns the resolved `current_model_id` + normalized `available_models`, and the fresh-session branch captures them from `new_session` while the resume branch keeps main's `_reapply_session_model_on_resume`. Integration fixes prompted by review of #3347 against the merged #3390: - ACPAgent.set_acp_model now refreshes `_current_model_id` on a successful live switch, so the chip/picker (ConversationInfo.current_model_id) reflects the new model instead of the stale session-start value. The shallow model_copy in switch_acp_model carries the updated PrivateAttr. - LocalConversation.switch_acp_model keeps the persisted `acp_current_model_id` hint in sync so cold list reads after a switch (pre re-init) aren't stale. - Add ACPAgent.supports_runtime_model_switch (resolved from the detected provider) and surface it on ConversationInfo (persisted in agent_state so cold reads are correct), telling the inline picker whether to offer a live-switch control. Regression tests: switch refreshes current_model_id + persisted hint; supports_runtime_model_switch property + ConversationInfo lift. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- P1: resume no longer erases persisted available_models. The list persistence is now gated on actually receiving a list this launch (independent of current_model_id, which can be non-null from a forced acp_model after a switch even when load_session omits the UNSTABLE models block). On a true resume the previously persisted list is preserved; it's cleared only on a fresh non-resumed replacement. - P1: switch_acp_model persists acp_current_model_id unconditionally. A successful live switch is authoritative even for an older/custom server that reported no models at init (so the hint key may not exist yet) — cold reads after restart now reflect the switch. - P2: _extract_session_models drops entries without a usable model_id instead of surfacing model_id="" (an invalid picker option / switch target). - Concern: supports_runtime_model_switch now mirrors set_acp_model's gate — refuse only for a known provider that declares no support, attempt optimistically for unknown/custom servers (so a switch-capable custom server isn't blocked from the picker), and False before a session exists. Regression tests for each. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ioms Address a follow-up review (most items already fixed in the prior commit): - _compose_conversation_info now falls back to the agent's authoritative acp_model as the final source for current_model_id, so a cold list read of a switched/overridden conversation whose server never surfaced `models` still resolves the chip instead of returning None. - Document the deliberate read strategies: available_models keeps the `or` idiom on purpose (the property returns [] — never None — for both cold-read and live-empty agents, so an `is None` check would drop the persisted picker payload on cold reads); supports_runtime_model_switch reads persisted-only because it's static per provider with no live-vs-persisted distinction. Regression test: cold read falls back to acp_model. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ionInfo Per review: state explicitly that current_model_id is not guaranteed to be a member of available_models (a forced acp_model override can name a model absent from the list), so clients must treat a lookup miss as "show the raw id"; and that some entries are opaque aliases whose human identity lives in description (e.g. claude-agent-acp's "default"). This is the contract the move to client-side label resolution hands to the canvas#769 picker. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
…tion The field description still claimed False for unknown/custom ACP servers, contradicting the now-optimistic behavior (the property mirrors set_acp_model, which attempts the switch for unknown/custom providers). Since this feeds the public API / OpenAPI / client contract, align it: True for known switch-capable providers and optimistically for unknown/custom (OpenHands attempts, a rejection surfaces as an error); False for native agents, known-unsupported providers, and before a session exists. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
all-hands-bot
left a comment
There was a problem hiding this comment.
Code Review: feat(acp): surface current_model_id + available_models from session responses
Taste Rating: 🟡 Acceptable — the design is sound and prior review issues are addressed, but there is one API documentation inconsistency to fix before this ships.
Summary:
The UNSTABLE-boundary isolation via a dedicated leaf module (), the dual-source fallback pattern () for cold reads, and the heuristic for correct persistence on non-resume resets are all well thought out. The helper is appropriately defensive. All six previously raised inline issues are resolved.
[IMPROVEMENT OPPORTUNITY]
The field description on mis-states the semantics — see inline comment.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟢 LOW. This is purely additive metadata plumbing. Non-ACP agents and clients that don't read the new fields are completely unaffected. The only risk vector is the ACP session init/resume path, which is covered by the new tests.
VERDICT: ✅ Worth merging after fixing the one documentation issue.
KEY INSIGHT: Isolating the UNSTABLE protocol type behind a stable DTO at the SDK boundary is exactly the right move — clients get a stable contract regardless of upstream spec churn.
Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:
- Add a
.agents/skills/custom-codereview-guide.mdfile to your branch with the/codereviewtrigger and the context the reviewer is missing.- Re-request a review — the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
Was this review helpful? React with 👍 or 👎 to give feedback.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified the ACP model metadata feature end-to-end through SDK execution and the live agent-server HTTP API; the PR surfaces current/available model state and keeps it current after runtime switching and server restart.
Does this PR achieve its stated goal?
Yes. The PR set out to surface ACP currentModelId, availableModels, and runtime-switch capability to SDK/server clients. I exercised a real ACPAgent conversation against a local JSON-RPC ACP server, then used the live agent-server API to create/run/get/switch an ACP conversation; the PR branch returned current_model_id, available_models, and supports_runtime_model_switch, updated the surfaced model to model-b after switch_acp_model, and preserved those fields after restarting the server.
| Phase | Result |
|---|---|
| Environment Setup | ✅ uv run created/used the project env; SDK and ACP imports worked; local agent-server started on 127.0.0.1:8765. |
| CI Status | ⚪ Refreshed checks showed 12 successful, 1 skipped, 13 pending, 0 failing. |
| Functional Verification | ✅ SDK properties, ConversationInfo fields, HTTP API response fields, model switching, and cold persisted reads all behaved as intended. |
Functional Verification
Test 1: SDK + ConversationInfo before/after ACP model metadata surfacing
Step 1 — Establish baseline on main:
Ran git checkout main && uv run python /tmp/qa_acp_model_surface.py against a local ACP server that reports models.currentModelId = model-a and two available models.
Relevant output:
{
"initial": {
"agent_has_current_model_id_property": false,
"agent_current_model_id": null,
"agent_available_models": null,
"conversation_info_has_current_model_id": false,
"conversation_info_current_model_id": null,
"conversation_info_available_models": null,
"conversation_info_supports_runtime_model_switch": null,
"agent_state_model_fields": {}
},
"after_switch": {
"switch_call_succeeded": true,
"agent_current_model_id_after_switch": null,
"conversation_info_current_model_id_after_switch": null,
"agent_state_current_model_after_switch": null
}
}This confirms the old behavior: the ACP server reported model state, but SDK/server-facing objects did not expose it, and a successful runtime switch did not update any surfaced model field.
Step 2 — Apply the PR changes:
Checked out feat-acp-surface-current-model-id at bd2f553882a36420feb941b8009bb8348bf7646e.
Step 3 — Re-run with the fix in place:
Ran uv run python /tmp/qa_acp_model_surface.py with the same local ACP server.
Relevant output:
{
"initial": {
"agent_has_current_model_id_property": true,
"agent_current_model_id": "model-a",
"agent_available_models": [
{"model_id": "model-a", "name": "Model A", "description": "Primary QA model"},
{"model_id": "model-b", "name": "Model B", "description": "Switch target QA model"}
],
"conversation_info_current_model_id": "model-a",
"conversation_info_supports_runtime_model_switch": true,
"agent_state_model_fields": {
"acp_current_model_id": "model-a",
"acp_supports_runtime_model_switch": true
}
},
"after_switch": {
"switch_call_succeeded": true,
"agent_current_model_id_after_switch": "model-b",
"conversation_info_current_model_id_after_switch": "model-b",
"agent_state_current_model_after_switch": "model-b"
}
}This shows the SDK now exposes the reported model metadata, ConversationInfo carries it, and a real switch_acp_model("model-b") operation refreshes both live and persisted surfaced state.
Test 2: Live agent-server HTTP API behavior and cold restart persistence
Step 1 — Establish baseline without an initialized session:
Started the server with OPENHANDS_SUPPRESS_BANNER=1 uv run python -m openhands.agent_server --host 127.0.0.1 --port 8765, then posted a real ACP conversation:
curl -sS -X POST http://127.0.0.1:8765/api/conversations -H 'Content-Type: application/json' --data @/tmp/qa_start_conv.jsonInitial response before ACP session initialization:
{
"execution_status": "idle",
"current_model_id": null,
"available_models": [],
"supports_runtime_model_switch": false
}This is expected before the ACP session has started; no model state has been learned yet.
Step 2 — Exercise the PR path through HTTP:
The initial message auto-started the conversation; after it finished, I called the live switch endpoint:
curl -sS -X POST http://127.0.0.1:8765/api/conversations/$CID/switch_acp_model -H 'Content-Type: application/json' --data '{"model":"model-b"}'Then fetched the conversation:
curl -sS http://127.0.0.1:8765/api/conversations/$CIDRelevant response:
{
"execution_status": "finished",
"current_model_id": "model-b",
"available_models": [
{"model_id": "model-a", "name": "Model A", "description": "Primary QA model"},
{"model_id": "model-b", "name": "Model B", "description": "Switch target QA model"}
],
"supports_runtime_model_switch": true
}This verifies the user-facing REST response includes the metadata clients need for the chip/picker and reflects the switched model.
Step 3 — Verify cold persisted reads:
Stopped and restarted the local agent-server, then fetched the same conversation again:
curl -sS http://127.0.0.1:8765/api/conversations/$CIDRelevant response after restart:
{
"execution_status": "finished",
"current_model_id": "model-b",
"available_models": [
{"model_id": "model-a", "name": "Model A", "description": "Primary QA model"},
{"model_id": "model-b", "name": "Model B", "description": "Switch target QA model"}
],
"supports_runtime_model_switch": true
}This confirms the persisted agent_state fallback works: the model chip/picker data survives a server restart even before a new ACP init call repopulates private attrs.
Issues Found
None.
Final verdict: PASS.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🟡 Acceptable — useful additive metadata plumbing, but one resume-state edge case can surface stale picker options.
Risk: 🟡 Medium (ACP resume metadata persistence; no dependency or agent-loop changes).
Verdict: COMMENT — please fix the inline issue before approval.
This review was generated by an AI agent (OpenHands) on behalf of the user.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26517437952
Distinguish an absent `models` block from an explicit empty `availableModels` on resume. `_extract_session_models` now returns `available_models=None` when the (UNSTABLE) block is absent and `[]` when the server reports it but offers no models; `_available_models` and the persistence gate key off that `None` sentinel instead of a truthy check: - absent (None) on a true resume -> preserve the persisted list (the block is often omitted from load_session responses); - absent on a fresh non-resumed replacement -> clear (stale); - reported, incl. explicit [] -> overwrite, so a server that dropped its models stops advertising stale picker options. The public `available_models` property still always returns a list (coerces the None sentinel to []). Adds a regression test for the explicit-empty resume path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
🟢 Good taste — clean, backward-compatible extraction of optional ACP session metadata. All previously raised issues have been addressed in this version.
Summary
This PR surfaces current_model_id, available_models, and supports_runtime_model_switch from ACP session responses through a well-designed two-source fallback chain (live PrivateAttr → persisted agent_state) that handles warm reads, cold list reads, and resume correctly.
The fix in f511eb950 correctly addressed the final review concern: _extract_session_models now returns (None, None) when the models block is absent (vs (id, []) when present-but-empty), giving init_state the sentinel it needs to distinguish "preserve the persisted picker list on resume" from "clear the stale list on fresh session replacement".
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟢 LOW. Purely additive changes — three new fields onConversationInfo(all optional with safe defaults), no behavior change for native OpenHands agents, no agent-loop or protocol changes. ACP-specific runtime state lives inPrivateAttrs and does not affect serialization/deserialization of existing fields. The test suite adds 750+ lines covering the full persistence state machine.
VERDICT:
✅ Worth merging — the design is sound: ACPModelInfo correctly isolated to its own module to avoid discriminated-union registration side effects; the None/[] sentinel properly threads through _extract_session_models → _start_acp_server → init_state persistence; switch_acp_model unconditionally writes the persisted hint so cold reads after a live switch are correct.
KEY INSIGHT:
The or-chain fallback in _compose_conversation_info is safe precisely because init_state already overwrites acp_available_models with [] whenever the server explicitly reports an empty list — so "live empty" and "cold read of a session whose server reported none" both correctly yield [] without surfacing a prior session's stale model list.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified with a real temporary ACP server, SDK conversation execution, and local agent-server HTTP requests that ACP model metadata now surfaces and updates as intended.
Does this PR achieve its stated goal?
Yes. The PR set out to expose ACP currentModelId, availableModels, and runtime-switch capability to SDK/server clients instead of discarding them. On main, the same fake ACP server reported model state but SDK/ConversationInfo consumers saw no model fields; on this PR, the SDK properties, persisted agent_state, ConversationInfo, /api/conversations/{id}, /api/conversations/search, and switch_acp_model all surfaced the expected model metadata and updated current_model_id after a live switch.
| Phase | Result |
|---|---|
| Environment Setup | ✅ uv sync --dev completed; local agent-server started successfully for HTTP verification. |
| CI Status | 🟡 Current checks show no failures; many checks are successful, with agent-server image/binary and REST API jobs still in progress at QA time. |
| Functional Verification | ✅ Real ACP session init/run, HTTP API reads, cold-read persistence simulation, forced override, omitted-models fallback, and runtime switch all behaved as described. |
Functional Verification
Test 1: SDK ACP session surfaces model state and preserves it for cold reads
Step 1 — Establish baseline on main:
Ran git checkout main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_model_surface.py against a temporary ACP server whose new_session response included models.currentModelId = "default" and two availableModels entries.
Observed excerpt:
{
"branch": "main",
"has_current_model_property": false,
"current_model_id": null,
"available_models": [],
"agent_state": {
"acp_agent_name": "qa-custom-acp",
"acp_agent_version": "1.0.0",
"acp_session_id": "qa-session-1"
},
"conversation_info_has_current_model_id": false,
"conversation_info_current_model_id": null,
"cold_conversation_info_current_model_id": null,
"after_switch_current_model_id": null,
"switch_ok": true
}This confirms the old behavior: even though the ACP server reported model state and runtime switching succeeded, SDK/server consumers had no surfaced current model or model list.
Step 2 — Apply PR changes:
Checked out feat-acp-surface-current-model-id at current PR head 7e0f74c0693e9f94d7152a2c778fb36a96d0a609.
Step 3 — Re-run with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_model_surface.py.
Observed excerpt:
{
"branch": "feat-acp-surface-current-model-id",
"has_current_model_property": true,
"current_model_id": "default",
"available_models": [
{"model_id": "default", "name": "Default (recommended)"},
{"model_id": "sonnet", "name": "Sonnet"}
],
"agent_state": {
"acp_current_model_id": "default",
"acp_supports_runtime_model_switch": true
},
"after_switch_current_model_id": "sonnet",
"after_switch_agent_state": {"acp_current_model_id": "sonnet"},
"conversation_info_current_model_id": "sonnet",
"cold_conversation_info_current_model_id": "sonnet",
"cold_conversation_info_supports_runtime_model_switch": true
}This verifies the feature end-to-end in the SDK path: the ACP response model data is captured, persisted, survives a JSON round-trip/cold-read simulation, and runtime switching updates the surfaced model id.
Test 2: Agent-server HTTP API exposes the new ConversationInfo fields
Step 1 — Baseline:
The main harness above established that ConversationInfo did not expose current_model_id / available_models fields to clients.
Step 2 — Apply PR changes and start the actual server:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -m openhands.agent_server --port 8765, then created and ran a conversation through HTTP:
curl -H 'Content-Type: application/json' -d @/tmp/qa_create_conv.json http://127.0.0.1:8765/api/acp/conversations
curl -X POST http://127.0.0.1:8765/api/conversations/$CID/run
curl http://127.0.0.1:8765/api/conversations/$CIDObserved excerpt after run:
{
"agent_state": {
"acp_current_model_id": "default",
"acp_available_models": [
{"model_id": "default", "name": "Default (recommended)"},
{"model_id": "sonnet", "name": "Sonnet"}
],
"acp_supports_runtime_model_switch": true
},
"current_model_id": "default",
"available_models": [
{"model_id": "default", "name": "Default (recommended)"},
{"model_id": "sonnet", "name": "Sonnet"}
],
"supports_runtime_model_switch": true
}Then ran:
curl -X POST -H 'Content-Type: application/json' -d '{"model":"sonnet"}' http://127.0.0.1:8765/api/conversations/$CID/switch_acp_model
curl http://127.0.0.1:8765/api/conversations/$CID
curl 'http://127.0.0.1:8765/api/conversations/search?limit=5'Observed excerpt after switching:
{
"agent_state": {"acp_current_model_id": "sonnet"},
"current_model_id": "sonnet",
"available_models": [
{"model_id": "default", "name": "Default (recommended)"},
{"model_id": "sonnet", "name": "Sonnet"}
],
"supports_runtime_model_switch": true
}The same sonnet state also appeared in /api/conversations/search, confirming the user-facing list endpoint has the picker/chip data.
Test 3: Forced override and older/omitted-models behavior
Step 1 — Baseline:
On main, even successful ACP runtime switches left after_switch_current_model_id: null, so forced/current model metadata was not externally available.
Step 2 — Apply PR changes:
Ran the same fake ACP server on the PR branch with a forced model and with the server omitting the unstable models block:
QA_ACP_OMIT_MODELS=1 QA_FORCE_ACP_MODEL=legacy-model OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_model_surface.pyObserved excerpt:
{
"forced_model": "legacy-model",
"current_model_id": "legacy-model",
"available_models": [],
"agent_state": {"acp_current_model_id": "legacy-model"},
"after_switch_current_model_id": "sonnet",
"after_switch_agent_state": {"acp_current_model_id": "sonnet"},
"conversation_info_current_model_id": "sonnet",
"conversation_info_available_models": [],
"cold_conversation_info_current_model_id": "sonnet"
}This confirms the documented edge cases: an explicit acp_model is surfaced even when the server omits models, available_models remains empty rather than fabricated, and a later successful switch persists the new authoritative model for cold reads.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🟡 Acceptable — the metadata plumbing is useful, but two resume/override paths can still expose stale or unverified current-model state.
Risk: 🟡 Medium (ACP session metadata correctness; additive API surface, but misleading model state affects picker/chip behavior).
Verdict: COMMENT — please address the inline correctness issues before approval.
This review was generated by an AI agent (OpenHands) on behalf of the user.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26519069349
| # resume whose ``load_session`` omitted the UNSTABLE ``models`` block. | ||
| if self._current_model_id is not None: | ||
| new_agent_state["acp_current_model_id"] = self._current_model_id | ||
| elif not truly_resumed: |
There was a problem hiding this comment.
🟠 Important: On a true resume, this preserves the old acp_current_model_id whenever the resumed response has a models block but no usable currentModelId (for example currentModelId: "" with availableModels: []). The list gets cleared via _available_models is not None, but the stale current id remains in agent_state, so ConversationInfo can still show the old chip even though the server explicitly reported no current/available model. When a models block was reported and no current id was extracted (and no forced acp_model supplied it), clear acp_current_model_id instead of preserving it.
| # forced one via ``acp_model``, trust that; otherwise fall back to | ||
| # whatever the server reported in ``models.currentModelId``. Older | ||
| # agents that don't surface the field leave it ``None``. | ||
| current_model_id = self.acp_model or reported_model_id |
There was a problem hiding this comment.
🟠 Important: This treats acp_model as the active model even on paths where it was not actually applied. Two concrete cases: _reapply_session_model_on_resume() catches ACPRequestError and explicitly warns the live session may keep the server default, and a fresh unknown/custom provider gets neither _meta nor _maybe_set_session_model() because both are provider-gated. In both cases the new current_model_id surface can claim the override while the ACP server is running something else. Track whether the override application succeeded (or make the unknown-provider initial path attempt it) and only prefer self.acp_model over reported_model_id after that success; otherwise surface the server-reported id/None.
Why
ACP servers (claude-agent-acp, codex-acp, gemini-cli-acp) report their model state on
NewSessionResponse/LoadSessionResponsevia the UNSTABLEmodelsfield —currentModelIdplus the fullavailableModelslist (each:modelId,name,description). The SDK readresponse.session_idand discarded the rest, so clients couldn't tell which model the harness is running, nor what it offers.This unblocks the OpenHands web client from (a) displaying the active model in a chip, and (b) building an inline model picker without maintaining a hardcoded model list (cf. agent-canvas#769).
Summary
Surface the live model state verbatim and let the client decide how to display it — no server-side name curation.
ACPModelInfo(openhands/sdk/agent/acp_models.py) — a small, stable DTO that normalizes the ACP protocol'sModelInfo. In its own leaf module so the agent-server can use it onConversationInfowithout importingACPAgent(which would eagerly register it in the agentDiscriminatedUnion). Becausemodelsis UNSTABLE upstream, we re-map into our own type at the SDK boundary rather than re-serializing the vendoredacp.schematype onto the public REST API. Entries without a usablemodel_idare dropped (an empty id is an unselectable picker option / unusable switch target).SDK (
ACPAgent) — capturecurrentModelId+availableModelsfrom the session response; exposecurrent_model_idandavailable_models(read-only,PrivateAttr-backed sinceAgentBaseis frozen). Caller-providedacp_modelstill overrides the reported id.None/[]for older agents.agent-server — lift both onto
ConversationInfo.current_model_idandConversationInfo.available_models. Persisted inagent_stateso they survive cold reads of the conversation list, letting the client resolve a label and render a picker even for idle/restored conversations.Design note — no
current_model_nameAn earlier revision derived a human-readable
current_model_nameserver-side (mapping opaque aliases like claude-agent-acp's"default"to a name via a hardcoded alias table + parsingModelInfo.description). Dropped per review: the alias table is exactly the kind of static knowledge that drifts, it's claude-specific, and it's redundant once the fullavailable_modelslist is on the wire — a client resolves the label with a trivialmodel_idlookup. Expose the data, let the client render.Client contract (for the canvas#769 picker)
Moving label resolution to the client means the consumer must handle two cases:
current_model_idis not guaranteed to be a member ofavailable_models— a forcedacp_modeloverride can name a model absent from the list; treat a lookup miss as "show the raw id."description(e.g. claude-agent-acp's"default"→"Opus 4.7 with 1M context · …").These are documented on the
ConversationInfo.available_modelsfield.Integration with #3390 (runtime model switch)
Merged
main, which includes #3390'sswitch_acp_model. Fixes so the surfaced state doesn't go stale right after a live switch (flagged in review):ACPAgent.set_acp_modelrefreshes_current_model_idon a successful switch;switch_acp_model's shallowmodel_copycarries the updated PrivateAttr onto the persisted agent.LocalConversation.switch_acp_modelpersists theacp_current_model_idhint unconditionally so cold list reads (before the next re-init) aren't stale — even for an older/custom server that reported nomodelsat init._compose_conversation_infofalls back to the agent's authoritativeacp_modelas the final source forcurrent_model_id, so a cold read of a switched/overridden conversation whose server never surfacedmodelsstill resolves the chip.Also added
supports_runtime_model_switchonConversationInfoso the picker knows whether to offer a live-switch control. It mirrorsset_acp_model's gate:Truefor known switch-capable providers, optimisticallyTruefor unknown/custom servers (whichset_acp_modelattempts rather than refuses),Falseonly for a known provider that declares no support and before a session exists.Resume / persistence correctness
available_modelspersistence is gated independently ofcurrent_model_id: the list is only replaced when a launch actually receives one. On a true resume whoseload_sessionomits the UNSTABLEmodelsblock, the previously persisted list is preserved (so the picker survives the restore) even thoughcurrent_model_idmay be set from a forcedacp_model; it's cleared only on a fresh, non-resumed replacement.Out of scope (intentionally)
modes/configOptions/available_commandsfrom the session response are not included. Unlike the model list (static per session), these change mid-session viacurrent_mode_update/config_option_update/available_commands_updatenotifications, so they belong as events, not a polledConversationInfosnapshot — a separate, event-aware PR. A polymorphicAgentBase.get_model_state()seam (vs. the current flat ACP-specific fields +getattr) is likewise deferred.Test plan
_extract_session_modelsnormalizes toACPModelInfoand drops entries without a usablemodel_id;available_modelsreturns a copy;supports_runtime_model_switchresolves per provider (optimistic for unknown); resume preserves persisted list (incl. with a forcedacp_model) / fresh-replacement clears it; switch refreshescurrent_model_id+ persists the hint unconditionally.current_model_id/available_models/supports_runtime_model_switchlifted from the live agent and from persistedagent_stateon cold reads, with anacp_modelfinal fallback;[]/Falsefor native agents.ruff,ruff format,pyrightclean on changed files; OpenAPI discriminator test green; net wire change vs main is additive.Follow-ups
current_model_id+available_models+supports_runtime_model_switch.available_modelsand retire the hardcoded/registry model lists.Type
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:532bee2-pythonRun
All tags pushed for this build
About Multi-Architecture Support
532bee2-python) is a multi-arch manifest supporting both amd64 and arm64532bee2-python-amd64) are also available if needed