feat(acp): runtime mid-conversation model switching (switch_acp_model) by simonrosenberg · Pull Request #3390 · OpenHands/software-agent-sdk

simonrosenberg · 2026-05-26T14:37:15Z

What

Implements the "software-agent-sdk (runtime switch)" task from OpenHands/agent-canvas#769: a protocol-level path to switch an ACP agent's model on a live session, so the new model applies to subsequent turns of the same conversation without restarting the subprocess or losing context.

Changes

ACPAgent.set_acp_model(model) — issues session/set_model on the live connection (via the portal executor) and reflects the model on the sentinel LLM + metrics. Raises RuntimeError if the session isn't initialized yet, ValueError if the provider doesn't support the protocol call.
LocalConversation.switch_acp_model(model) — conversation-level entry point; takes the state lock so the switch can't race a running step() (mirrors switch_llm).
agent-server — POST /conversations/{id}/switch_acp_model → EventService.switch_acp_model (run_in_executor), mapping ValueError → 400, RuntimeError → 409, missing conversation → 404.
acp_providers.py — adds a new supports_runtime_model_switch field meaning "supports the session/set_model protocol call for runtime, mid-conversation switching". claude-code keeps supports_set_session_model=False (it still selects its initial model via _meta) and sets supports_runtime_model_switch=True; the two flags cover orthogonal concerns (initial selection vs. runtime switching). session_meta_key continues to govern only the initial model selection, and _maybe_set_session_model keeps session-creation behavior byte-for-byte identical (claude still uses _meta at creation).

Verification (real LLM calls, context preserved across the switch)

SDK path (Conversation.switch_acp_model → live session/set_model):
- Claude Code: sonnet → haiku ✅
- Codex: gpt-5.5/xhigh → gpt-5.5/low ✅
- Gemini CLI: default → gemini-2.5-flash ✅ (validated against the bundled @google/gemini-cli@0.38.0 — yolo init + switch work, no extra flags)
agent-server HTTP route: real POST /conversations/{id}/switch_acp_model {"model":"haiku"} against a live Claude conversation → 200, switched, unknown conversation → 404, context preserved ✅
301 relevant unit tests pass (new: provider metadata, set_acp_model, switch_acp_model guard, 4 router endpoint tests for 200/400/404/409).

Scope / follow-ups (per #769)

Uninitialized sessions: this PR returns 409 rather than persisting acp_model for next session creation. [Openhands] we should add test to test this init_state will actually modify state in-place #769's task describes the persist-on-uninitialized behavior — flagging as an open decision; happy to add it here or in a follow-up.
Capability detection: rather than runtime detection, this corrects the static flag (claude-code → True) and decouples it from the init mechanism. That fully resolves the "misleading flag" problem for the three built-in providers; attempt/runtime detection would only matter for unknown/custom servers (provider=None, which this PR already lets attempt the call).
Gemini autonomous mode works on the bundled @google/gemini-cli@0.38.0 as-is (no folder-trust gate). Newer gemini-cli (0.43+) adds a trust gate that would need --skip-trust — a future concern to pair with any bundled-CLI bump (see [Openhands] we should add test to test this init_state will actually modify state in-place #769; the premature fix(acp): trust gemini workspace so autonomous (yolo) mode works #3391 was closed).
Frontend/typescript-client still gate on their own copy of the registry fields (supports_runtime_model_switch); surfacing this in the UI + a typed switchAcpModel client method are the remaining [Openhands] we should add test to test this init_state will actually modify state in-place #769 tasks.
Touches acp_providers.py, which also changes in feat(acp): add available_models / default_model to ACP provider registry #3389 — expect a trivial rebase depending on merge order.

🤖 Generated with Claude Code

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22-slim`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:6094c64-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-6094c64-python \
  ghcr.io/openhands/agent-server:6094c64-python

All tags pushed for this build

ghcr.io/openhands/agent-server:6094c64-golang-amd64
ghcr.io/openhands/agent-server:6094c645dfb185445587425176012c365627c59a-golang-amd64
ghcr.io/openhands/agent-server:feat-acp-runtime-model-switch-golang-amd64
ghcr.io/openhands/agent-server:6094c64-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:6094c64-golang-arm64
ghcr.io/openhands/agent-server:6094c645dfb185445587425176012c365627c59a-golang-arm64
ghcr.io/openhands/agent-server:feat-acp-runtime-model-switch-golang-arm64
ghcr.io/openhands/agent-server:6094c64-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:6094c64-java-amd64
ghcr.io/openhands/agent-server:6094c645dfb185445587425176012c365627c59a-java-amd64
ghcr.io/openhands/agent-server:feat-acp-runtime-model-switch-java-amd64
ghcr.io/openhands/agent-server:6094c64-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:6094c64-java-arm64
ghcr.io/openhands/agent-server:6094c645dfb185445587425176012c365627c59a-java-arm64
ghcr.io/openhands/agent-server:feat-acp-runtime-model-switch-java-arm64
ghcr.io/openhands/agent-server:6094c64-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:6094c64-python-amd64
ghcr.io/openhands/agent-server:6094c645dfb185445587425176012c365627c59a-python-amd64
ghcr.io/openhands/agent-server:feat-acp-runtime-model-switch-python-amd64
ghcr.io/openhands/agent-server:6094c64-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:6094c64-python-arm64
ghcr.io/openhands/agent-server:6094c645dfb185445587425176012c365627c59a-python-arm64
ghcr.io/openhands/agent-server:feat-acp-runtime-model-switch-python-arm64
ghcr.io/openhands/agent-server:6094c64-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:6094c64-golang
ghcr.io/openhands/agent-server:6094c645dfb185445587425176012c365627c59a-golang
ghcr.io/openhands/agent-server:feat-acp-runtime-model-switch-golang
ghcr.io/openhands/agent-server:6094c64-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:6094c64-java
ghcr.io/openhands/agent-server:6094c645dfb185445587425176012c365627c59a-java
ghcr.io/openhands/agent-server:feat-acp-runtime-model-switch-java
ghcr.io/openhands/agent-server:6094c64-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:6094c64-python
ghcr.io/openhands/agent-server:6094c645dfb185445587425176012c365627c59a-python
ghcr.io/openhands/agent-server:feat-acp-runtime-model-switch-python
ghcr.io/openhands/agent-server:6094c64-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

Each variant tag (e.g., 6094c64-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 6094c64-python-amd64) are also available if needed

Add a protocol-level path to switch an ACP agent's model on a live session, so the new model applies to subsequent turns of the *same* conversation without restarting the subprocess or losing context. - ACPAgent.set_acp_model(): issues session/set_model on the live connection (via the portal executor) and reflects the model on the sentinel LLM + metrics. Gated on provider.supports_set_session_model. - LocalConversation.switch_acp_model(): conversation-level entry point; takes the state lock so the switch can't race a running step(). - agent-server: POST /conversations/{id}/switch_acp_model -> EventService.switch_acp_model (run_in_executor), mapping ValueError->400 and RuntimeError->409. - acp_providers: claude-code now supports_set_session_model=True. The flag is repurposed to mean "supports the session/set_model protocol call" (runtime switching); session_meta_key continues to govern only the *initial* model selection. _maybe_set_session_model keeps init behavior identical (claude still uses _meta at creation). Verified end-to-end with real LLM calls (context preserved across the switch): Claude Code (sonnet->haiku) and Codex (gpt-5.5/xhigh->low) via LocalConversation, and the agent-server HTTP route for Claude Code. Note: the frontend/typescript-client still gate ACP model switching on their own copy of supports_set_session_model; surfacing this in the UI is a follow-up (agent-canvas + typescript-client release). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-26T14:37:49Z

Python API breakage checks — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-05-26T14:37:59Z

REST API breakage checks (OpenAPI) — ✅ PASSED

Result: ✅ PASSED

Action log

all-hands-bot

🟡 QA Report: PARTIAL

SDK and agent-server switching entry points behaved correctly under exercised local conditions, but I could not verify a real live ACP subprocess/model-provider switch because no supported ACP provider CLI or provider credential environment was available.

Does this PR achieve its stated goal?

Partially verified. The PR’s SDK API and HTTP route now exist where origin/main did not, and exercising them showed the expected runtime session/set_model call path, sentinel LLM/metrics update, state-lock conversation entry point, and HTTP status mapping. I could not prove the full stated goal — a real Claude/Codex/Gemini live ACP session preserving context after an actual model switch — because this QA environment has no claude, codex, gemini, or related ACP CLI installed and no provider-specific credentials were present.

Phase	Result
Environment Setup	✅ `make build` completed successfully with uv-managed dependencies installed.
CI Status	🟡 Several checks are green; multiple build/agent-server/review checks were still pending when checked.
Functional Verification	🟡 SDK and HTTP behavior verified locally; real provider-backed ACP session not verified.

Functional Verification

Test 1: SDK runtime switching entry point

Step 1 — Establish baseline without the fix:
Checked out origin/main and ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_sdk_behavior.py:

SDK runtime ACP model switching smoke
plain_conversation_switch=AttributeError: 'LocalConversation' object has no attribute 'switch_acp_model'
has_set_acp_model=False

This shows the base branch has no conversation-level switch_acp_model entry point and no ACPAgent.set_acp_model method.

Step 2 — Apply the PR's changes:
Checked out 4755abb64ba48b3a9e08fe588c4dfab547fb215a.

Step 3 — Re-run with the fix in place:
Ran the same command:

SDK runtime ACP model switching smoke
plain_conversation_switch=ValueError: switch_acp_model is only supported for ACP conversations.
has_set_acp_model=True
uninitialized_switch=RuntimeError: ACP session is not initialized; the model can only be switched after the conversation has started (first run()).
live_conn_calls=[{'model_id': 'qa-live-switched-model', 'session_id': 'qa-session-1'}]
sentinel_llm_model=qa-live-switched-model
metrics_model_name=qa-live-switched-model
conversation_switch_calls=[{'model_id': 'qa-live-switched-model', 'session_id': 'qa-session-1'}, {'model_id': 'qa-conversation-switched-model', 'session_id': 'qa-session-1'}]
conversation_agent_model=qa-conversation-switched-model

This shows the PR adds the user-facing SDK methods, rejects non-ACP conversations, returns the expected uninitialized-session error, calls the live connection’s set_session_model with the requested model/session, and updates the sentinel LLM/metrics visible on the conversation after switching.

Test 2: Agent-server HTTP route behavior

Step 1 — Establish baseline without the fix:
Checked out origin/main and ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_http_behavior.py:

HTTP switch_acp_model endpoint smoke
success_status=404
success_body={"detail":"Not Found"}
unsupported_status=404
unsupported_body={"detail":"Not Found"}
uninitialized_status=404
uninitialized_body={"detail":"Not Found"}
missing_status=404
missing_body={"detail":"Not Found"}
event_service_calls=[]

This confirms the base branch does not expose POST /api/conversations/{id}/switch_acp_model.

Step 2 — Apply the PR's changes:
Checked out 4755abb64ba48b3a9e08fe588c4dfab547fb215a.

Step 3 — Re-run with the fix in place:
Ran the same API smoke script:

HTTP switch_acp_model endpoint smoke
success_status=200
success_body={"success":true}
unsupported_status=400
unsupported_body={"detail":"not ACP / unsupported provider"}
uninitialized_status=409
uninitialized_body={"detail":"ACP session is not initialized"}
missing_status=404
missing_body={"detail":"Not Found"}
event_service_calls=['qa-success-model', 'qa-unsupported-model', 'qa-uninitialized-model']

Then started the actual FastAPI app on localhost with the same conversation-service behavior and sent real HTTP requests using requests:

success: status=200 body={"success":true}
unsupported: status=400 body={"detail":"not ACP / unsupported provider"}
uninitialized: status=409 body={"detail":"ACP session is not initialized"}
missing: status=404 body={"detail":"Not Found"}

This shows the route is now reachable over HTTP, forwards the requested model, and maps success/unsupported/uninitialized/missing cases to the claimed statuses.

Unable to Verify

I could not verify an end-to-end live ACP subprocess model switch with preserved context. I first checked for real provider CLIs with command -v claude claude-code claude-agent-acp codex gemini; none were installed, and the environment only exposed generic LLM_API_KEY, LLM_BASE_URL, and LLM_MODEL variables, not provider-specific ACP CLI credentials.

Future QA runs would benefit from AGENTS.md guidance that provisions one supported ACP provider CLI plus credentials, and a small manual scenario for planting a word, switching models, and confirming the next turn recalls it.

Issues Found

None in the SDK/API behavior I could exercise. Full live-provider verification remains unverified due environment constraints.

This QA review was created by an AI agent (OpenHands) on behalf of the user.

github-actions · 2026-05-26T14:42:15Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-agent-server/openhands/agent_server
conversation_router.py	178	11	93%	234, 334, 471–474, 486–489, 524
event_service.py	474	98	79%	87–88, 118, 121–122, 126–127, 134, 138, 144, 154–158, 161–164, 223, 244–245, 316, 367, 377, 401–402, 406, 414, 417, 475, 477, 481–483, 487, 496–497, 499, 503, 509, 511, 558, 588, 591, 642, 646, 838, 840–841, 845, 859–861, 863, 884, 889–892, 896–899, 907–910, 926, 939–942, 988–989, 991–998, 1000–1001, 1010–1011, 1013–1014, 1021–1022, 1024–1025, 1045, 1051, 1057, 1066–1067
openhands-sdk/openhands/sdk/agent
acp_agent.py	732	62	91%	400–402, 532–533, 566, 568, 572, 576, 584, 647–648, 653, 720, 873, 876–877, 894–895, 924, 929, 947, 957, 986–989, 1197–1200, 1204–1206, 1209–1213, 1215, 1397, 1750–1752, 1789, 1793–1794, 1797, 1805–1807, 1809, 1811, 1815, 1818, 1827–1829, 1831, 1850, 2044–2045
openhands-sdk/openhands/sdk/conversation/impl
local_conversation.py	541	44	91%	308, 313, 457, 503, 540, 556, 621, 899–900, 903, 1016, 1027–1030, 1037–1038, 1041, 1047–1048, 1051, 1057, 1072, 1075, 1079–1080, 1084–1086, 1093, 1179, 1184, 1294, 1296, 1300–1301, 1312–1313, 1338, 1533, 1537, 1607, 1614–1615
TOTAL	28470	6521	77%

all-hands-bot

🟡 Taste Rating: Acceptable, but I found a few correctness/API-compatibility issues around persistence, protocol error mapping, and timeout behavior.

Risk Assessment: 🟡 MEDIUM — this touches live ACP agent runtime behavior plus a new public REST endpoint, so resume/error-path correctness matters before merge.

This review was created by an AI agent (OpenHands) on behalf of the requester.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26454993683

Addresses the four review findings on the runtime switch_acp_model path: - Persist the switched model authoritatively. switch_acp_model now replaces the agent with a model_copy carrying the new acp_model, instead of re-assigning the same mutated object. The old approach was an autosave no-op (old == new) and left the frozen acp_model field stale, so reload / resume (model_post_init, _start_acp_server) reverted to the original model. model_copy preserves the live ACP connection in private attrs. - Bound the set_session_model round-trip with acp_prompt_timeout. It runs under the conversation state lock, so an unresponsive server could otherwise wedge the lock indefinitely. The sentinel LLM is only updated after the RPC succeeds. - Map ACP protocol failures to 400, not 500. acp.exceptions.RequestError derives from Exception (not RuntimeError); set_acp_model now translates it to ValueError (method-not-found / invalid model), which the agent-server route already maps to 400. The server stays decoupled from the acp package. - Stop repurposing the exported supports_set_session_model flag. Restored its original initial-selection semantics (claude-code = False) and added a separate supports_runtime_model_switch capability flag (all three providers = True). _maybe_set_session_model keeps using the original flag at session creation; set_acp_model gates runtime switches on the new flag. Tests: switch -> reload persistence regression, RPC timeout, ACPRequestError -> ValueError translation, and a router protocol-error -> 400 case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…l-switch # Conflicts: # openhands-sdk/openhands/sdk/settings/acp_providers.py # tests/sdk/conversation/test_switch_model.py

all-hands-bot

Code Review — `feat(acp): runtime mid-conversation model switching`

All four issues flagged in the earlier review (persistence, lock-indefinite-hang, ACP error translation, and supports_set_session_model semantic breakage) are correctly addressed in the current diff. The separation of supports_runtime_model_switch from supports_set_session_model is clean, the model_copy approach for persistence is sound, the bounded acp_prompt_timeout on the round-trip prevents lock starvation, and translating ACPRequestError → ValueError at the SDK boundary keeps the router simple. Test coverage is thorough across all error paths and the persistence regression is well-designed.

One issue worth fixing before merge (details in the inline comment):

event_service.switch_acp_model raises ValueError("inactive_service") for a None conversation, which (a) maps to a 400 rather than the 409 used for every other uninitialized-state condition, and (b) surfaces an opaque machine identifier as the public error detail.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.

all-hands-bot

🟡 QA Report: PARTIAL

I verified the new SDK entry point and agent-server HTTP endpoint execute as claimed, but could not validate real Claude/Codex/Gemini ACP subprocess switching because provider CLIs/credentials are unavailable here.

Does this PR achieve its stated goal?

Partially. The PR delivers the exposed runtime-switch plumbing: on the PR branch, LocalConversation.switch_acp_model("model-b") invoked set_session_model for the live ACP session, updated the current agent/LLM model, and persisted acp_model=model-b; the new server endpoint returned the documented 200/400/404/409 responses. The remaining unverified part is the strongest product claim — that real provider subprocesses actually switch models mid-conversation while preserving context — because this environment has no claude, codex, or gemini CLI and no provider API keys.

Phase	Result
Environment Setup	✅ `make build` completed successfully with uv 0.11.16.
CI Status	⚠️ Most build/test jobs green, but `Python API` and `unresolved-review-threads` are failing; `qa-changes`/`pr-review` were in progress when checked.
Functional Verification	🟡 SDK and HTTP route behavior verified with live code execution; real ACP provider model/context switching could not be verified.

Functional Verification

Test 1: SDK `LocalConversation.switch_acp_model` path

Step 1 — Establish baseline on origin/main:
Ran OPENHANDS_SUPPRESS_BANNER=1 PYTHONPATH=/tmp/qa-main-pr3390/openhands-sdk:/tmp/qa-main-pr3390/openhands-tools:/tmp/qa-main-pr3390/openhands-workspace:/tmp/qa-main-pr3390/openhands-agent-server .venv/bin/python -u /tmp/qa_switch_sdk.py:

switch_result=error AttributeError: 'LocalConversation' object has no attribute 'switch_acp_model'
protocol_calls=[]
executor_calls=[]
conv_agent_acp_model=model-a
conv_agent_llm_model=model-a
state_agent_acp_model=model-a
base_state_files=1
persisted_agent_kind=ACPAgent
persisted_acp_model=model-a
persisted_llm_model=model-a
non_acp_guard=AttributeError: 'LocalConversation' object has no attribute 'switch_acp_model'
uninitialized_guard=AttributeError: 'LocalConversation' object has no attribute 'switch_acp_model'

This confirms the baseline has no conversation-level ACP model switch entry point.

Step 2 — Apply the PR's changes:
Used the checked-out PR branch at 6f324f73e9e498d580c5fc900ad8626802ad43bb after make build.

Step 3 — Re-run with the PR in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 .venv/bin/python -u /tmp/qa_switch_sdk.py:

switch_result=ok None
protocol_calls=[{"model_id": "model-b", "session_id": "session-123"}]
executor_calls=[{"timeout": 7.5}]
conv_agent_acp_model=model-b
conv_agent_llm_model=model-b
state_agent_acp_model=model-b
base_state_files=1
persisted_agent_kind=ACPAgent
persisted_acp_model=model-b
persisted_llm_model=model-b
non_acp_guard=ValueError: switch_acp_model is only supported for ACP conversations.
uninitialized_guard=RuntimeError: ACP session is not initialized; the model can only be switched after the conversation has started (first run()).

This shows the new SDK API sends the protocol call with the live session id, passes a bounded timeout, updates in-memory state, persists the switched model, and reports the documented guard errors.

Test 2: Agent-server `POST /api/conversations/{id}/switch_acp_model`

Step 1 — Establish baseline on origin/main:
Ran OPENHANDS_SUPPRESS_BANNER=1 PYTHONPATH=/tmp/qa-main-pr3390/openhands-sdk:/tmp/qa-main-pr3390/openhands-tools:/tmp/qa-main-pr3390/openhands-workspace:/tmp/qa-main-pr3390/openhands-agent-server .venv/bin/python -u /tmp/qa_switch_server.py:

success_request
status=404 body={"detail":"Not Found"} calls=[]
not_found_request
status=404 body={"detail":"Not Found"} calls=None
value_error_request
status=404 body={"detail":"Not Found"} calls=[]
runtime_error_request
status=404 body={"detail":"Not Found"} calls=[]

This confirms the endpoint did not exist on the baseline branch.

Step 2 — Apply the PR's changes:
Used the checked-out PR branch at 6f324f73e9e498d580c5fc900ad8626802ad43bb.

Step 3 — Re-run with the PR in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 .venv/bin/python -u /tmp/qa_switch_server.py:

success_request
status=200 body={"success":true} calls=['haiku']
not_found_request
status=404 body={"detail":"Not Found"} calls=None
value_error_request
status=400 body={"detail":"provider does not support runtime model switching"} calls=['bogus']
runtime_error_request
status=409 body={"detail":"ACP session is not initialized"} calls=['haiku']

This shows real HTTP requests through the FastAPI route now reach switch_acp_model and map success, missing conversation, unsupported/rejected switch, and uninitialized session outcomes as described.

Unable to Verify

I attempted to verify a real ACP provider session, but this environment has no provider CLIs or provider-specific credentials:

claude=
codex=
gemini=
ANTHROPIC_API_KEY_set=False
OPENAI_API_KEY_set=False
GEMINI_API_KEY_set=False
GOOGLE_API_KEY_set=False
LLM_API_KEY_set=True

As a result, I could not independently confirm the PR author's real Claude/Codex/Gemini claims or context preservation across an actual vendor-backed model switch. Future QA would benefit from AGENTS.md guidance for provisioning a disposable ACP provider CLI/API key and a minimal prompt sequence that demonstrates context survives after switch_acp_model.

Issues Found

None from functional QA. CI still reports Python API and unresolved review-thread failures, but I did not investigate those as part of this runtime QA pass.

This review was created by an AI agent (OpenHands) on behalf of the user.

Verdict: PARTIAL

all-hands-bot

🟡 Taste Rating: Acceptable, but I found a few correctness/API-compatibility issues around restart persistence, timeout mapping, and the provider metadata constructor.

Risk Assessment: 🟡 MEDIUM — this touches live ACP runtime behavior plus a new public REST endpoint, so restart and error-path semantics should be tightened before merge.

This review was created by an AI agent (OpenHands) on behalf of the requester.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26466851088

- Move supports_runtime_model_switch to the end of ACPProviderInfo with a default of False so it is a backward-compatible addition, not a moved positional / newly-required field (fixes Python API breakage check). - switch_acp_model: raise RuntimeError("Conversation is not active.") instead of ValueError("inactive_service") so an inactive/None conversation maps to 409 (per the documented endpoint contract) and surfaces a human-readable detail. Use an explicit `is None` guard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

simonrosenberg · 2026-05-26T18:34:48Z

Addressed review feedback in e99aaa07c:

switch_acp_model now raises RuntimeError("Conversation is not active.") (→ 409, per the documented contract) with an explicit is None guard.
Moved supports_runtime_model_switch to the end of ACPProviderInfo with a default of False, making it a backward-compatible addition (fixes the Python API breakage check).

Ready for another look.

all-hands-bot

Code Review — `feat(acp): runtime mid-conversation model switching`

Solid implementation overall. The four previously-flagged issues (persistence via model_copy, bounded timeout on run_async, ACPRequestError translation to ValueError, and supports_set_session_model API break) are all properly addressed. Test coverage is thorough: provider metadata, set_acp_model unit tests (including the state-mutation-on-failure regression), the router's four HTTP status paths, and the persistence round-trip to base_state.json.

Two unresolved 🟠 Important threads from the previous review still need fixes before merge:

meta.json / restart persistence (event_service.py line ~923): switch_acp_model updates base_state.json via LocalConversation, but self.stored.agent and meta.json are not updated. On agent-server restart, EventService.start() rebuilds from meta.json and overwrites the switched model in the loaded state — the runtime switch is lost across restarts.
TimeoutError not caught in the router (conversation_router.py line ~426): set_acp_model now raises TimeoutError when the bounded session/set_model call expires, but the endpoint only catches ValueError (→ 400) and RuntimeError (→ 409). A timed-out ACP server will produce an HTTP 500 instead of a predictable 504. The responses dict should be updated to document this status as well.

On the unresolved ACPProviderInfo constructor thread: supports_runtime_model_switch: bool = False carries a default and is placed after the existing default fields, so keyword callers are unaffected and positional callers that don't pass it will default to False. The concern about a required field breaking callers appears addressed; that thread can likely be closed.

One new minor docstring gap noted inline below.

This review was generated by an AI agent (OpenHands) on behalf of the requester via OpenHands Automation.

Addresses review on switch_acp_model: - Persistence: EventService.switch_acp_model now mirrors the new acp_model into meta.json (self.stored). start() rebuilds the runtime agent from meta.json and ConversationState.create() copies it over the resumed base_state.json, so without this a restart silently reverted the switch. model_post_init re-derives the sentinel llm.model from acp_model on reload, so only acp_model needs mirroring. - TimeoutError -> 504: ACPAgent.set_acp_model bounds the session/set_model round-trip and raises TimeoutError on a wedged/slow server. The route only mapped ValueError (400) and RuntimeError (409), so a timeout surfaced as an opaque 500. Catch TimeoutError -> 504 and document it. Adds regression tests for both (meta.json persistence + 504 mapping). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

set_acp_model propagates TimeoutError from the bounded session/set_model round-trip; surface it in the public contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

all-hands-bot

Code Review — `feat(acp): runtime mid-conversation model switching`

This is a follow-up pass after the multi-round review cycle. All 10 previously-flagged threads are resolved, and my fresh analysis of the HEAD diff found no new issues to raise.

What was fixed across the review cycle

Issue	Fix commit	Status
Persistence no-op: same mutated agent object bypassed autosave	`a35d333b` — `model_copy` replaces agent with new identity	✅
Unbounded `session/set_model` wedging the state lock	`a35d333b` — `run_async(timeout=acp_prompt_timeout)`	✅
`ACPRequestError` leaking as a raw ACP exception (→ 500)	`a35d333b` — translated to `ValueError` at the SDK boundary	✅
`supports_set_session_model` API semantic breakage	`a35d333b` / `e99aaa07` — reverted to session-creation meaning; new `supports_runtime_model_switch` flag added with default `False`	✅
`ValueError("inactive_service")` for `None` conversation → wrong 400	`e99aaa07` — changed to `RuntimeError("Conversation is not active.")` → 409	✅
Truthiness check on `_conversation`	`e99aaa07` — `is None` guard	✅
`meta.json` not updated on restart (switch reverted after server bounce)	`d8a7d072` — `EventService.switch_acp_model` mirrors `acp_model` into `self.stored` and calls `save_meta()`	✅
`TimeoutError` not caught by router → opaque 500	`d8a7d072` — `TimeoutError → 504`, documented in `responses` dict	✅
`ACPProviderInfo` required field in the middle breaking positional callers	`e99aaa07` — `supports_runtime_model_switch: bool = False` placed at end	✅
`switch_acp_model` docstring missing `TimeoutError` in `Raises`	`d8a7d072` — added	✅

Fresh analysis of the HEAD diff

set_acp_model error sequencing is correct: the bounded run_async call raises before self.llm.model is mutated, so the sentinel LLM is only updated once the live session has actually switched. ACPRequestError is caught and re-raised as ValueError; TimeoutError propagates naturally.

Persistence chain is complete: local_conversation.switch_acp_model updates both self.agent and self._state.agent with a fresh model_copy (triggering autosave). EventService.switch_acp_model additionally mirrors acp_model into meta.json so EventService.start() rebuilds the agent with the correct model after a restart.

Router exception coverage is exhaustive: ValueError → 400, TimeoutError → 504, RuntimeError → 409, missing conversation → 404 — all documented in the responses dict.

supports_runtime_model_switch semantics are clean: the flag is independent from supports_set_session_model (which governs session-creation only), defaults to False, and the three built-in providers are correctly set to True.

Test coverage is thorough: 8 unit tests for set_acp_model (session guard, provider guard, unknown provider pass-through, error translation, timeout propagation, LLM mutation on success), regression test for base_state.json persistence, meta.json persistence test, and all four router status codes including 504.

Assessment

The implementation is correct, well-structured, and has no loose ends. Ready to merge pending CI green.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.

all-hands-bot

✅ QA Report: PASS

Verified the ACP runtime model switch works through both the SDK conversation API and the new agent-server HTTP route using a live local ACP-compatible subprocess.

Does this PR achieve its stated goal?

Yes. The PR set out to add a protocol-level way to switch an ACP agent's model on a live session without restarting the subprocess or losing context. I exercised that behavior end-to-end: the SDK path issued session/set_model, the second prompt stayed on the same ACP session (qa-session-1) and used model-b, and the persisted conversation state reported acp_model=model-b. I also verified the server route is new relative to origin/main and that the PR route returns 200, calls the live ACP set_session_model, updates the persisted/returned agent model, and returns 404 for an unknown conversation.

Phase	Result
Environment Setup	✅ Used existing `uv` environment; started local `agent-server` instances and a fake ACP subprocess.
CI Status	⚠️ GitHub checks at review time: 30 success, 3 skipped, 1 queued (`coverage-report`).
Functional Verification	✅ SDK runtime switch + HTTP route both behaved as claimed.

Functional Verification

Test 1: SDK runtime switch preserves session context

Step 1 — Establish pre-switch state:
Ran uv run python /tmp/qa_acp_switch.py against a local ACP-compatible subprocess. The first user turn completed on the initial model:

Finish with message:
session=qa-session-1; turn=1; model=model-a; heard=remember alpha
AFTER_FIRST_RUN_LLM_MODEL model-a

This confirms the live ACP session started normally with model-a.

Step 2 — Apply the PR's new runtime switch:
The script then called conv.switch_acp_model("model-b") on the same conversation.

Step 3 — Re-run with the switched model in the same session:
The same command continued with a second user turn:

AFTER_SWITCH_ACP_MODEL model-b
AFTER_SWITCH_LLM_MODEL model-b
Finish with message:
session=qa-session-1; turn=2; model=model-b; heard=use the same session
AFTER_SECOND_RUN_LLM_MODEL model-b
ACP_LOG_BEGIN
{"event": "prompt", "model": "model-a", "session_id": "qa-session-1", "turn": 1, "user_text": "remember alpha"}
{"event": "set_session_model", "new_model": "model-b", "old_model": "model-a", "session_id": "qa-session-1"}
{"event": "prompt", "model": "model-b", "session_id": "qa-session-1", "turn": 2, "user_text": "use the same session"}
ACP_LOG_END
PERSISTED_AGENT_ACP_MODEL model-b

This shows the switch happened through the ACP protocol, the subprocess was not restarted, the second prompt used the same session id, and persisted state now records the switched model.

Test 2: HTTP route is new and switches a live server conversation

Step 1 — Reproduce baseline without the PR route:
Checked out origin/main, started uv run agent-server --port 8768, and called the new route shape:

POST /api/conversations/00000000-0000-0000-0000-000000000000/switch_acp_model
HTTP/1.1 404 Not Found
{"detail":"Not Found"}

This establishes the server route is not present on the base branch.

Step 2 — Apply the PR's changes:
Switched back to feat-acp-runtime-model-switch at d8a7d0728544ab75cc014b4e60a477250307c16f, started uv run agent-server --port 8767, then created an ACP conversation through POST /api/acp/conversations with a local ACP subprocess and initial acp_model=model-a.

Step 3 — Re-run the route with the PR in place:
Called the new endpoint on the live conversation:

POST /api/conversations/3541ae8f-e7eb-423d-94c9-6b05c50179fd/switch_acp_model
HTTP/1.1 200 OK
{"success":true}

GET /api/acp/conversations/3541ae8f-e7eb-423d-94c9-6b05c50179fd
agent.acp_model= model-b
agent.llm.model= model-b
execution_status= finished

Fake ACP log:
{"event": "set_session_model", "new_model": "model-b", "old_model": "model-a", "session_id": "qa-session-1"}

This confirms the route exists, returns success, reaches the live ACP session, and reflects the switched model in server-visible conversation state. I also called the route for an unknown conversation and got the expected 404:

HTTP/1.1 404 Not Found
{"detail":"Not Found"}

Unable to Verify

I did not verify the switch against real Claude Code, Codex, or Gemini CLI accounts because the QA environment does not provide those interactive/provider credentials. The protocol-level behavior was verified with a real local ACP subprocess over stdin/stdout, including the server route and persisted state. Future QA guidance in AGENTS.md could list any available non-production ACP provider credentials or canned CLI login setup for provider-specific smoke tests.

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot

🟡 Taste Rating: Acceptable, but I found one live-resource ownership bug in the ACP switch path.

[RISK ASSESSMENT] 🟡 MEDIUM — this touches live ACP session runtime behavior, persistence, and a new REST endpoint; active-session resource ownership needs to be fixed before merge.

VERDICT: ❌ Needs one fix before merge.

This review was created by an AI agent (OpenHands) on behalf of the requester.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26467951722

switch_acp_model swaps the agent for a shallow model_copy that shares the live ACP runtime (_conn/_executor/_process) with the old agent. ACPAgent defines __del__ -> close(), so when the discarded old agent is garbage collected it would close the connection, kill the subprocess and shut down the executor — out from under the copy, breaking the next turn even though the live switch succeeded. Add ACPAgent.release_runtime() to disarm the finalizer (mark closed + clear the shared references) and call it on the old agent before dropping it. Adds a regression test asserting the discarded agent is disarmed and its close()/__del__ leaves the copy's shared connection/executor intact. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

all-hands-bot · 2026-05-26T21:15:21Z

✅ Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

all-hands-bot

Code Review — `feat(acp): runtime mid-conversation model switching`

Final review pass on the HEAD commit after a thorough multi-round cycle. All previously raised issues have been addressed and the latest commit delivers the last required fix.

`release_runtime()` — the ownership fix is correct ✅

The previous approach left model_copy()'s shallow clone sharing _conn/_executor/_process with the discarded old agent. Because ACPAgent defines __del__ → close(), GC of the old agent would silently tear down the live session the new copy depended on. The fix:

old_agent = self.agent
new_agent = old_agent.model_copy(update={"acp_model": model})
old_agent.release_runtime()   # ← disarms __del__ → close()
self.agent = new_agent
self._state.agent = new_agent

release_runtime() sets _closed = True and clears all live references, making the finalizer a no-op while leaving the copy fully operational. The ordering is correct: the runtime is released only after new_agent is safely constructed, so a failure in model_copy (unlikely but possible) would still leave the old agent in a closeable state.

Rest of the stack ✅

Layer	Observation
`ACPAgent.set_acp_model`	Correctly bounds the round-trip with `acp_prompt_timeout`; translates `ACPRequestError` → `ValueError` for clean HTTP mapping; only mutates sentinel LLM state after the live switch succeeds
`LocalConversation.switch_acp_model`	Holds state lock throughout; `model_copy` with a fresh identity triggers the autosave path (bypasses the old-object no-op); live protocol call before any local state mutation
`EventService.switch_acp_model`	Runs blocking sync call off the event loop via `run_in_executor`; persists to `meta.json` so restarts don't silently revert the model
HTTP route	Correct status codes: 400 for `ValueError` (non-ACP/unsupported provider/rejected call), 409 for `RuntimeError` (uninitialised session), 504 for `TimeoutError`
Tests	Covers success paths, all error paths, the finalizer disarm regression, and persistence across restart

Minor note (non-blocking)

set_acp_model will issue a session/set_model round-trip even when model == self.llm.model. A one-line early-return guard would make repeated same-model calls cheaper, but correctness is unaffected.

Summary

Solid implementation across all layers. The multi-round review cycle caught and fixed real bugs (persistence no-op, unbounded lock hold, TimeoutError mapping, resource ownership). The current state is correct and merge-ready.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

all-hands-bot

✅ QA Report: PASS

Runtime ACP model switching works through both the SDK and the agent-server API with a live local ACP-compatible subprocess.

Does this PR achieve its stated goal?

Yes. The PR set out to add a protocol-level way to switch an ACP agent's model on a live session without restarting the subprocess or losing context. I verified that main lacks the SDK entrypoint and HTTP route, while this PR switches the same ACP session from model-a to model-b; the next turn preserves the remembered kiwi context and reports turns=2 on the same fake-session-1.

Phase	Result
Environment Setup	✅ Dependencies available; actual SDK and agent-server processes launched locally
CI Status	✅ 32 passing, 3 skipped; only QA/pr-review automation still in progress at review time
Functional Verification	✅ SDK switch, HTTP switch, 404, and 409 paths exercised with real execution

Functional Verification

Test 1: SDK runtime ACP switch preserves the live session and context

Step 1 — Reproduce / establish baseline on main:
Ran cd /tmp/qa-sdk-main && uv run python /tmp/qa_acp_switch_sdk.py against a local ACP-compatible subprocess:

HAS_SWITCH False
BEFORE_SWITCH_AGENT_MODEL model-a model-a
BEFORE_SWITCH_LAST ... "fake-acp-response session=fake-session-1 model=model-a remembered=kiwi turns=1" ...
AttributeError: 'LocalConversation' object has no attribute 'switch_acp_model'

This establishes the prior state: an ACP conversation can run, but users have no SDK entrypoint to switch models mid-conversation.

Step 2 — Apply the PR's changes:
Checked out/reran in the PR worktree at commit 46a1cf4d069ab423b762bb9a898a524e5dd49970.

Step 3 — Re-run with the fix in place:
Ran uv run python /tmp/qa_acp_switch_sdk.py:

HAS_SWITCH True
BEFORE_SWITCH_AGENT_MODEL model-a model-a
BEFORE_SWITCH_LAST ... "fake-acp-response session=fake-session-1 model=model-a remembered=kiwi turns=1" ...
AFTER_SWITCH_AGENT_MODEL model-b model-b
AFTER_SWITCH_METRICS_MODEL model-b
AFTER_SWITCH_LAST ... "fake-acp-response session=fake-session-1 model=model-b remembered=kiwi turns=2" ...

This shows the new SDK entrypoint exists, updates the agent/sentinel LLM/metrics to model-b, and the subsequent prompt uses the same ACP session with preserved conversation context (remembered=kiwi, turns=2).

Test 2: Agent-server HTTP route switches a live ACP conversation

Step 1 — Reproduce / establish baseline on main:
Started the main-branch server and queried OpenAPI:

uv run python -m openhands.agent_server --host 127.0.0.1 --port 8766
python -c "... print('/api/conversations/{conversation_id}/switch_acp_model' in p['paths'])"
False

This confirms the HTTP route did not exist before the PR.

Step 2 — Apply the PR's changes:
Started the PR server with uv run python -m openhands.agent_server --host 127.0.0.1 --port 8765 and created an ACP conversation backed by /tmp/fake_acp_server.py.

Step 3 — Re-run with the fix in place:
Posted to the new route and then sent a follow-up message:

POST /api/conversations/a1d279b0-df9d-4a59-9909-b30bf0ef1c75/switch_acp_model
HTTP/1.1 200 OK
{"success":true}

GET /api/conversations/a1d279b0-df9d-4a59-9909-b30bf0ef1c75
"execution_status": "idle"
"model_name": "model-b"
"acp_model": "model-b"

GET /api/conversations/a1d279b0-df9d-4a59-9909-b30bf0ef1c75/events/search?limit=100
fake-acp-response session=fake-session-1 model=model-a remembered=kiwi turns=1
fake-acp-response session=fake-session-1 model=model-b remembered=kiwi turns=2

This verifies the real HTTP route returns success, persists/reports the switched model, and the next turn runs on model-b without losing the same-session context.

Test 3: HTTP error behavior

Ran concrete error requests against the PR server:

POST /api/conversations/00000000-0000-0000-0000-000000000000/switch_acp_model
HTTP/1.1 404 Not Found
{"detail":"Not Found"}

# ACP conversation created but not run/initialized yet
POST /api/conversations/cdea2062-5a29-4f95-a436-06706d0ae0e1/switch_acp_model
HTTP/1.1 409 Conflict
{"detail":"ACP session is not initialized; the model can only be switched after the conversation has started (first run())."}

This confirms the documented not-found and uninitialized-session behaviors through the actual API surface.

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot

Well-executed implementation of ACP mid-conversation model switching. The layered architecture (router → event_service → local_conversation → acp_agent) is clean and each layer has a clear responsibility. The two-flag separation (supports_set_session_model for session-creation vs supports_runtime_model_switch for runtime switching) is a good design that cleanly explains the previously-confusing claude behaviour. The cancellation-safe asyncio.shield pattern and the release_runtime() finalizer disarming are both thoughtful solutions to non-obvious correctness problems. Test coverage is comprehensive — the test_switch_acp_model_finishes_meta_on_cancel and test_switch_acp_model_disarms_discarded_agent_finalizer regression tests are particularly valuable.

A few items worth addressing before merge:

_reapply_session_model_on_resume has no timeout — if the ACP server hangs during resume the session startup blocks indefinitely (see inline comment).
Missing -> None return type on EventService.switch_acp_model (inconsistent with surrounding methods).
Minor TOCTOU on the isinstance guard in LocalConversation.switch_acp_model — not a real bug today but worth tightening.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

all-hands-bot

🟡 Taste Rating: Acceptable, but I found two resume-path correctness gaps that should be addressed or explicitly accepted by a maintainer. Risk: 🟡 MEDIUM because this changes live ACP runtime behavior and persistence.

This review was created by an AI agent (OpenHands) on behalf of the requester.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26478875702

all-hands-bot

🟡 QA Report: PARTIAL

The core SDK and agent-server ACP runtime switching path works end-to-end with a real local ACP-compatible subprocess, but I could not verify the external Claude/Codex/Gemini provider claims without provider credentials.

Does this PR achieve its stated goal?

Yes for the protocol-level SDK and HTTP behavior I could exercise. On the base commit, a running ACP conversation has no LocalConversation.switch_acp_model entry point; on this PR, the same workflow successfully sends session/set_model, keeps the same ACP session alive, and subsequent turns run on the new model with prior context still present (turns: 2). The agent-server route also returns 200 for a live ACP conversation, 404 for an unknown conversation, 409 before ACP session initialization, and the conversation state reflects acp_model=model-b after switching.

Phase	Result
Environment Setup	✅ Used existing uv/.venv environment; started local `openhands.agent_server` on `127.0.0.1:8124`
CI Status	🟡 32 success, 4 skipped, 1 in-progress (`qa-changes`) at time of review
Functional Verification	✅ SDK + HTTP route switching verified with a local ACP subprocess; external provider calls not verified

Functional Verification

Test 1: SDK runtime ACP model switch preserves the live session

Step 1 — Establish baseline without the fix:
Ran git checkout 58771eeb && OPENHANDS_SUPPRESS_BANNER=1 .venv/bin/python /tmp/qa_acp_switch_sdk.py --mode baseline:

BASE_SDK_EXIT=2
FIRST_RUN_OK
HAS_SWITCH_METHOD False
SWITCH_ERROR AttributeError 'LocalConversation' object has no attribute 'switch_acp_model'
LOG_AFTER_ERROR
{"event": "connect"}
{"event": "initialize", "protocol_version": 1}
{"cwd": "/tmp/qa-acp-switch-tbk6668e/workspace", "event": "new_session", "meta": {"mcp_servers": []}, "model": null, "session_id": "sess-1"}
{"event": "set_session_model", "model_id": "model-a", "session_id": "sess-1"}
{"event": "set_session_mode", "mode_id": "full-access", "session_id": "sess-1"}
{"event": "prompt", "model": "model-a", "session_id": "sess-1", "text": "remember token=blue", "turns": 1}

This confirms the old SDK path could start and run an ACP conversation, but exposed no runtime conversation-level model switch method.

Step 2 — Apply the PR's changes:
Checked out feat-acp-runtime-model-switch at 4760fa0860d9655f618984e469a3a850ab15c2c7.

Step 3 — Re-run with the fix in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 .venv/bin/python /tmp/qa_acp_switch_sdk.py --mode pr:

PR_SDK_EXIT=0
FIRST_RUN_OK
HAS_SWITCH_METHOD True
SWITCH_OK model-b model-b
SECOND_RUN_OK
AGENT_MODEL model-b model-b
LOG_FINAL
{"event": "connect"}
{"event": "initialize", "protocol_version": 1}
{"cwd": "/tmp/qa-acp-switch-ckas0kc7/workspace", "event": "new_session", "meta": {"mcp_servers": []}, "model": null, "session_id": "sess-1"}
{"event": "set_session_model", "model_id": "model-a", "session_id": "sess-1"}
{"event": "set_session_mode", "mode_id": "full-access", "session_id": "sess-1"}
{"event": "prompt", "model": "model-a", "session_id": "sess-1", "text": "remember token=blue", "turns": 1}
{"event": "set_session_model", "model_id": "model-b", "session_id": "sess-1"}
{"event": "prompt", "model": "model-b", "session_id": "sess-1", "text": "what token did I ask you to remember?", "turns": 2}

This shows the PR adds the SDK entry point, sends the live session/set_model request, updates the SDK-visible model fields, and preserves session context (sess-1, turns: 2) for the next turn.

Test 2: Agent-server HTTP route switches a live ACP conversation

Step 1 — Establish baseline / setup:
Started the actual server with uv run python -m openhands.agent_server --host 127.0.0.1 --port 8124 and created a live ACP conversation against the same local ACP subprocess.

Step 2 — Exercise the PR route:
Posted POST /api/conversations/{id}/switch_acp_model {"model":"model-b"} after one running turn. The server and ACP subprocess logs showed:

POST /api/conversations/40682af1-476c-48f0-b62a-502b6e0672b0/switch_acp_model HTTP/1.1" 200
Switched ACP session model to model-b (provider=codex, session=sess-1)

{"event": "prompt", "model": "model-a", "session_id": "sess-1", "text": "remember token=green", "turns": 1}
{"event": "set_session_model", "model_id": "model-b", "session_id": "sess-1"}
{"event": "prompt", "model": "model-b", "session_id": "sess-1", "text": "what token did I ask you to remember?", "turns": 2}

Then fetched conversation state:

STATE_AGENT_MODEL model-b model-b
STATUS finished

This verifies the HTTP route drives the same live ACP session switch and persisted/serialized conversation state reports the switched model.

Test 3: HTTP route error cases

Ran unknown-conversation and uninitialized-session requests:

POST /api/conversations/00000000-0000-0000-0000-000000000000/switch_acp_model
{"detail":"Not Found"}
HTTP 404

POST /api/conversations/0aad1a56-46c8-49ee-983e-ee2b0b29ea69/switch_acp_model before any run()
{"detail":"ACP session is not initialized; the model can only be switched after the conversation has started (first run())."}
HTTP 409

This matches the PR's advertised route behavior for missing conversations and not-yet-initialized ACP sessions.

Unable to Verify

I did not verify real Claude Code, Codex, or Gemini CLI provider behavior because this QA environment does not include those external provider credentials/subscriptions. The local ACP subprocess exercises the actual OpenHands SDK, ACP JSON-RPC connection, live session/set_model call, and agent-server HTTP path, but it cannot prove that each external bundled provider accepts the method in production.

Future QA guidance in AGENTS.md could list the exact provider credentials/environment variables and minimal safe prompts to use for real-provider ACP switching checks.

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

Final verdict: PARTIAL

all-hands-bot

🟡 QA Report: PARTIAL

The SDK and agent-server ACP runtime switch work end-to-end with a real local ACP-compatible subprocess; external Claude/Codex/Gemini provider calls were not verified due missing provider credentials.

Does this PR achieve its stated goal?

Yes for the core protocol-level behavior I could exercise. Baseline 58771eeb could run an ACP conversation but had no LocalConversation.switch_acp_model; this PR at 4760fa0860d9655f618984e469a3a850ab15c2c7 successfully sent session/set_model, kept the same ACP session alive, and the next turn ran on model-b with turns=2. The agent-server route also switched a live conversation via HTTP, returned 404 for an unknown conversation, returned 409 before session initialization, and reported agent.acp_model=model-b after switching.

Phase	Result
Environment Setup	✅ Used existing uv/.venv environment; started local `openhands.agent_server` on `127.0.0.1:8124`
CI Status	🟡 32 success, 4 skipped, 1 in-progress (`qa-changes`) at review time
Functional Verification	✅ SDK + HTTP switch verified with local ACP subprocess; external providers not verified

Functional Verification

Test 1: SDK runtime ACP switch preserves the live session

Baseline: ran git checkout 58771eeb && OPENHANDS_SUPPRESS_BANNER=1 .venv/bin/python /tmp/qa_acp_switch_sdk.py --mode baseline:

BASE_SDK_EXIT=2
FIRST_RUN_OK
HAS_SWITCH_METHOD False
SWITCH_ERROR AttributeError 'LocalConversation' object has no attribute 'switch_acp_model'
{"event": "prompt", "model": "model-a", "session_id": "sess-1", "text": "remember token=blue", "turns": 1}

This establishes the prior state: ACP could run, but users had no runtime SDK model-switch entry point.

PR: checked out feat-acp-runtime-model-switch at 4760fa0860d9655f618984e469a3a850ab15c2c7 and ran OPENHANDS_SUPPRESS_BANNER=1 .venv/bin/python /tmp/qa_acp_switch_sdk.py --mode pr:

PR_SDK_EXIT=0
FIRST_RUN_OK
HAS_SWITCH_METHOD True
SWITCH_OK model-b model-b
SECOND_RUN_OK
AGENT_MODEL model-b model-b
{"event": "prompt", "model": "model-a", "session_id": "sess-1", "text": "remember token=blue", "turns": 1}
{"event": "set_session_model", "model_id": "model-b", "session_id": "sess-1"}
{"event": "prompt", "model": "model-b", "session_id": "sess-1", "text": "what token did I ask you to remember?", "turns": 2}

This shows the switch goes through ACP session/set_model, the same session continues, and SDK-visible model fields update to model-b.

Test 2: Agent-server HTTP route switches a live ACP conversation

Started uv run python -m openhands.agent_server --host 127.0.0.1 --port 8124, created an ACP conversation, sent one turn, then posted POST /api/conversations/{id}/switch_acp_model {"model":"model-b"}:

POST /api/conversations/40682af1-476c-48f0-b62a-502b6e0672b0/switch_acp_model HTTP/1.1" 200
Switched ACP session model to model-b (provider=codex, session=sess-1)
{"event": "prompt", "model": "model-a", "session_id": "sess-1", "text": "remember token=green", "turns": 1}
{"event": "set_session_model", "model_id": "model-b", "session_id": "sess-1"}
{"event": "prompt", "model": "model-b", "session_id": "sess-1", "text": "what token did I ask you to remember?", "turns": 2}
STATE_AGENT_MODEL model-b model-b
STATUS finished

This verifies the HTTP route drives the live ACP switch and returned conversation state reflects the switched model.

Test 3: HTTP error cases

POST /api/conversations/00000000-0000-0000-0000-000000000000/switch_acp_model
{"detail":"Not Found"}
HTTP 404

POST /api/conversations/0aad1a56-46c8-49ee-983e-ee2b0b29ea69/switch_acp_model before any run()
{"detail":"ACP session is not initialized; the model can only be switched after the conversation has started (first run())."}
HTTP 409

Unable to Verify

I did not verify real Claude Code, Codex, or Gemini CLI provider behavior because this QA environment does not provide those external provider credentials/subscriptions. The local ACP subprocess exercises the actual OpenHands SDK, ACP JSON-RPC connection, live session/set_model call, and agent-server HTTP path, but it cannot prove every external provider accepts the method in production.

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

Final verdict: PARTIAL

all-hands-bot

🟡 QA Report: PARTIAL

The SDK and agent-server ACP runtime switch work end-to-end with a real local ACP-compatible subprocess; external Claude/Codex/Gemini provider calls were not verified due missing provider credentials.

Does this PR achieve its stated goal?

Yes for the core protocol-level behavior I could exercise. Baseline 58771eeb could run an ACP conversation but had no LocalConversation.switch_acp_model; this PR at 4760fa0860d9655f618984e469a3a850ab15c2c7 successfully sent session/set_model, kept the same ACP session alive, and the next turn ran on model-b with turns=2. The agent-server route also switched a live conversation via HTTP, returned 404 for an unknown conversation, returned 409 before session initialization, and reported agent.acp_model=model-b after switching.

Phase	Result
Environment Setup	✅ Used existing uv/.venv environment; started local `openhands.agent_server` on `127.0.0.1:8124`
CI Status	🟡 32 success, 4 skipped, 1 in-progress (`qa-changes`) at review time
Functional Verification	✅ SDK + HTTP switch verified with local ACP subprocess; external providers not verified

Functional Verification

Test 1: SDK runtime ACP switch preserves the live session

Baseline: ran git checkout 58771eeb && OPENHANDS_SUPPRESS_BANNER=1 .venv/bin/python /tmp/qa_acp_switch_sdk.py --mode baseline:

BASE_SDK_EXIT=2
FIRST_RUN_OK
HAS_SWITCH_METHOD False
SWITCH_ERROR AttributeError 'LocalConversation' object has no attribute 'switch_acp_model'
{"event": "prompt", "model": "model-a", "session_id": "sess-1", "text": "remember token=blue", "turns": 1}

This establishes the prior state: ACP could run, but users had no runtime SDK model-switch entry point.

PR: checked out feat-acp-runtime-model-switch at 4760fa0860d9655f618984e469a3a850ab15c2c7 and ran OPENHANDS_SUPPRESS_BANNER=1 .venv/bin/python /tmp/qa_acp_switch_sdk.py --mode pr:

PR_SDK_EXIT=0
FIRST_RUN_OK
HAS_SWITCH_METHOD True
SWITCH_OK model-b model-b
SECOND_RUN_OK
AGENT_MODEL model-b model-b
{"event": "prompt", "model": "model-a", "session_id": "sess-1", "text": "remember token=blue", "turns": 1}
{"event": "set_session_model", "model_id": "model-b", "session_id": "sess-1"}
{"event": "prompt", "model": "model-b", "session_id": "sess-1", "text": "what token did I ask you to remember?", "turns": 2}

This shows the switch goes through ACP session/set_model, the same session continues, and SDK-visible model fields update to model-b.

Test 2: Agent-server HTTP route switches a live ACP conversation

Started uv run python -m openhands.agent_server --host 127.0.0.1 --port 8124, created an ACP conversation, sent one turn, then posted POST /api/conversations/{id}/switch_acp_model {"model":"model-b"}:

POST /api/conversations/40682af1-476c-48f0-b62a-502b6e0672b0/switch_acp_model HTTP/1.1" 200
Switched ACP session model to model-b (provider=codex, session=sess-1)
{"event": "prompt", "model": "model-a", "session_id": "sess-1", "text": "remember token=green", "turns": 1}
{"event": "set_session_model", "model_id": "model-b", "session_id": "sess-1"}
{"event": "prompt", "model": "model-b", "session_id": "sess-1", "text": "what token did I ask you to remember?", "turns": 2}
STATE_AGENT_MODEL model-b model-b
STATUS finished

This verifies the HTTP route drives the live ACP switch and returned conversation state reflects the switched model.

Test 3: HTTP error cases

POST /api/conversations/00000000-0000-0000-0000-000000000000/switch_acp_model
{"detail":"Not Found"}
HTTP 404

POST /api/conversations/0aad1a56-46c8-49ee-983e-ee2b0b29ea69/switch_acp_model before any run()
{"detail":"ACP session is not initialized; the model can only be switched after the conversation has started (first run())."}
HTTP 409

Unable to Verify

I did not verify real Claude Code, Codex, or Gemini CLI provider behavior because this QA environment does not provide those external provider credentials/subscriptions. The local ACP subprocess exercises the actual OpenHands SDK, ACP JSON-RPC connection, live session/set_model call, and agent-server HTTP path, but it cannot prove every external provider accepts the method in production.

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

Final verdict: PARTIAL

…comments Addresses review on switch_acp_model / resume: - _reapply_session_model_on_resume now bounds the set_session_model call with the agent's acp_prompt_timeout (a hung server can no longer block session startup forever) and mirrors set_acp_model's error classification: server-internal failures (-32603) propagate rather than silently leaving the resumed session on the wrong model; client/protocol rejections (method-not-found, invalid model id) are swallowed/logged so a stale-capability server can't break resume. - EventService.switch_acp_model: add -> None return annotation; expand the cancellation comment to note CancelledError (a BaseException) is never suppressed by suppress(Exception). Adds resume tests for -32603 propagation and the timeout bound. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

all-hands-bot · 2026-05-26T22:59:03Z

✅ Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

all-hands-bot

✅ QA Report: PASS

Runtime ACP model switching works through both the SDK entry point and the agent-server HTTP route, with subsequent turns using the new model while preserving the same session context.

Does this PR achieve its stated goal?

Yes. I exercised a live LocalConversation and a live local openhands.agent_server against an ACP-compatible subprocess named codex-acp; after calling switch_acp_model / POST /switch_acp_model, the ACP provider received set_session_model(model-b), the next prompt ran on model-b, and the provider still had the first turn in session history (turn=2 history=...). I also confirmed the base branch lacks the SDK method / REST route, establishing the before/after delta.

Phase	Result
Environment Setup	✅ `make build` completed successfully; no tests/linters were run locally.
CI Status	🟡 Completed checks observed green, with PR Review/QA automation and image publish jobs still in progress at refresh time.
Functional Verification	✅ SDK and HTTP API model switching both worked end-to-end with a live ACP subprocess.

Functional Verification

Test 1: SDK `LocalConversation.switch_acp_model` switches a live ACP session

Step 1 — Establish baseline on origin/main:
Ran git checkout --detach origin/main && OPENHANDS_SUPPRESS_BANNER=1 QA_ACP_AGENT_NAME=codex-acp QA_ACP_LOG=/tmp/qa-acp-base-codex.jsonl QA_ACP_STATE=/tmp/qa-acp-base-codex-state.json uv run python /tmp/qa_switch_acp_sdk.py:

EXIT:0
has_switch_acp_model False
QA_REPLY model=model-a turn=1 history=remember alpha
after_first_run_agent_model model-a model-a
switch_acp_model_unavailable
{"event": "set_session_model", "model_id": "model-a", "session_id": "session-1"}
{"event": "prompt", "model": "model-a", "prompt": "remember alpha", "reply": "QA_REPLY model=model-a turn=1 history=remember alpha", "session_id": "session-1", "turn": 1}
ACP_STATE {"session-1": {"model": "model-a", "prompts": ["remember alpha"]}}

This shows the existing SDK could start and prompt an ACP session, but had no user-facing switch_acp_model entry point.

Step 2 — Apply PR changes:
Checked out feat-acp-runtime-model-switch at 340cf451fc5704a9381742f745b8f25e98464fa9.

Step 3 — Re-run with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 QA_ACP_AGENT_NAME=codex-acp QA_ACP_LOG=/tmp/qa-acp-pr-codex.jsonl QA_ACP_STATE=/tmp/qa-acp-pr-codex-state.json uv run python /tmp/qa_switch_acp_sdk.py:

EXIT:0
has_switch_acp_model True
QA_REPLY model=model-a turn=1 history=remember alpha
after_first_run_agent_model model-a model-a
after_switch_agent_model model-b model-b
QA_REPLY model=model-b turn=2 history=remember alpha | now report context
after_second_run_agent_model model-b model-b
{"event": "set_session_model", "model_id": "model-a", "session_id": "session-1"}
{"event": "prompt", "model": "model-a", "prompt": "remember alpha", "reply": "QA_REPLY model=model-a turn=1 history=remember alpha", "session_id": "session-1", "turn": 1}
{"event": "set_session_model", "model_id": "model-b", "session_id": "session-1"}
{"event": "prompt", "model": "model-b", "prompt": "now report context", "reply": "QA_REPLY model=model-b turn=2 history=remember alpha | now report context", "session_id": "session-1", "turn": 2}
ACP_STATE {"session-1": {"model": "model-b", "prompts": ["remember alpha", "now report context"]}}

This confirms the PR exposes the SDK switch method, sends session/set_model to the already-running ACP subprocess, updates the local agent model to model-b, and preserves session context across the switch.

Test 2: Agent-server HTTP route switches a live ACP conversation

Step 1 — Establish baseline on origin/main:
Started the base server with uv run python -m openhands.agent_server --host 127.0.0.1 --port 8766, confirmed /server_info returned 200, then ran:

curl -X POST http://127.0.0.1:8766/api/conversations/00000000-0000-0000-0000-000000000000/switch_acp_model   -H 'Content-Type: application/json' --data '{"model":"model-b"}'
SWITCH_HTTP:404
{"detail":"Not Found"}

This shows the route was not available before the PR.

Step 2 — Apply PR changes:
Returned to feat-acp-runtime-model-switch and started the PR server with uv run python -m openhands.agent_server --host 127.0.0.1 --port 8767.

Step 3 — Re-run with the PR:
Created an ACP conversation via POST /api/acp/conversations, called the new route, then sent another user message with run: true:

START_HTTP:201
SWITCH_HTTP:200
SECOND_HTTP:200
{"success":true}
{"event": "set_session_model", "model_id": "model-a", "session_id": "session-1"}
{"event": "prompt", "model": "model-a", "prompt": "remember api alpha", "reply": "QA_REPLY model=model-a turn=1 history=remember api alpha", "session_id": "session-1", "turn": 1}
{"event": "set_session_model", "model_id": "model-b", "session_id": "session-1"}
{"event": "prompt", "model": "model-b", "prompt": "now report api context", "reply": "QA_REPLY model=model-b turn=2 history=remember api alpha | now report api context", "session_id": "session-1", "turn": 2}

This confirms the HTTP endpoint returns success, performs the live ACP model switch, and the next HTTP-triggered turn uses model-b without losing prior context.

Test 3: Missing conversation error mapping

Ran against the PR server:

curl -X POST http://127.0.0.1:8767/api/conversations/00000000-0000-0000-0000-000000000000/switch_acp_model   -H 'Content-Type: application/json' --data '{"model":"model-b"}'
MISSING_HTTP:404
{"detail":"Not Found"}

This confirms the new route maps an unknown conversation to 404 as described.

Scope note

I used a local ACP-compatible subprocess implemented with the agent-client-protocol package and identified as codex-acp, rather than external Claude/Codex/Gemini CLIs, because those require provider credentials/subscriptions not available in this QA environment. This still exercises the real OpenHands SDK, real ACP subprocess transport, and real agent-server HTTP route.

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

Final verdict: PASS

all-hands-bot

Code Review: `feat(acp): runtime mid-conversation model switching`

The HEAD commit (340cf451) brings the implementation to a clean, production-ready state. All issues identified in the prior review cycle are fully addressed:

-> None annotation on EventService.switch_acp_model ✓
Timeout bound (asyncio.wait_for) on _reapply_session_model_on_resume ✓
Server-internal error (-32603) propagation on resume (raises instead of swallowing) ✓
Explicit CancelledError is BaseException comment in the asyncio.shield handler ✓

Key design decisions that are correct:

release_runtime() correctly disarms the discarded agent's __del__ so it can't close the session the shallow-copy new agent now owns — the dedicated regression test (test_switch_acp_model_disarms_discarded_agent_finalizer) is exactly right.
The asyncio.shield + suppress(Exception) pattern ensures meta.json is mirrored before CancelledError propagates, keeping persistent metadata consistent with the live session.
The separation of supports_set_session_model (initial selection path only) from the new supports_runtime_model_switch (live mid-conversation switching) is the correct design — cleaner than repurposing the existing flag.
Succeed-then-update: the sentinel LLM is only mutated after the protocol call succeeds, so a rejected switch leaves local state unchanged.

One minor documentation note (see inline): the PR description describes an earlier design where supports_set_session_model was repurposed for claude-code. The actual (and better) implementation introduces a separate supports_runtime_model_switch flag; worth updating the description to avoid confusing future readers.

On the asymmetry concern from the prior review: The last review round noted that _reapply_session_model_on_resume "takes the opposite stance" on unknown providers vs set_acp_model. This characterization is not accurate for the current HEAD — both functions share identical gating (if provider is not None and not provider.supports_runtime_model_switch: return), so unknown providers (provider is None) attempt the call in both paths. The docstring already documents this correctly, and test_unknown_provider_attempts_reapply confirms it.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

all-hands-bot

🟡 Taste Rating: Acceptable. I found two lifecycle correctness gaps inline around ACP init-failure cleanup and switch-vs-close coordination. Risk: 🟡 MEDIUM because this changes live ACP runtime behavior and persisted server metadata, so I’m leaving this as COMMENT for maintainer judgment.

This review was created by an AI agent (OpenHands) on behalf of the requester.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26479958363

…h switch Addresses review on the resume reapply / switch lifecycle: - ACPAgent._init now assigns self._conn/_process/_filtered_reader as soon as the subprocess and connection are created, instead of only via the returned tuple at the end. A mid-init failure (e.g. the resume model reapply now raises on timeout / -32603) therefore leaves them reachable by init_state()'s _cleanup(), so the subprocess/connection are torn down rather than leaked. The "session initialized" gating keys off _session_id (assigned last), so an early _conn doesn't make the agent look ready prematurely. - EventService.close() now acquires _acp_model_switch_lock around the conversation teardown, so it waits for an in-flight (shielded) switch to finish mirroring meta.json instead of closing the ACP runtime out from under the worker. Adds a test asserting close() waits for an in-flight switch before tearing the conversation down. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

all-hands-bot · 2026-05-26T23:19:54Z

✅ Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

all-hands-bot

Code Review: feat(acp): runtime mid-conversation model switching

🟢 Good taste — This is a well-engineered, production-quality implementation. The dual-flag design (supports_set_session_model / supports_runtime_model_switch) cleanly separates concerns, the async cancel semantics in EventService are correctly handled, and the release_runtime() / shallow-model_copy pattern elegantly solves the double-close problem. The PR description is thorough and the test coverage is excellent.

[IMPROVEMENT OPPORTUNITIES]

event_service.py — cancel handler and save_meta consistency

The with suppress(Exception): await task in the cancel handler swallows all errors from the shielded task, including a failure from save_meta(). If save_meta() throws after the live switch completes, the cancel handler exits silently. At that point: the live ACP session has the new model, base_state.json has been updated by LocalConversation.switch_acp_model, but meta.json still has the old model. On the next restart, start() reads meta.json into self.stored.agent, and ConversationState.create() copies that back over base_state.json — meaning meta.json wins and silently reverts the switched model. This is a cancel-plus-save-failure compounding edge case, but a brief comment acknowledging the known inconsistency window would help future maintainers.

acp_agent.py — set_acp_model() is public but requires the state lock

The docstring says "Thread-safety is the caller's responsibility: drive this through LocalConversation.switch_acp_model." As a public method, this is a footgun — a caller who finds it in the SDK and calls it directly could race with a running step(). Consider naming it _set_acp_model to signal it is semi-private, or guard with an assertion. The current unit tests call it directly (fine for coverage), but the public API surface implies it is safe to call from any context.

conversation_router.py — no early validation on the model string

An empty string "" will pass through to the ACP server. The protocol rejection fires a ValueError → 400 correctly, but the error message from the ACP server (e.g., "set_session_model(model=''): method rejected") is less actionable than an early "model must not be empty" guard. Minor, non-blocking.

[TESTING GAPS]

Missing: cancel + save_meta failure path in test_event_service.py

test_switch_acp_model_finishes_meta_on_cancel covers the happy cancel path (worker completes, meta.json mirrored). It does not cover the case where save_meta() raises after the live switch completes during cancellation. That path is silently swallowed by suppress(Exception), leaving meta.json inconsistent with base_state.json. A dedicated test for this scenario would confirm the behavior is intentional and catch regressions if the suppress scope ever narrows.

[RISK ASSESSMENT]

Overall PR ⚠️ Risk: 🟡 MEDIUM
- New HTTP endpoint on the agent-server with novel exception-to-status-code mappings (ValueError→400, RuntimeError→409, TimeoutError→504) — low blast radius but careful to ensure the exception taxonomy stays consistent with existing routes.
- Changed _init() early-assignment pattern is a behavioral change to ACP session startup: _cleanup() now runs against partially-initialized connections on a mid-init failure. The _session_id-as-initialization-sentinel gate in set_acp_model() preserves correctness here ✓.
- The asyncio.shield + cancel pattern in EventService is non-trivial; correctness depends on run_in_executor thread-cancelability guarantees, which are correctly documented and tested.
- supports_runtime_model_switch defaults to False — backward-compatible for any external ACPProviderInfo instantiations ✓.

VERDICT

✅ Worth merging — Core logic is sound, design decisions are well-reasoned and well-documented. The items above are non-blocking suggestions.

KEY INSIGHT: The release_runtime() / shallow model_copy / _session_id-as-initialization-sentinel combination is the right pattern for handing a live ACP runtime to a new agent copy without risking double-close or premature teardown.

Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better by adding a .agents/skills/custom-codereview-guide.md file to your branch.

Was this review helpful? React with 👍 or 👎 to give feedback.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

all-hands-bot

🟡 QA Report: PARTIAL

The SDK and agent-server ACP model-switch paths work end-to-end against a live ACP-compatible stub process; real vendor ACP CLIs were not re-verified because this environment lacks their service credentials/subscriptions.

Does this PR achieve its stated goal?

Yes for the PR's protocol-level goal: a live ACP session can switch from model-a to model-b mid-conversation, and subsequent turns in the same conversation use the new model without restarting the ACP subprocess or losing the session. I verified this both through LocalConversation.switch_acp_model(...) and through the new POST /api/conversations/{id}/switch_acp_model route against an actual running agent-server. The only unverified part is the provider-specific matrix (Claude Code / Codex / Gemini with real accounts), which could not be exercised here without external credentials.

Phase	Result
Environment Setup	✅ `uv run` built/imported the local packages and started the actual agent-server successfully.
CI Status	🟡 Current checks are mostly green; manifest merge jobs and bot review/QA jobs were still in progress when checked.
Functional Verification	✅ SDK and HTTP protocol paths switch the live ACP session and persist the new model; 404/409 error cases match the PR description.

Functional Verification

Test 1: SDK `LocalConversation.switch_acp_model` against an ACP-compatible subprocess

Step 1 — Establish baseline on main:
Ran the same SDK script from a detached origin/main worktree:

$ cd /tmp/qa-sdk-main && uv run python /tmp/qa_acp_switch_flow.py
...
Finish with message:
QA_STUB_REPLY model=model-a prompt=first turn
...
AttributeError: 'LocalConversation' object has no attribute 'switch_acp_model'

This shows the runtime ACP switch entry point does not exist on the base branch; after the first ACP turn there is no user-facing SDK API to switch the live ACP model.

Step 2 — Apply the PR's changes:
Checked out PR branch feat-acp-runtime-model-switch at 0baa3128940d954c2981b428ce7fb323244d7ef3 and used a temporary ACP stub server (/tmp/qa_acp_stub.py) launched as a real subprocess through ACPAgent(acp_command=...). The stub runs through acp.core.run_agent(..., use_unstable_protocol=True) and records new_session, set_model, set_mode, and prompt calls.

Step 3 — Re-run with the fix in place:
Ran:

$ uv run python /tmp/qa_acp_switch_flow.py
...
Finish with message:
QA_STUB_REPLY model=model-a prompt=first turn
...
Switched ACP session model to model-b (provider=codex, session=sess-qa)
...
Finish with message:
QA_STUB_REPLY model=model-b prompt=second turn
...
{
  "ok": true,
  "after_first_run_llm_model": "model-a",
  "has_switch_acp_model": true,
  "after_switch_agent_acp_model": "model-b",
  "after_switch_llm_model": "model-b",
  "after_second_run_llm_model": "model-b",
  "persisted_agent_kind": "ACPAgent",
  "persisted_acp_model": "model-b"
}

The stub event log also showed the same live session (sess-qa) received set_model(model-a), prompted on model-a, then received set_model(model-b), then prompted on model-b. This confirms the PR-provided SDK method switches the running ACP session and persists acp_model as the authoritative model for reload/resume.

Test 2: Agent-server HTTP route on the real server

Step 1 — Establish baseline on main:
Started the base server and called the new route:

$ cd /tmp/qa-sdk-main && uv run python -m openhands.agent_server --host 127.0.0.1 --port 8125
$ curl -w '%{http_code}
' -X POST   http://127.0.0.1:8125/api/conversations/00000000-0000-0000-0000-000000000000/switch_acp_model   -H 'Content-Type: application/json' --data '{"model":"model-b"}'
404
{"detail":"Not Found"}

This confirms the base branch has no HTTP route for runtime ACP model switching.

Step 2 — Apply the PR's changes:
Started the PR server with:

$ uv run python -m openhands.agent_server --host 127.0.0.1 --port 8124

Then created an ACP conversation via POST /api/conversations using the same stub subprocess and an initial user message.

Step 3 — Re-run with the fix in place:
First turn over HTTP used model-a:

$ curl http://127.0.0.1:8124/api/conversations/<id>/agent_final_response
{"response":"QA_STUB_REPLY model=model-a prompt=first via http"}

Then switched and sent a second turn:

$ curl -X POST http://127.0.0.1:8124/api/conversations/<id>/switch_acp_model   -H 'Content-Type: application/json' --data '{"model":"model-b"}'
{"success":true}

$ curl -X POST http://127.0.0.1:8124/api/conversations/<id>/events   -H 'Content-Type: application/json'   --data '{"role":"user","content":[{"type":"text","text":"second via http"}],"run":true}'
{"success":true}

$ curl http://127.0.0.1:8124/api/conversations/<id>/agent_final_response
{"response":"QA_STUB_REPLY model=model-b prompt=second via http"}

The stub log for that same HTTP-created conversation showed:

{"event":"prompt","session_id":"sess-qa","model":"model-a","prompt":"first via http"}
{"event":"set_model","session_id":"sess-qa","model":"model-b"}
{"event":"prompt","session_id":"sess-qa","model":"model-b","prompt":"second via http"}

Fetching the stored conversation after the switch returned:

{
  "execution_status": "finished",
  "agent_acp_model": "model-b",
  "agent_llm_model": "model-b"
}

This confirms the new server endpoint switches the live ACP session and mirrors the switched model into conversation metadata.

Test 3: HTTP error behavior

Unknown conversation:

$ curl -w '%{http_code}
' -X POST   http://127.0.0.1:8124/api/conversations/00000000-0000-0000-0000-000000000000/switch_acp_model   -H 'Content-Type: application/json' --data '{"model":"model-b"}'
404
{"detail":"Not Found"}

Uninitialized ACP session:
Created an ACP conversation without running a first turn, then called the switch route:

$ curl -w '%{http_code}
' -X POST   http://127.0.0.1:8124/api/conversations/<uninitialized-id>/switch_acp_model   -H 'Content-Type: application/json' --data '{"model":"model-b"}'
409
{"detail":"ACP session is not initialized; the model can only be switched after the conversation has started (first run())."}

These match the PR's stated 404 and 409 behavior.

Unable to Verify

I did not run real Claude Code, Codex, or Gemini CLI sessions because this QA environment does not include the required provider credentials/subscription state. The local ACP stub exercises the real SDK ACP client, real subprocess transport, real session/set_model protocol route, and real agent-server HTTP route, but it cannot prove provider-specific model IDs are accepted by those external services.

Suggested future QA guidance: add an AGENTS.md/QA note documenting which environment variables or subscription login state are required to run a real ACP provider smoke test, plus a safe low-cost prompt/model pair for each provider.

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot

🟡 Taste Rating: Acceptable. I found two correctness gaps inline around direct ACP model-switch accounting and lease renewal while waiting for shielded switch teardown. Risk: 🟡 MEDIUM because this changes live ACP runtime behavior and persistence, so I’m leaving this as COMMENT for maintainer/eval judgment.

This review was created by an AI agent (OpenHands) on behalf of the requester.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26480738775

…d model Addresses review: - EventService.close() now waits for the switch lock with a bounded timeout (ACP_SWITCH_LOCK_CLOSE_TIMEOUT_SECONDS) instead of unconditionally: a switch is normally sub-second, but a wedged server could otherwise hold the lock for the worker's full acp_prompt_timeout (30 min) while lease renewal is already stopped. After the bound, close() proceeds with teardown. - The cancellation handler now logs a failure of the in-flight switch task (e.g. a meta.json write that fails after the live switch) instead of swallowing it silently, while still always re-raising CancelledError. - set_acp_model rejects empty/whitespace model ids early with a clear ValueError (-> 400) rather than forwarding them to the ACP server, and its docstring now warns that direct callers bypass acp_model persistence / cost attribution (use LocalConversation.switch_acp_model). Adds a test for the empty-model guard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

all-hands-bot · 2026-05-26T23:50:10Z

✅ Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

all-hands-bot

🟡 QA Report: PARTIAL

Verified the SDK conversation switch path and agent-server HTTP route behavior locally; real provider LLM switching could not be exercised because ACP provider CLIs/API credentials are unavailable in this environment.

Does this PR achieve its stated goal?

Partially verified. The stated goal is a runtime switch_acp_model path that switches an ACP session model mid-conversation without restarting or losing session context. In local execution, the PR adds the missing SDK/API entry points, forwards the switch to session/set_model, keeps the same connection/session, updates local model/metrics state, and exposes the expected HTTP success/error behavior. I could not verify a real Claude/Codex/Gemini LLM turn after switching because claude, codex, and gemini CLIs plus their API keys are not present here.

Phase	Result
Environment Setup	✅ `make build` completed successfully
CI Status	⏳ 20 successful, 8 pending, 4 skipped, 0 failing at refresh time
Functional Verification	🟡 SDK/API behavior verified with a live-session harness; real provider end-to-end unavailable

Functional Verification

Test 1: SDK `LocalConversation.switch_acp_model` runtime path

Step 1 — Reproduce / establish baseline without the fix:
Ran git switch --detach origin/main && uv run python /tmp/qa_switch_acp_model.py:

has_conversation_switch=False
initial_agent_model=model-a
initial_llm_model=model-a
switch_result=AttributeError:AttributeError

This shows the base branch has no conversation-level runtime ACP model switching entry point.

Step 2 — Apply the PR's changes:
Checked out 00f4eb8c08c7216631e18ddced683cfe98f50a8a.

Step 3 — Re-run with the fix in place:
Ran uv run python /tmp/qa_switch_acp_model.py:

has_conversation_switch=True
initial_agent_model=model-a
initial_llm_model=model-a
switch_result=ok
protocol_calls=[('model-b', 'session-123')]
executor_timeouts=[7.0]
same_connection=True
same_session=True
old_agent_disarmed=True
new_agent_model=model-b
new_llm_model=model-b
metrics_model=model-b

This confirms the new public conversation method invokes the live set_session_model call, uses the configured timeout, preserves the same session/connection, and reflects the switched model in the agent/LLM/metrics state.

Test 2: agent-server `POST /api/conversations/{id}/switch_acp_model`

Step 1 — Reproduce / establish baseline without the fix:
Ran git switch --detach origin/main && uv run python /tmp/qa_switch_acp_model_api.py:

has_http_route=False
success_status=404
success_body={"detail":"Not Found"}
forwarded_models=[]

This shows the base branch does not expose the HTTP route.

Step 2 — Apply the PR's changes:
Checked out 00f4eb8c08c7216631e18ddced683cfe98f50a8a.

Step 3 — Re-run with the fix in place:
Ran uv run python /tmp/qa_switch_acp_model_api.py:

has_http_route=True
success_status=200
success_body={"success":true}
forwarded_models=['model-b']
missing_status=404
value_error_status=400
runtime_error_status=409
timeout_status=504

This confirms the HTTP endpoint is present, forwards the requested model to the event service, returns success on the happy path, and maps the user-visible error cases to the documented statuses.

Unable to Verify

Real provider end-to-end switching with Claude Code, Codex, or Gemini CLI could not be verified in this runner. I checked availability with command -v claude codex gemini and key presence for ANTHROPIC_API_KEY, OPENAI_API_KEY, GEMINI_API_KEY, and GOOGLE_API_KEY; all provider CLIs and keys were missing. Future QA would be stronger with AGENTS.md guidance for a local non-credential ACP test server or provider-specific setup instructions for runtime model-switch verification.

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot

Code Review — Post-Approval Follow-up (commits since `46a1cf4d`)

The eight incremental commits correctly address all five issues raised in the post-approval round:

✅ Concurrent switch serialization: _acp_model_switch_lock serializes concurrent switch_acp_model calls so the live-switch and meta.json mirror commit atomically.
✅ Cancellation safety: asyncio.shield + create_task prevents a client disconnect from leaving the live session and meta.json in inconsistent states; failures during cancellation are now logged rather than silently swallowed.
✅ close() deadlock bound: ACP_SWITCH_LOCK_CLOSE_TIMEOUT_SECONDS = 10 s prevents a wedged-server switch from blocking close() for the full acp_prompt_timeout while the lease has already expired.
✅ Resume correctness: _reapply_session_model_on_resume correctly reapplies the persisted model on session resume via the runtime-switch capability (not the initial-selection gate), with proper error classification: server-internal errors propagate, protocol rejections are logged and swallowed.
✅ Init-failure resource leak: Early assignment of _conn/_process/_filtered_reader inside _init() ensures _cleanup() can tear them down even when a later init step (e.g. the model reapply) fails.

Two minor observations on the new code are left as inline comments below.

The release_runtime() / finalizer-disarm pattern, the asyncio.shield cancel-handling block, and the two-flag capability split (supports_set_session_model vs supports_runtime_model_switch) are all clean, targeted solutions to legitimately subtle problems. Test coverage is solid and regression-driven (especially test_switch_acp_model_disarms_discarded_agent_finalizer and the resume error-boundary tests).

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM — Runtime protocol calls to live ACP sessions introduce a new code path that executes in production while a conversation is in progress. The implementation correctly guards against races, leaks, and partial failures, but the feature touches process-lifecycle code (_start_acp_server, close(), _init()) that is difficult to integration-test fully. The author's real-LLM verification (Claude/Codex/Gemini) provides confidence on the happy path; the edge cases (cancel+save_meta failure, close() lock timeout) are unlikely but acknowledged.

VERDICT:
✅ Worth merging: Concurrency model is correct, resource lifecycle is clean, test coverage is regression-driven, and all previous review feedback has been addressed.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

all-hands-bot

🟡 Taste Rating: Acceptable. Found one cancellation-safety gap inline; the existing unresolved reviewer threads should also be resolved or explicitly accepted by a maintainer. Risk: 🟡 MEDIUM because this changes live ACP runtime behavior and persistence, so I’m leaving this as COMMENT for maintainer/eval judgment.

This review was created by an AI agent (OpenHands) on behalf of the requester.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26481848513

The agent-server switch endpoint had accreted concurrency/cancellation machinery far beyond its sibling switch_llm (a one-liner). Strip it back: - EventService.switch_acp_model: drop the per-conversation lock, the shield/await-task cancellation handling, and the cancel-time logging. It is now guard + run_in_executor(live switch) + mirror acp_model into meta.json + save_meta (~12 lines). close() reverts to its original form (no lock coordination / bounded wait). The guarded scenarios (concurrent switches on one conversation; cancellation mid-switch) are not handled by switch_llm either and are not realistic for a user-driven picker. - _reapply_session_model_on_resume: drop the timeout param and the -32603-vs-client-error distinction; tolerate (log) any ACPRequestError on resume, exactly like the load_session fallback in the same function. This makes it consistent with the other unbounded _init protocol calls (initialize/new_session/set_session_mode). Removes obsolete tests for the deleted machinery. Behavior preserved on the real paths: re-validated end-to-end against the pinned @agentclientprotocol/claude-agent-acp@0.30.0 via the live agent-server (switch -> 200, edge cases 409/400/404), plus 373 unit tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

VascoSch92

LGTM

Integrate the model-state surfacing (#3347) with the runtime model-switch work that landed on main (#3390). Conflicts in acp_agent.py, resolved by adapting #3347's model capture onto main's refactored `_init` (which now sets `_conn`/`_process` inside and returns only success-only fields): `_init` now also returns the resolved `current_model_id` + normalized `available_models`, and the fresh-session branch captures them from `new_session` while the resume branch keeps main's `_reapply_session_model_on_resume`. Integration fixes prompted by review of #3347 against the merged #3390: - ACPAgent.set_acp_model now refreshes `_current_model_id` on a successful live switch, so the chip/picker (ConversationInfo.current_model_id) reflects the new model instead of the stale session-start value. The shallow model_copy in switch_acp_model carries the updated PrivateAttr. - LocalConversation.switch_acp_model keeps the persisted `acp_current_model_id` hint in sync so cold list reads after a switch (pre re-init) aren't stale. - Add ACPAgent.supports_runtime_model_switch (resolved from the detected provider) and surface it on ConversationInfo (persisted in agent_state so cold reads are correct), telling the inline picker whether to offer a live-switch control. Regression tests: switch refreshes current_model_id + persisted hint; supports_runtime_model_switch property + ConversationInfo lift. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

simonrosenberg mentioned this pull request May 26, 2026

Tracking: unify ACP model selection with the LLM-profile UX OpenHands/agent-canvas#769

Open

15 tasks