[codex] Accept ACP user messages during async turns by neubig · Pull Request #3376 · OpenHands/software-agent-sdk

neubig · 2026-05-24T11:55:13Z

Summary

include the condenser LLM usage-id/metrics fix from fix(sdk): assign condenser LLM usage id #3368 so condenser settings do not collide with the agent LLM in the agent server
stream ACP assistant text through transient StreamingDeltaEvents, including ACP providers that invoke token callbacks with plain string chunks
release the conversation state lock around long ACP astep() awaits so websocket/API user messages can be persisted immediately
wrap ACP async event callbacks with short state-lock acquires for emitted events, and send ACP session/cancel when an in-flight turn is interrupted
when a new user message arrives during a running ACP turn, interrupt the current prompt and restart so the new message is processed instead of staying stuck in "sending"

Regression Tests

test_acp_string_token_callback_publishes_delta fails on the prior implementation because ACP plain-string token callbacks raised before publishing any StreamingDeltaEvent
test_acp_arun_accepts_user_message_while_step_is_in_flight fails on main because send_message() waits behind the in-flight ACP prompt
server-level send-message tests cover interrupting and restarting ACP runs when a user message arrives during execution
tests/sdk/test_settings.py -k condenser covers the copied condenser LLM usage-id behavior from fix(sdk): assign condenser LLM usage id #3368

Validation

uv run pytest -q tests/sdk/test_settings.py -k condenser
uv run pytest -q tests/agent_server/test_event_streaming.py tests/agent_server/test_event_service.py::TestEventServiceSendMessage tests/sdk/agent/test_acp_agent.py::TestACPAgentAstep tests/sdk/conversation/local/test_conversation_send_message.py
uv run ruff check openhands-sdk/openhands/sdk/settings/model.py tests/sdk/test_settings.py openhands-agent-server/openhands/agent_server/event_service.py tests/agent_server/test_event_streaming.py

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22-slim`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:c3749ff-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-c3749ff-python \
  ghcr.io/openhands/agent-server:c3749ff-python

All tags pushed for this build

ghcr.io/openhands/agent-server:c3749ff-golang-amd64
ghcr.io/openhands/agent-server:c3749ffa83e04f8f64ff102aa5529ef5c65e591c-golang-amd64
ghcr.io/openhands/agent-server:codex-acp-live-message-deltas-golang-amd64
ghcr.io/openhands/agent-server:c3749ff-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:c3749ff-golang-arm64
ghcr.io/openhands/agent-server:c3749ffa83e04f8f64ff102aa5529ef5c65e591c-golang-arm64
ghcr.io/openhands/agent-server:codex-acp-live-message-deltas-golang-arm64
ghcr.io/openhands/agent-server:c3749ff-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:c3749ff-java-amd64
ghcr.io/openhands/agent-server:c3749ffa83e04f8f64ff102aa5529ef5c65e591c-java-amd64
ghcr.io/openhands/agent-server:codex-acp-live-message-deltas-java-amd64
ghcr.io/openhands/agent-server:c3749ff-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:c3749ff-java-arm64
ghcr.io/openhands/agent-server:c3749ffa83e04f8f64ff102aa5529ef5c65e591c-java-arm64
ghcr.io/openhands/agent-server:codex-acp-live-message-deltas-java-arm64
ghcr.io/openhands/agent-server:c3749ff-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:c3749ff-python-amd64
ghcr.io/openhands/agent-server:c3749ffa83e04f8f64ff102aa5529ef5c65e591c-python-amd64
ghcr.io/openhands/agent-server:codex-acp-live-message-deltas-python-amd64
ghcr.io/openhands/agent-server:c3749ff-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:c3749ff-python-arm64
ghcr.io/openhands/agent-server:c3749ffa83e04f8f64ff102aa5529ef5c65e591c-python-arm64
ghcr.io/openhands/agent-server:codex-acp-live-message-deltas-python-arm64
ghcr.io/openhands/agent-server:c3749ff-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:c3749ff-golang
ghcr.io/openhands/agent-server:c3749ffa83e04f8f64ff102aa5529ef5c65e591c-golang
ghcr.io/openhands/agent-server:codex-acp-live-message-deltas-golang
ghcr.io/openhands/agent-server:c3749ff-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:c3749ff-java
ghcr.io/openhands/agent-server:c3749ffa83e04f8f64ff102aa5529ef5c65e591c-java
ghcr.io/openhands/agent-server:codex-acp-live-message-deltas-java
ghcr.io/openhands/agent-server:c3749ff-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:c3749ff-python
ghcr.io/openhands/agent-server:c3749ffa83e04f8f64ff102aa5529ef5c65e591c-python
ghcr.io/openhands/agent-server:codex-acp-live-message-deltas-python
ghcr.io/openhands/agent-server:c3749ff-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

Each variant tag (e.g., c3749ff-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., c3749ff-python-amd64) are also available if needed

Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-05-24T11:55:36Z

Python API breakage checks — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-05-24T11:55:48Z

REST API breakage checks (OpenAPI) — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-05-24T11:57:35Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-agent-server/openhands/agent_server
event_service.py	488	100	79%	88–89, 119, 122–123, 127–128, 135, 139, 145, 155–159, 162–165, 224, 245–246, 317, 368, 378, 402–403, 407, 415, 418, 444, 487, 489, 493–495, 499, 508–509, 511, 515, 521, 523, 570, 600, 603, 660, 681, 858–859, 868, 870–871, 875, 889–891, 893, 914, 919–922, 926–929, 937–940, 946–949, 995–996, 998–1005, 1007–1008, 1017–1018, 1020–1021, 1028–1029, 1031–1032, 1052, 1058, 1064, 1073–1074
openhands-sdk/openhands/sdk/agent
acp_agent.py	774	86	88%	351–353, 483–484, 517, 519, 523, 527, 535, 598–599, 604, 671, 826, 829–830, 847–848, 877, 882, 900, 910, 943–946, 1150–1153, 1157–1159, 1162–1166, 1168, 1321, 1335–1338, 1346, 1354–1356, 1363–1364, 1385, 1389–1391, 1395–1396, 1428, 1772, 1776, 1784–1786, 1826, 1830–1831, 1834, 1842–1844, 1846, 1848, 1852, 1855, 1864–1866, 1868, 1893–1898, 1994–1995
openhands-sdk/openhands/sdk/conversation/impl
local_conversation.py	570	52	90%	309, 314, 458, 504, 541, 557, 622, 771–772, 847–848, 851, 969, 980–983, 990–991, 994, 1000–1001, 1004, 1010, 1057, 1060, 1064–1065, 1069–1070, 1073, 1080, 1139, 1147, 1151–1153, 1160, 1246, 1251, 1361, 1363, 1367–1368, 1379–1380, 1405, 1600, 1604, 1674, 1681–1682
TOTAL	28511	6647	76%

all-hands-bot

Clean approach to the live-streaming problem: reusing the existing StreamingDeltaEvent / _pub_sub path for ACP deltas is exactly right — transient chunks don't belong in state.events, and the new helper _publish_streaming_delta_from_thread correctly encapsulates the cross-thread scheduling so neither the LLM token path nor the ACP path has to repeat the run_coroutine_threadsafe / suppress boilerplate. Test coverage is solid (wiring, unwiring, and actual delta delivery are all exercised). Two items worth a look before merge are called out below.

This review was generated by an AI agent (OpenHands) on behalf of the repository owner via OpenHands Automation.

Co-authored-by: openhands <openhands@all-hands.dev>

neubig · 2026-05-24T17:57:10Z

Merged main into this branch and resolved the conflict in EventService.send_message() by preserving the ACP interrupt behavior while keeping the newer rerun-request handling from main. Local verification: env -u GITHUB_TOKEN uv run pytest tests/agent_server/test_event_service.py tests/agent_server/test_event_streaming.py tests/sdk/agent/test_acp_agent.py tests/sdk/conversation/local/test_conversation_send_message.py -q → 289 passed.\n\n_This comment was generated by an AI agent (OpenHands) on behalf of neubig._

all-hands-bot

🟡 Acceptable, but I found two ACP concurrency races that should be addressed before merge: one can replay a just-arrived user prompt, and one can let cancelled prompt updates leak into the next turn. Risk: 🟡 medium, since this changes conversation loop/cancellation behavior.

This review was generated by an AI agent (OpenHands) on behalf of the user.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26368622109

Co-authored-by: openhands <openhands@all-hands.dev>

neubig · 2026-05-24T18:20:37Z

Addressed the two ACP async concurrency review items in 6c89026 and verified locally with ruff plus the full EventService test file (83 passed). Ready for another automated review.\n\n_This comment was generated by an AI agent (OpenHands) on behalf of the user._

Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot

🟡 Acceptable direction, but I found a few ACP concurrency ordering gaps that should be addressed before merge. Risk: 🟡 medium because this changes async conversation loop/cancellation behavior.

This review was generated by an AI agent (OpenHands) on behalf of the user.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26369467316

Co-authored-by: openhands <openhands@all-hands.dev>

neubig · 2026-05-24T18:57:45Z

Addressed the latest ACP concurrency review in c9bbb68: atomic async finalization, finish-gap queued-message reconciliation, and cancellation drain-before-failure ordering. Verified locally with ruff, pyright, and targeted ACP async tests.\n\n_This comment was generated by an AI agent (OpenHands) on behalf of the user._

all-hands-bot

🟡 Acceptable direction, but I found a few ACP cancellation/queued-message edge cases plus one slow test that should be addressed before merge. Risk: 🟡 medium because this changes async conversation and cancellation behavior.

This review was generated by an AI agent (OpenHands) on behalf of the user.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26369954489

Co-authored-by: openhands <openhands@all-hands.dev>

neubig · 2026-05-24T19:37:37Z

Addressed the latest ACP review in ce68215: cancelled-prompt drain timeout now restarts/quarantines the ACP session, timeout ordering drains before terminal failure emission, queued messages at the iteration cap are left IDLE for a follow-up run, and the slow cancellation test now releases its fake prompt promptly. Verified locally with ruff, pyright, ACP arun tests, and the ACP cancellation test.\n\n_This comment was generated by an AI agent (OpenHands) on behalf of the user._

all-hands-bot

🟡 Acceptable direction, but I found several ACP cancellation/session-restart and queued-message edge cases that should be addressed before merge. Risk: 🟡 medium because this changes async conversation and cancellation behavior.

This review was generated by an AI agent (OpenHands) on behalf of the user.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26370828864

Co-authored-by: openhands <openhands@all-hands.dev>

neubig · 2026-05-24T20:03:16Z

Addressed the latest ACP cancellation/session review in f581a01: shielded interrupt waits, bounded ACP cancel send, suffix reset on forced restarts, deferred user-cancel restarts, FIFO processing for multiple queued messages, and immediate stop after ACP ERROR/STUCK. Verified locally with ruff, pyright, targeted ACP arun tests, and ACP cancellation regression.\n\n_This comment was generated by an AI agent (OpenHands) on behalf of the user._

…sage-deltas

all-hands-bot

🟡 Acceptable direction, but I found a few ACP async queue/cancellation races to address before merge. Risk: 🟡 medium because this changes agent/conversation loop behavior; a human maintainer should also decide on eval coverage before approval.

This review was generated by an AI agent (OpenHands) on behalf of the user.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26371395754

Co-authored-by: openhands <openhands@all-hands.dev>

neubig · 2026-05-24T20:21:29Z

Addressed the latest ACP queue/cancellation review in ffeb881: persisted FIFO cursor across arun invocations, kept explicit rerun requests through ACP PAUSED interrupts, and serialized timeout/error finalization under the state lock. Verified locally with ruff, pyright, ACP arun regression tests, and ACP cancellation regression.\n\n_This comment was generated by an AI agent (OpenHands) on behalf of the user._

Co-authored-by: openhands <openhands@all-hands.dev>

neubig · 2026-05-25T01:49:33Z

Added a follow-up fix in c3749ff after investigating conversation 1156d472-2173-4deb-bc07-ccd433337e92.

Root cause: after two rapid user interruptions, the ACP bridge timed out waiting for the cancelled prompt to drain. _restart_session_after_drain_timeout() then removed acp_session_id/acp_session_cwd from ConversationState.agent_state, so the restarted Codex ACP subprocess called new_session instead of load_session. That created a new Codex rollout with only the latest user message, losing the previous thread.

Fix: preserve the persisted ACP session id/cwd across restart-after-drain-timeout so the restarted subprocess attempts load_session first. The patch also tracks whether load_session actually succeeded, so stale acp_suffix_installed markers are cleared when we truly fall back to a replacement session.

Verification:

uv run pytest -q tests/sdk/agent/test_acp_agent.py -k "SessionIdPersistence or init_state_sets_installed_when_suffix_marker_persisted"
uv run pytest -q tests/sdk/agent/test_acp_agent.py
uv run ruff check openhands-sdk/openhands/sdk/agent/acp_agent.py tests/sdk/agent/test_acp_agent.py
uv run pre-commit run --files openhands-sdk/openhands/sdk/agent/acp_agent.py tests/sdk/agent/test_acp_agent.py

all-hands-bot

The iterative hardening across this PR (FIFO message tracking, shielded drain, deferred cancel-path restart, suffix-marker clearing on fallback) is solid work. Earlier rounds of review + fixes addressed the major concurrency races. Three remaining points below — two in production code, one in tests.

This review was generated by an AI agent (OpenHands) on behalf of the user via OpenHands Automation.

all-hands-bot · 2026-05-25T02:10:23Z

+            with state:
+                self._emit_turn_timeout(time.monotonic() - t0, state, on_event)
+                if not drained:
+                    self._restart_session_after_drain_timeout(state, on_event)


🟡 Suggestion: The cancel path defers subprocess restart to the next turn (_restart_session_on_next_turn = True) to avoid holding the state lock during slow I/O. The timeout path calls _restart_session_after_drain_timeout here while with state: is already held, which runs _cleanup() + init_state() (subprocess spawn + ACP connection) under that lock. This blocks send_message() for the duration of the restart.

For consistency you could mirror the cancel path: set _restart_session_on_next_turn = True here too, and let the flag be consumed at the top of the next astep(). The timeout path already sets ERROR via _emit_turn_timeout, so arun() will break; the restart would then be picked up by the subsequent init_state() call at the start of the next run (which re-opens the server anyway). If the inline-restart ordering is intentional for the timeout case, a brief comment explaining why it differs from the cancel path would help future readers.

all-hands-bot · 2026-05-25T02:10:23Z

+
+                    acp_user_message_changed = (
+                        self._state.last_user_message_id is not None
+                        and self._state.last_user_message_id != acp_step_user_message_id


🟡 Suggestion: acp_step_user_message_id is None when no queued message was found (e.g., last_acp_prompt_user_message_id already points to the newest event). In that case astep() is called with prompt_message=None and falls back to scanning state.events in reverse — but last_acp_prompt_user_message_id is not updated afterward (line 1096 has the if acp_step_user_message_id is not None: guard). Then this check evaluates as last_user_message_id is not None and last_user_message_id != None → True, spuriously signalling that a new message arrived and setting RUNNING for another iteration. On the next iteration the pattern repeats, so the loop burns through max_iteration_per_run re-prompting the last already-processed message.

In practice EventService only calls run() when a new message exists, so this edge case is unlikely to surface there. Standalone callers that loop arun() without a new message could hit it. A guard like if acp_step_user_message is None: break # no queued message before the await self.agent.astep(...) call would close the gap and make the intent explicit.

all-hands-bot · 2026-05-25T02:10:23Z

+
+        completed_promptly = True
+        try:
+            await asyncio.wait_for(send_done.wait(), timeout=0.2)


🟡 Suggestion: The 0.2 s deadline is a design assertion ("send_message must not block on the in-flight prompt"), not a performance benchmark, so the intent is good. However it can produce false negatives on heavily loaded CI runners where thread scheduling latency alone exceeds 200 ms. Consider replacing the wait_for + completed_promptly flag pattern with a threading.Event that send_message sets synchronously once the lock is acquired, then assert that the event was set before release_first_step. That approach is instant rather than time-bounded and cannot flake under load.

all-hands-bot

✅ QA Report: PASS

Verified ACP live message behavior through a real stdio ACP JSON-RPC harness; the PR achieves its stated goal with no QA issues found.

Does this PR achieve its stated goal?

Yes. I exercised the changed SDK and agent-server paths with a minimal ACP server instead of relying on unit tests: on the base commit, ACP send_message() waited behind an in-flight prompt and ACP string chunks did not produce streaming deltas; on the PR, the intervening user message persisted immediately, the queued ACP prompt ran, streaming deltas were published, and EventService sent session/cancel before restarting with the replacement prompt. The condenser settings path also constructs separate default and condenser LLM usage IDs on the PR.

Phase	Result
Environment Setup	✅ `uv sync --dev --frozen` completed (`Checked 235 packages in 2ms`)
CI Status	🟡 Core build/test/pre-commit checks are green; OpenHands `pr-review` and `qa-changes` automation checks are still in progress
Functional Verification	✅ ACP queued messages, streaming deltas, interrupt/cancel restart, and condenser usage IDs verified

Functional Verification

Test 1: Local ACP conversation accepts a user message while a prompt is in flight

Step 1 — Reproduce baseline without the PR:
Checked out origin/main (2aa5256e2147a3252be8d1f96600f627ec27abbb) and ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_harness.py local_queue:

QA_RESULT={"cancel_events": [], "completed_within_250ms": false, "mode": "local_queue", "prompts": ["initial request", "intervening request"], "second_prompt_auto": true, "send_elapsed_if_prompt": null, "send_elapsed_total": 2.7864337420000425}

This confirms the old behavior: the intervening send_message() did not persist promptly; it waited ~2.8s for the in-flight ACP prompt/usage wait to complete.

Step 2 — Apply the PR's changes:
Checked out c3749ffa83e04f8f64ff102aa5529ef5c65e591c.

Step 3 — Re-run with the fix in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_harness.py local_queue:

QA_RESULT={"cancel_events": [], "completed_within_250ms": true, "mode": "local_queue", "prompts": ["initial request", "intervening request"], "second_prompt_auto": true, "send_elapsed_if_prompt": 0.0020447240000294187, "send_elapsed_total": 0.0020454750000453714}

This shows the PR fixes the core async-turn issue: the new user message persisted in ~2ms while the first ACP prompt was still active, and the second ACP prompt was processed automatically in FIFO order.

Test 2: ACP plain-string token callbacks publish streaming deltas

Step 1 — Reproduce baseline without the PR:
Checked out origin/main and ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_harness.py event_streaming:

QA_RESULT={"delta_contents": [], "delta_count": 0, "mode": "event_streaming", "prompts": ["stream this"]}

This confirms the previous agent-server streaming path did not surface ACP plain-string chunks as StreamingDeltaEvents.

Step 2 — Apply the PR's changes:
Checked out c3749ffa83e04f8f64ff102aa5529ef5c65e591c.

Step 3 — Re-run with the fix in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_harness.py event_streaming:

QA_RESULT={"delta_contents": ["delta for stream this"], "delta_count": 1, "mode": "event_streaming", "prompts": ["stream this"]}

This shows the ACP string chunk reached subscribers as a transient streaming delta.

Test 3: Agent-server interrupts an in-flight ACP turn and sends `session/cancel`

Step 1 — Reproduce baseline without the PR:
Checked out origin/main and ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_harness.py event_interrupt:

QA_RESULT={"cancel_count": 0, "mode": "event_interrupt", "prompts": ["initial long request", "replacement request"], "replacement_prompt_seen": true, "send_elapsed": 7.073722684000018}

This shows the old EventService path did not send session/cancel; the second send_message(run=True) waited for the first long ACP prompt and then for the replacement run.

Step 2 — Apply the PR's changes:
Checked out c3749ffa83e04f8f64ff102aa5529ef5c65e591c.

Step 3 — Re-run with the fix in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_harness.py event_interrupt:

QA_RESULT={"cancel_count": 1, "mode": "event_interrupt", "prompts": ["initial long request", "replacement request"], "replacement_prompt_seen": true, "send_elapsed": 2.009888352999951}

This confirms the PR sends ACP session/cancel during the interrupted turn and restarts against the replacement user prompt instead of waiting for the old long prompt to complete.

Test 4: Condenser LLM usage IDs stay separated

Step 1 — Establish baseline:
On origin/main, ran a real settings construction script using OpenHandsAgentSettings, CondenserSettings(enabled=True), and create_agent_from_settings(...):

Agent
agent_llm_usage_id= default
condenser= LLMSummarizingCondenser
all_llm_usage_ids= ['default', 'condenser']

Step 2 — Apply the PR's changes and re-run:
On c3749ffa83e04f8f64ff102aa5529ef5c65e591c, ran the same script:

Agent
agent_llm_usage_id= default
condenser= LLMSummarizingCondenser
all_llm_usage_ids= ['default', 'condenser']

This verifies the PR branch exposes distinct runtime usage IDs for agent and condenser LLMs; the current origin/main already has the same behavior, so this was a no-regression check for the copied fix.

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot

🟡 Acceptable direction, but I found a few ACP async queue/cancellation edge cases that should be addressed before merge. Risk: 🟡 medium because this changes agent/conversation loop behavior; a human maintainer should decide eval coverage before approval.

This review was generated by an AI agent (OpenHands) on behalf of the user.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26379406959

all-hands-bot · 2026-05-25T02:22:49Z

+                    # the message and we are FINISHED here, so the guard avoids a
+                    # redundant run. A deliberate run=False append, or an IDLE
+                    # reached via another path, never sets the flag.
                    if self._rerun_requested:


🟠 Important: _rerun_requested now stays set when the current run finishes in a non-restartable state (for example, FINISHED after the still-running loop already absorbed the run=True message). That stale flag can be consumed by a later IDLE/ACP-PAUSED cleanup and start an unexpected run, including after a later run=False append. Please clear/consume the flag before checking status, and set it back only when the replacement run() attempt races with another active task.

all-hands-bot · 2026-05-25T02:22:49Z

+                        ]
+                        if last_acp_prompt_user_message_id is None:
+                            acp_step_user_message = (
+                                user_messages[-1] if user_messages else None


🟠 Important: With no persisted ACP cursor, this starts from the latest user message. If a client queued multiple user messages before the first arun() (e.g. run=False sends followed by run=True), every earlier message is skipped and the latest id is persisted as the cursor. Since this path is now responsible for FIFO ACP processing, the initial cursor case should also process the earliest unprocessed user message, or establish an explicit baseline before selecting the next prompt.

all-hands-bot · 2026-05-25T02:22:49Z

        except TimeoutError:
-            self._emit_turn_timeout(time.monotonic() - t0, state, on_event)
+            await self._arequest_session_cancel()
+            drained = await self._drain_cancelled_prompt(prompt_future)


🟠 Important: If the shielded prompt completes during _drain_cancelled_prompt(), drained is True but the completed PromptResponse is discarded and the code still emits a timeout while preserving the same ACP session. That can leave ACP server history with a completed turn that OpenHands records as a timeout. Please have the drain helper return the completed result/exception so it can be finalized, or restart/discard the session whenever a completed turn is intentionally ignored.

all-hands-bot · 2026-05-25T02:22:49Z

+            with state:
+                self._emit_turn_timeout(time.monotonic() - t0, state, on_event)
+                if not drained:
+                    self._restart_session_after_drain_timeout(state, on_event)


🟠 Important: On drain timeout, _restart_session_after_drain_timeout() replaces _client; the finally block then calls _clear_turn_callbacks() on the new bridge, leaving the old bridge (the one still owned by the undrained prompt) with on_event/on_token wired. Clear the old callbacks before cleanup/restart, or have the restart helper clear them before swapping clients, so late old-session updates cannot append after the synthetic timeout/failure events.

all-hands-bot · 2026-05-25T02:22:49Z

-                await asyncio.sleep(60)
+                # Block until cancellation releases the prompt, exercising
+                # prompt quiescing without leaving a long-running portal task.
+                await asyncio.to_thread(prompt_released.wait)


🟡 Suggestion: If the cancellation path regresses before _fake_cancel runs, this to_thread worker blocks forever and executor.close() can hang after the test times out, masking the actual failure. Release prompt_released in a cleanup finally (or use a bounded wait) before closing the executor so the test fails cleanly.

openhands-agent added 2 commits May 23, 2026 16:59

fix(sdk): assign condenser LLM usage id

0a180f1

Co-authored-by: openhands <openhands@all-hands.dev>

fix(sdk): reset condenser LLM metrics

43032a6

Co-authored-by: openhands <openhands@all-hands.dev>

neubig mentioned this pull request May 24, 2026

[codex] Render streaming assistant deltas OpenHands/agent-canvas#753

Open

neubig marked this pull request as ready for review May 24, 2026 12:25

neubig requested a review from simonrosenberg May 24, 2026 12:25

all-hands-bot reviewed May 24, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py Outdated

Comment thread openhands-agent-server/openhands/agent_server/event_service.py Outdated

neubig changed the title ~~[codex] Stream ACP assistant message deltas~~ [codex] Accept ACP user messages during async turns May 24, 2026

fix(acp): accept user messages during async turns

ca04e8c

neubig force-pushed the codex/acp-live-message-deltas branch from dc3d0a3 to ca04e8c Compare May 24, 2026 13:51

neubig and others added 4 commits May 24, 2026 10:40

fix(agent-server): stream ACP text deltas

f50ed72

fix(acp): interrupt running turn on new user message

71f8148

fix(agent-server): handle ACP string token deltas

b9dc93b

merge main into PR #3376

b336d05

Co-authored-by: openhands <openhands@all-hands.dev>

neubig added the review-this This label triggers a PR review by OpenHands label May 24, 2026 — with OpenHands AI

all-hands-bot reviewed May 24, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py Outdated

Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py Outdated

fix: address ACP async turn races (#3376)

6c89026

Co-authored-by: openhands <openhands@all-hands.dev>

openhands-agent added 2 commits May 24, 2026 18:22

fix: satisfy ACP prompt future typing (#3376)

49583a2

Co-authored-by: openhands <openhands@all-hands.dev>

test: update ACP arun prompt snapshot test (#3376)

55b6c38

Co-authored-by: openhands <openhands@all-hands.dev>

neubig removed the review-this This label triggers a PR review by OpenHands label May 24, 2026

neubig added the review-this This label triggers a PR review by OpenHands label May 24, 2026 — with OpenHands AI

all-hands-bot reviewed May 24, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py

Comment thread openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py Outdated

Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py

fix: close ACP async ordering gaps (#3376)

c9bbb68

Co-authored-by: openhands <openhands@all-hands.dev>

neubig removed the review-this This label triggers a PR review by OpenHands label May 24, 2026

neubig added the review-this This label triggers a PR review by OpenHands label May 24, 2026 — with OpenHands AI

all-hands-bot reviewed May 24, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py

Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py Outdated

Comment thread openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py

Comment thread tests/sdk/agent/test_acp_agent.py

fix: address ACP cancellation edge cases (#3376)

ce68215

Co-authored-by: openhands <openhands@all-hands.dev>

neubig removed the review-this This label triggers a PR review by OpenHands label May 24, 2026

neubig added the review-this This label triggers a PR review by OpenHands label May 24, 2026 — with OpenHands AI

all-hands-bot reviewed May 24, 2026

View reviewed changes

fix: address ACP review edge cases (#3376)

f581a01

Co-authored-by: openhands <openhands@all-hands.dev>

neubig removed the review-this This label triggers a PR review by OpenHands label May 24, 2026

neubig added the review-this This label triggers a PR review by OpenHands label May 24, 2026 — with OpenHands AI

Merge branch 'pr-3368-fix-condenser-usage-id' into codex/acp-live-mes…

f1ca28d

…sage-deltas

all-hands-bot reviewed May 24, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py

Comment thread openhands-agent-server/openhands/agent_server/event_service.py

Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py Outdated

fix: close ACP rerun race windows (#3376)

ffeb881

Co-authored-by: openhands <openhands@all-hands.dev>

neubig removed the review-this This label triggers a PR review by OpenHands label May 24, 2026

neubig added the review-this This label triggers a PR review by OpenHands label May 24, 2026 — with OpenHands AI

neubig and others added 3 commits May 24, 2026 16:30

Merge branch 'main' into codex/acp-live-message-deltas

c1c37ca

fix(acp): reassign agent state for prompt tracking

1c390af

fix(acp): resume session after cancel drain timeout

c3749ff

Co-authored-by: openhands <openhands@all-hands.dev>

neubig requested a review from all-hands-bot May 25, 2026 02:02

all-hands-bot reviewed May 25, 2026

View reviewed changes

Conversation

neubig commented May 24, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Regression Tests

Validation

Uh oh!

github-actions Bot commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python API breakage checks — ✅ PASSED

Uh oh!

github-actions Bot commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

REST API breakage checks (OpenAPI) — ✅ PASSED

Uh oh!

github-actions Bot commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

neubig commented May 24, 2026

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

neubig commented May 24, 2026

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

neubig commented May 24, 2026

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

neubig commented May 24, 2026

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

neubig commented May 24, 2026

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

neubig commented May 24, 2026

Uh oh!

neubig commented May 25, 2026

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

all-hands-bot left a comment

neubig commented May 24, 2026 •

edited by github-actions Bot

Loading

github-actions Bot commented May 24, 2026 •

edited

Loading

github-actions Bot commented May 24, 2026 •

edited

Loading

github-actions Bot commented May 24, 2026 •

edited

Loading

Test 3: Agent-server interrupts an in-flight ACP turn and sends `session/cancel`