Add conversation observability metadata by neubig · Pull Request #3270 · OpenHands/software-agent-sdk

neubig · 2026-05-15T20:03:30Z

Summary

add validated conversation-level observability_metadata and observability_tags fields to start requests
pass metadata through local/remote conversation construction and the agent server into Laminar root spans
apply trace metadata and span tags when creating the Laminar root span

Tests

uv run pytest -q tests/sdk/conversation/test_base_span_management.py tests/sdk/conversation/test_tags.py tests/sdk/observability/test_laminar.py
uv run ruff check openhands-sdk/openhands/sdk/conversation/base.py openhands-sdk/openhands/sdk/conversation/conversation.py openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py openhands-sdk/openhands/sdk/conversation/impl/remote_conversation.py openhands-sdk/openhands/sdk/conversation/request.py openhands-sdk/openhands/sdk/conversation/types.py openhands-sdk/openhands/sdk/observability/laminar.py openhands-sdk/openhands/sdk/settings/model.py openhands-agent-server/openhands/agent_server/event_service.py tests/sdk/conversation/test_base_span_management.py tests/sdk/conversation/test_tags.py tests/sdk/observability/test_laminar.py

This PR was created by an AI agent (OpenHands) on behalf of the user.

@neubig can click here to continue refining the PR

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22-slim`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:725d8f5-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-725d8f5-python \
  ghcr.io/openhands/agent-server:725d8f5-python

All tags pushed for this build

ghcr.io/openhands/agent-server:725d8f5-golang-amd64
ghcr.io/openhands/agent-server:725d8f5bfab6e37d54c3f7edd47cd33fde8609d7-golang-amd64
ghcr.io/openhands/agent-server:openhands-laminar-conversation-metadata-golang-amd64
ghcr.io/openhands/agent-server:725d8f5-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:725d8f5-golang-arm64
ghcr.io/openhands/agent-server:725d8f5bfab6e37d54c3f7edd47cd33fde8609d7-golang-arm64
ghcr.io/openhands/agent-server:openhands-laminar-conversation-metadata-golang-arm64
ghcr.io/openhands/agent-server:725d8f5-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:725d8f5-java-amd64
ghcr.io/openhands/agent-server:725d8f5bfab6e37d54c3f7edd47cd33fde8609d7-java-amd64
ghcr.io/openhands/agent-server:openhands-laminar-conversation-metadata-java-amd64
ghcr.io/openhands/agent-server:725d8f5-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:725d8f5-java-arm64
ghcr.io/openhands/agent-server:725d8f5bfab6e37d54c3f7edd47cd33fde8609d7-java-arm64
ghcr.io/openhands/agent-server:openhands-laminar-conversation-metadata-java-arm64
ghcr.io/openhands/agent-server:725d8f5-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:725d8f5-python-amd64
ghcr.io/openhands/agent-server:725d8f5bfab6e37d54c3f7edd47cd33fde8609d7-python-amd64
ghcr.io/openhands/agent-server:openhands-laminar-conversation-metadata-python-amd64
ghcr.io/openhands/agent-server:725d8f5-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:725d8f5-python-arm64
ghcr.io/openhands/agent-server:725d8f5bfab6e37d54c3f7edd47cd33fde8609d7-python-arm64
ghcr.io/openhands/agent-server:openhands-laminar-conversation-metadata-python-arm64
ghcr.io/openhands/agent-server:725d8f5-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:725d8f5-golang
ghcr.io/openhands/agent-server:725d8f5bfab6e37d54c3f7edd47cd33fde8609d7-golang
ghcr.io/openhands/agent-server:openhands-laminar-conversation-metadata-golang
ghcr.io/openhands/agent-server:725d8f5-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:725d8f5-java
ghcr.io/openhands/agent-server:725d8f5bfab6e37d54c3f7edd47cd33fde8609d7-java
ghcr.io/openhands/agent-server:openhands-laminar-conversation-metadata-java
ghcr.io/openhands/agent-server:725d8f5-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:725d8f5-python
ghcr.io/openhands/agent-server:725d8f5bfab6e37d54c3f7edd47cd33fde8609d7-python
ghcr.io/openhands/agent-server:openhands-laminar-conversation-metadata-python
ghcr.io/openhands/agent-server:725d8f5-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

Each variant tag (e.g., 725d8f5-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 725d8f5-python-amd64) are also available if needed

Issue

Related to OpenHands/OpenHands#14457

Related issue: #3344

Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-05-15T20:04:01Z

Python API breakage checks — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-05-15T20:04:09Z

REST API breakage checks (OpenAPI) — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-05-15T20:06:33Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-agent-server/openhands/agent_server
api.py	252	21	91%	100, 102–107, 109, 111, 113, 147, 159, 174, 180, 453, 456, 460–462, 464, 470
event_service.py	467	97	79%	87–88, 118, 121–122, 126–127, 134, 138, 144, 154–158, 161–164, 223, 244–245, 316, 367, 377, 401–402, 406, 414, 417, 475, 477, 481–483, 487, 496–497, 499, 503, 509, 511, 558, 588, 591, 642, 646, 840, 842–843, 847, 861–863, 865, 886, 891–894, 898–901, 909–912, 918–921, 967–968, 970–977, 979–980, 989–990, 992–993, 1000–1001, 1003–1004, 1024, 1030, 1036, 1045–1046
openhands-sdk/openhands/sdk/conversation
base.py	95	2	97%	209, 263
conversation.py	34	8	76%	150, 163–164, 170–173, 177
request.py	72	8	88%	65, 232, 235, 237–238, 241–242, 253
types.py	65	1	98%	100
openhands-sdk/openhands/sdk/conversation/impl
local_conversation.py	531	44	91%	319, 324, 468, 514, 551, 567, 632, 857–858, 861, 974, 985–988, 995–996, 999, 1005–1006, 1009, 1015, 1030, 1033, 1037–1038, 1042–1044, 1051, 1137, 1142, 1252, 1254, 1258–1259, 1270–1271, 1296, 1491, 1495, 1565, 1572–1573
remote_conversation.py	657	77	88%	144, 171, 184, 186–189, 199, 221–222, 227–230, 314, 324–326, 332, 406, 553–556, 558, 584–588, 593–596, 599, 615, 774–775, 779–780, 794, 840, 851–852, 872–875, 877–878, 909, 919, 923, 932–933, 972, 1103, 1175–1176, 1180, 1185–1189, 1195–1201, 1214, 1219, 1254, 1467–1468
openhands-sdk/openhands/sdk/observability
laminar.py	198	48	75%	37–41, 103, 170–171, 184–185, 211–212, 280–287, 303–304, 311–313, 332–333, 336–338, 342, 344–346, 361, 365, 417, 431, 475, 497–500, 533–535, 537–538
openhands-sdk/openhands/sdk/settings
model.py	569	51	91%	87, 112, 117, 356, 366–369, 372, 385, 389, 395, 405, 411, 416, 616, 629, 640, 650, 654, 656, 658, 660, 662, 664, 666, 668, 670, 945, 947, 1060, 1244, 1312, 1351, 1378, 1414–1417, 1443, 1612, 1644, 1654, 1656, 1661, 1679, 1692, 1694, 1696, 1698, 1705
TOTAL	28450	6522	77%

…versation-metadata # Conflicts: # openhands-agent-server/openhands/agent_server/event_service.py # openhands-sdk/openhands/sdk/conversation/base.py # openhands-sdk/openhands/sdk/conversation/conversation.py # openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py # openhands-sdk/openhands/sdk/conversation/impl/remote_conversation.py # openhands-sdk/openhands/sdk/conversation/request.py # openhands-sdk/openhands/sdk/observability/laminar.py # tests/sdk/conversation/test_base_span_management.py

all-hands-bot

Taste Rating: 🟡 Acceptable. The core data flow looks reasonable, but I found two validation issues worth fixing before relying on this new public API surface.

Risk: 🟡 Medium — new request/settings fields can mishandle malformed metadata input.

This review was generated by an AI agent (OpenHands) on behalf of the user.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26411135703

Co-authored-by: openhands <openhands@all-hands.dev>

neubig · 2026-05-27T04:36:57Z

Addressed the open review feedback in dd37fff and resolved the review threads. I also reran local settings/tag validation: uv run pytest -q tests/sdk/conversation/test_tags.py tests/sdk/test_settings.py.\n\n_This comment was created by an AI agent (OpenHands) on behalf of the user._

all-hands-bot · 2026-05-27T04:37:48Z

✅ Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

all-hands-bot

Taste Rating: 🟢 Looks Good

The implementation is clean and well-structured. The validated ConversationObservabilityMetadata type, the consistent threading through local/remote/server paths, and the targeted test suite all reflect solid design. Prior review issues are already addressed in this HEAD.

Two observations below — one on validation consistency and one on test coverage for the remote path.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

all-hands-bot

⚠️ QA Report: PASS WITH ISSUES

Valid observability metadata now flows through the SDK/start-request paths and into Laminar span calls, but invalid metadata sent to the real agent-server API returns HTTP 500 instead of a validation response.

Does this PR achieve its stated goal?

Partially. I verified that the new conversation-level metadata/tags are absent on main, present on this PR in StartConversationRequest and ConversationSettings, and passed to Laminar when constructing a real local Conversation. I also verified a real POST /api/conversations accepts a valid metadata/tags payload. However, the new validation path currently turns invalid API input into an internal server error, so the “validated start requests” part is not fully user-safe yet.

Phase	Result
Environment Setup	✅ `make build` completed successfully; no tests/linters were run.
CI Status	⏳ Observed 20 successful, 2 skipped, 8 pending, 0 failing checks.
Functional Verification	⚠️ Valid SDK/API paths work; invalid HTTP validation returns 500.

Functional Verification

Test 1: SDK start request/settings/local conversation behavior before and after

Step 1 — Establish baseline on main:
Ran git switch main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_observability_probe.py with a script that constructs StartConversationRequest, ConversationSettings.create_request(...), and a local Conversation(...) using observability metadata/tags:

[
  {"label":"start_request","ok":true,"value":{"metadata_in_dump":null,"tags_in_dump":null,"invalid_nested_rejected":false}},
  {"label":"conversation_settings","ok":true,"value":{"settings_metadata":null,"settings_tags":null,"request_metadata":null,"request_tags":null}},
  {"label":"local_conversation_laminar_calls","ok":false,"error":"TypeError: Conversation.__new__() got an unexpected keyword argument 'observability_metadata'"}
]

This confirms the baseline did not expose or preserve the new metadata/tags inputs.

Step 2 — Apply the PR's changes:
Checked out openhands/laminar-conversation-metadata at dd37fffb334201c7a33493910b1992eb47298fa6.

Step 3 — Re-run with the fix in place:
Ran the same probe:

[
  {"label":"start_request","ok":true,"value":{"metadata_in_dump":{"repo_name":"OpenHands/software-agent-sdk","private":true,"retry_count":3,"scores":[0.1,0.2]},"tags_in_dump":["repo:OpenHands/software-agent-sdk","qa"],"invalid_nested_rejected":true}},
  {"label":"conversation_settings","ok":true,"value":{"settings_metadata":{"repo":"OpenHands/software-agent-sdk"},"settings_tags":["repo:OpenHands/software-agent-sdk"],"request_metadata":{"repo":"OpenHands/software-agent-sdk"},"request_tags":["repo:OpenHands/software-agent-sdk"]}},
  {"label":"local_conversation_laminar_calls","ok":true,"value":["set_trace_session_id","set_trace_user_id","set_trace_metadata","set_span_tags"]}
]

This shows the PR preserves valid metadata/tags in request models/settings and invokes Laminar trace metadata/span tag methods during real local conversation construction.

Test 2: Real agent-server HTTP start requests

Step 1 — Start the software:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -m openhands.agent_server --port 18080 and waited for GET /server_info to return 200.

Step 2 — Create a conversation with valid metadata/tags:
Ran curl -H 'Content-Type: application/json' --data @/tmp/qa_valid_start.json http://127.0.0.1:18080/api/conversations:

{"id":"8bc0189f-6696-4d29-ae29-9afa4a11dc28","workspace":{"working_dir":"/tmp/qa-http-workspace","kind":"LocalWorkspace"},"execution_status":"idle"}
HTTP_STATUS:201

This confirms a real user can create an agent-server conversation with valid observability metadata/tags.

Step 3 — Send invalid metadata to exercise validation:
Ran curl -H 'Content-Type: application/json' --data @/tmp/qa_invalid_start.json http://127.0.0.1:18080/api/conversations, where observability_metadata contained a nested object:

{"detail":"Internal Server Error","exception":"Object of type ValueError is not JSON serializable"}
HTTP_STATUS:500

The server log shows FastAPI raised RequestValidationError, then the validation handler failed serializing ctx.error: ValueError(...). This confirms invalid observability metadata is detected, but the user-facing API response is incorrectly a 500.

Issues Found

🟠 Issue: Invalid observability_metadata in POST /api/conversations returns HTTP 500 instead of a 4xx validation response.

This review was created by an AI agent (OpenHands) on behalf of the user.

Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot

Taste Rating: 🟡 Acceptable. The data flow is mostly sound, but I found two observability issues worth fixing before merge.

Risk: 🟡 Medium — new root-span metadata plumbing can produce incomplete or misleading traces if these edge cases hit.

This review was generated by an AI agent (OpenHands) on behalf of the user.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26491056754

all-hands-bot · 2026-05-27T04:50:53Z

+                # Include tags and observability metadata if provided
                "tags": tags or {},
+                "observability_metadata": observability_metadata or {},
+                "observability_tags": observability_tags or [],


🟠 Important: The server-side LocalConversation now consumes these observability fields from StoredConversation, but the remote create payload still omits the existing user_id field. That means a remote SDK caller gets user_id on the client proxy root span only, while the server root span covering agent execution loses it. Please include "user_id": user_id in this payload and add a payload propagation assertion alongside the metadata/tags coverage.

all-hands-bot · 2026-05-27T04:50:53Z

                    if user_id:
                        Laminar.set_trace_user_id(user_id)
+                    if metadata:
+                        Laminar.set_trace_metadata(metadata)


🟠 Important: These new metadata/tag setters run inside Laminar.use_span() with its default record_exception=True / set_status_on_exception=True. If a best-effort setter raises (for example due to Laminar API or serialization incompatibility), contextlib.suppress hides the failure from the SDK caller but Laminar still records the exception and marks the conversation root span as errored. Please use Laminar.use_span(self.span, record_exception=False, set_status_on_exception=False) for this setup block so observability setup failures don't corrupt trace status.

Add conversation observability metadata

8467f37

Co-authored-by: openhands <openhands@all-hands.dev>

neubig mentioned this pull request May 15, 2026

Pass repository metadata to observability traces OpenHands/OpenHands#14431

Draft

This was referenced May 18, 2026

Pass repository metadata to observability traces OpenHands/OpenHands#14457

Open

Track PR #3270: Add conversation observability metadata #3344

Open

neubig added 2 commits May 24, 2026 16:34

test(observability): satisfy pyright for metadata tests

9d8db8f

neubig added the review-this This label triggers a PR review by OpenHands label May 25, 2026

all-hands-bot reviewed May 25, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/conversation/types.py

Comment thread openhands-sdk/openhands/sdk/settings/model.py Outdated

openhands-agent added 2 commits May 27, 2026 04:32

chore: address PR review feedback (#3270)

015b17c

Co-authored-by: openhands <openhands@all-hands.dev>

fix: remove unused metadata type import

dd37fff

Co-authored-by: openhands <openhands@all-hands.dev>

neubig requested a review from all-hands-bot May 27, 2026 04:36

all-hands-bot reviewed May 27, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/conversation/request.py Outdated

Comment thread openhands-sdk/openhands/sdk/conversation/impl/remote_conversation.py

all-hands-bot reviewed May 27, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/conversation/types.py

fix: make observability validation API-safe

725d8f5

Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot reviewed May 27, 2026

View reviewed changes

Conversation

neubig commented May 15, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tests

Issue

Uh oh!

github-actions Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python API breakage checks — ✅ PASSED

Uh oh!

github-actions Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

REST API breakage checks (OpenAPI) — ✅ PASSED

Uh oh!

github-actions Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

neubig commented May 27, 2026

Uh oh!

all-hands-bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

⚠️ QA Report: PASS WITH ISSUES

Does this PR achieve its stated goal?

Test 1: SDK start request/settings/local conversation behavior before and after

Test 2: Real agent-server HTTP start requests

Issues Found

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

all-hands-bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

all-hands-bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

neubig commented May 15, 2026 •

edited by github-actions Bot

Loading

github-actions Bot commented May 15, 2026 •

edited

Loading

github-actions Bot commented May 15, 2026 •

edited

Loading

github-actions Bot commented May 15, 2026 •

edited

Loading

all-hands-bot commented May 27, 2026 •

edited

Loading