Skip to content

Add conversation observability metadata#3270

Draft
neubig wants to merge 6 commits into
mainfrom
openhands/laminar-conversation-metadata
Draft

Add conversation observability metadata#3270
neubig wants to merge 6 commits into
mainfrom
openhands/laminar-conversation-metadata

Conversation

@neubig
Copy link
Copy Markdown
Member

@neubig neubig commented May 15, 2026

Summary

  • add validated conversation-level observability_metadata and observability_tags fields to start requests
  • pass metadata through local/remote conversation construction and the agent server into Laminar root spans
  • apply trace metadata and span tags when creating the Laminar root span

Tests

  • uv run pytest -q tests/sdk/conversation/test_base_span_management.py tests/sdk/conversation/test_tags.py tests/sdk/observability/test_laminar.py
  • uv run ruff check openhands-sdk/openhands/sdk/conversation/base.py openhands-sdk/openhands/sdk/conversation/conversation.py openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py openhands-sdk/openhands/sdk/conversation/impl/remote_conversation.py openhands-sdk/openhands/sdk/conversation/request.py openhands-sdk/openhands/sdk/conversation/types.py openhands-sdk/openhands/sdk/observability/laminar.py openhands-sdk/openhands/sdk/settings/model.py openhands-agent-server/openhands/agent_server/event_service.py tests/sdk/conversation/test_base_span_management.py tests/sdk/conversation/test_tags.py tests/sdk/observability/test_laminar.py

This PR was created by an AI agent (OpenHands) on behalf of the user.

@neubig can click here to continue refining the PR


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:725d8f5-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-725d8f5-python \
  ghcr.io/openhands/agent-server:725d8f5-python

All tags pushed for this build

ghcr.io/openhands/agent-server:725d8f5-golang-amd64
ghcr.io/openhands/agent-server:725d8f5bfab6e37d54c3f7edd47cd33fde8609d7-golang-amd64
ghcr.io/openhands/agent-server:openhands-laminar-conversation-metadata-golang-amd64
ghcr.io/openhands/agent-server:725d8f5-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:725d8f5-golang-arm64
ghcr.io/openhands/agent-server:725d8f5bfab6e37d54c3f7edd47cd33fde8609d7-golang-arm64
ghcr.io/openhands/agent-server:openhands-laminar-conversation-metadata-golang-arm64
ghcr.io/openhands/agent-server:725d8f5-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:725d8f5-java-amd64
ghcr.io/openhands/agent-server:725d8f5bfab6e37d54c3f7edd47cd33fde8609d7-java-amd64
ghcr.io/openhands/agent-server:openhands-laminar-conversation-metadata-java-amd64
ghcr.io/openhands/agent-server:725d8f5-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:725d8f5-java-arm64
ghcr.io/openhands/agent-server:725d8f5bfab6e37d54c3f7edd47cd33fde8609d7-java-arm64
ghcr.io/openhands/agent-server:openhands-laminar-conversation-metadata-java-arm64
ghcr.io/openhands/agent-server:725d8f5-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:725d8f5-python-amd64
ghcr.io/openhands/agent-server:725d8f5bfab6e37d54c3f7edd47cd33fde8609d7-python-amd64
ghcr.io/openhands/agent-server:openhands-laminar-conversation-metadata-python-amd64
ghcr.io/openhands/agent-server:725d8f5-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:725d8f5-python-arm64
ghcr.io/openhands/agent-server:725d8f5bfab6e37d54c3f7edd47cd33fde8609d7-python-arm64
ghcr.io/openhands/agent-server:openhands-laminar-conversation-metadata-python-arm64
ghcr.io/openhands/agent-server:725d8f5-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:725d8f5-golang
ghcr.io/openhands/agent-server:725d8f5bfab6e37d54c3f7edd47cd33fde8609d7-golang
ghcr.io/openhands/agent-server:openhands-laminar-conversation-metadata-golang
ghcr.io/openhands/agent-server:725d8f5-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:725d8f5-java
ghcr.io/openhands/agent-server:725d8f5bfab6e37d54c3f7edd47cd33fde8609d7-java
ghcr.io/openhands/agent-server:openhands-laminar-conversation-metadata-java
ghcr.io/openhands/agent-server:725d8f5-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:725d8f5-python
ghcr.io/openhands/agent-server:725d8f5bfab6e37d54c3f7edd47cd33fde8609d7-python
ghcr.io/openhands/agent-server:openhands-laminar-conversation-metadata-python
ghcr.io/openhands/agent-server:725d8f5-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

  • Each variant tag (e.g., 725d8f5-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 725d8f5-python-amd64) are also available if needed

Issue

Related to OpenHands/OpenHands#14457

Related issue: #3344

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 15, 2026

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 15, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 15, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-agent-server/openhands/agent_server
   api.py2522191%100, 102–107, 109, 111, 113, 147, 159, 174, 180, 453, 456, 460–462, 464, 470
   event_service.py4679779%87–88, 118, 121–122, 126–127, 134, 138, 144, 154–158, 161–164, 223, 244–245, 316, 367, 377, 401–402, 406, 414, 417, 475, 477, 481–483, 487, 496–497, 499, 503, 509, 511, 558, 588, 591, 642, 646, 840, 842–843, 847, 861–863, 865, 886, 891–894, 898–901, 909–912, 918–921, 967–968, 970–977, 979–980, 989–990, 992–993, 1000–1001, 1003–1004, 1024, 1030, 1036, 1045–1046
openhands-sdk/openhands/sdk/conversation
   base.py95297%209, 263
   conversation.py34876%150, 163–164, 170–173, 177
   request.py72888%65, 232, 235, 237–238, 241–242, 253
   types.py65198%100
openhands-sdk/openhands/sdk/conversation/impl
   local_conversation.py5314491%319, 324, 468, 514, 551, 567, 632, 857–858, 861, 974, 985–988, 995–996, 999, 1005–1006, 1009, 1015, 1030, 1033, 1037–1038, 1042–1044, 1051, 1137, 1142, 1252, 1254, 1258–1259, 1270–1271, 1296, 1491, 1495, 1565, 1572–1573
   remote_conversation.py6577788%144, 171, 184, 186–189, 199, 221–222, 227–230, 314, 324–326, 332, 406, 553–556, 558, 584–588, 593–596, 599, 615, 774–775, 779–780, 794, 840, 851–852, 872–875, 877–878, 909, 919, 923, 932–933, 972, 1103, 1175–1176, 1180, 1185–1189, 1195–1201, 1214, 1219, 1254, 1467–1468
openhands-sdk/openhands/sdk/observability
   laminar.py1984875%37–41, 103, 170–171, 184–185, 211–212, 280–287, 303–304, 311–313, 332–333, 336–338, 342, 344–346, 361, 365, 417, 431, 475, 497–500, 533–535, 537–538
openhands-sdk/openhands/sdk/settings
   model.py5695191%87, 112, 117, 356, 366–369, 372, 385, 389, 395, 405, 411, 416, 616, 629, 640, 650, 654, 656, 658, 660, 662, 664, 666, 668, 670, 945, 947, 1060, 1244, 1312, 1351, 1378, 1414–1417, 1443, 1612, 1644, 1654, 1656, 1661, 1679, 1692, 1694, 1696, 1698, 1705
TOTAL28450652277% 

neubig added 2 commits May 24, 2026 16:34
…versation-metadata

# Conflicts:
#	openhands-agent-server/openhands/agent_server/event_service.py
#	openhands-sdk/openhands/sdk/conversation/base.py
#	openhands-sdk/openhands/sdk/conversation/conversation.py
#	openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py
#	openhands-sdk/openhands/sdk/conversation/impl/remote_conversation.py
#	openhands-sdk/openhands/sdk/conversation/request.py
#	openhands-sdk/openhands/sdk/observability/laminar.py
#	tests/sdk/conversation/test_base_span_management.py
@neubig neubig added the review-this This label triggers a PR review by OpenHands label May 25, 2026
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable. The core data flow looks reasonable, but I found two validation issues worth fixing before relying on this new public API surface.

Risk: 🟡 Medium — new request/settings fields can mishandle malformed metadata input.

This review was generated by an AI agent (OpenHands) on behalf of the user.


Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26411135703

Comment thread openhands-sdk/openhands/sdk/conversation/types.py
Comment thread openhands-sdk/openhands/sdk/settings/model.py Outdated
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Member Author

neubig commented May 27, 2026

Addressed the open review feedback in dd37fff and resolved the review threads. I also reran local settings/tag validation: uv run pytest -q tests/sdk/conversation/test_tags.py tests/sdk/test_settings.py.\n\n_This comment was created by an AI agent (OpenHands) on behalf of the user._

@neubig neubig requested a review from all-hands-bot May 27, 2026 04:36
Copy link
Copy Markdown
Collaborator

all-hands-bot commented May 27, 2026

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟢 Looks Good

The implementation is clean and well-structured. The validated ConversationObservabilityMetadata type, the consistent threading through local/remote/server paths, and the targeted test suite all reflect solid design. Prior review issues are already addressed in this HEAD.

Two observations below — one on validation consistency and one on test coverage for the remote path.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

Comment thread openhands-sdk/openhands/sdk/conversation/request.py Outdated
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ QA Report: PASS WITH ISSUES

Valid observability metadata now flows through the SDK/start-request paths and into Laminar span calls, but invalid metadata sent to the real agent-server API returns HTTP 500 instead of a validation response.

Does this PR achieve its stated goal?

Partially. I verified that the new conversation-level metadata/tags are absent on main, present on this PR in StartConversationRequest and ConversationSettings, and passed to Laminar when constructing a real local Conversation. I also verified a real POST /api/conversations accepts a valid metadata/tags payload. However, the new validation path currently turns invalid API input into an internal server error, so the “validated start requests” part is not fully user-safe yet.

Phase Result
Environment Setup make build completed successfully; no tests/linters were run.
CI Status ⏳ Observed 20 successful, 2 skipped, 8 pending, 0 failing checks.
Functional Verification ⚠️ Valid SDK/API paths work; invalid HTTP validation returns 500.
Functional Verification

Test 1: SDK start request/settings/local conversation behavior before and after

Step 1 — Establish baseline on main:
Ran git switch main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_observability_probe.py with a script that constructs StartConversationRequest, ConversationSettings.create_request(...), and a local Conversation(...) using observability metadata/tags:

[
  {"label":"start_request","ok":true,"value":{"metadata_in_dump":null,"tags_in_dump":null,"invalid_nested_rejected":false}},
  {"label":"conversation_settings","ok":true,"value":{"settings_metadata":null,"settings_tags":null,"request_metadata":null,"request_tags":null}},
  {"label":"local_conversation_laminar_calls","ok":false,"error":"TypeError: Conversation.__new__() got an unexpected keyword argument 'observability_metadata'"}
]

This confirms the baseline did not expose or preserve the new metadata/tags inputs.

Step 2 — Apply the PR's changes:
Checked out openhands/laminar-conversation-metadata at dd37fffb334201c7a33493910b1992eb47298fa6.

Step 3 — Re-run with the fix in place:
Ran the same probe:

[
  {"label":"start_request","ok":true,"value":{"metadata_in_dump":{"repo_name":"OpenHands/software-agent-sdk","private":true,"retry_count":3,"scores":[0.1,0.2]},"tags_in_dump":["repo:OpenHands/software-agent-sdk","qa"],"invalid_nested_rejected":true}},
  {"label":"conversation_settings","ok":true,"value":{"settings_metadata":{"repo":"OpenHands/software-agent-sdk"},"settings_tags":["repo:OpenHands/software-agent-sdk"],"request_metadata":{"repo":"OpenHands/software-agent-sdk"},"request_tags":["repo:OpenHands/software-agent-sdk"]}},
  {"label":"local_conversation_laminar_calls","ok":true,"value":["set_trace_session_id","set_trace_user_id","set_trace_metadata","set_span_tags"]}
]

This shows the PR preserves valid metadata/tags in request models/settings and invokes Laminar trace metadata/span tag methods during real local conversation construction.

Test 2: Real agent-server HTTP start requests

Step 1 — Start the software:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -m openhands.agent_server --port 18080 and waited for GET /server_info to return 200.

Step 2 — Create a conversation with valid metadata/tags:
Ran curl -H 'Content-Type: application/json' --data @/tmp/qa_valid_start.json http://127.0.0.1:18080/api/conversations:

{"id":"8bc0189f-6696-4d29-ae29-9afa4a11dc28","workspace":{"working_dir":"/tmp/qa-http-workspace","kind":"LocalWorkspace"},"execution_status":"idle"}
HTTP_STATUS:201

This confirms a real user can create an agent-server conversation with valid observability metadata/tags.

Step 3 — Send invalid metadata to exercise validation:
Ran curl -H 'Content-Type: application/json' --data @/tmp/qa_invalid_start.json http://127.0.0.1:18080/api/conversations, where observability_metadata contained a nested object:

{"detail":"Internal Server Error","exception":"Object of type ValueError is not JSON serializable"}
HTTP_STATUS:500

The server log shows FastAPI raised RequestValidationError, then the validation handler failed serializing ctx.error: ValueError(...). This confirms invalid observability metadata is detected, but the user-facing API response is incorrectly a 500.

Issues Found

  • 🟠 Issue: Invalid observability_metadata in POST /api/conversations returns HTTP 500 instead of a 4xx validation response.

This review was created by an AI agent (OpenHands) on behalf of the user.

Comment thread openhands-sdk/openhands/sdk/conversation/types.py
Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable. The data flow is mostly sound, but I found two observability issues worth fixing before merge.

Risk: 🟡 Medium — new root-span metadata plumbing can produce incomplete or misleading traces if these edge cases hit.

This review was generated by an AI agent (OpenHands) on behalf of the user.


Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26491056754

# Include tags and observability metadata if provided
"tags": tags or {},
"observability_metadata": observability_metadata or {},
"observability_tags": observability_tags or [],
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: The server-side LocalConversation now consumes these observability fields from StoredConversation, but the remote create payload still omits the existing user_id field. That means a remote SDK caller gets user_id on the client proxy root span only, while the server root span covering agent execution loses it. Please include "user_id": user_id in this payload and add a payload propagation assertion alongside the metadata/tags coverage.

if user_id:
Laminar.set_trace_user_id(user_id)
if metadata:
Laminar.set_trace_metadata(metadata)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: These new metadata/tag setters run inside Laminar.use_span() with its default record_exception=True / set_status_on_exception=True. If a best-effort setter raises (for example due to Laminar API or serialization incompatibility), contextlib.suppress hides the failure from the SDK caller but Laminar still records the exception and marks the conversation root span as errored. Please use Laminar.use_span(self.span, record_exception=False, set_status_on_exception=False) for this setup block so observability setup failures don't corrupt trace status.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review-this This label triggers a PR review by OpenHands

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants