Add conversation observability metadata#3270
Conversation
Co-authored-by: openhands <openhands@all-hands.dev>
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
Coverage Report •
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
…versation-metadata # Conflicts: # openhands-agent-server/openhands/agent_server/event_service.py # openhands-sdk/openhands/sdk/conversation/base.py # openhands-sdk/openhands/sdk/conversation/conversation.py # openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py # openhands-sdk/openhands/sdk/conversation/impl/remote_conversation.py # openhands-sdk/openhands/sdk/conversation/request.py # openhands-sdk/openhands/sdk/observability/laminar.py # tests/sdk/conversation/test_base_span_management.py
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🟡 Acceptable. The core data flow looks reasonable, but I found two validation issues worth fixing before relying on this new public API surface.
Risk: 🟡 Medium — new request/settings fields can mishandle malformed metadata input.
This review was generated by an AI agent (OpenHands) on behalf of the user.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26411135703
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
|
Addressed the open review feedback in dd37fff and resolved the review threads. I also reran local settings/tag validation: |
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🟢 Looks Good
The implementation is clean and well-structured. The validated ConversationObservabilityMetadata type, the consistent threading through local/remote/server paths, and the targeted test suite all reflect solid design. Prior review issues are already addressed in this HEAD.
Two observations below — one on validation consistency and one on test coverage for the remote path.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
all-hands-bot
left a comment
There was a problem hiding this comment.
⚠️ QA Report: PASS WITH ISSUES
Valid observability metadata now flows through the SDK/start-request paths and into Laminar span calls, but invalid metadata sent to the real agent-server API returns HTTP 500 instead of a validation response.
Does this PR achieve its stated goal?
Partially. I verified that the new conversation-level metadata/tags are absent on main, present on this PR in StartConversationRequest and ConversationSettings, and passed to Laminar when constructing a real local Conversation. I also verified a real POST /api/conversations accepts a valid metadata/tags payload. However, the new validation path currently turns invalid API input into an internal server error, so the “validated start requests” part is not fully user-safe yet.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully; no tests/linters were run. |
| CI Status | ⏳ Observed 20 successful, 2 skipped, 8 pending, 0 failing checks. |
| Functional Verification |
Functional Verification
Test 1: SDK start request/settings/local conversation behavior before and after
Step 1 — Establish baseline on main:
Ran git switch main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_observability_probe.py with a script that constructs StartConversationRequest, ConversationSettings.create_request(...), and a local Conversation(...) using observability metadata/tags:
[
{"label":"start_request","ok":true,"value":{"metadata_in_dump":null,"tags_in_dump":null,"invalid_nested_rejected":false}},
{"label":"conversation_settings","ok":true,"value":{"settings_metadata":null,"settings_tags":null,"request_metadata":null,"request_tags":null}},
{"label":"local_conversation_laminar_calls","ok":false,"error":"TypeError: Conversation.__new__() got an unexpected keyword argument 'observability_metadata'"}
]This confirms the baseline did not expose or preserve the new metadata/tags inputs.
Step 2 — Apply the PR's changes:
Checked out openhands/laminar-conversation-metadata at dd37fffb334201c7a33493910b1992eb47298fa6.
Step 3 — Re-run with the fix in place:
Ran the same probe:
[
{"label":"start_request","ok":true,"value":{"metadata_in_dump":{"repo_name":"OpenHands/software-agent-sdk","private":true,"retry_count":3,"scores":[0.1,0.2]},"tags_in_dump":["repo:OpenHands/software-agent-sdk","qa"],"invalid_nested_rejected":true}},
{"label":"conversation_settings","ok":true,"value":{"settings_metadata":{"repo":"OpenHands/software-agent-sdk"},"settings_tags":["repo:OpenHands/software-agent-sdk"],"request_metadata":{"repo":"OpenHands/software-agent-sdk"},"request_tags":["repo:OpenHands/software-agent-sdk"]}},
{"label":"local_conversation_laminar_calls","ok":true,"value":["set_trace_session_id","set_trace_user_id","set_trace_metadata","set_span_tags"]}
]This shows the PR preserves valid metadata/tags in request models/settings and invokes Laminar trace metadata/span tag methods during real local conversation construction.
Test 2: Real agent-server HTTP start requests
Step 1 — Start the software:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -m openhands.agent_server --port 18080 and waited for GET /server_info to return 200.
Step 2 — Create a conversation with valid metadata/tags:
Ran curl -H 'Content-Type: application/json' --data @/tmp/qa_valid_start.json http://127.0.0.1:18080/api/conversations:
{"id":"8bc0189f-6696-4d29-ae29-9afa4a11dc28","workspace":{"working_dir":"/tmp/qa-http-workspace","kind":"LocalWorkspace"},"execution_status":"idle"}
HTTP_STATUS:201This confirms a real user can create an agent-server conversation with valid observability metadata/tags.
Step 3 — Send invalid metadata to exercise validation:
Ran curl -H 'Content-Type: application/json' --data @/tmp/qa_invalid_start.json http://127.0.0.1:18080/api/conversations, where observability_metadata contained a nested object:
{"detail":"Internal Server Error","exception":"Object of type ValueError is not JSON serializable"}
HTTP_STATUS:500The server log shows FastAPI raised RequestValidationError, then the validation handler failed serializing ctx.error: ValueError(...). This confirms invalid observability metadata is detected, but the user-facing API response is incorrectly a 500.
Issues Found
- 🟠 Issue: Invalid
observability_metadatainPOST /api/conversationsreturns HTTP 500 instead of a 4xx validation response.
This review was created by an AI agent (OpenHands) on behalf of the user.
Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🟡 Acceptable. The data flow is mostly sound, but I found two observability issues worth fixing before merge.
Risk: 🟡 Medium — new root-span metadata plumbing can produce incomplete or misleading traces if these edge cases hit.
This review was generated by an AI agent (OpenHands) on behalf of the user.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26491056754
| # Include tags and observability metadata if provided | ||
| "tags": tags or {}, | ||
| "observability_metadata": observability_metadata or {}, | ||
| "observability_tags": observability_tags or [], |
There was a problem hiding this comment.
🟠 Important: The server-side LocalConversation now consumes these observability fields from StoredConversation, but the remote create payload still omits the existing user_id field. That means a remote SDK caller gets user_id on the client proxy root span only, while the server root span covering agent execution loses it. Please include "user_id": user_id in this payload and add a payload propagation assertion alongside the metadata/tags coverage.
| if user_id: | ||
| Laminar.set_trace_user_id(user_id) | ||
| if metadata: | ||
| Laminar.set_trace_metadata(metadata) |
There was a problem hiding this comment.
🟠 Important: These new metadata/tag setters run inside Laminar.use_span() with its default record_exception=True / set_status_on_exception=True. If a best-effort setter raises (for example due to Laminar API or serialization incompatibility), contextlib.suppress hides the failure from the SDK caller but Laminar still records the exception and marks the conversation root span as errored. Please use Laminar.use_span(self.span, record_exception=False, set_status_on_exception=False) for this setup block so observability setup failures don't corrupt trace status.
Summary
observability_metadataandobservability_tagsfields to start requestsTests
uv run pytest -q tests/sdk/conversation/test_base_span_management.py tests/sdk/conversation/test_tags.py tests/sdk/observability/test_laminar.pyuv run ruff check openhands-sdk/openhands/sdk/conversation/base.py openhands-sdk/openhands/sdk/conversation/conversation.py openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py openhands-sdk/openhands/sdk/conversation/impl/remote_conversation.py openhands-sdk/openhands/sdk/conversation/request.py openhands-sdk/openhands/sdk/conversation/types.py openhands-sdk/openhands/sdk/observability/laminar.py openhands-sdk/openhands/sdk/settings/model.py openhands-agent-server/openhands/agent_server/event_service.py tests/sdk/conversation/test_base_span_management.py tests/sdk/conversation/test_tags.py tests/sdk/observability/test_laminar.pyThis PR was created by an AI agent (OpenHands) on behalf of the user.
@neubig can click here to continue refining the PR
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:725d8f5-pythonRun
All tags pushed for this build
About Multi-Architecture Support
725d8f5-python) is a multi-arch manifest supporting both amd64 and arm64725d8f5-python-amd64) are also available if neededIssue
Related to OpenHands/OpenHands#14457
Related issue: #3344