Add Python agent run trace#87
Merged
Merged
Conversation
Add structured tracing for custom Python agents so their execution surfaces
on the Narada observability dashboard alongside GUI-built custom agents.
narada-core:
- New PythonAgentRunTrace step type + PythonTraceEvent discriminated union
covering stdout, stderr, sub-agent calls, extension actions, and side
effects. Added to the ApaStepTrace union; parse_action_trace handles it
transparently.
narada-pyodide:
- New private _trace.py module with bounded-size summarisation of
extension action requests/responses and per-event emitters
(emit_sub_agent_call, emit_extension_action, emit_side_effect).
- Instrument dispatch_request() to emit one subAgentCall event per
invocation, covering success/error/timeout paths.
- Instrument _run_extension_action() to emit one extensionAction event
per call, with action_name keyed off the request discriminator.
- Instrument download_file / render_html in utils.py to emit sideEffect
events.
- 38 unit tests exercise summarisation, truncation, emitter shapes, and
Pydantic round-trip via parse_action_trace.
Version bumps (coupled to avoid parse_action_trace ValidationError for
external narada users whose traces may contain pythonAgentRun nodes):
- narada-core: 0.0.17 -> 0.0.18
- narada-pyodide: 0.0.43 -> 0.0.44
- narada: 0.1.42 -> 0.1.43 (repin narada-core==0.0.18 only)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses the four ship-blocker findings from the cross-dimensional review:
Robustness — trace emission must not break user code (_trace.py):
- `emit_trace_event` now wraps the serialise + forward in try/except and
logs the failure instead of propagating it. Previously a stray non-
serialisable value in a summary (a datetime, a Pydantic model leak)
would raise TypeError out of `_run_extension_action` and abort the
user's agent mid-run.
- `json.dumps(event, default=str)` stringifies unknown types defensively.
Scalability — bound recursive trace size (_trace.py):
- `emit_sub_agent_call` now strips the `events` list from any nested
`pythonAgentRun` node in the forwarded action trace, replacing it with
a `truncated_event_count` marker. Previously a custom Python agent
that delegated to another custom Python agent embedded the sub-run's
full event timeline in the parent's persisted JSON, producing
O(breadth^depth) growth.
Robustness — code-quality cleanup (window.py):
- Collapsed the duplicated `except asyncio.TimeoutError` / `except
NaradaAgentTimeoutError_INTERNAL_DO_NOT_USE` blocks in
`dispatch_request` into a single `except (A, B):` branch. Removes
~12 lines and the divergence risk.
Robustness — side-effect tracing on failure (utils.py):
- `download_file` and `render_html` now emit a "failed" side-effect
trace when the underlying JS call raises, then re-raise. Previously
a failed download produced no trace at all — users saw silence
rather than the actual error.
Type safety — schema invariants (narada-core/actions/models.py):
- `PythonAgentRunTrace.duration_ms` and `truncated_event_count` now
use `NonNegativeInt` — Pydantic rejects negative values at parse
time rather than letting `-42ms` reach the dashboard formatter.
- New `@model_validator` on `PythonSubAgentCallEvent` and
`PythonExtensionActionEvent` rejects `ts_end < ts_start`; clock
skew on the Pyodide clock can no longer produce negative-duration
events that the renderer would display as `-5ms`.
- `parse_action_trace` now dispatches deterministically based on the
first item's discriminator (`step_type` vs `action`+`url`) rather
than try/except-falling-through two adapters. Eliminates the risk
of silently misrouting a homogeneity-violated trace.
Tests:
- 13 new unit tests across `TestEmitDefensive`,
`TestStripNestedPythonEvents`, `TestPythonEventInvariants`, and
`TestParseActionTraceDispatch`. Full suite is now 51 tests, all
passing under `uv run --package narada-pyodide pytest`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolve merge conflicts: - pyproject.toml: keep both pytest and pytest-asyncio dev deps - window.py: combine _get_auth_headers() refactor with trace instrumentation - uv.lock: regenerated Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolve conflict in window.py: combine _get_auth_headers() signature change from #91 with trace instrumentation from this PR. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-trace # Conflicts: # packages/narada-pyodide/src/narada/window.py
8 tasks
Removes _strip_nested_python_events. The function dropped events from any nested pythonAgentRun node and stamped truncated_event_count on it, citing "deep recursion blowing up persisted JSON size" as the reason. In practice the policy was always-on and uniform — a 1-event nested trace got stripped just as readily as a 10K-event one — and the frontend already owns size enforcement via MAX_NESTED_ACTION_TRACE_BYTES in python.worker.ts plus the workflow-run-detail consumer caps. Two layers of stripping is strictly worse: small nested traces lose their events for no benefit, and the dashboard's CollapsibleNestedTrace can't recover them (it does not lazy-fetch by request_id). Now: emit_sub_agent_call forwards action_trace_raw as-is. The frontend caps when actually over budget. Tests updated to assert events flow through unmodified.
xTRam1
commented
Apr 30, 2026
xTRam1
commented
Apr 30, 2026
xTRam1
commented
Apr 30, 2026
xTRam1
commented
Apr 30, 2026
xTRam1
commented
Apr 30, 2026
| return {} | ||
|
|
||
|
|
||
| def summarize_response( |
Contributor
Author
There was a problem hiding this comment.
Since we are removing the hard caps, we probably don't need these right?
xTRam1
commented
Apr 30, 2026
xTRam1
commented
Apr 30, 2026
| description: str | ||
|
|
||
|
|
||
| # --------------------------------------------------------------------------- |
Contributor
Author
There was a problem hiding this comment.
Move these to a model.py file inside a tracing folder.
…-trace # Conflicts: # packages/narada-core/src/narada_core/actions/models.py
zizhengtai
approved these changes
Apr 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds structured tracing for custom Python agents. The SDK emits events for every stdout/stderr line, sub-agent call, extension action, and side-effect; the frontend (NaradaAI/frontend#1592) consumes these events and renders them on the observability dashboard.
What changes
narada-core— new Pydantic modelsPythonAgentRunTrace(step_type:pythonAgentRun) wrapping a run's full executionPythonTraceEventdiscriminated union:stdout,stderr,subAgentCall,extensionAction,sideEffectApaStepTrace;parse_action_traceround-trips the new shapenarada-pyodide— chokepoint-based instrumentation so adding a new SDK method needs no trace-layer changeswindow.py:dispatch_request()wrapped → everywindow.agent(...)emits asubAgentCallevent (covers success / error / timeout)window.py:_run_extension_action()wrapped → every extension action emits anextensionActioneventutils.download_file/utils.render_html→sideEffectevents_narada_emit_trace_event(json_str), a Pyodide global the frontend registersPayload safety — per-action summaries in
_trace.summarize_requestbound the persisted JSON size (hard caps on free-form strings; structured subsets forwrite_google_sheet, screenshot,agentic_selector, etc.).Version bumps (coupled)
narada-core: 0.0.17 → 0.0.18narada-pyodide: 0.0.43 → 0.0.45a2narada: 0.1.42 → 0.1.43 (repinnarada-core==0.0.18only, no code change)Why bump
naradatoo: ifnarada-core@0.0.18(new variant) shipped without a matchingnaradabump, external Python SDK users would silently break onparse_action_traceValidationError the moment a response they polled contained a nested Python custom agent run.Example
This custom Python agent:
emits this trace (rendered by the frontend PR):
{ "step_type": "pythonAgentRun", "status": "error", "duration_ms": 16028, "events": [ { "kind": "extensionAction", "ts_start": 1777063298934, "ts_end": 1777063302993, "action_name": "go_to_url", "request_summary": { "url": "https://example.com", "new_tab": true }, "status": "success" }, { "kind": "sideEffect", "ts": 1777063302994, "effect_type": "render_html", "description": "Rendered HTML in a new tab" }, { "kind": "stdout", "ts": 1777063302995, "text": "[A] after render_html" }, { "kind": "sideEffect", "ts": 1777063302995, "effect_type": "download_file", "description": "Downloaded file: ordering-test.txt" }, { "kind": "stdout", "ts": 1777063302996, "text": "[B] after download_file" }, { "kind": "stderr", "ts": 1777063303004, "text": "Traceback (most recent call last):\n File \"<exec>\", line 11, in <module>\nValueError: intentional final traceback\n" } ], "error_message": "Traceback ..." }Tests
38 unit tests in
packages/narada-pyodide/tests/test_trace.py:summarize_request/summarize_responseper action typeparse_action_traceinto a validPythonAgentRunTraceRun (the
narada/narada-pyodidenamespace collision requires uninstalling narada first; seepackages/narada-pyodide/tests/README.md):Publish steps (manual, post-merge)
Frontend PR NaradaAI/frontend#1592 pins
narada-pyodide==0.0.45a2and micropips it at Pyodide runtime, so the new versions must be on PyPI before that PR ships.