Skip to content

feat: add OpenTelemetry support across all SDKs#785

Merged
SteveSandersonMS merged 3 commits intomainfrom
stoub/otel-support
Mar 13, 2026
Merged

feat: add OpenTelemetry support across all SDKs#785
SteveSandersonMS merged 3 commits intomainfrom
stoub/otel-support

Conversation

@stephentoub
Copy link
Collaborator

Summary

Adds OpenTelemetry integration to all four language SDKs (Node.js, Python, Go, .NET), enabling distributed tracing between SDK consumers and the Copilot CLI.

What's included

TelemetryConfig type (all SDKs)

New configuration object on CopilotClientOptions that maps to CLI environment variables:

  • otlpEndpointOTEL_EXPORTER_OTLP_ENDPOINT
  • filePathCOPILOT_OTEL_FILE_EXPORTER_PATH
  • exporterTypeCOPILOT_OTEL_EXPORTER_TYPE
  • sourceNameCOPILOT_OTEL_SOURCE_NAME
  • captureContentOTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT

When provided, COPILOT_OTEL_ENABLED=true is set on the spawned CLI process.

W3C Trace Context propagation

traceparent/tracestate fields are now sent on:

  • session.create
  • session.resume
  • session.send

Trace context restoration in tool handlers

Both v2 RPC (tool.call) and v3 broadcast (tool.call.requested) tool call paths restore the inbound trace context before invoking user tool handlers, so tool execution is linked to the originating trace.

Telemetry helper modules

Each SDK has a new telemetry module (telemetry.ts, telemetry.py, telemetry.go, Telemetry.cs) with unit tests.

Updated generated types

Regenerated RPC and session-event types from the latest schema to include traceparent/tracestate fields.

Documentation

Added OpenTelemetry configuration docs and per-language README sections.

⚠️ Blocked: requires next CLI version

This PR depends on the Copilot CLI supporting the traceparent/tracestate fields in the RPC protocol. It will need to update to the next version of the CLI before this can move forward.

Known limitation

The Go ToolHandler type does not accept a context.Context parameter, so while trace context is restored around the handler call (for the HandlePendingToolCall RPC), it cannot be passed directly into user tool code. A comment has been added noting this; a future breaking change to the handler signature would fully resolve it.

@github-actions
Copy link
Contributor

Cross-SDK Consistency Review: OpenTelemetry Support

I've reviewed this PR for cross-SDK consistency across all four language implementations (Node.js, Python, Go, .NET). Overall, the implementation is very well done with excellent feature parity! 🎉

✅ Consistent Across All SDKs

The following features are implemented consistently across all languages:

  1. TelemetryConfig type - All SDKs define equivalent configuration with the same fields (accounting for naming conventions):

    • otlpEndpoint / otlp_endpoint / OTLPEndpoint / OtlpEndpoint
    • filePath / file_path / FilePath / FilePath
    • exporterType / exporter_type / ExporterType / ExporterType
    • sourceName / source_name / SourceName / SourceName
    • captureContent / capture_content / CaptureContent / CaptureContent
  2. Environment variable mapping - All SDKs correctly map config fields to CLI environment variables:

    • COPILOT_OTEL_ENABLED=true (when telemetry config is present)
    • OTEL_EXPORTER_OTLP_ENDPOINT
    • COPILOT_OTEL_FILE_EXPORTER_PATH
    • COPILOT_OTEL_EXPORTER_TYPE
    • COPILOT_OTEL_SOURCE_NAME
    • OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT
  3. W3C Trace Context propagation - All SDKs send traceparent/tracestate on:

    • session.create
    • session.resume
    • session.send
  4. Telemetry helper modules - Each SDK has appropriate helper functions:

    • Node.js: telemetry.ts with getTraceContext() / withTraceContext()
    • Python: telemetry.py with get_trace_context() / trace_context() context manager
    • Go: telemetry.go with getTraceContext() / contextWithTraceParent()
    • .NET: Telemetry.cs with GetTraceContext() / RestoreTraceContext()
  5. Documentation - All four SDK READMEs have telemetry sections with consistent examples

  6. V3 broadcast event handling - All SDKs restore trace context before invoking tool handlers for tool.call.requested events

⚠️ Inconsistency Found: Go V2 Tool.Call Handler

There is one inconsistency in the Go SDK's v2 RPC handler (handleToolCallRequestV2):

Issue: The Go SDK receives traceparent and tracestate in the v2 tool.call request struct but does not restore the trace context before calling the tool handler.

Other SDKs for comparison:

  • Node.js (client.ts:1588-1592): Uses withTraceContext(traceparent, tracestate, () => handler(...))
  • Python (client.py:1646-1653): Uses with trace_context(tp, ts): result = handler(...)
  • .NET (Client.cs:1346): Uses using var _ = TelemetryHelpers.RestoreTraceContext(traceparent, tracestate);
  • Go (client.go:1519-1526): Calls handler(invocation) directly without context restoration

Why this matters: While the Go ToolHandler signature doesn't accept context.Context (which is documented as a known limitation), the other SDKs still restore the ambient trace context even when they can't pass it directly to the handler. This allows any OpenTelemetry instrumentation that reads from the ambient context to work correctly.

Suggested fix: In go/client.go around line 1519, before calling the handler:

invocation := ToolInvocation{
    SessionID:  req.SessionID,
    ToolCallID: req.ToolCallID,
    ToolName:   req.ToolName,
    Arguments:  req.Arguments,
}

// Restore trace context around handler invocation (even though we can't pass ctx to handler)
ctx := contextWithTraceParent(context.Background(), req.Traceparent, req.Tracestate)
// Since ToolHandler doesn't accept context, we can't pass ctx directly, but
// if the handler uses otel.GetTextMapPropagator() or runtime.Callers, the
// restored context would be available if we could set it as goroutine-local.
// For now, this is documented as a limitation.
result, err := handler(invocation)

However, since Go doesn't have goroutine-local storage and the ToolHandler signature doesn't accept context.Context, there may be limited benefit to this change. The current approach is acceptable given the documented limitation, but it would be more consistent with the other SDKs to at least create the context (even if unused) to show the intent.

Summary

This is an excellent, consistent implementation across all SDKs! The only minor inconsistency is in the Go v2 handler's trace context restoration, which is somewhat mitigated by the documented limitation around Go's ToolHandler signature. The feature parity, API design, and documentation are all outstanding. Great work! 🚀

Generated by SDK Consistency Review Agent for issue #785 ·

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generated by SDK Consistency Review Agent for issue #785

Arguments: req.Arguments,
}

result, err := handler(invocation)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consistency suggestion: For consistency with the other SDK implementations, consider restoring the trace context before calling the handler here:

// Restore trace context (even though ToolHandler can't receive it)
ctx := contextWithTraceParent(context.Background(), req.Traceparent, req.Tracestate)
// Note: ToolHandler signature doesn't accept context.Context, so any spans created
// by the handler won't be automatically parented unless the handler manually propagates context.
result, err := handler(invocation)

The other SDKs all restore trace context in their v2 handlers:

  • Node.js uses withTraceContext(traceparent, tracestate, () => handler(...))
  • Python uses with trace_context(tp, ts): result = handler(...)
  • .NET uses using var _ = RestoreTraceContext(traceparent, tracestate);

While Go's limitation is well-documented in the README, adding the context restoration (even if unused) would make the intent clearer and align with the other implementations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in an earlier commit — we now populate ToolInvocation.TraceContext with the restored context in both the v2 (client.go) and v3 (session.go) paths. The handler receives a fully usable context.Context.

… documentation

Add telemetry documentation across all SDK docs:

- getting-started.md: New 'Telemetry & Observability' section with
  per-language examples, TelemetryConfig options table, file export
  example, and trace context propagation explanation
- Per-SDK READMEs (Node.js, Python, Go, .NET): Add telemetry option
  to constructor/options lists and new Telemetry sections with
  language-specific examples and dependency notes
- observability/opentelemetry.md: Add 'Built-in Telemetry Support'
  section at top with multi-language examples, options table,
  propagation details, and dependency matrix
- docs/index.md: Update Observability description

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Contributor

✅ Cross-SDK Consistency Review: Excellent

This PR demonstrates exemplary cross-language consistency in adding OpenTelemetry support to all four SDK implementations. Here's what I verified:

✅ Consistent Implementation Across All SDKs

1. TelemetryConfig Structure — All SDKs expose the same five configuration options with appropriate naming conventions:

  • otlpEndpoint (Node), otlp_endpoint (Python), OTLPEndpoint (Go), OtlpEndpoint (.NET)
  • filePath / file_path / FilePath / FilePath
  • exporterType / exporter_type / ExporterType / ExporterType
  • sourceName / source_name / SourceName / SourceName
  • captureContent / capture_content / CaptureContent / CaptureContent

2. Environment Variable Mapping — Consistent across all languages:

  • COPILOT_OTEL_ENABLED=true (set when telemetry config present)
  • OTEL_EXPORTER_OTLP_ENDPOINT, COPILOT_OTEL_FILE_EXPORTER_PATH, etc.

3. Trace Context Propagation — All SDKs propagate W3C Trace Context (traceparent/tracestate) in three places:

  • session.create / CreateSession / create_session / CreateSessionAsync
  • session.resume / ResumeSession / resume_session / ResumeSessionAsync
  • session.send / Send / send / SendAsync

4. Tool Handler Context Restoration — All SDKs restore trace context before invoking user tool handlers in both v2 RPC (tool.call) and v3 broadcast (tool.call.requested) code paths:

  • Node: withTraceContext(traceparent, tracestate, () => handler(...))
  • Python: with trace_context(traceparent, tracestate): handler(...)
  • Go: ctx := contextWithTraceParent(context.Background(), traceparent, tracestate)
  • .NET: using (TelemetryHelpers.RestoreTraceContext(traceparent, tracestate)) await ExecuteToolAndRespondAsync(...)

5. Telemetry Modules — Each SDK has a dedicated telemetry helper module (telemetry.ts, telemetry.py, telemetry.go, Telemetry.cs) with unit tests.

6. Documentation — Comprehensive and consistent:

  • All language READMEs include a "Telemetry" or "Telemetry & Observability" section
  • Shared docs/getting-started.md section with side-by-side examples
  • New docs/observability/opentelemetry.md guide
  • Clear dependency notes per language

📝 Known Limitation (Documented)

The Go SDK has a documented limitation: the ToolHandler signature doesn't accept a context.Context, so while trace context is restored around the handler (for the HandlePendingToolCall RPC), it cannot be passed into user tool code. The Go README explicitly notes this and mentions a future breaking change may address it. This is an acceptable trade-off given Go's current API design.

🎯 Test Coverage

All four SDKs include telemetry unit tests:

  • nodejs/test/telemetry.test.ts (122 lines)
  • python/test_telemetry.py (128 lines)
  • go/telemetry_test.go (57 lines)
  • dotnet/test/TelemetryTests.cs (65 lines)

Summary: This PR maintains feature parity and API consistency across all language implementations. No cross-SDK inconsistencies found. 🚀

Generated by SDK Consistency Review Agent for issue #785 ·

@github-actions
Copy link
Contributor

Cross-SDK Consistency Review ✅

I've completed a thorough review of PR #785 for cross-language consistency across all four SDK implementations (Node.js, Python, Go, and .NET). This is an excellent example of consistent feature implementation!

Summary

This PR adds OpenTelemetry support across all SDKs with excellent consistency. The implementation maintains API parity while respecting language-specific idioms and conventions.

✅ What's Consistent

1. TelemetryConfig API Surface

All SDKs expose the same five configuration fields with equivalent semantics:

  • otlpEndpoint / otlp_endpoint / OTLPEndpoint / OtlpEndpoint
  • filePath / file_path / FilePath / FilePath
  • exporterType / exporter_type / ExporterType / ExporterType
  • sourceName / source_name / SourceName / SourceName
  • captureContent / capture_content / CaptureContent / CaptureContent

Naming follows language conventions (camelCase, snake_case, PascalCase) appropriately.

2. Environment Variable Mapping

All SDKs map the same config fields to the same CLI environment variables:

  • COPILOT_OTEL_ENABLED=true (set when telemetry config is present)
  • OTEL_EXPORTER_OTLP_ENDPOINT
  • COPILOT_OTEL_FILE_EXPORTER_PATH
  • COPILOT_OTEL_EXPORTER_TYPE
  • COPILOT_OTEL_SOURCE_NAME
  • OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT

3. W3C Trace Context Propagation

All SDKs propagate traceparent/tracestate consistently:

  • Outbound (SDK → CLI): On session.create, session.resume, and session.send
  • Inbound (CLI → SDK): On both v2 (tool.call) and v3 (tool.call.requested) tool call paths

4. Telemetry Helper Modules

Each SDK has a dedicated telemetry module with equivalent functionality:

  • Extract current trace context: getTraceContext() / get_trace_context() / getTraceContext() / GetTraceContext()
  • Restore trace context: withTraceContext() / trace_context() (context manager) / contextWithTraceParent() / RestoreTraceContext()

5. Graceful Degradation

All SDKs handle missing OpenTelemetry dependencies gracefully:

  • Node.js: Dynamic import with .catch(() => undefined)
  • Python: Try/except ImportError
  • Go: Uses required dependency go.opentelemetry.io/otel
  • .NET: Uses built-in System.Diagnostics.Activity (no external dep)

6. Documentation

  • Comprehensive docs added to docs/getting-started.md and docs/observability/opentelemetry.md
  • Per-language README sections updated consistently
  • All SDKs document the same features and behavior

✅ Language-Specific Design Choices (Appropriate)

These differences are intentional and correct based on language idioms:

  1. Go's CaptureContent uses *bool instead of bool to distinguish unset from false (idiomatic Go pattern for optional fields)
  2. Python uses TypedDict with total=False for optional fields (Pythonic)
  3. .NET uses nullable reference types (string?, bool?) (C# convention)
  4. Node.js uses optional properties with ?: (TypeScript convention)
  5. Python uses context manager (with trace_context()) while others use function callbacks (matches language patterns)

📝 Notable Design Decision

Go ToolHandler Limitation: The PR includes an appropriate comment in go/types.go noting that the ToolHandler signature doesn't accept context.Context, so trace context can't be passed directly into user tool code. This is documented as a known limitation for a future breaking change. This is good practice—acknowledging the limitation rather than introducing a breaking change now.

🎯 Verdict

No consistency issues found. This PR exemplifies how to add a feature across multiple language SDKs while maintaining:

  • Functional parity
  • API design consistency
  • Language-appropriate idioms
  • Comprehensive documentation

The implementation is ready from a cross-SDK consistency perspective. Great work! 🎉

Generated by SDK Consistency Review Agent for issue #785 ·

Add TelemetryConfig to all four SDKs (Node, Python, Go, .NET) to configure
OpenTelemetry instrumentation on the Copilot CLI process. This includes:

- TelemetryConfig type with OTLP endpoint, file exporter, source name, and
  capture-content options, mapped to CLI environment variables
- W3C Trace Context propagation (traceparent/tracestate) on session.create,
  session.resume, and session.send RPC calls
- Trace context restoration in tool call handlers (v2 RPC and v3 broadcast)
  so user tool code executes within the correct distributed trace
- Telemetry helper modules (telemetry.ts, telemetry.py, telemetry.go,
  Telemetry.cs) with unit tests
- Updated generated types from latest schema

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Contributor

✅ Cross-SDK Consistency Review: PASS

I've completed a comprehensive consistency review of PR #785 across all four SDK implementations (Node.js, Python, Go, .NET). This PR demonstrates excellent cross-language consistency in implementing OpenTelemetry support.

✅ Consistent Implementation Across All SDKs

1. TelemetryConfig API - Fully Consistent

All SDKs expose equivalent configuration options following language conventions:

Property Node.js Python Go .NET
OTLP Endpoint otlpEndpoint otlp_endpoint OTLPEndpoint OtlpEndpoint
File Path filePath file_path FilePath FilePath
Exporter Type exporterType exporter_type ExporterType ExporterType
Source Name sourceName source_name SourceName SourceName
Capture Content captureContent capture_content CaptureContent CaptureContent

2. W3C Trace Context Propagation - Fully Consistent

All SDKs propagate traceparent/tracestate on:

  • session.create / CreateSession / create_session
  • session.resume / ResumeSession / resume_session
  • session.send / Send / send

3. Tool Handler Trace Context Restoration - Consistent

All SDKs restore inbound trace context before invoking tool handlers:

  • Node.js: withTraceContext(traceparent, tracestate, () => handler(...))
  • Python: with trace_context(traceparent, tracestate): handler(...)
  • Go: ctx := contextWithTraceParent(context.Background(), traceparent, tracestate) ⚠️ *
  • .NET: using (TelemetryHelpers.RestoreTraceContext(data.Traceparent, data.Tracestate))

* Known Limitation Properly Documented: Go's current ToolHandler signature doesn't accept context.Context, so while trace context is restored around the handler, it can't be passed into user code. This limitation is clearly documented in the Go README with a note about future enhancement.

4. CLI Environment Variable Mapping - Fully Consistent

All SDKs set identical environment variables when spawning the CLI:

  • COPILOT_OTEL_ENABLED=true
  • OTEL_EXPORTER_OTLP_ENDPOINT
  • COPILOT_OTEL_FILE_EXPORTER_PATH
  • COPILOT_OTEL_EXPORTER_TYPE
  • COPILOT_OTEL_SOURCE_NAME
  • OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT

5. Test Coverage - Fully Consistent

All SDKs include telemetry unit tests:

  • nodejs/test/telemetry.test.ts
  • python/test_telemetry.py
  • go/telemetry_test.go
  • dotnet/test/TelemetryTests.cs

6. Documentation - Fully Consistent

  • ✅ Per-language README sections with examples
  • ✅ Shared documentation in docs/getting-started.md and docs/observability/opentelemetry.md
  • ✅ Configuration table comparing all four languages
  • ✅ Dependency requirements clearly stated

7. Generated Types Updated - Fully Consistent

RPC request/response types updated to include traceparent/tracestate fields in all SDKs.

Summary

This PR maintains exceptional feature parity across all four SDK implementations. The API design is parallel, accounting appropriately for language idioms (camelCase, snake_case, PascalCase). The one limitation in Go's tool handler signature is both unavoidable (would require breaking change) and properly documented.

No consistency issues found. 🎉

Generated by SDK Consistency Review Agent for issue #785 ·

@SteveSandersonMS
Copy link
Contributor

SteveSandersonMS commented Mar 13, 2026

@stephentoub Thanks for this!

It generally looks great. I made two small alterations - hope this seems OK to you:

  • For Go, passed the trace context into tool calls via the copilot.ToolInvocation object
  • For Node, removed the peerDependency and dynamic import of @opentelemetry/api in favour of an optional callback that developers can use to wire up our context into @opentelemetry/api themselves
    • The reason is I prefer to avoid the peer-dependency and dynamic-import approach whenever possible as bundlers can get into trouble with it or produce errors by default if you don't have the optional dependency
    • If this turns out to make people unhappy we can come back to adding the optional dependency and dynamically detect it in the future

@github-actions
Copy link
Contributor

Cross-SDK Consistency Review

I've reviewed this PR for consistency across all four SDK implementations. Overall, this is an excellent and very thorough piece of work that adds OpenTelemetry support comprehensively across all languages. The implementation is remarkably consistent, but I found one important inconsistency:

✅ Consistent Across All SDKs

  1. TelemetryConfig structure: All four SDKs have equivalent configuration with proper naming conventions:

    • otlpEndpoint/otlp_endpoint/OTLPEndpoint/OtlpEndpoint
    • filePath/file_path/FilePath/FilePath
    • exporterType/exporter_type/ExporterType/ExporterType
    • sourceName/source_name/SourceName/SourceName
    • captureContent/capture_content/CaptureContent/CaptureContent
  2. Trace context propagation on RPC calls: All SDKs consistently propagate traceparent/tracestate on session.create, session.resume, and session.send operations

  3. Telemetry helper modules: Each SDK has a dedicated telemetry module with appropriate unit tests

  4. v2 backward compatibility: All SDKs handle the v2 tool.call RPC and restore trace context in those handlers

  5. Generated types updated: RPC types properly include traceparent/tracestate fields

⚠️ Inconsistency Found: v3 Tool Handler Trace Context Restoration

Issue: When handling v3 broadcast external_tool.requested events, Node.js does not automatically restore trace context while the other three SDKs do.

Current behavior:

  • Python (session.py:337): ✅ Automatically restores via with trace_context(traceparent, tracestate):
  • Go (session.go:542): ✅ Automatically restores via contextWithTraceParent() and passes in ToolInvocation.TraceContext
  • .NET (Session.cs:419): ✅ Automatically restores via using (TelemetryHelpers.RestoreTraceContext(...))
  • Node.js (session.ts:386-393): ❌ Only passes traceparent/tracestate fields in the ToolInvocation object, requiring users to manually restore

Why this matters: Automatic trace context restoration means that if a user's tool handler creates child spans using their OpenTelemetry SDK, those spans will automatically be parented to the CLI's span. Without automatic restoration, users must manually extract and restore the context, which is more error-prone and creates an inconsistent developer experience.

Documentation inconsistency: The OpenTelemetry guide (docs/observability/opentelemetry.md:139) states: "For Go, the ToolInvocation.TraceContext field is a context.Context with the trace already restored — use it directly as the parent for your spans. For Python and .NET, extract from the raw traceparent/tracestate strings using the respective APIs."

However, the actual implementation shows Python and .NET also automatically restore context (not just Go), making the documentation outdated.

📝 Recommendations

  1. For consistency: Consider updating Node.js to automatically restore trace context in v3 tool handlers (similar to Python/Go/.NET), OR document this as an intentional design difference with a clear rationale

  2. Update documentation: The OpenTelemetry guide should accurately reflect that Python, Go, and .NET all automatically restore trace context, not just Go

  3. Consider adding: A brief comment in the Node.js session.ts explaining why manual restoration is chosen (if intentional)

💯 What's Great

  • Comprehensive coverage across all four languages
  • Consistent API naming following each language's conventions
  • Proper handling of both v2 and v3 protocol paths
  • Good test coverage for telemetry helpers
  • Clear, well-structured documentation

This is high-quality work! The inconsistency noted above is the only cross-SDK issue I found.

Generated by SDK Consistency Review Agent for issue #785 ·

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generated by SDK Consistency Review Agent for issue #785

arguments: args,
traceparent,
tracestate,
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cross-SDK consistency note: Node.js passes traceparent/tracestate to the user handler but doesn't automatically restore the trace context. In contrast:

  • Python automatically restores via with trace_context(traceparent, tracestate):
  • Go automatically restores via contextWithTraceParent()
  • .NET automatically restores via using (TelemetryHelpers.RestoreTraceContext(...))

This means Node.js users must manually restore the context (as shown in the docs), while other SDK users get automatic parent-child span linking. Consider whether this is intentional or if Node.js should also automatically restore for consistency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentional. Node.js removed the @opentelemetry/api dependency entirely, so the SDK can't auto-restore context without pulling in that package. Instead, raw traceparent/tracestate strings are passed on ToolInvocation and users who want span linkage restore it themselves (documented in the OpenTelemetry guide). Python, Go, and .NET all have OTel/Activity dependencies already, so they auto-restore.

});
```

For **Go**, the `ToolInvocation.TraceContext` field is a `context.Context` with the trace already restored — use it directly as the parent for your spans. For **Python** and **.NET**, extract from the raw `traceparent`/`tracestate` strings using the respective APIs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation accuracy issue: This states that only Go has trace context already restored, and that Python and .NET need to "extract from the raw traceparent/tracestate strings."

However, reviewing the actual implementations shows:

  • Python (session.py:337): Automatically restores via with trace_context(traceparent, tracestate):
  • .NET (Session.cs:419): Automatically restores via using (TelemetryHelpers.RestoreTraceContext(...))
  • Go (session.go:542): Automatically restores via contextWithTraceParent()

So all three of these SDKs automatically restore trace context, not just Go. Only Node.js requires manual restoration by the user.

Suggest updating this section to accurately reflect the current implementations across all four languages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed. The docs now accurately describe that Go, Python, and .NET all auto-restore trace context, and only Node.js requires manual restoration (since it has no OTel dependency).

Replace the optional @opentelemetry/api peer dependency with a
user-provided callback approach:

- Add TraceContext interface and TraceContextProvider type
- Add onGetTraceContext callback to CopilotClientOptions
- Pass traceparent/tracestate directly on ToolInvocation for inbound context
- Remove @opentelemetry/api from peerDependencies and devDependencies
- Rewrite telemetry.ts to a simple callback-based helper (~27 lines)
- Update tests, README, and OpenTelemetry docs with wire-up examples

Users who want distributed trace propagation provide a callback:

  const client = new CopilotClient({
    onGetTraceContext: () => {
      const carrier = {};
      propagation.inject(context.active(), carrier);
      return carrier;
    },
  });

TelemetryConfig (CLI env vars) is unchanged and requires no dependency.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@SteveSandersonMS SteveSandersonMS marked this pull request as ready for review March 13, 2026 15:44
@SteveSandersonMS SteveSandersonMS requested a review from a team as a code owner March 13, 2026 15:44
Copilot AI review requested due to automatic review settings March 13, 2026 15:44
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds OpenTelemetry integration across Node.js, Python, Go, and .NET SDKs to enable distributed tracing between SDK consumers and the Copilot CLI, including W3C Trace Context propagation and tool-handler context restoration.

Changes:

  • Introduces TelemetryConfig in each SDK and maps it to CLI environment variables to enable/export telemetry.
  • Propagates traceparent/tracestate on key RPC calls (create/resume/send) and restores inbound context for tool execution.
  • Adds new telemetry helper modules + unit tests, updates documentation, and updates Go scenario module files to reflect new deps.

Reviewed changes

Copilot reviewed 63 out of 95 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
test/scenarios/transport/tcp/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/transport/tcp/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/transport/stdio/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/transport/stdio/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/transport/reconnect/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/transport/reconnect/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/tools/virtual-filesystem/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/tools/virtual-filesystem/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/tools/tool-overrides/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/tools/tool-overrides/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/tools/tool-filtering/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/tools/tool-filtering/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/tools/skills/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/tools/skills/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/tools/no-tools/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/tools/no-tools/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/tools/mcp-servers/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/tools/mcp-servers/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/tools/custom-agents/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/tools/custom-agents/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/sessions/streaming/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/sessions/streaming/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/sessions/session-resume/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/sessions/session-resume/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/sessions/infinite-sessions/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/sessions/infinite-sessions/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/sessions/concurrent-sessions/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/sessions/concurrent-sessions/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/prompts/system-message/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/prompts/system-message/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/prompts/reasoning-effort/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/prompts/reasoning-effort/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/prompts/attachments/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/prompts/attachments/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/modes/minimal/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/modes/minimal/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/modes/default/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/modes/default/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/callbacks/user-input/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/callbacks/user-input/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/callbacks/permissions/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/callbacks/permissions/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/callbacks/hooks/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/callbacks/hooks/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/bundling/fully-bundled/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/bundling/fully-bundled/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/bundling/container-proxy/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/bundling/container-proxy/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/bundling/app-direct-server/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/bundling/app-direct-server/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/bundling/app-backend-to-server/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/bundling/app-backend-to-server/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/auth/gh-app/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/auth/gh-app/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/auth/byok-openai/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/auth/byok-openai/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/auth/byok-ollama/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/auth/byok-ollama/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/auth/byok-azure/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/auth/byok-azure/go/go.sum Updates dependency checksums for Go scenario module.
test/scenarios/auth/byok-anthropic/go/go.mod Adds indirect OTel deps for Go scenario module.
test/scenarios/auth/byok-anthropic/go/go.sum Updates dependency checksums for Go scenario module.
python/test_telemetry.py Adds Python unit tests for trace context helpers and env-var mapping behavior.
python/pyproject.toml Adds telemetry optional dependency group for opentelemetry-api.
python/copilot/types.py Adds TelemetryConfig type and telemetry field on SubprocessConfig.
python/copilot/telemetry.py Adds OpenTelemetry-based trace context injection/restoration helpers (optional import).
python/copilot/session.py Injects outbound trace context on session.send and restores inbound context around tool handlers.
python/copilot/client.py Injects trace context on create/resume; maps telemetry config to CLI env vars; restores inbound context for v2 tool calls.
python/copilot/init.py Exports TelemetryConfig in the Python package API.
python/README.md Documents Python telemetry configuration and install instructions.
nodejs/test/telemetry.test.ts Adds Node unit tests for trace context provider and telemetry env-var mapping.
nodejs/test/client.test.ts Adds Node tests asserting trace context inclusion in create/resume/send RPCs.
nodejs/src/types.ts Adds TelemetryConfig, TraceContext, and onGetTraceContext typing; adds trace context fields to tool invocation.
nodejs/src/telemetry.ts Adds helper to safely call user trace-context provider without SDK OTel deps.
nodejs/src/session.ts Propagates outbound trace context on session.send; forwards inbound trace fields to tool handlers.
nodejs/src/index.ts Re-exports telemetry-related types.
nodejs/src/client.ts Maps telemetry config to CLI env vars; injects trace context into session.create/resume; forwards inbound trace fields for v2 tool calls.
nodejs/README.md Documents Node telemetry config and advanced trace context propagation setup.
go/types.go Adds Go TelemetryConfig and trace context fields on RPC payload structs; adds ToolInvocation.TraceContext.
go/telemetry.go Adds Go helpers to inject/extract W3C trace context via global OTel propagator.
go/telemetry_test.go Adds unit tests for Go trace context helpers and ToolInvocation trace restoration.
go/session.go Injects trace context into session.send; restores inbound trace context for tool handlers (broadcast path).
go/client.go Injects trace context into create/resume; maps telemetry config to CLI env vars; restores inbound trace context for v2 tool calls.
go/go.mod Adds OTel dependency to Go module.
go/go.sum Adds checksums for new Go dependencies.
go/README.md Documents Go telemetry config and trace context propagation behavior.
dotnet/test/TelemetryTests.cs Adds .NET unit tests for TelemetryConfig and options cloning behavior.
dotnet/src/Types.cs Adds .NET TelemetryConfig and adds it to CopilotClientOptions.
dotnet/src/Telemetry.cs Adds .NET helpers to read current trace context and restore inbound trace context for tool handlers.
dotnet/src/Session.cs Injects trace context into session.send payload; restores inbound trace context around tool execution.
dotnet/src/Client.cs Injects trace context into create/resume; maps telemetry config to CLI env vars; restores inbound trace context for v2 tool calls; updates request types.
dotnet/README.md Documents .NET telemetry configuration and trace propagation.
docs/observability/opentelemetry.md Adds a cross-language guide for built-in telemetry config + trace propagation and advanced examples.
docs/getting-started.md Adds telemetry/observability section and configuration examples across languages.
docs/index.md Updates observability docs index entry to reflect new telemetry content.

))
```

Install with telemetry extras: `pip install copilot-sdk[telemetry]` (provides `opentelemetry-api`)
| Language | Dependency | Notes |
|---|---|---|
| Node.js | — | No dependency; provide `onGetTraceContext` callback for outbound propagation |
| Python | `opentelemetry-api` | Install with `pip install copilot-sdk[telemetry]` |

Trace context (`traceparent`/`tracestate`) is automatically propagated between the SDK and CLI on `CreateSession`, `ResumeSession`, and `Send` calls, and inbound when the CLI invokes tool handlers.

> **Note:** The current `ToolHandler` signature does not accept a `context.Context`, so the inbound trace context cannot be passed to handler code. Spans created inside a tool handler will not be automatically parented to the CLI's `execute_tool` span. A future version may add a context parameter.
Comment on lines +21 to +24
// Set up W3C propagator
otel.SetTextMapPropagator(propagation.TraceContext{})
defer otel.SetTextMapPropagator(propagation.NewCompositeTextMapPropagator())

Comment on lines +13 to +15
return Activity.Current is { } activity
? (activity.Id, activity.TraceStateString)
: (null, null);

Trace context (`traceparent`/`tracestate`) is automatically propagated between the SDK and CLI on `create_session`, `resume_session`, and `send` calls, and inbound when the CLI invokes tool handlers.

Install with telemetry extras: `pip install copilot-sdk[telemetry]` (provides `opentelemetry-api`)
@github-actions
Copy link
Contributor

✅ Cross-SDK Consistency Review

I've reviewed this PR for consistency across all four SDK implementations (Node.js/TypeScript, Python, Go, and .NET). Here are my findings:

Excellent Consistency Across SDKs

This PR successfully implements OpenTelemetry support consistently across all four languages with appropriate adaptations for each ecosystem. All major features are present in all SDKs:

1. TelemetryConfig Structure

All four SDKs have equivalent configuration with the same 5 fields:

  • otlpEndpoint / otlp_endpoint / OTLPEndpoint / OtlpEndpoint
  • filePath / file_path / FilePath / FilePath
  • exporterType / exporter_type / ExporterType / ExporterType
  • sourceName / source_name / SourceName / SourceName
  • captureContent / capture_content / CaptureContent / CaptureContent

All map to the same environment variables and have equivalent semantics.

2. Trace Context Propagation (SDK → CLI)

All SDKs propagate traceparent/tracestate on:

  • session.create / create_session / CreateSession
  • session.resume / resume_session / ResumeSession
  • session.send (all SDKs)

Design differences (documented and intentional):

  • Node.js: Uses explicit onGetTraceContext callback (no OpenTelemetry dependency)
  • Python: Automatic via opentelemetry-api package
  • Go: Automatic via go.opentelemetry.io/otel package
  • .NET: Automatic via built-in System.Diagnostics.Activity

3. Trace Context Restoration (CLI → SDK)

Both v2 RPC (tool.call) and v3 broadcast (tool.call.requested) handlers properly restore trace context:

  • Python: with trace_context(traceparent, tracestate): (lines 337, 1638 in client.py/session.py)
  • Go: contextWithTraceParent(ctx, traceparent, tracestate) (lines 542, 1533 in client.go/session.go)
  • .NET: using TelemetryHelpers.RestoreTraceContext() (lines 419, 1355 in Client.cs/Session.cs)
  • Node.js: Intentionally different - provides raw traceparent/tracestate strings on ToolInvocation for manual extraction (documented in docs/observability/opentelemetry.md:115-142)

This design difference is appropriate because Node.js has no OpenTelemetry dependency, maintaining the SDK's zero-dependency philosophy for users who don't need telemetry.

4. Documentation

  • Comprehensive shared documentation in docs/observability/opentelemetry.md
  • Per-language README sections explaining the telemetry features
  • Clear documentation of the intentional design differences between Node.js and other SDKs

5. Testing

All SDKs have telemetry tests:

  • nodejs/test/telemetry.test.ts
  • python/tests/test_telemetry.py (implied from module)
  • go/telemetry_test.go
  • dotnet/test/TelemetryTests.cs

🎯 Conclusion

This PR maintains excellent cross-SDK consistency while respecting language-specific conventions and dependencies. The design differences (particularly Node.js's manual trace context extraction vs. automatic restoration in other SDKs) are intentional, documented, and appropriate for each language's ecosystem.

No consistency issues found. This PR is ready from a cross-SDK consistency perspective. Great work! 🚀

Generated by SDK Consistency Review Agent for issue #785 ·

@SteveSandersonMS SteveSandersonMS merged commit f2d21a0 into main Mar 13, 2026
41 of 42 checks passed
@SteveSandersonMS SteveSandersonMS deleted the stoub/otel-support branch March 13, 2026 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants