Observability with `diagnostic_channels`

### Confirm this is a feature request for the Node library and not the underlying OpenAI API.

- [x] This is a feature request for the Node library

### Describe the feature or improvement you're requesting

I'd like to propose adding first-class [`TracingChannel`](https://nodejs.org/api/diagnostics_channel.html#class-tracingchannel) support to the OpenAI Node.js SDK, following the pattern established by [`undici`](https://github.com/nodejs/undici) in Node.js core and adopted across the npm ecosystem.

`TracingChannel` is a higher-level API built on top of `diagnostics_channel`, it is a built-in AP in Node.js that is available in all other runtimes like Bun and Deno and it is specifically designed for tracing async operations. It provides structured lifecycle channels (`start`, `end`, `error`, `asyncStart`, `asyncEnd`) and handles async context propagation correctly. This is the missing piece that makes monkey-patching approaches fragile in real-world async applications.

I work at Sentry so we have a few first-hand problems with the current state of things with the monkey-patching approach. Current APM instrumentations use IITM (import-in-the-middle) for ESM and RITM (require-in-the-middle) for CJS to monkey-patch SDK internals. This has several fragility concerns:

- **Runtime lock-in:** both RITM and IITM rely on Node.js-specific module loader internals (`Module._resolveFilename`, `module.register()`). They don't work on Bun or Deno, which implement the Node.js API surface but not the module loader internals. The OpenAI SDK explicitly supports Node.js, Deno, and Bun, making monkey-patching especially inadequate.
- **ESM fragility:** IITM is built on Node.js's module customization hooks, which are still evolving and have been a persistent source of breakage in the OTel JS ecosystem.
- **Initialization ordering:** both require instrumentation to be set up before the SDK is first `require()`'d / `import`'d. Get the order wrong and instrumentation silently does nothing, which is very hard to debug in production.
- **Bundling and Externalization:** Users have to ensure their instrumented modules are externalized, which is becoming very difficult to guarantee with more and more frameworks bundling server-side code into single executables, binaries, or deployment files.

All of these are friction points that are not solvable on the APM's side, resulting in a less than ideal DX and constant bugs whenever the user switches runtimes, platforms, or framework.

Taking Sentry as an example, our OpenAI instrumentation uses RITM/IITM to intercept `require('openai')`, replaces the `OpenAI` constructor, and creates a deep recursive Proxy on the resulting client instance to intercept method calls at arbitrary nesting depth. This spans **~1,200 lines across 6 files**:

- [instrumentation.ts](https://github.com/getsentry/sentry-javascript/blob/7c193254674cebba2b6241cc0abd94b60c4a0235/packages/node/src/integrations/tracing/openai/instrumentation.ts) (125 lines): IITM module patching, constructor wrapping
- [core/index.ts](https://github.com/getsentry/sentry-javascript/blob/7c193254674cebba2b6241cc0abd94b60c4a0235/packages/core/src/tracing/openai/index.ts) (269 lines): deep Proxy creation, method interception, span lifecycle
- [streaming.ts](https://github.com/getsentry/sentry-javascript/blob/7c193254674cebba2b6241cc0abd94b60c4a0235/packages/core/src/tracing/openai/streaming.ts) (236 lines): async iterable wrapping, chunk accumulation, stream error handling
- [utils.ts](https://github.com/getsentry/sentry-javascript/blob/7c193254674cebba2b6241cc0abd94b60c4a0235/packages/core/src/tracing/openai/utils.ts) (194 lines): attribute extraction from requests and responses
- [constants.ts](https://github.com/getsentry/sentry-javascript/blob/7c193254674cebba2b6241cc0abd94b60c4a0235/packages/core/src/tracing/openai/constants.ts) (31 lines): method registry mapping API paths to operation types
- [types.ts](https://github.com/getsentry/sentry-javascript/blob/7c193254674cebba2b6241cc0abd94b60c4a0235/packages/core/src/tracing/openai/types.ts) (366 lines): type definitions for responses, streaming events, options

All of which is extremely coupled to the internal working of the AI SDK and thus is very fragile.

This is why we started an initiative to ask all the top libraries in the ecosystem to implement diagnostic channels, we got a lot of projects onboard and would like to get the top AI SDKs to implement them as well.

## Proposed Tracing Channels

All channels use the Node.js [`TracingChannel`](https://nodejs.org/api/diagnostics_channel.html#class-tracingchannel) API, which provides `start`, `end`, `asyncStart`, `asyncEnd`, and `error` sub-channels automatically.

| TracingChannel | Tracks | Context fields |
|---|---|---|
| `openai:chat` | Chat completion and response generation, from request to full response (or stream completion) | `method`, `model`, `stream`, `params`, `response` |
| `openai:embeddings` | Embedding generation | `model`, `params`, `response` |

Two channels rather than one because chat and embeddings have fundamentally different context shapes (messages, tools, and streaming for chat vs. input text and dimensions for embeddings). A `method` discriminator on the chat channel distinguishes between the Chat Completions API and the Responses API.

We can spec out the message payloads passed to those channels in a PR and spec them out properly there. 

But the main gains from baking this at the openai library level is:

- Zero performance cost when no one is listening and code path can be optimized with `hasSubscribers`.
- Works everywhere on server runtimes today, Deno, Bun and CF Workers support it fully today.
- Implementation here will be straightforward because all methods return a promise AFAIK, maybe streaming will need a workaround.
- Anyone can use those channels to feed into other observability solutions (e.g: APMs, OpenTelemetry, and custom monitoring).
- APMs will have a public API that isn't subject to frequent changes unlike implementation internals that monkey patching forces them to.

That means no hacks with RITM or IITM will be needed anymore and any APM can just listen at anytime dynamically and report their traces, metrics or even logs.

I think https://github.com/openai/openai-node/issues/1563 can also be easily solved userland once the `openai` package implements tracing channels, some users have already started doing so because OTel instrumentations only work on Node.js also it isn't the right abstract layer to own as a first party.



### Additional context

### Examples

This is a simplified sketch of the logic that would be added to the `openai` SDK:

```ts
import dc from 'node:diagnostics_channel';

const chatChannel = dc.tracingChannel('openai:chat');

// Inside chat.completions.create / responses.create
async function create(params) {
  if (!chatChannel.hasSubscribers) {
    return this._makeRequest(params);
  }

  const context = { method: 'chat.completions.create', model: params.model, stream: !!params.stream, params };
  return chatChannel.tracePromise(() => this._makeRequest(params), context);
}
```

APMs and any interested parties can then listen and react to those async executions like so:

```ts
import dc from 'node:diagnostics_channel';

dc.tracingChannel('openai:chat').subscribe({
  start(ctx) {
    // Create a span, inject timings, or whatever is needed.
  },
  asyncEnd(ctx) {
    // End a span, record metrics...
  },
  error(ctx) {
    // Report unhandled failures
  },
});
```

All of which is pretty minimal on both sides yet it enables observing the SDK calls which is invaluable for monitoring AI powered software today.

I'm working full time on helping teams ship this, so I'm more than happy to submit a PR and co-own this with the team. We can discuss it further if you think a PR will be a good first step, and also I can jump on calls to showcase the importance of adopting tracing channels into the node SDK.

### Prior Art

Like I mentioned, we worked with the teams of those high-usage ecosystem libraries to use tracing channels, which allows each library to have its own observability story at little to no cost.

- **`mysql2`**: [sidorares/node-mysql2#4178](https://github.com/sidorares/node-mysql2/pull/4178) 
- **`node-redis`**: [redis/node-redis#3195](https://github.com/redis/node-redis/pull/3195) 
- **`ioredis`**: [redis/ioredis#2089](https://github.com/redis/ioredis/pull/2089) 
- **`h3`**: [h3js/h3#1251](https://github.com/h3js/h3/pull/1251)
- **`srvx`**: [h3js/srvx#141](https://github.com/h3js/srvx/pull/141)

Other projects that already shipped them:

- **`fastify`**: ships `TracingChannel` support natively (`tracing:fastify.request.handler`)
- **`undici`** (Node.js core): ships `TracingChannel` support since Node 20.12 ([`undici:request`](https://nodejs.org/api/diagnostics_channel.html#undici-channels))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Observability with `diagnostic_channels` #1819

Confirm this is a feature request for the Node library and not the underlying OpenAI API.

Describe the feature or improvement you're requesting

Proposed Tracing Channels

Additional context

Examples

Prior Art

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

TracingChannel	Tracks	Context fields
`openai:chat`	Chat completion and response generation, from request to full response (or stream completion)	`method`, `model`, `stream`, `params`, `response`
`openai:embeddings`	Embedding generation	`model`, `params`, `response`

Observability with diagnostic_channels #1819

Description

Confirm this is a feature request for the Node library and not the underlying OpenAI API.

Describe the feature or improvement you're requesting

Proposed Tracing Channels

Additional context

Examples

Prior Art

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Observability with `diagnostic_channels` #1819