Confirm this is a feature request for the Node library and not the underlying OpenAI API.
Describe the feature or improvement you're requesting
I'd like to propose adding first-class TracingChannel support to the OpenAI Node.js SDK, following the pattern established by undici in Node.js core and adopted across the npm ecosystem.
TracingChannel is a higher-level API built on top of diagnostics_channel, it is a built-in AP in Node.js that is available in all other runtimes like Bun and Deno and it is specifically designed for tracing async operations. It provides structured lifecycle channels (start, end, error, asyncStart, asyncEnd) and handles async context propagation correctly. This is the missing piece that makes monkey-patching approaches fragile in real-world async applications.
I work at Sentry so we have a few first-hand problems with the current state of things with the monkey-patching approach. Current APM instrumentations use IITM (import-in-the-middle) for ESM and RITM (require-in-the-middle) for CJS to monkey-patch SDK internals. This has several fragility concerns:
- Runtime lock-in: both RITM and IITM rely on Node.js-specific module loader internals (
Module._resolveFilename, module.register()). They don't work on Bun or Deno, which implement the Node.js API surface but not the module loader internals. The OpenAI SDK explicitly supports Node.js, Deno, and Bun, making monkey-patching especially inadequate.
- ESM fragility: IITM is built on Node.js's module customization hooks, which are still evolving and have been a persistent source of breakage in the OTel JS ecosystem.
- Initialization ordering: both require instrumentation to be set up before the SDK is first
require()'d / import'd. Get the order wrong and instrumentation silently does nothing, which is very hard to debug in production.
- Bundling and Externalization: Users have to ensure their instrumented modules are externalized, which is becoming very difficult to guarantee with more and more frameworks bundling server-side code into single executables, binaries, or deployment files.
All of these are friction points that are not solvable on the APM's side, resulting in a less than ideal DX and constant bugs whenever the user switches runtimes, platforms, or framework.
Taking Sentry as an example, our OpenAI instrumentation uses RITM/IITM to intercept require('openai'), replaces the OpenAI constructor, and creates a deep recursive Proxy on the resulting client instance to intercept method calls at arbitrary nesting depth. This spans ~1,200 lines across 6 files:
- instrumentation.ts (125 lines): IITM module patching, constructor wrapping
- core/index.ts (269 lines): deep Proxy creation, method interception, span lifecycle
- streaming.ts (236 lines): async iterable wrapping, chunk accumulation, stream error handling
- utils.ts (194 lines): attribute extraction from requests and responses
- constants.ts (31 lines): method registry mapping API paths to operation types
- types.ts (366 lines): type definitions for responses, streaming events, options
All of which is extremely coupled to the internal working of the AI SDK and thus is very fragile.
This is why we started an initiative to ask all the top libraries in the ecosystem to implement diagnostic channels, we got a lot of projects onboard and would like to get the top AI SDKs to implement them as well.
Proposed Tracing Channels
All channels use the Node.js TracingChannel API, which provides start, end, asyncStart, asyncEnd, and error sub-channels automatically.
| TracingChannel |
Tracks |
Context fields |
openai:chat |
Chat completion and response generation, from request to full response (or stream completion) |
method, model, stream, params, response |
openai:embeddings |
Embedding generation |
model, params, response |
Two channels rather than one because chat and embeddings have fundamentally different context shapes (messages, tools, and streaming for chat vs. input text and dimensions for embeddings). A method discriminator on the chat channel distinguishes between the Chat Completions API and the Responses API.
We can spec out the message payloads passed to those channels in a PR and spec them out properly there.
But the main gains from baking this at the openai library level is:
- Zero performance cost when no one is listening and code path can be optimized with
hasSubscribers.
- Works everywhere on server runtimes today, Deno, Bun and CF Workers support it fully today.
- Implementation here will be straightforward because all methods return a promise AFAIK, maybe streaming will need a workaround.
- Anyone can use those channels to feed into other observability solutions (e.g: APMs, OpenTelemetry, and custom monitoring).
- APMs will have a public API that isn't subject to frequent changes unlike implementation internals that monkey patching forces them to.
That means no hacks with RITM or IITM will be needed anymore and any APM can just listen at anytime dynamically and report their traces, metrics or even logs.
I think #1563 can also be easily solved userland once the openai package implements tracing channels, some users have already started doing so because OTel instrumentations only work on Node.js also it isn't the right abstract layer to own as a first party.
Additional context
Examples
This is a simplified sketch of the logic that would be added to the openai SDK:
import dc from 'node:diagnostics_channel';
const chatChannel = dc.tracingChannel('openai:chat');
// Inside chat.completions.create / responses.create
async function create(params) {
if (!chatChannel.hasSubscribers) {
return this._makeRequest(params);
}
const context = { method: 'chat.completions.create', model: params.model, stream: !!params.stream, params };
return chatChannel.tracePromise(() => this._makeRequest(params), context);
}
APMs and any interested parties can then listen and react to those async executions like so:
import dc from 'node:diagnostics_channel';
dc.tracingChannel('openai:chat').subscribe({
start(ctx) {
// Create a span, inject timings, or whatever is needed.
},
asyncEnd(ctx) {
// End a span, record metrics...
},
error(ctx) {
// Report unhandled failures
},
});
All of which is pretty minimal on both sides yet it enables observing the SDK calls which is invaluable for monitoring AI powered software today.
I'm working full time on helping teams ship this, so I'm more than happy to submit a PR and co-own this with the team. We can discuss it further if you think a PR will be a good first step, and also I can jump on calls to showcase the importance of adopting tracing channels into the node SDK.
Prior Art
Like I mentioned, we worked with the teams of those high-usage ecosystem libraries to use tracing channels, which allows each library to have its own observability story at little to no cost.
Other projects that already shipped them:
fastify: ships TracingChannel support natively (tracing:fastify.request.handler)
undici (Node.js core): ships TracingChannel support since Node 20.12 (undici:request)
Confirm this is a feature request for the Node library and not the underlying OpenAI API.
Describe the feature or improvement you're requesting
I'd like to propose adding first-class
TracingChannelsupport to the OpenAI Node.js SDK, following the pattern established byundiciin Node.js core and adopted across the npm ecosystem.TracingChannelis a higher-level API built on top ofdiagnostics_channel, it is a built-in AP in Node.js that is available in all other runtimes like Bun and Deno and it is specifically designed for tracing async operations. It provides structured lifecycle channels (start,end,error,asyncStart,asyncEnd) and handles async context propagation correctly. This is the missing piece that makes monkey-patching approaches fragile in real-world async applications.I work at Sentry so we have a few first-hand problems with the current state of things with the monkey-patching approach. Current APM instrumentations use IITM (import-in-the-middle) for ESM and RITM (require-in-the-middle) for CJS to monkey-patch SDK internals. This has several fragility concerns:
Module._resolveFilename,module.register()). They don't work on Bun or Deno, which implement the Node.js API surface but not the module loader internals. The OpenAI SDK explicitly supports Node.js, Deno, and Bun, making monkey-patching especially inadequate.require()'d /import'd. Get the order wrong and instrumentation silently does nothing, which is very hard to debug in production.All of these are friction points that are not solvable on the APM's side, resulting in a less than ideal DX and constant bugs whenever the user switches runtimes, platforms, or framework.
Taking Sentry as an example, our OpenAI instrumentation uses RITM/IITM to intercept
require('openai'), replaces theOpenAIconstructor, and creates a deep recursive Proxy on the resulting client instance to intercept method calls at arbitrary nesting depth. This spans ~1,200 lines across 6 files:All of which is extremely coupled to the internal working of the AI SDK and thus is very fragile.
This is why we started an initiative to ask all the top libraries in the ecosystem to implement diagnostic channels, we got a lot of projects onboard and would like to get the top AI SDKs to implement them as well.
Proposed Tracing Channels
All channels use the Node.js
TracingChannelAPI, which providesstart,end,asyncStart,asyncEnd, anderrorsub-channels automatically.openai:chatmethod,model,stream,params,responseopenai:embeddingsmodel,params,responseTwo channels rather than one because chat and embeddings have fundamentally different context shapes (messages, tools, and streaming for chat vs. input text and dimensions for embeddings). A
methoddiscriminator on the chat channel distinguishes between the Chat Completions API and the Responses API.We can spec out the message payloads passed to those channels in a PR and spec them out properly there.
But the main gains from baking this at the openai library level is:
hasSubscribers.That means no hacks with RITM or IITM will be needed anymore and any APM can just listen at anytime dynamically and report their traces, metrics or even logs.
I think #1563 can also be easily solved userland once the
openaipackage implements tracing channels, some users have already started doing so because OTel instrumentations only work on Node.js also it isn't the right abstract layer to own as a first party.Additional context
Examples
This is a simplified sketch of the logic that would be added to the
openaiSDK:APMs and any interested parties can then listen and react to those async executions like so:
All of which is pretty minimal on both sides yet it enables observing the SDK calls which is invaluable for monitoring AI powered software today.
I'm working full time on helping teams ship this, so I'm more than happy to submit a PR and co-own this with the team. We can discuss it further if you think a PR will be a good first step, and also I can jump on calls to showcase the importance of adopting tracing channels into the node SDK.
Prior Art
Like I mentioned, we worked with the teams of those high-usage ecosystem libraries to use tracing channels, which allows each library to have its own observability story at little to no cost.
mysql2: sidorares/node-mysql2#4178node-redis: redis/node-redis#3195ioredis: redis/ioredis#2089h3: h3js/h3#1251srvx: h3js/srvx#141Other projects that already shipped them:
fastify: shipsTracingChannelsupport natively (tracing:fastify.request.handler)undici(Node.js core): shipsTracingChannelsupport since Node 20.12 (undici:request)