Skip to content

feat: add request layer lifecycle TracingChannels#196

Open
logaretm wants to merge 11 commits intopillarjs:masterfrom
logaretm:tracing-channel
Open

feat: add request layer lifecycle TracingChannels#196
logaretm wants to merge 11 commits intopillarjs:masterfrom
logaretm:tracing-channel

Conversation

@logaretm
Copy link
Copy Markdown

@logaretm logaretm commented Apr 15, 2026

This PR adds a single tracing channel that emits structured lifecycle events (start, end, asyncStart, asyncEnd, error) for every middleware, route handler, and error handler execution.

This builds on #96 but uses tracing channels instead which enables context propagation out of the box.

Context shape

The context object is pretty minimal, providing both req and res enables APMs to figure out how to react to that specific payload.

{ req, res, layer }

The layer is the Layer instance. Subscribers use layer.name, layer.handle.length (4 for error handlers), and req.route (set when inside a matched route) to classify and name spans however they see fit.

Usage Example:

This is an example I've set up locally using a Sentry setup with this PR patched in:

const Sentry = require('@sentry/node')
// otel-tracing-channel is a simple tracing channel wrapper that patches in the OTEL context propagation
// atm otel doesn't support tracing channels, so this workaround is required for OTEL-based APMs
// the user wouldn't need this and typically they can set their own thing with ALS.
const { tracingChannel } = require('otel-tracing-channel')

Sentry.init({
  // ...
})

// Subscribe to express:request TracingChannel and create Sentry spans
const channel = tracingChannel('pillarjs.router.request', (ctx) => {
  const { layer, req } = ctx
  const isErrorHandler = layer.handle.length === 4
  const isRouteHandler = Boolean(req.route)

  const spanName = isRouteHandler
    ? `${req.method} ${req.route.path}`
    : isErrorHandler
      ? `error_handler - ${layer.name}`
      : `middleware - ${layer.name}`

  return Sentry.startSpanManual(
    {
      name: spanName,
      op: isRouteHandler ? 'http.handler' : isErrorHandler ? 'middleware.error' : 'middleware',
      attributes: {
        'express.name': layer.name,
        'http.request.method': req.method,
        'url.path': req.url,
        'url.full': req.originalUrl,
        ...(req.route && { 'http.route': req.route.path }),
        ...(req.query && Object.keys(req.query).length > 0 && {
          'url.query': JSON.stringify(req.query)
        }),
        ...(req.params && Object.keys(req.params).length > 0 && {
          'express.params': JSON.stringify(req.params)
        })
      }
    },
    (span) => span
  )
})

channel.subscribe({
  asyncEnd (ctx) {
    if (ctx.req.route) {
      ctx.span?.setAttribute('http.response.status_code', ctx.res.statusCode)
    }
    ctx.span?.end()
  },
  error (ctx) {
    ctx.span?.setStatus({ code: 2, message: ctx.error?.message })
    ctx.span?.end()
  }
})

// Express App
const express = require('express')
const app = express()

// Middleware
app.use(express.json())

app.use(function requestLogger (req, res, next) {
  console.log(`${req.method} ${req.url}`)
  next()
})

app.use(function addRequestId (req, res, next) {
  req.requestId = Math.random().toString(36).slice(2, 10)
  res.setHeader('x-request-id', req.requestId)
  next()
})

// Routes
app.get('/users/:id', function validateId (req, res, next) {
  if (isNaN(req.params.id)) {
    return res.status(400).json({ error: 'id must be a number' })
  }
  next()
}, function loadUser (req, res, next) {
  req.user = { id: req.params.id, name: 'Test User' }
  next()
}, function getUser (req, res) {
  res.json({ ...req.user, requestId: req.requestId })
})

This produces a waterfall view looking like this:

CleanShot 2026-04-15 at 08 36 22@2x

Note: The three spans that look the same at the bottom actually represent the three handlers executing for that route, it is up to the subscriber to decide which one is the route handler and how to name the other ones since it has access to layer.

Related work

Caveats

  • The channel name right now is pillarjs.router.request, in the original PR it was request but I think channel names need to have a relationship with the module name otherwise we risk duplication in the ecosystem.
  • Depending on the exact version of Node 18 certain behaviors are expected in this PR:
    • Node versions without hasSubscribers will assume that there is a subscriber and will execute the traced path (with context object allocation). Multiple libraries like redis and mysql2 accepted this limitation.
    • Node versions without tracingChannel support will just not offer tracing of any kind. The fallback here is NOOP.
  • It is up to APMs to represent the onion shape with nested handlers or flatten certain spans which is relatively easy if APMs have their own ALS context propagation strategy.
  • Tests running a node version that do not have tracingChannel will be skipped via describe.skip.

Add a single `express:request` TracingChannel that emits structured
lifecycle events (start, end, asyncStart, asyncEnd, error) for every
middleware, route handler, and error handler execution.

Context shape: { req, res, layer } where layer is the Layer instance,
giving subscribers access to layer.name, layer.handle.length (for error
handler detection), and layer.route. Consumers decide how to classify
and name spans based on these properties.

Route dispatch wrapper layers (internal glue that calls route.dispatch)
are excluded from tracing to avoid duplicate events. The actual user
handlers inside the route are traced individually.

Zero overhead when no subscribers are registered. The hasSubscribers
check gates all context allocation and tracePromise wrapping.

Refs: pillarjs#96
Refs: expressjs/express#6353
Extract handlePromise and unify the traced/untraced paths into a
single try/catch and .then handler.
Copilot AI review requested due to automatic review settings April 15, 2026 12:45
logaretm added a commit to getsentry/js-tracing-channels-proposals that referenced this pull request Apr 15, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds request-layer lifecycle instrumentation using Node.js diagnostics_channel TracingChannel, emitting structured events for middleware, route handlers, and error handlers to support APM-style tracing with context propagation.

Changes:

  • Add express:request TracingChannel integration in Layer request/error handling.
  • Introduce shared invocation helper to wrap handler execution and promise handling with tracing hooks.
  • Add a new test suite validating emitted events, context shape, and basic ordering/behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
lib/layer.js Adds TracingChannel wiring and refactors handler invocation through a tracing-aware helper.
test/tracing.js New mocha tests covering tracing context, handler types, and lifecycle event expectations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lib/layer.js Outdated
Comment thread lib/layer.js Outdated
Copy link
Copy Markdown

@Qard Qard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM. I think we might want to validate if handled errors are still captured on the span prior to the handler though.

Comment thread test/tracing.js Outdated
})

describe('error handler tracing', function () {
it('should trace error handlers (fn.length === 4)', function (done) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this one capture the error even though it is considered handled overall?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not entirely sure, APMs will usually have a global error handler or whatever to catch them so having them traced might be an overkill and may duplicate error emissions for those who don't dedup them.

WDYT?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But does it having a handler here result in the error not being present on the span for the layer that lead to the error handler? Whatever the behaviour is, I think it would be useful to validate it more explicitly in this particular test. We should have a clear model of what errors happen where.

Copy link
Copy Markdown
Author

@logaretm logaretm Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I added more tests to pin down various behaviors. Here is how errors flow right now:

No error handler (no app.use):

  • Route throws sync -> error channel publishes for the route layer. Router responds 500.
  • Route promise rejects -> error channel publishes for the route layer. Router responds 500.

With error handler:

  • Route throws sync -> error channel publishes for the route, handler runs, nothing on the error channel for the handler.
  • Route rejects async -> same, error channel publishes for the route, handler clean.

With error handler that also throws/rejects:

  • Route throws sync, handler throws -> error channel publishes twice, once per layer. Two legitimately failed executions.
  • Route rejects async, handler throws -> same.

For an APM: both throw cases produce two error events on the channel — one with ctx.layer pointing at the route, one at the error handler. Each event targets its own span, so the APM just marks both spans as failed.

In the case for next(err) then the child error handler span will be marked, not the parent one. Which is fine I suppose since it technically recovered as you've said.

Tests in 1ad1b9c.

Copy link
Copy Markdown

@Qard Qard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm...this handling seems a bit incorrect. If an error is unhandled it should definitely be reported. And even if it is handled we should still probably be reporting it so we can capture cases like a too permissive error handler silencing legitimate failures.

I think probably what we should be doing is always capturing the error and just adding some handled flag to the context to mark when it has been handled somehow. That way we can still report it with the detail that the application thinks it has been handled while still allowing human judgement to look at the span and see if that assessment actually makes sense.

Comment thread test/tracing.js Outdated
})
})

it('should not emit error on the originating layer when next(err) is unhandled', function (done) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems problematic. Unhandled errors should definitely always be observable.

Copy link
Copy Markdown
Author

@logaretm logaretm Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay agreed, this could've become a real gap and it's always easier to dedup rather than miss events.

I pushed a fix to that in 90ed7e2 and adjusted the tests to pin that behavior.

@bjohansebas bjohansebas requested review from efekrskl and krzysdz April 21, 2026 03:07
Copy link
Copy Markdown

@Qard Qard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's in a reasonable state now. Might be good to also get a review from @rochdev or @Flarna for some more perspectives from other APM vendors. The specific error semantics I think might be a bit up to interpretation how they should be handled, but I think this shape should generally provide the context that is needed for the different ways we think about route errors. 🤔

Comment thread lib/layer.js Outdated
* @private
*/

const requestChannel = dc.tracingChannel && dc.tracingChannel('express:request')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is express:request channel used here?
I think it would be better to use a dedicated channel for router to allow consumers to decide on what to receive and how to handle it.
I think router component can be also used without express.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I don't think it has to be express prefixed but it needs a unique prefix for sure which is why I wanted some opinions on what it should be. My concern is router:request is too generic and can easily conflict with other names in the ecosystem.

I had this in the PR description:

The channel name right now is express:request, in the original PR it was request but I think channel names need to have a relationship with the module name otherwise we risk duplication in the ecosystem.

WDYT? Maybe pillarjs:router:request?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a guideline.
So I guess the string to use would be pillarjs.router.request as tracing: prefix and event name postfix is added by dc.tracingChannel().

Copy link
Copy Markdown
Author

@logaretm logaretm Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't seem like there is a consistent standard, docs do suggest that but others just went with lib:something:action formats.

Anyways I have no strong opinions here and I think as long as the name is unique enough, it is arbitrary. I renamed it to your suggestion 👍

Follows the Node.js TracingChannel naming guideline (dot-separated,
module-scoped) and avoids the express-specific prefix since router
is usable outside express.
@Flarna
Copy link
Copy Markdown

Flarna commented Apr 23, 2026

The actual data published by the tracing channel is the Layer instance, req and res objects.

If I understand the router module correct layer is an internal object which gets exposed here. Is there any guarantee that the relevant fields on that object stay intact? In fact they become part of the public interface here.

Also I don't find in the docs here that req.route is a public property. Is it safe for consumers to rely on that as it is seen part of public API and following Semver regarding changes?

In general I would love some section in the docs describing the diag channel and it's emitted data.

@logaretm
Copy link
Copy Markdown
Author

Is it safe for consumers to rely on that as it is seen part of public API and following Semver regarding changes?

I can't answer that as it usually up to maintainers to decide this but I would say it does imply a public API contract and should be subject to smever.

As for the req.route, I'm not sure the AI created a snippet with it and it worked 🤷‍♂️

The layer object is strictly private today so this woulda actually expose it publicly.

In general I would love some section in the docs describing the diag channel and it's emitted data.

I think we have 3 options here:

  • Define a curated list of properties and expose those instead as public API.
  • Layer objects become a public API.
  • Continue to emit layer, but document a stable partial of it.

I think that would be up to the maintainers to decide, but all options work for us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support diagnostic channels

4 participants