Skip to content

feat(api): add OpenTelemetry tracing (HTTP + DB spans)#18

Merged
jmgilman merged 1 commit into
masterfrom
feat/otel-tracing
Jun 24, 2026
Merged

feat(api): add OpenTelemetry tracing (HTTP + DB spans)#18
jmgilman merged 1 commit into
masterfrom
feat/otel-tracing

Conversation

@jmgilman

Copy link
Copy Markdown
Contributor

Summary

Adds opt-in OpenTelemetry distributed tracing. When --tracing-enabled, the server exports spans over OTLP/HTTP, configured entirely through the standard OTEL_* environment variables (no bespoke endpoint/sampler flags). It covers inbound HTTP requests and PostgreSQL queries, with W3C trace-context propagation and a flush on graceful shutdown.

Design (decisions confirmed with the maintainer)

  • Config: standard OTEL_* env + a single --tracing-enabled master switch. otlptracehttp.New reads OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_TRACES_SAMPLER, etc. natively, so deployments tune tracing the OpenTelemetry way. service.name/service.version default to the app name and build version, overridable via OTEL_SERVICE_NAME / OTEL_RESOURCE_ATTRIBUTES.
  • Depth: HTTP + DB spans. Inbound HTTP server spans via otelhttp (edge handler, W3C propagation, named by operation for low cardinality, infra routes filtered out) and PostgreSQL query spans via otelpgx on the pool — a trace shows the SQL under the request that issued it.
  • Default OFF. Tracing needs an external collector; defaulting on with no endpoint would spam connection errors. This deliberately differs from the self-contained authz/rate-limit tiers (on by default).

Layering

  • New internal/observability/tracing.go: NewTracerProvider(ctx, cfg) (OTLP exporter + resource + batching provider + global registration + W3C propagator; returns a flush-on-shutdown func, or nil when disabled), and TraceSpanNamer — a router-agnostic Huma middleware that renames the active server span to the operation ID.
  • internal/adapter/http: RouterDeps.Tracing wraps the mux with otelhttp.NewHandler (filtering /healthz, /readyz, /metrics) and installs the span namer before registration.
  • internal/adapter/postgres: Config.Tracing installs the otelpgx query tracer (gated, zero-overhead when off).
  • internal/app: builds the provider, wires Tracing through the router and the pool, and flushes the provider on shutdown (mirrors closePool/stopRateLimiter) on a fresh grace-bounded context.
  • internal/config: --tracing-enabled (default false).

When disabled, nothing is installed (the global no-op provider stays), the handler is unwrapped, and the pgx tracer is absent — zero overhead.

Testing

  • Unit: NewTracerProvider enabled/disabled (globals saved/restored); TraceSpanNamer renames the active span to the operation ID (humatest + an in-memory tracetest exporter); traceableRequest excludes infra routes; NewRouter{Tracing:true} still serves requests.
  • Config: tracing defaults off.
  • moon run root:check green (incl. openapi-check — tracing is middleware/transport-level, so the spec is unchanged); moon run root:test-integration green against postgres:17-alpine (default tracing-off path, no regression).

🤖 Generated with Claude Code

Add opt-in OpenTelemetry distributed tracing. When --tracing-enabled, the
server exports spans over OTLP/HTTP — configured entirely through the
standard OTEL_* environment variables, with no bespoke endpoint/sampler
flags — covering inbound HTTP requests (otelhttp server spans, named by
operation, infra routes excluded) and PostgreSQL queries (otelpgx), with
W3C trace-context propagation and a flush on graceful shutdown.

Tracing defaults off because it needs an external collector, unlike the
self-contained authz and rate-limit tiers. The bootstrap lives in
internal/observability; service.name/version default to the app name and
build version and are overridable via OTEL_SERVICE_NAME /
OTEL_RESOURCE_ATTRIBUTES. The infrastructure routes (/healthz, /readyz,
/metrics) are filtered out so health checks and scrapes are not traced.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jmgilman jmgilman merged commit 6625ab1 into master Jun 24, 2026
7 checks passed
@jmgilman jmgilman deleted the feat/otel-tracing branch June 24, 2026 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant