Skip to content

feat(datadog-aws-lambda): add distributed tracing for Rust Lambda functions#190

Open
Dogbu-cyber wants to merge 26 commits intodavid.ogbureke/aws-sdk-rustfrom
david.ogbureke/lambda-integration
Open

feat(datadog-aws-lambda): add distributed tracing for Rust Lambda functions#190
Dogbu-cyber wants to merge 26 commits intodavid.ogbureke/aws-sdk-rustfrom
david.ogbureke/lambda-integration

Conversation

@Dogbu-cyber
Copy link
Copy Markdown

@Dogbu-cyber Dogbu-cyber commented Mar 17, 2026

PR Stack: #194 (workspace setup) -> #189 (aws-sdk injection) -> #190 (lambda consumer)

What does this PR do?

Adds datadog-aws-lambda (instrumentation/datadog-aws-lambda/), a crate that provides Datadog distributed tracing for Rust Lambda functions. Wrap your handler with WrappedHandler and each invocation automatically extracts upstream trace context, creates inferred trigger spans, and instruments the invocation with an aws.lambda root span.

Trigger detection and carrier extraction are delegated to libdd-trace-inferrer, an experimental shared crate in development in libdatadog. This crate is a PoC implementation based on the work outlined in the Serverless Rust tracing design doc, originally started by @duncanista on the jordan.gonzalez/libdd-trace-inferrer branch. This PR depends on a fork of that work at david.ogbureke/libdd-trace-inferrer to unblock the consumer side while the upstream crate matures.

Dependency note: libdd-trace-inferrer is pulled via a git dependency on david.ogbureke/libdd-trace-inferrer in libdatadog. This will be updated to a stable release once the crate lands in libdatadog main.

Supported triggers (as implemented by libdd-trace-inferrer):

Trigger Inferred Span(s)
SQS aws.sqs
SNS aws.sns
EventBridge aws.eventbridge
SNS -> SQS aws.sns -> aws.sqs
EventBridge -> SQS aws.eventbridge -> aws.sqs
EventBridge -> SNS aws.eventbridge -> aws.sns
API Gateway REST (v1) aws.apigateway
API Gateway HTTP (v2) aws.httpapi
API Gateway WebSocket aws.apigateway.websocket
Lambda Function URL aws.lambda.url
Kinesis aws.kinesis
DynamoDB aws.dynamodb
S3 aws.s3
MSK (Kafka) aws.msk

For all trigger types, trace context carrier extraction is also handled by libdd-trace-inferrer. A header-based fallback covers payloads not matched by any known trigger shape.

Motivation

Completes the consumer side of distributed tracing through AWS managed services for Rust Lambdas. The producer side is handled by datadog-aws (#189).

Notes

  • MSRV 1.85.0 (not repo-wide 1.84.1), required by lambda_runtime crate.
  • Handler accepts LambdaEvent<Box<RawValue>> — the runtime passes raw JSON bytes without deserializing into a Value, eliminating a redundant allocation before the user's type is constructed.
  • The propagator (global::get_text_map_propagator) is acquired once per invocation, not once per carrier branch.
  • SpanInferrer and SdkTracer are constructed once at cold start and reused across invocations.
  • datadog-opentelemetry is pulled in with features = ["test-utils"] because set_trace_writer_synchronous_write is currently gated behind that feature. Synchronous flush ensures spans are flushed from the handler's in-process buffer to the local Datadog extension before the handler returns, reducing span loss when the process freezes. This causes test-only deps (criterion, gRPC and HTTP exporters) to be compiled into the production binary, which has a binary size impact on cold starts.

@Dogbu-cyber Dogbu-cyber added the enhancement New feature or request label Mar 17, 2026
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have all the code for inferred spans and trace extraction in libdatadog. cc @duncanista

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not currently ready to be in libdatadog so the relevant trace extraction code was taken from the extension and placed here.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we now have the same code in two locations? I think it makes more sense to work on moving the code from the extension to libdatadog than it does to duplicate all of this code.

Also, pretty sure that @duncanista moved the trace extraction code to libdatadog already.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I spoke to him he said his branch would not be ready in time, and that he thinks it's fine if this is duplicated here and then removed later.

@Dogbu-cyber Dogbu-cyber marked this pull request as ready for review March 17, 2026 22:01
@Dogbu-cyber Dogbu-cyber requested a review from a team as a code owner March 17, 2026 22:01
@ygree
Copy link
Copy Markdown

ygree commented Mar 18, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9f0960ba57

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread instrumentation/aws/datadog-aws-lambda/src/lib.rs Outdated
Comment thread integrations/aws/datadog-lambda-rs/src/span_inferrer/triggers/sns.rs Outdated
Comment thread integrations/aws/datadog-lambda/src/span_inferrer/mod.rs Outdated
@Dogbu-cyber Dogbu-cyber force-pushed the david.ogbureke/aws-sdk-rust branch from ee01b82 to 612838b Compare March 22, 2026 00:55
@Dogbu-cyber Dogbu-cyber force-pushed the david.ogbureke/lambda-integration branch 2 times, most recently from aa4a9db to decba09 Compare March 23, 2026 14:54
@Dogbu-cyber Dogbu-cyber force-pushed the david.ogbureke/aws-sdk-rust branch from 612838b to 4f7442a Compare March 23, 2026 14:54
@Dogbu-cyber Dogbu-cyber changed the title AWS Lambda Integration feat(datadog-lambda-rs): add distributed tracing for Rust Lambda functions Mar 23, 2026
@Dogbu-cyber Dogbu-cyber force-pushed the david.ogbureke/lambda-integration branch from b1d472b to fd24910 Compare March 26, 2026 13:29
@Dogbu-cyber Dogbu-cyber force-pushed the david.ogbureke/aws-sdk-rust branch 3 times, most recently from 80fedd0 to 18c7914 Compare March 30, 2026 20:48
@Dogbu-cyber Dogbu-cyber force-pushed the david.ogbureke/lambda-integration branch from f5a3f43 to 5fd8f52 Compare March 30, 2026 20:51
@Dogbu-cyber
Copy link
Copy Markdown
Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5fd8f52f48

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

.as_ref()
.map(|s| s.is_async)
.unwrap_or(false);
let upstream_cx = if validate_carrier(&result.carrier).is_some() {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Extract context without requiring Datadog trace-id header

TriggerExtractor::extract only calls the configured propagator when validate_carrier finds a numeric x-datadog-trace-id, so valid upstream contexts that are propagated as W3C-only headers (traceparent/tracestate) are dropped and a new root trace is created. This breaks distributed tracing for setups that use tracecontext-only injection (for example via propagation-style config) even though global::get_text_map_propagator can parse those headers; extraction should not be gated on a Datadog-specific key.

Useful? React with 👍 / 👎.

@Dogbu-cyber Dogbu-cyber force-pushed the david.ogbureke/lambda-integration branch from 27e53f9 to 53797ee Compare April 2, 2026 14:54
@Dogbu-cyber Dogbu-cyber force-pushed the david.ogbureke/aws-sdk-rust branch from bc510bc to 35b2e4f Compare April 2, 2026 14:54
@Dogbu-cyber Dogbu-cyber force-pushed the david.ogbureke/lambda-integration branch from 3debc49 to 0b7351d Compare April 2, 2026 15:32
@Dogbu-cyber Dogbu-cyber force-pushed the david.ogbureke/aws-sdk-rust branch from 35b2e4f to 7d1fbbe Compare April 2, 2026 15:32
@Dogbu-cyber Dogbu-cyber changed the title feat(datadog-lambda-rs): add distributed tracing for Rust Lambda functions feat(datadog-aws-lambda): add distributed tracing for Rust Lambda functions Apr 2, 2026
@paullegranddc paullegranddc changed the base branch from david.ogbureke/aws-sdk-rust to main April 7, 2026 12:26
@paullegranddc paullegranddc changed the base branch from main to david.ogbureke/aws-sdk-rust April 7, 2026 12:27
@Dogbu-cyber Dogbu-cyber force-pushed the david.ogbureke/aws-sdk-rust branch from 29211f0 to d999596 Compare April 13, 2026 12:51
@Dogbu-cyber Dogbu-cyber force-pushed the david.ogbureke/lambda-integration branch 2 times, most recently from c856bff to ad914b1 Compare April 13, 2026 13:17
@Dogbu-cyber Dogbu-cyber force-pushed the david.ogbureke/aws-sdk-rust branch from 5cf7cc6 to 778b9a8 Compare April 16, 2026 21:38
…r inference

Adds `datadog-lambda` (`integrations/aws/datadog-lambda/`), a crate that
provides Datadog distributed tracing for Rust Lambda functions. Wrap your
handler with `WrappedHandler` and each invocation automatically extracts
upstream trace context, creates inferred trigger spans, and instruments
the invocation with an `aws.lambda` root span.

Supported triggers:
- SQS: `_datadog` MessageAttribute (String, JSON)
- SNS: `_datadog` MessageAttribute (Binary or String, JSON)
- EventBridge: `_datadog` key in `Detail` JSON
- SNS -> SQS: SQS body contains SNS notification
- EventBridge -> SQS: SQS body contains EB event
- EventBridge -> SNS: SNS message contains EB event
- API Gateway REST v1 / HTTP v2: `headers` object (case-insensitive)
- Lambda Function URL: `headers` object (case-insensitive)
Add the dd_resource_key meta tag to inferred spans for API Gateway HTTP
(v2) and REST (v1) triggers, matching the Datadog Lambda Extension
behavior. This tag is used by the Datadog backend to link inferred spans
to AWS resources (e.g. the API Gateway "Invoked Functions" view).

Also adds trigger_arn computation for both API Gateway trigger types and
aws_region/get_aws_partition_by_region helpers mirroring the extension.
- Allow disallowed_methods on aws_region(): AWS_REGION / AWS_DEFAULT_REGION
  are Lambda platform variables, not application config
- Allow type_complexity on test_handler() helper
… backend

Replace the hand-rolled trigger detection logic (~1800 lines across
triggers/) with libdd-trace-inferrer from libdatadog. The shared crate
handles event parsing, carrier extraction, and inferred span construction;
this crate now owns only the OTel span lifecycle on top of those results.

- Add libdd-trace-inferrer workspace dependency (jordan.gonzalez/libdd-trace-inferrer branch)
- Replace InferredSpan + triggers/ with InferenceResult from the shared crate
- Delete the triggers/ directory (kept locally for reference, not compiled)
- Drop DD_RESOURCE_KEY constant (no longer emitted)
- Update tests to construct InferenceResult directly
…lue>

Accept LambdaEvent<Box<RawValue>> in the Service impl so the runtime
copies the payload bytes once without parsing them into a Value tree.
Trigger extraction and typed deserialization both call .get() on the
same RawValue, eliminating the intermediate serde_json::Value allocation.

- Enable serde_json raw_value feature
- Switch Service impl from LambdaEvent<Value> to LambdaEvent<Box<RawValue>>
- Propagate &str through Invocation::start and TriggerExtractor::extract
- Replace extract_from_headers(&Value) with RawHeaders<'_> zero-copy struct
- Call inferrer.infer_span(&str) instead of infer_span_from_value(&Value)
…InferredSpanScope

InferenceResult has at most one level of wrapping, so the span chain
is always 0-2 elements. Replace the Vec with explicit outer and inner
Option fields to reflect that constraint in the type.
global::get_text_map_propagator acquires a RwLock on every call.
Resolve both the inferrer carrier and the header fallback carrier
before entering the closure so the lock is taken exactly once.
SpanInferrer and AWS_REGION were previously reconstructed on every
invocation. Store the inferrer in WrappedHandler (built once in new())
and thread it through Invocation::start and TriggerExtractor::extract.
@Dogbu-cyber Dogbu-cyber force-pushed the david.ogbureke/lambda-integration branch from bacc2de to 81e8e60 Compare April 16, 2026 21:48
libdd-trace-inferrer already handles carrier extraction for all trigger
types including API Gateway (REST/HTTP/WebSocket), ALB, and Lambda
Function URLs — it populates result.carrier directly from the event
headers. The local extract_from_headers_str fallback and carrier.rs
module were dead code.

Also replace format!("{err}") with err.to_string() in set_error.
Improves documentation density to match the style established in
libdd-trace-inferrer:

- Module-level `//!` docs on `invocation` and `span_inferrer` explaining
  each module's role and typical call sequence
- Field-level `///` docs on `LambdaSpan`, `Invocation`, `TriggerContext`,
  `ActiveInferredSpan`, `TriggerExtraction`
- Function docs on `LambdaSpan::start` (parent fallback logic),
  `Invocation::handler_context`, `finish`, `finish_spans`,
  `build_inferred_span` (start_ns fallback), `InferredSpanScope::start`,
  and `TriggerExtractor::extract` (zero trace-id sentinel)
- `#[must_use]` on `handler_context`
- `#![cfg_attr(not(test), deny(clippy::panic/unwrap_used/expect_used))]`
  to prevent tracing from silently crashing Lambda invocations
…th free function

TriggerExtractor carried no state and its single method was a free
function in disguise. Replace it with extract_trigger() at module level.

Move the trigger_tags key lookups ("function_trigger.event_source" /
"function_trigger.event_source_arn") from Invocation::start into
extract_trigger(), exposing them as typed fields on TriggerExtraction.
Invocation no longer reaches into InferenceResult's internal tag map.
…add innermost_context

The tuple return (Context, Self) forced callers to juggle the innermost
context as a separate value even though Self already owns it via
self.inner. Drop the context from the return type and add
innermost_context(&self, fallback) -> Context so the scope is
self-contained. Callers pass the upstream context as the fallback for
the no-inferred-spans case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants