agents,lib,src,test: add traceSampleRate support#430
agents,lib,src,test: add traceSampleRate support#430santigimeno wants to merge 1 commit intonode-v24.x-nsolid-v6.xfrom
Conversation
Add end-to-end traceSampleRate handling across config, runtime propagation, tracing decisions, and regression tests. Why: - Enable configurable probabilistic trace sampling with predictable behavior. - Ensure consistent semantics across all config entry points. - Prevent invalid updates from corrupting current sampling behavior. - Keep transaction consistency by deciding sampling at the root span only. What changed: - Added traceSampleRate parsing and normalization in JS config paths with explicit default fallback and finite/range validation in [0, 1]. - Added native config sanitization for traceSampleRate to reject invalid values before merge, preserving previous valid configuration. - Ensured runtime sampling state is synchronized from effective current config after updates to avoid stale shared-memory sample rates. - Added gRPC reconfigure support for traceSampleRate in proto and agent mapping, including generated protobuf updates. - Updated tracing logic so root spans perform the sampling decision and child spans inherit parent traceFlags. - Extended tests for: - invalid value handling (including NaN/Infinity) - env/package bootstrap behavior - partial updates preserving existing traceSampleRate - gRPC invalid-update fallback behavior - sampling behavior at 0%, 50% (tolerance), and 100% - worker-thread sampling behavior - explicit parent/child trace consistency assertions
WalkthroughThis PR introduces probabilistic trace sampling control to the NSolid/OpenTelemetry integration. A new Changes
Sequence DiagramsequenceDiagram
participant App as JavaScript App
participant Config as nsolid.config
participant gRPC as gRPC Agent
participant EnvList as EnvList (C++)
participant Tracer as OpenTelemetry Tracer
participant Span as Root Span
App->>Config: updateConfig({traceSampleRate: 0.5})
Config->>Config: parseTraceSampleRate(0.5)
Config->>gRPC: serialize & send reconfigure
gRPC->>EnvList: validate_trace_sample_rate(config)
EnvList->>EnvList: sanitize & check [0, 1]
EnvList->>EnvList: update_tracing_sample_rate(0.5)
EnvList->>EnvList: store in trace_sample_rate_
EnvList->>Config: propagate via binding
Note over Config: binding.trace_sample_rate = 0.5
App->>Tracer: create root span (no parent)
Tracer->>Span: sample decision: MathRandom() < 0.5?
alt Sampled (50% probability)
Span->>Span: spanContext.flags = SAMPLED
else Not Sampled (50% probability)
Span->>Span: spanContext.flags = NOT_SAMPLED
end
Span-->>Tracer: return span
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
test/agents/test-grpc-reconfigure.mjs (1)
171-208: Consider addingNaN/Infinityinvalid-rate cases in this gRPC path test.This block currently checks only out-of-range finite numbers. Extending it with
Number.NaNand±Infinitywould better guard the gRPC reconfigure edge cases already covered in non-gRPC tests.✅ Minimal test extension
- const invalidRates = [2, -0.5]; + const invalidRates = [2, -0.5, Number.NaN, Number.POSITIVE_INFINITY, Number.NEGATIVE_INFINITY];🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/agents/test-grpc-reconfigure.mjs` around lines 171 - 208, Update the test "should preserve previous traceSampleRate for invalid values" to include NaN and Infinity cases: add Number.NaN, Number.POSITIVE_INFINITY and Number.NEGATIVE_INFINITY (or +/-Infinity) to the invalidRates array used with grpcServer.reconfigure and client.config assertions so grpcServer.reconfigure(agentId, { traceSampleRate: invalidRate }) is exercised for NaN/Infinity and the existing assert.strictEqual checks that nsolidConfig.traceSampleRate remained 0.4 still apply.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@agents/grpc/src/grpc_agent.cc`:
- Around line 1761-1763: PopulateReconfigureEvent currently omits the
traceSampleRate field when building the outgoing reconfigure event body; update
PopulateReconfigureEvent to set body.traceSampleRate from the internal config so
the outgoing event mirrors inbound updates (i.e., when you previously read
body.tracesamplerate() on incoming updates, ensure PopulateReconfigureEvent
calls the corresponding setter to populate body.traceSampleRate in the event).
Locate the PopulateReconfigureEvent function in grpc_agent.cc and add the
traceSampleRate assignment using the same source/field used for other mapped
settings so reconfigure responses include traceSampleRate.
In `@lib/nsolid.js`:
- Around line 1144-1154: The parseTraceSampleRate function currently coerces
unintended types via unary +; update parseTraceSampleRate to only accept inputs
that are typeof 'number' or 'string', explicitly reject booleans and other
types, trim string inputs and return undefined for empty/whitespace-only
strings, then convert the trimmed numeric string (or the number input) to a
numeric value and validate with NumberIsFinite and range checks (>=0 && <=1);
ensure the function returns undefined for invalid types, whitespace-only
strings, NaN, non-finite numbers, or values outside the 0–1 range while
returning the numeric rate for valid inputs.
In `@test/parallel/test-nsolid-config-trace-sample-rate-env.js`:
- Around line 15-18: The test currently spreads process.env into the child env
(env: {...process.env, ...envVars}), which can leak ambient NSOLID_* variables
and make the test nondeterministic; update the env construction in
test/parallel/test-nsolid-config-trace-sample-rate-env.js to explicitly filter
out any keys starting with "NSOLID_" (e.g., build a filteredEnv =
Object.fromEntries(Object.entries(process.env).filter(([k]) =>
!k.startsWith('NSOLID_'))) and then use env: { ...filteredEnv, ...envVars } so
the child process is isolated from ambient NSOLID_* variables while still
allowing envVars to override.
---
Nitpick comments:
In `@test/agents/test-grpc-reconfigure.mjs`:
- Around line 171-208: Update the test "should preserve previous traceSampleRate
for invalid values" to include NaN and Infinity cases: add Number.NaN,
Number.POSITIVE_INFINITY and Number.NEGATIVE_INFINITY (or +/-Infinity) to the
invalidRates array used with grpcServer.reconfigure and client.config assertions
so grpcServer.reconfigure(agentId, { traceSampleRate: invalidRate }) is
exercised for NaN/Infinity and the existing assert.strictEqual checks that
nsolidConfig.traceSampleRate remained 0.4 still apply.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
agents/grpc/proto/reconfigure.protoagents/grpc/src/grpc_agent.ccagents/grpc/src/proto/reconfigure.pb.ccagents/grpc/src/proto/reconfigure.pb.hlib/internal/otel/trace.jslib/nsolid.jssrc/nsolid/nsolid_api.ccsrc/nsolid/nsolid_api.htest/addons/nsolid-tracing/test-otel-basic2.jstest/agents/test-grpc-reconfigure.mjstest/fixtures/nsolid-trace-sample-rate-package.jsontest/fixtures/test-nsolid-config-trace-sample-rate-env-script.jstest/parallel/test-nsolid-config-trace-sample-rate-env.jstest/parallel/test-nsolid-config-trace-sample-rate.jstest/parallel/test-nsolid-trace-sample-rate-sampling.js
| if (body.has_tracesamplerate()) { | ||
| out["traceSampleRate"] = body.tracesamplerate(); | ||
| } |
There was a problem hiding this comment.
traceSampleRate is accepted but not returned in reconfigure event body.
Lines 1761-1763 map inbound updates, but PopulateReconfigureEvent does not set body.traceSampleRate. This makes reconfigure responses incomplete even when the update is applied.
💡 Suggested fix
diff --git a/agents/grpc/src/grpc_agent.cc b/agents/grpc/src/grpc_agent.cc
@@ void PopulateReconfigureEvent(grpcagent::ReconfigureEvent* reconfigure_event,
it = config.find("assetsEnabled");
if (it != config.end()) {
body->set_assetsenabled(*it);
}
+ it = config.find("traceSampleRate");
+ if (it != config.end()) {
+ body->set_tracesamplerate(*it);
+ }
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@agents/grpc/src/grpc_agent.cc` around lines 1761 - 1763,
PopulateReconfigureEvent currently omits the traceSampleRate field when building
the outgoing reconfigure event body; update PopulateReconfigureEvent to set
body.traceSampleRate from the internal config so the outgoing event mirrors
inbound updates (i.e., when you previously read body.tracesamplerate() on
incoming updates, ensure PopulateReconfigureEvent calls the corresponding setter
to populate body.traceSampleRate in the event). Locate the
PopulateReconfigureEvent function in grpc_agent.cc and add the traceSampleRate
assignment using the same source/field used for other mapped settings so
reconfigure responses include traceSampleRate.
| function parseTraceSampleRate(value) { | ||
| if (value === undefined || value === null || value === '') { | ||
| return undefined; | ||
| } | ||
|
|
||
| const normalized = +value; | ||
| if (!NumberIsFinite(normalized) || normalized < 0 || normalized > 1) { | ||
| return undefined; | ||
| } | ||
|
|
||
| return normalized; |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
JavaScript unary plus operator coercion behavior true false empty string symbols
💡 Result:
In JavaScript, unary plus (+x) coerces its operand using ToNumber (spec: it evaluates the operand, gets its value, then applies ToNumber). [3]
Common coercions
| Operand | Result of +operand |
Notes |
|---|---|---|
true |
1 |
Boolean → number coercion [1] |
false |
0 |
Boolean → number coercion [1] |
"" (empty string) |
0 |
String numeric coercion; empty string becomes 0 [1] |
Symbol("x") (any Symbol) |
throws TypeError |
ToNumber throws on Symbol (and also on BigInt) [2] |
Examples: +true === 1, +false === 0, +"" === 0, +"hello" === NaN, +Symbol() throws TypeError. [1][2]
Sources:
[1] MDN: Unary plus (+) operator coercion examples and behavior
[2] ECMAScript spec: ToNumber throws on Symbol (and BigInt)
[3] ECMAScript spec: unary + evaluation uses ToNumber(GetValue(...))
Reject non-numeric, non-string inputs and properly validate whitespace in traceSampleRate.
+value coerces unintended inputs (true → 1, false → 0) and whitespace-only strings (e.g., ' ' → 0) into valid rates, causing invalid configuration to silently alter sampling instead of being rejected. Type-check to accept only numbers and strings, trim and reject empty/whitespace-only strings, and reject booleans and other types.
Suggested fix
const {
ArrayIsArray,
Date,
DateNow,
JSONParse,
JSONStringify,
NumberIsFinite,
NumberParseInt,
ObjectAssign,
ObjectDefineProperty,
ObjectGetOwnPropertyNames,
ObjectPrototype,
+ StringPrototypeTrim,
} = primordials;
...
function parseTraceSampleRate(value) {
- if (value === undefined || value === null || value === '') {
+ if (value === undefined || value === null) {
return undefined;
}
- const normalized = +value;
+ if (typeof value !== 'number' && typeof value !== 'string') {
+ return undefined;
+ }
+ if (typeof value === 'string' && StringPrototypeTrim(value) === '') {
+ return undefined;
+ }
+
+ const normalized = Number(value);
if (!NumberIsFinite(normalized) || normalized < 0 || normalized > 1) {
return undefined;
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/nsolid.js` around lines 1144 - 1154, The parseTraceSampleRate function
currently coerces unintended types via unary +; update parseTraceSampleRate to
only accept inputs that are typeof 'number' or 'string', explicitly reject
booleans and other types, trim string inputs and return undefined for
empty/whitespace-only strings, then convert the trimmed numeric string (or the
number input) to a numeric value and validate with NumberIsFinite and range
checks (>=0 && <=1); ensure the function returns undefined for invalid types,
whitespace-only strings, NaN, non-finite numbers, or values outside the 0–1
range while returning the numeric rate for valid inputs.
| env: { | ||
| ...process.env, | ||
| ...envVars, | ||
| }, |
There was a problem hiding this comment.
Isolate child test env from ambient NSOLID_* variables.
Inheriting process.env directly can make this test nondeterministic if the parent already exports NSOLID_TRACE_SAMPLE_RATE or NSOLID_PACKAGE_JSON.
Suggested hardening
function runWithEnv(envVars) {
+ const baseEnv = { ...process.env };
+ delete baseEnv.NSOLID_TRACE_SAMPLE_RATE;
+ delete baseEnv.NSOLID_PACKAGE_JSON;
+
const result = spawnSync(process.execPath, [script], {
env: {
- ...process.env,
+ ...baseEnv,
...envVars,
},
encoding: 'utf8',
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| env: { | |
| ...process.env, | |
| ...envVars, | |
| }, | |
| function runWithEnv(envVars) { | |
| const baseEnv = { ...process.env }; | |
| delete baseEnv.NSOLID_TRACE_SAMPLE_RATE; | |
| delete baseEnv.NSOLID_PACKAGE_JSON; | |
| const result = spawnSync(process.execPath, [script], { | |
| env: { | |
| ...baseEnv, | |
| ...envVars, | |
| }, | |
| encoding: 'utf8', | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/parallel/test-nsolid-config-trace-sample-rate-env.js` around lines 15 -
18, The test currently spreads process.env into the child env (env:
{...process.env, ...envVars}), which can leak ambient NSOLID_* variables and
make the test nondeterministic; update the env construction in
test/parallel/test-nsolid-config-trace-sample-rate-env.js to explicitly filter
out any keys starting with "NSOLID_" (e.g., build a filteredEnv =
Object.fromEntries(Object.entries(process.env).filter(([k]) =>
!k.startsWith('NSOLID_'))) and then use env: { ...filteredEnv, ...envVars } so
the child process is isolated from ambient NSOLID_* variables while still
allowing envVars to override.
Add end-to-end traceSampleRate handling across config, runtime propagation, tracing decisions, and regression tests.
Why:
What changed:
Summary by CodeRabbit
Release Notes