feat(chore): declarative configuration for trace_provider#2161
feat(chore): declarative configuration for trace_provider#2161xuan-cao-swi wants to merge 4 commits into
Conversation
thompson-tomo
left a comment
There was a problem hiding this comment.
One other thing, how are we ensuring the loading of the the env variables are disabled when covered by declarative config?
Will the current approach support dynamic reloading of the config which is also a part of the spec?
| YARD::Rake::YardocTask.new do |t| | ||
| t.stats_options = ['--list-undoc'] | ||
| end |
There was a problem hiding this comment.
| YARD::Rake::YardocTask.new do |t| | |
| t.stats_options = ['--list-undoc'] | |
| end | |
| YARD::Rake::YardocTask.new |
| def build_span_exporter(exp_cfg) | ||
| raise ArgumentError, 'no exporter config' unless exp_cfg | ||
|
|
||
| configured = 0 | ||
| exporter = nil | ||
|
|
||
| if exp_cfg.key?('console') | ||
| configured += 1 | ||
| exporter = OpenTelemetry::SDK::Trace::Export::ConsoleSpanExporter.new | ||
| end | ||
|
|
||
| if exp_cfg['otlp_http'] | ||
| configured += 1 | ||
| exporter = build_otlp_http_span_exporter(exp_cfg['otlp_http']) | ||
| end | ||
|
|
||
| raise ArgumentError, 'must not specify multiple exporters' if configured > 1 | ||
| raise ArgumentError, 'no valid span exporter' if exporter.nil? | ||
|
|
||
| exporter | ||
| end |
There was a problem hiding this comment.
How will this scale as exporter is an extension point meaning additional exporters can be added?
There was a problem hiding this comment.
It doesn't. Let's keep this simple and comply with what the current Ruby trace SDK can do.
| def build_span_processor(proc_cfg) | ||
| raise ArgumentError, 'must not specify multiple span processor type' if proc_cfg['batch'] && proc_cfg['simple'] | ||
|
|
||
| if proc_cfg['batch'] | ||
| build_batch_span_processor(proc_cfg['batch']) | ||
| elsif proc_cfg['simple'] | ||
| build_simple_span_processor(proc_cfg['simple']) | ||
| else | ||
| raise ArgumentError, 'unsupported span processor type, must be one of simple or batch' | ||
| end | ||
| end |
There was a problem hiding this comment.
How will this as scale to support additional processors the as argument error doesn't match spec.
There was a problem hiding this comment.
It's better to include a link or reference to the spec so people can check it quickly. If you are referring to SDK extension components, then it's an SDK-level feature. The Ruby trace SDK doesn't have this feature.
To answer your questions about custom components:
- It's an SDK-level feature (e.g., ComponentProvider.java, PHP ComponentProvider). It also requires language-level support for the Service Provider Interface, which Ruby (and Go) do not provide.
Given the inherent differences across languages, the details of extension
component mechanisms are likely to vary to a greater degree than is the case
with other APIs defined by OpenTelemetry. This is to be expected and is
acceptable so long as the implementation results in the defined behaviors.
- Even if someone found a way to support it, to keep this PR minimal (already 27 files), it's better to keep it simple.
There was a problem hiding this comment.
I am all for keeping it simple, small and manageable while also ensuring that we are doing something which scales and will serve us long term.
For instance could we not load declarative config into the global options after parsing/validating the file, inject an options node into the providers which uses them accordingly. The process of inject & then using them is repeated.
In effect each component is responsible for getting their own options, hence it avoids all of the switch options and decouples config from implementations. It also means we can follow the same path regardless of how it is configured ie env or file.
| def build_sampler(sampler_cfg) | ||
| s = OpenTelemetry::SDK::Trace::Samplers | ||
|
|
||
| # Default: parent-based with always_on root | ||
| return s.parent_based(root: s::ALWAYS_ON) unless sampler_cfg | ||
|
|
||
| if sampler_cfg['parent_based'] | ||
| build_parent_based_sampler(sampler_cfg['parent_based']) | ||
| elsif sampler_cfg.key?('always_on') | ||
| s::ALWAYS_ON | ||
| elsif sampler_cfg.key?('always_off') | ||
| s::ALWAYS_OFF | ||
| elsif sampler_cfg['trace_id_ratio_based'] | ||
| ratio = sampler_cfg['trace_id_ratio_based']['ratio'] || 1.0 | ||
| s.trace_id_ratio_based(ratio.to_f) | ||
| else | ||
| s.parent_based(root: s::ALWAYS_ON) | ||
| end | ||
| end |
There was a problem hiding this comment.
How will this scale to support custom/additional samplers?
| return {} unless instrumentation_cfg.is_a?(Hash) | ||
|
|
||
| general = instrumentation_cfg['general'] | ||
| return {} unless general.is_a?(Hash) |
There was a problem hiding this comment.
We should be returning the default
There was a problem hiding this comment.
OpenTelemetry::Instrumentation.registry.install_all({}) is the default
There was a problem hiding this comment.
Yes that installs all instrumentation, I am more talking about the default options as defined in declarative config aka within https://github.com/open-telemetry/opentelemetry-configuration/blob/main/schema-docs.md#experimentalgeneralinstrumentation-.
| # frozen_string_literal: true | ||
|
|
||
| # Copyright The OpenTelemetry Authors | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| module OpenTelemetry | ||
| module OtelConfig | ||
| # Trace component builder for configuring TracerProvider from declarative config. | ||
| module Trace | ||
| module_function | ||
|
|
||
| # Builds a TracerProvider from the parsed YAML tracer_provider config. | ||
| # Returns a noop-configured provider if config is nil. | ||
| def build_tracer_provider(config, resource) | ||
| return OpenTelemetry::Trace::TracerProvider.new unless config | ||
|
|
||
| sampler = build_sampler(config['sampler']) | ||
| span_limits = build_span_limits(config['limits']) | ||
|
|
||
| tp = OpenTelemetry::SDK::Trace::TracerProvider.new( | ||
| resource: resource, | ||
| sampler: sampler, | ||
| span_limits: span_limits | ||
| ) | ||
|
|
||
| Array(config['processors']).each do |proc_cfg| | ||
| processor = build_span_processor(proc_cfg) | ||
| tp.add_span_processor(processor) if processor | ||
| rescue StandardError => e | ||
| OpenTelemetry.logger.warn("Failed to build span processor: #{e.message}") | ||
| end | ||
|
|
||
| tp | ||
| end | ||
|
|
||
| # Builds a span processor (simple or batch) from config hash. | ||
| def build_span_processor(proc_cfg) | ||
| raise ArgumentError, 'must not specify multiple span processor type' if proc_cfg['batch'] && proc_cfg['simple'] | ||
|
|
||
| if proc_cfg['batch'] | ||
| build_batch_span_processor(proc_cfg['batch']) | ||
| elsif proc_cfg['simple'] | ||
| build_simple_span_processor(proc_cfg['simple']) | ||
| else | ||
| raise ArgumentError, 'unsupported span processor type, must be one of simple or batch' | ||
| end | ||
| end | ||
|
|
||
| # Builds a BatchSpanProcessor with exporter and optional tuning options. | ||
| def build_batch_span_processor(cfg) | ||
| exporter = build_span_exporter(cfg['exporter']) | ||
| opts = { | ||
| schedule_delay: cfg['schedule_delay']&.to_f, | ||
| exporter_timeout: cfg['export_timeout']&.to_f, | ||
| max_queue_size: cfg['max_queue_size']&.to_i, | ||
| max_export_batch_size: cfg['max_export_batch_size']&.to_i | ||
| }.compact | ||
|
|
||
| OpenTelemetry::SDK::Trace::Export::BatchSpanProcessor.new(exporter, **opts) | ||
| end | ||
|
|
||
| # Builds a SimpleSpanProcessor wrapping the configured exporter. | ||
| def build_simple_span_processor(cfg) | ||
| exporter = build_span_exporter(cfg['exporter']) | ||
| OpenTelemetry::SDK::Trace::Export::SimpleSpanProcessor.new(exporter) | ||
| end | ||
|
|
||
| # Builds a span exporter from config; supports console and otlp_http. | ||
| def build_span_exporter(exp_cfg) | ||
| raise ArgumentError, 'no exporter config' unless exp_cfg | ||
|
|
||
| configured = 0 | ||
| exporter = nil | ||
|
|
||
| if exp_cfg.key?('console') | ||
| configured += 1 | ||
| exporter = OpenTelemetry::SDK::Trace::Export::ConsoleSpanExporter.new | ||
| end | ||
|
|
||
| if exp_cfg['otlp_http'] | ||
| configured += 1 | ||
| exporter = build_otlp_http_span_exporter(exp_cfg['otlp_http']) | ||
| end | ||
|
|
||
| raise ArgumentError, 'must not specify multiple exporters' if configured > 1 | ||
| raise ArgumentError, 'no valid span exporter' if exporter.nil? | ||
|
|
||
| exporter | ||
| end | ||
|
|
||
| # Builds an OTLP HTTP span exporter from the given endpoint/headers config. | ||
| def build_otlp_http_span_exporter(cfg) | ||
| opts = { | ||
| endpoint: cfg['endpoint'], | ||
| headers: cfg['headers'] || cfg['headers_list'] ? parse_headers(cfg) : nil, | ||
| compression: cfg['compression'], | ||
| timeout: cfg['timeout'] && cfg['timeout'] / 1000.0 # YAML ms → Ruby seconds | ||
| }.compact | ||
|
|
||
| OpenTelemetry::Exporter::OTLP::Exporter.new(**opts) | ||
| end | ||
|
|
||
| # Builds a sampler from config; defaults to ParentBased(ALWAYS_ON). | ||
| def build_sampler(sampler_cfg) | ||
| s = OpenTelemetry::SDK::Trace::Samplers | ||
|
|
||
| # Default: parent-based with always_on root | ||
| return s.parent_based(root: s::ALWAYS_ON) unless sampler_cfg | ||
|
|
||
| if sampler_cfg['parent_based'] | ||
| build_parent_based_sampler(sampler_cfg['parent_based']) | ||
| elsif sampler_cfg.key?('always_on') | ||
| s::ALWAYS_ON | ||
| elsif sampler_cfg.key?('always_off') | ||
| s::ALWAYS_OFF | ||
| elsif sampler_cfg['trace_id_ratio_based'] | ||
| ratio = sampler_cfg['trace_id_ratio_based']['ratio'] || 1.0 | ||
| s.trace_id_ratio_based(ratio.to_f) | ||
| else | ||
| s.parent_based(root: s::ALWAYS_ON) | ||
| end | ||
| end | ||
|
|
||
| # Builds a ParentBased sampler with configurable root and remote/local delegates. | ||
| def build_parent_based_sampler(cfg) | ||
| s = OpenTelemetry::SDK::Trace::Samplers | ||
|
|
||
| root = cfg['root'] ? build_sampler(cfg['root']) : s::ALWAYS_ON | ||
|
|
||
| opts = { | ||
| root: root, | ||
| remote_parent_sampled: cfg['remote_parent_sampled'] && build_sampler(cfg['remote_parent_sampled']), | ||
| remote_parent_not_sampled: cfg['remote_parent_not_sampled'] && build_sampler(cfg['remote_parent_not_sampled']), | ||
| local_parent_sampled: cfg['local_parent_sampled'] && build_sampler(cfg['local_parent_sampled']), | ||
| local_parent_not_sampled: cfg['local_parent_not_sampled'] && build_sampler(cfg['local_parent_not_sampled']) | ||
| }.compact | ||
|
|
||
| s.parent_based(**opts) | ||
| end | ||
|
|
||
| # Builds SpanLimits from config; returns the SDK default when config is nil. | ||
| def build_span_limits(limits_cfg) | ||
| return OpenTelemetry::SDK::Trace::SpanLimits::DEFAULT unless limits_cfg | ||
|
|
||
| opts = { | ||
| attribute_count_limit: limits_cfg['attribute_count_limit'], | ||
| attribute_length_limit: limits_cfg['attribute_value_length_limit'], | ||
| event_count_limit: limits_cfg['event_count_limit'], | ||
| link_count_limit: limits_cfg['link_count_limit'], | ||
| event_attribute_count_limit: limits_cfg['event_attribute_count_limit'], | ||
| link_attribute_count_limit: limits_cfg['link_attribute_count_limit'] | ||
| }.compact | ||
|
|
||
| OpenTelemetry::SDK::Trace::SpanLimits.new(**opts) | ||
| end | ||
|
|
||
| # Parses headers from YAML array format or headers_list string. | ||
| # Array format takes precedence over headers_list. | ||
| def parse_headers(cfg) | ||
| headers = {} | ||
|
|
||
| if cfg['headers'].is_a?(Array) | ||
| cfg['headers'].each do |h| | ||
| headers[h['name']] = h['value'] if h['name'] && h['value'] | ||
| end | ||
| end | ||
|
|
||
| # Fall back to headers_list only if headers array produced nothing | ||
| if headers.empty? && cfg['headers_list'].is_a?(String) | ||
| cfg['headers_list'].split(',').each do |pair| | ||
| key, value = pair.strip.split('=', 2) | ||
| headers[key] = value if key && value | ||
| end | ||
| end | ||
|
|
||
| headers | ||
| end | ||
| end | ||
| end | ||
| end |
There was a problem hiding this comment.
We seem to be missing the tracerConfig.
There was a problem hiding this comment.
What is tracerConfig? Could you be more specific?
There was a problem hiding this comment.
It is one of the types defined at https://github.com/open-telemetry/opentelemetry-configuration/blob/main/schema-docs.md#experimentaltracerconfigurator-
| # Returns a Resource for the given detector name. | ||
| def run_detector(name) | ||
| case name | ||
| when 'container' | ||
| detect_resource('OpenTelemetry::Resource::Detector::Container') | ||
| when 'aws' | ||
| # Run all AWS sub-detectors; each returns an empty resource if not on that platform. | ||
| detect_resource('OpenTelemetry::Resource::Detector::AWS', %i[ec2 ecs eks lambda]) | ||
| when 'azure' | ||
| detect_resource('OpenTelemetry::Resource::Detector::Azure') | ||
| when 'google_cloud_platform' | ||
| detect_resource('OpenTelemetry::Resource::Detector::GoogleCloudPlatform') | ||
| else | ||
| OpenTelemetry.logger.warn("OtelConfig: unknown resource detector '#{name}'; skipping.") | ||
| OpenTelemetry::SDK::Resources::Resource.create({}) | ||
| end |
There was a problem hiding this comment.
How will this scale as mentioned in the spec.
| spec.add_development_dependency 'opentelemetry-api', '~> 1.10.0' | ||
| spec.add_development_dependency 'opentelemetry-common', '~> 0.25.0' | ||
| spec.add_development_dependency 'opentelemetry-exporter-otlp', '~> 0.34.0' | ||
| spec.add_development_dependency 'opentelemetry-instrumentation-all', '~> 0.91.0' | ||
| spec.add_development_dependency 'opentelemetry-propagator-google_cloud_trace_context', '~> 0.4.0' | ||
| spec.add_development_dependency 'opentelemetry-propagator-ottrace', '~> 0.25.0' | ||
| spec.add_development_dependency 'opentelemetry-propagator-xray', '~> 0.27.0' | ||
| spec.add_development_dependency 'opentelemetry-resource-detector-aws', '~> 0.5.0' | ||
| spec.add_development_dependency 'opentelemetry-resource-detector-azure', '~> 0.3.0' | ||
| spec.add_development_dependency 'opentelemetry-resource-detector-container', '~> 0.3.0' | ||
| spec.add_development_dependency 'opentelemetry-resource-detector-google_cloud_platform', '~> 0.4.0' | ||
| spec.add_development_dependency 'opentelemetry-sdk', '~> 1.12' |
| spec.add_development_dependency 'opentelemetry-sdk', '~> 1.12' | ||
|
|
||
| if spec.respond_to?(:metadata) | ||
| spec.metadata['changelog_uri'] = "https://open-telemetry.github.io/opentelemetry-ruby/opentelemetry-logs-sdk/v#{OpenTelemetry::OtelConfig::VERSION}/file.CHANGELOG.html" |
| spec.metadata['changelog_uri'] = "https://open-telemetry.github.io/opentelemetry-ruby/opentelemetry-logs-sdk/v#{OpenTelemetry::OtelConfig::VERSION}/file.CHANGELOG.html" | ||
| spec.metadata['source_code_uri'] = "https://github.com/open-telemetry/opentelemetry-ruby/tree/#{spec.name}/v#{spec.version}/logs_sdk" | ||
| spec.metadata['bug_tracker_uri'] = 'https://github.com/open-telemetry/opentelemetry-ruby/issues' | ||
| spec.metadata['documentation_uri'] = |
Description
This PR introduces an initial declarative configuration implementation for OpenTelemetry Ruby under otelconfig, intentionally scoped to tracing only.
I followed the Go otelconfig convention for config structure and behavior where it maps well to Ruby.
Minimalist approach: no hard dependency on optional components, which optional propagators and resource detectors are resolved only if the corresponding gem has been required by the user. If an optional component is not available, configuration continues safely with warnings instead of failing startup.
Scope in this PR
Try
/example