support geneating OpenTelemetry load by Haleygo · Pull Request #32 · VictoriaMetrics/prometheus-benchmark

Haleygo · 2026-05-14T08:06:24Z

fix #31
This pull request adds optional OpenTelemetry workload support for benchmarking the storage. It currently supports configuring targetsCount and scrapeInterval, but not churn rate yet (this may be supported in the future if needed).

This pull request introduces breaking changes by refactoring the chart values file into separate prometheusLoad and openTelemetryLoad sections, and bumps the chart version to v0.3.0.

Haleygo · 2026-05-14T08:12:21Z

It is tricky to add corresponding rules for the new OpenTelemetry workload, since different storage systems may use different strategies for whether to convert metric names to Prometheus naming conventions. For example, VictoriaMetrics does not convert the name by default, so the metric is ingested as system.cpu.time. But if -opentelemetry.usePrometheusNaming is enabled, or in other systems such as Prometheus, the metric will be converted to system_cpu_time_seconds_total.

hagen1778 · 2026-05-15T09:54:26Z

        args:
        - --httpListenAddr=:8436
-        - --targetsCount={{ $.Values.targetsCount }}
+        - --targetsCount={{ $.Values.prometheusLoad.targetsCount }}


would it make sense to define $.Values.prometheusLoad as variable and use it instead the full path?

Do you mean in the chart values file, or only in the vmagent deployment yaml?
In vmagent deployment yaml, sure, we can create a variable at the top of the file as:

{{- $pl := $.Values.prometheusLoad }}

and use it below as --targetsCount={{ $pl.targetsCount }}.
But tbh, I'd prefer to keep the original full value path when it is not too long to read, to avoid having to check the variable context.

hagen1778 · 2026-05-15T09:56:00Z

 description: A Helm chart for Prometheus benchmark
 type: application
-version: 0.2.0
+version: 0.3.0


Should it be a major bump?

As I know from semver, if the version is still in the 0.y.z range, it is considered to be in development, and breaking changes can still happen in y.
And JFYI all our official charts are following that.
Ofc we can bump it into 1.0.0 this time if you feel it's ready.

Seems like you have better understanding than me. @f41gh7 @zekker6 what would you say?

I agree with bumping to 0.3.0 since major is "0" here, but the version defined here is almost useless.
Currently, we do not publish this chart in a registry which means users always pull version from main directly. So changing version here will not stop environments from updating and having a breaking change.

In order to avoid that it would be great to do any of:

publish tags in repo so that it will be possible pin version to a specific tag - this is a quick and easy way to get this addressed, but not something that is widely used for helm charts.

publish in proper registry format. For example, to publish to Github Pages / Github Packages - this requires more work upfront but is a proper solution when publishing helm charts.

So changing version here will not stop environments from updating and having a breaking change.

Yeah, but I don’t think users should automatically pull updates for this chart directly from main just because we don’t publish versions for it right now, it means unknown upgrade.

I don’t think proper publishing for this chart is strictly necessary, because I don’t expect users to run this benchmark constantly or upgrade to newer versions frequently. Most likely, this chart is used either for a one-time benchmark test or for running long-term stable loads, like in our sandbox, and upgrades are only needed if users know they want the new changes.

Of course, it would still be nice to have.

Yeah, but I don’t think users should automatically pull updates for this chart directly from main just because we don’t publish versions for it right now, it means unknown upgrade.

While this is true we do not provide any other options today. So even running a one time benchmark and saving the configuration for future reference will now become invalid due to a breaking change.
Even in case of our blog posts which are using this benchmarks the example configs will no longer be valid.

hagen1778 · 2026-05-15T10:04:50Z


 # remoteStorages contains a named list of Prometheus-compatible systems to test.
-# These systems must support data ingestion via Prometheus remote_write protocol.
+# These systems must support data ingestion via Prometheus remote_write protocol, and OTLP if openTelemetryLoad.enabled is true.


can this part be within the sections openTelemetryLoad and prometheusLoad?
It is confusing that user has to define them outside that scope.

It could, but the current options look better to me:
Users specify the target-related options in the xxLoad sections to generate different workloads.
Users specify all storage system-related options under the remoteStorages section.

This also avoids defining the storage address twice under each xxLoad. I don’t see a case where a user would need to generate two kinds of workload and write them to different storage systems in one install.

So I would still prefer to keep it this way. Why does defining the storage address outside the xxxLoad sections feel confusing to you?

My thinking is bound to the scope. When I am defining otel-workload - everything related to that workload should be within the scope. The hierarchy is what gives me logical confidence in the settings change. We discussed this and it seems like we have different points of view.
@zekker6 @f41gh7 wdyt?

everything related to that workload should be within the scope.

Everything related to the workload, I agree, but from my perspective, backend storage is not within the scope of the workload.

It is also reflected in the topology: workloads are only inputs, while the storage system is independent from them and shared across them.

I think we could even make openTelemetryLoad and prometheusLoad subsections under workloads or input, if that makes things clearer.
If there is a new type of metric workload in the future, we can add it under workloads. If we later introduce log workloads, we can rename the current section to something like metricsWorkloads.

I am opposing remoteWrites as a separate sub-structure as you still has to define it twice: globally and then within the scope of the workload. It is at least confusing to me, because it is not clear from first glance what inherits what, what's required and what's not.

I think we could even make openTelemetryLoad and prometheusLoad subsections under workloads or input, if that makes things clearer.

I like the idea. What you think of the following structure?

disableMonitoring: false nodeSelector: {} tolerations: [] writes: prometheusRW: enabled: false tag: "v1.143.0" extraFlags: [] extraEnvs: [] bearerToken: headers: targetsCount: 1000 scrapeInterval: 10s writeReplicas: 1 remoteStorages: vm: writeURL: "http://<vminsert-cluster-1>:8480/insert/0/prometheus/api/v1/write" openTelemetry: enabled: false targetsCount: 1000 scrapeInterval: 10s tag: "" extraFlags: [ ] extraEnvs: [ ] bearerToken: headers: remoteStorages: vm: writeURL: "http://<vminsert-cluster-1>:8480/insert/0/prometheus/opentelemetry/v1/metrics" reads: enabled: false tag: "v1.143.0" queryInterval: 10s extraFlags: [] extraEnvs: [] bearerToken: remoteStorages: vm: readURL: "http://<vmselect-cluster-1>:8482/select/0/prometheus"

hagen1778 · 2026-05-15T10:16:46Z

WDYT about the following scheme?

See source https://excalidraw.com/#json=RBbwkaVqWplqh2VxCFfNQ,tOklyf18QgIhTPTTo6NSUA

Please note, I dropped Alertmanager as we don't actually need it. Just set -notifier.blackhole on vmalert.

Haleygo force-pushed the add-otel-load branch 2 times, most recently from e7d5890 to 8e45d23 Compare May 14, 2026 08:17

Haleygo requested review from f41gh7, hagen1778, makasim and zekker6 May 14, 2026 14:36

support geneating OpenTelemetry load

2483c0c

Haleygo force-pushed the add-otel-load branch from 8e45d23 to 2483c0c Compare May 15, 2026 07:21

hagen1778 requested changes May 15, 2026

View reviewed changes

fix otel load if resource attribute promotion is restricted in storage

f676ea8

Haleygo force-pushed the add-otel-load branch from 73c5d36 to f676ea8 Compare May 15, 2026 16:19

Conversation

Haleygo commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Haleygo commented May 14, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Haleygo May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hagen1778 May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Haleygo commented May 14, 2026 •

edited

Loading

Haleygo May 15, 2026 •

edited

Loading

hagen1778 May 15, 2026 •

edited

Loading