Skip to content

Add BQ insertion traces#3293

Open
bblaszkow06 wants to merge 12 commits intoLogflare:mainfrom
bblaszkow06:bb/bq-insert-instrumentation
Open

Add BQ insertion traces#3293
bblaszkow06 wants to merge 12 commits intoLogflare:mainfrom
bblaszkow06:bb/bq-insert-instrumentation

Conversation

@bblaszkow06
Copy link
Copy Markdown
Contributor

@bblaszkow06 bblaszkow06 commented Mar 17, 2026

Closes O11Y-1580

@bblaszkow06 bblaszkow06 marked this pull request as ready for review March 17, 2026 14:29
Comment on lines +383 to +388
defp compute_batch_attrs(messages, bq_api_tag) do
event_count = length(messages)
bytes = messages |> Enum.map(&:erlang.external_size(&1.data.body)) |> Enum.sum()

%{insert_method: bq_api_tag, batch_event_count: event_count, batch_bytes: bytes}
end
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming that the thought process for adding this attr in is to compare whether payload size plays a role in the performance? might be better to compute this lower down in the request payload construction if so, this is before any manipulations are performed and wouldn't be as informative.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is before any manipulations are performed

That's exactly the reason for adding that. Since there are multiple options for serialization (JSON for legacy API, ArrowIPC & protobuf for Write API), the size ratio of serialization input and output is one of the comparison aspects.

To make it easier to compare, I've switched to flat_size instead of external_size and moved the calculation to the serialization span

"/api/events" <> _ -> ingest_config()
"/endpoints/query" <> _ -> endpoint_config()
"/api/endpoints/query" <> _ -> endpoint_config()
case {span_name, Map.get(attributes, :"url.path")} do
Copy link
Copy Markdown
Contributor

@Ziinc Ziinc Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably can check if the span name contains _pipeline_ or an ingest_ prefix that would clearly indicate that it is part of the ingestion pipeline.

@bblaszkow06 bblaszkow06 force-pushed the bb/bq-insert-instrumentation branch from d9e7202 to 1150454 Compare March 24, 2026 12:46
@bblaszkow06 bblaszkow06 requested a review from Ziinc March 24, 2026 13:00
Copy link
Copy Markdown
Contributor

@Ziinc Ziinc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also verify that we don't get any unexpected spans not getting dropped? An ingest test that will verify that only spans with the ingest prefix is emitted?

|> Enum.map(&log_event_to_df_struct(&1))
|> normalize_df_struct_fields()
arrow_data =
OpenTelemetry.Tracer.with_span "bigquery.serialize", %{
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not have the ingest prefix, should add

context,
table_id
)
OpenTelemetry.Tracer.with_span "bigquery.api_call", %{
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not have ingest prefix.

def append_rows({:arrow, data_frame}, context, table) do
@spec encode_ndjson([map()]) :: binary()
def encode_ndjson(data_frames) do
OpenTelemetry.Tracer.with_span "bigquery.ndjson_encode" do
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not have ingest prefix.


def encode_arrow_data(ndjson) do
{arrow_schema, batch_msgs} =
OpenTelemetry.Tracer.with_span "bigquery.ipc_encode" do
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ingest prefix

@Ziinc
Copy link
Copy Markdown
Contributor

Ziinc commented Mar 30, 2026

We don't want to accidentally being down our trace collection with the potentially high volume of spans.

@bblaszkow06 bblaszkow06 requested a review from Ziinc March 31, 2026 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants