Skip to content

feat(charts): make the series limit opt-in and consistent across chunks#2449

Open
wrn14897 wants to merge 7 commits into
mainfrom
warren/fix-series-limit-empty-group
Open

feat(charts): make the series limit opt-in and consistent across chunks#2449
wrn14897 wants to merge 7 commits into
mainfrom
warren/fix-series-limit-empty-group

Conversation

@wrn14897

@wrn14897 wrn14897 commented Jun 11, 2026

Copy link
Copy Markdown
Member

Summary

Reworks the top-N series cap added in #2429: the team setting becomes opt-in, and when enabled, the cap now holds across chunked queries and is reported accurately in the SQL preview.

The series limit is now opt-in (default: disabled). The team seriesLimit setting no longer falls back to 100: when unset, charts fetch every series and no __hdx_series_limit CTE is emitted — the pre-#2429 behavior. The team settings page shows "Time Chart Series Limit" as Disabled by default, and a configured value can be cleared back to disabled via a new "Disable" button (the generic setting form gained allowUnset/resetLabel props for settings without a server-side default). No API changes were needed — the update schema already accepted null to unset, and the mongoose field has no default.

When a limit is set, chunked queries now respect it. Time charts fetch data in time-windowed chunks (useQueriedChartConfigfetchDataInChunks), and each chunk's CTE ranked the top N within its own window — so a 46-minute chart with seriesLimit: 3 split into a 15m + 31m chunk could render 5 series, and alert previews returned inconsistent group sets between adjacent windows. Chunked queries now carry a shared ranking range as a runtime-only seriesLimitDateRange field — the newest chunk window, which also bounds the ranking scan instead of re-scanning the full chart range in every chunk — and renderSeriesLimitCte renders the ranking CTE's WHERE/GROUP BY against that pinned range (with normalized boundary inclusivity) while the outer query still fetches only its window. The CTE is byte-identical across chunks and its ORDER BY max(rank) DESC, group tiebreak is deterministic, so every chunk keeps the exact same top-N set and the union equals N. Trade-off: series are selected by activity in the newest window, so a group with no events there is dropped from the chart even if it was active earlier in the range. Unchunked consumers (alert evaluation, API) never set the field and are unchanged.

Generated SQL preview shows the real limit. The chart editor's "Generated SQL" section always rendered LIMIT 100 regardless of the team setting, because buildChartConfigForExplanations called convertToTimeChartConfig without the team value. It now passes the same me.team.seriesLimit that DBTimeChart uses, so the preview matches the executed query — including no CTE at all when the limit is disabled.

Verified by unit tests (no CTE / no seriesLimit when the team setting is unset; CTE pinned and byte-identical across two windowed renders; explanation config reflects the team limit), hook tests (serial and parallel chunked fetches pin the ranking to the newest window; unchunked fetches don't), and a ClickHouse integration test where two windows with different local winners both return the newest window's top-1.

How to test on Vercel preview

Preview routes: /team, /chart

Steps:

  1. Open /team and scroll to the "ClickHouse Client Settings" section.
  2. Verify "Time Chart Series Limit" shows "Disabled" by default.
  3. Change it to 5 and save.
  4. Open /chart, pick the Logs source, add a Group By (e.g. ServiceName), and run the query.
  5. Expand the "Generated SQL" accordion below the chart.
  6. Verify the __hdx_series_limit CTE in the SQL shows LIMIT 5 (not LIMIT 100).
  7. Back on /team, click "Disable" next to the setting and verify it shows "Disabled" again.

References

Each time-window chunk's __hdx_series_limit CTE ranked the top N within
its own window, so the union across chunks could exceed seriesLimit and
adjacent windows disagreed on which series to keep. Chunked queries now
carry the full chart range as seriesLimitDateRange and the CTE ranks
over that pinned range, so every chunk keeps the identical top-N set.
Follow-up to HDX-4499.
@changeset-bot

changeset-bot Bot commented Jun 11, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: e40772c

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 4 packages
Name Type
@hyperdx/common-utils Patch
@hyperdx/app Patch
@hyperdx/api Patch
@hyperdx/otel-collector Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel

vercel Bot commented Jun 11, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hyperdx-oss Ready Ready Preview, Comment Jun 11, 2026 10:14pm
hyperdx-storybook Ready Ready Preview, Comment Jun 11, 2026 10:14pm

Request Review

@github-actions github-actions Bot added the review/tier-3 Standard — full human review required label Jun 11, 2026
@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

🟡 Tier 3 — Standard

Introduces new logic, modifies core functionality, or touches areas with non-trivial risk.

Why this tier:

  • Cross-layer change: touches frontend (packages/app) + shared utils (packages/common-utils)

Review process: Full human review — logic, architecture, edge cases.
SLA: First-pass feedback within 1 business day.

Stats
  • Production files changed: 7
  • Production lines changed: 134 (+ 295 in test files, excluded from tier calculation)
  • Branch: warren/fix-series-limit-empty-group
  • Author: wrn14897

To override this classification, remove the review/tier-3 label and apply a different review/tier-* label. Manual overrides are preserved on subsequent pushes.

@greptile-apps

greptile-apps Bot commented Jun 11, 2026

Copy link
Copy Markdown

Greptile Summary

This PR fixes three issues with the __hdx_series_limit CTE introduced in #2429: chunked queries now share a single ranking range (the newest window) so the union of per-chunk top-N sets never exceeds the configured limit; the team seriesLimit setting is made opt-in (disabled by default, clearable via a new "Disable" button); and the chart editor's "Generated SQL" preview is fixed to pass the team's actual limit instead of always showing LIMIT 100.

  • Chunked ranking pin (useChartConfig.tsx, renderChartConfig.ts): fetchDataInChunks attaches seriesLimitDateRange = windows[0].dateRange (newest window) to every windowed config; renderSeriesLimitCte re-renders its WHERE/GROUP BY against that pinned range with normalized inclusivity, making the CTE byte-identical across all chunks.
  • Opt-in series limit (ChartUtils.tsx, TeamQueryConfigSection.tsx): convertToTimeChartConfig omits seriesLimit entirely when the team setting is unset, reverting to pre-fix(charts): cap group-by time-series to top-N series to prevent OOM #2429 "fetch all series" behavior; the settings UI gains allowUnset/resetLabel props and shows "Disabled" by default.
  • SQL preview fix (EditTimeChartForm.tsx, utils.ts): buildChartConfigForExplanations now receives me.team.seriesLimit, so the generated SQL accordion reflects the query that will actually execute.

Confidence Score: 5/5

Safe to merge. The chunked-query ranking fix is correct, well-tested (unit + integration), and the render layer change is isolated to the series-limit CTE path.

All three fixes are targeted and well-tested: the CTE pinning has a ClickHouse integration test that directly verifies the newest-window winner wins in both chunks, hook tests cover serial, parallel, and unchunked paths, and the SQL preview fix is covered by a new unit test. No data paths outside the series-limit feature are touched. The only observation is a cosmetic UI glitch in the Disable button visibility when the server returns null for an unset field.

packages/app/src/components/TeamSettings/TeamQueryConfigSection.tsx — isCustomValue logic may show the Disable button when the setting is already null/unset.

Important Files Changed

Filename Overview
packages/common-utils/src/core/renderChartConfig.ts Adds seriesLimitDateRange override in renderSeriesLimitCte: re-renders WHERE and GROUP BY against the pinned ranking range while the outer query stays windowed. Logic is correct; auto-granularity bucket size varies with the CTE range but doesn't affect group selection.
packages/app/src/hooks/useChartConfig.tsx fetchDataInChunks pins seriesLimitDateRange to windows[0] (newest window) for all chunks when seriesLimit is set; correctly skips when unchunked (w == null). Window ordering confirmed as newest-first from getGranularityAlignedTimeWindows.
packages/common-utils/src/types.ts Adds optional seriesLimitDateRange to DateRange with a clear runtime-only, never persisted comment; field is only set by fetchDataInChunks and never flows into chart save paths.
packages/app/src/components/TeamSettings/TeamQueryConfigSection.tsx Adds allowUnset/resetLabel props to ClickhouseSettingForm and wires the seriesLimit field to show Disabled by default; isCustomValue = currentValue !== undefined may show Disable button when server returns null for an already-unset field.
packages/app/src/ChartUtils.tsx convertToTimeChartConfig now omits seriesLimit when teamSeriesLimit is unset (undefined), reverting to pre-#2429 behavior. Change is correct and matches the opt-in UX requirement.
packages/app/src/components/DBEditTimeChartForm/EditTimeChartForm.tsx Adds api.useMe() call and passes me?.team?.seriesLimit to buildChartConfigForExplanations so the SQL preview matches the executed query. Dependency array updated correctly.
packages/app/src/components/DBEditTimeChartForm/utils.ts buildChartConfigForExplanations gains optional teamSeriesLimit param; threaded to convertToTimeChartConfig, fixing the SQL preview showing stale LIMIT 100.
packages/common-utils/src/tests/renderChartConfig.test.ts New tests verify: CTE is byte-identical across two windowed renders, pinned timestamps appear in CTE while chunked window timestamps appear in outer query, and seriesLimitDateRange-only (no seriesLimit) emits no CTE.
packages/app/src/hooks/tests/useChartConfig.test.tsx Adds serial-chunked, parallel-chunked, and unchunked tests; all verify that seriesLimitDateRange is (or is not) pinned to the newest window as expected.
packages/common-utils/src/tests/queryChartConfig.int.test.ts Integration test creates a table with two windows where different services dominate; confirms both windows return only the newest-window winner (svcB), proving CTE pinning works end-to-end.

Sequence Diagram

sequenceDiagram
    participant UC as useQueriedChartConfig
    participant FDC as fetchDataInChunks
    participant CH as ClickHouse
    participant RCC as renderChartConfig
    participant SLC as renderSeriesLimitCte

    UC->>FDC: "config (seriesLimit=N, dateRange=full)"
    Note over FDC: windows = getGranularityAlignedTimeWindows()<br/>rankingDateRange = windows[0].dateRange (newest)

    loop Each window chunk
        FDC->>FDC: "windowedConfig = { dateRange: chunkWindow, seriesLimitDateRange: rankingDateRange }"
        FDC->>CH: queryChartConfig(windowedConfig)
        CH->>RCC: renderChartConfig(windowedConfig)
        RCC->>SLC: "renderSeriesLimitCte(chartConfig, {from, where, groupBy})"
        Note over SLC: cteConfig = {dateRange: seriesLimitDateRange,<br/>  dateRangeStartInclusive: true, dateRangeEndInclusive: true}<br/>cteWhere = renderWhere(cteConfig)<br/>cteGroupBy = renderGroupBy(cteConfig)
        SLC-->>RCC: CTE identical across all chunks
        RCC-->>CH: SQL with pinned CTE + windowed outer query
        CH-->>FDC: chunk result (top-N from newest window)
    end

    FDC-->>UC: merged results (consistent N series)
Loading

Fix All in Claude Code Fix All in Conductor Fix All in Cursor Fix All in Codex

Reviews (6): Last reviewed commit: "chore: update changeset for opt-in serie..." | Re-trigger Greptile

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

E2E Test Results

All tests passed • 199 passed • 3 skipped • 1319s

Status Count
✅ Passed 199
❌ Failed 0
⚠️ Flaky 4
⏭️ Skipped 3

Tests ran across 4 shards in parallel.

View full report →

buildChartConfigForExplanations called convertToTimeChartConfig without
the team's seriesLimit, so the Generated SQL section always rendered the
default LIMIT 100 even when the team setting was lower. Pass the same
value DBTimeChart uses so the preview matches the executed query.
Pinning the __hdx_series_limit ranking to the full chart range made
every chunk re-scan the whole range. Rank over the newest window
instead: still one fixed range shared by all chunks (so the top-N set
stays consistent), but the scan is bounded by the smallest window.
Trade-off: series are picked by recent activity, so groups with no
events in the newest window are dropped from the chart.
wrn14897 added 2 commits June 11, 2026 14:40
The team seriesLimit setting no longer falls back to 100: when unset,
charts fetch every series and no __hdx_series_limit CTE is emitted.
The team settings page shows the setting as "Disabled" by default and
adds a Disable button to clear a configured value back to undefined.
@wrn14897 wrn14897 changed the title fix(charts): keep a consistent top-N series set across chunked queries feat(charts): make the series limit opt-in and consistent across chunks Jun 11, 2026

@pulpdrew pulpdrew left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change itself LGTM in isolation. I have some thoughts though:

  1. I think opt-in is a good idea, in the future I could see it being useful to opt-in at the chart level instead of team-wide, so that charts without too many series don't pay the performance penalty of this new approach, while charts that do need series limiting can opt-in individually. This could be another "display setting"
  2. Also I am wondering if you've considered a two-query solution: One query to fetch the top-N series over the whole time range; then the main query, without a CTE, just an additional WHERE <series> IN (...) condition?
    • This would address "Series are selected by activity in the newest window, so a group with no events there is dropped from the chart even if it was active earlier in the range."
    • I would imagine this could also make the query building a bit simpler too, but there are probably some complexities around making sure that the tuples identifying the top N series are rendered correctly, so maybe not.
    • This could also be used to fix the complaint that the chart seems to change drastically as new chunks load. The "pre flight" query could be used to get the max series value that will be rendered, and the chart y-axis could be scaled according to that value, instead of being re-scaled as each new chunk is loaded in.
  3. Not related to this change but something I notice about original implementation which is still present: When using multiple series, or when using ratio mode, only the first series is aggregated to choose the "top" series, instead of the max of the series or the ratio value.

@wrn14897

Copy link
Copy Markdown
Member Author

Change itself LGTM in isolation. I have some thoughts though:

  1. I think opt-in is a good idea, in the future I could see it being useful to opt-in at the chart level instead of team-wide, so that charts without too many series don't pay the performance penalty of this new approach, while charts that do need series limiting can opt-in individually. This could be another "display setting"

  2. Also I am wondering if you've considered a two-query solution: One query to fetch the top-N series over the whole time range; then the main query, without a CTE, just an additional WHERE <series> IN (...) condition?

    • This would address "Series are selected by activity in the newest window, so a group with no events there is dropped from the chart even if it was active earlier in the range."
    • I would imagine this could also make the query building a bit simpler too, but there are probably some complexities around making sure that the tuples identifying the top N series are rendered correctly, so maybe not.
    • This could also be used to fix the complaint that the chart seems to change drastically as new chunks load. The "pre flight" query could be used to get the max series value that will be rendered, and the chart y-axis could be scaled according to that value, instead of being re-scaled as each new chunk is loaded in.
  3. Not related to this change but something I notice about original implementation which is still present: When using multiple series, or when using ratio mode, only the first series is aggregated to choose the "top" series, instead of the max of the series or the ratio value.

  1. I thought about moving the series limit configuration to the Display Settings. One concern I have is the migration path, since in Cloud v1 the limit is enforced by default. We can defer that discussion for now, though. I agree that from a UX perspective, it makes more sense there.
  2. Good call on this. Selecting groups from the local maximum can be problematic because those groups may not always exist. My main concern is the performance impact of scanning groups across the entire time range, which could defeat the purpose of chunking. I'll put more thought into separating the queries and come back with some ideas.
  3. I think this is exactly the issue this PR is trying to address. The groups need to remain fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review/tier-3 Standard — full human review required

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants