Skip to content

Conversation

@SungJin1212
Copy link
Member

This PR adds -blocks-storage.bucket-store.honor-projection-hints CLI flag. If enabled, Store Gateway in Parquet mode will honor projection hints and only materialize requested labels.

Which issue(s) this PR fixes:
Fixes #

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
f.Float64Var(&cfg.TokenBucketBytesLimiter.TouchedChunksTokenFactor, "blocks-storage.bucket-store.token-bucket-bytes-limiter.touched-chunks-token-factor", 1, "Multiplication factor used for touched chunks token")
f.IntVar(&cfg.MatchersCacheMaxItems, "blocks-storage.bucket-store.matchers-cache-max-items", 0, "Maximum number of entries in the regex matchers cache. 0 to disable.")
cfg.ParquetShardCache.RegisterFlagsWithPrefix("blocks-storage.bucket-store.", f)
f.BoolVar(&cfg.HonorProjectionHints, "blocks-storage.bucket-store.honor-projection-hints", false, "[Experimental] If enabled, Store Gateway will honor projection hints and only materialize requested labels. It is only effect when `-blocks-storage.bucket-store.bucket-store-type` is a parquet.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is parquet Remove a.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check, if I want to actually use parquet store gateway projection, I need to also honor projection hints in Querier (but not enable parquet querier)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, even if the Parquet Querier is disabled (using the standard Querier), the Parquet Store Gateway relies on these hints to perform projection pushdown (Querier send the projection hints).

storageHints.ProjectionLabels = queryHints.ProjectionLabels

// Reset projection hints if not all parquet shard have the hash column.
if !allParquetBlocksHaveHashColumn(shards) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should only project if it is projection include. Add that check first can avoid checking all blocks of hash column when unnecessary


// HasHashColumn checks if the parquet block contains the schema.SeriesHashColumn column.
// This is used to determine if projection pushdown can be enabled.
func (b *parquetBlock) HasHashColumn() bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it private. No need to expose a public method for a private field.

I think we can change this check to simply just checking the parquet converter marker version. We can change it after we have the change to Parquet store gateway to load the bucket index

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we need to change it later

assert.Equal(t, expectedVector1, result.(model.Vector))

if testCfg.honorProjectionHints {
aggQuery := `sum by (series_1) (series_1)`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is series_1 a valid label to be used in the by clause?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, series_1 is actually a label key included in the series_1 metric. In this test, we generated the series this way

generateSeries("series_1", series1Timestamp, prompb.Label{Name: "series_1", Value: "series_1"})

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants