-
Notifications
You must be signed in to change notification settings - Fork 847
Add projectionHint to SG Parquet mode #7206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add projectionHint to SG Parquet mode #7206
Conversation
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
pkg/storage/tsdb/config.go
Outdated
| f.Float64Var(&cfg.TokenBucketBytesLimiter.TouchedChunksTokenFactor, "blocks-storage.bucket-store.token-bucket-bytes-limiter.touched-chunks-token-factor", 1, "Multiplication factor used for touched chunks token") | ||
| f.IntVar(&cfg.MatchersCacheMaxItems, "blocks-storage.bucket-store.matchers-cache-max-items", 0, "Maximum number of entries in the regex matchers cache. 0 to disable.") | ||
| cfg.ParquetShardCache.RegisterFlagsWithPrefix("blocks-storage.bucket-store.", f) | ||
| f.BoolVar(&cfg.HonorProjectionHints, "blocks-storage.bucket-store.honor-projection-hints", false, "[Experimental] If enabled, Store Gateway will honor projection hints and only materialize requested labels. It is only effect when `-blocks-storage.bucket-store.bucket-store-type` is a parquet.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is parquet Remove a.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to double check, if I want to actually use parquet store gateway projection, I need to also honor projection hints in Querier (but not enable parquet querier)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, even if the Parquet Querier is disabled (using the standard Querier), the Parquet Store Gateway relies on these hints to perform projection pushdown (Querier send the projection hints).
| storageHints.ProjectionLabels = queryHints.ProjectionLabels | ||
|
|
||
| // Reset projection hints if not all parquet shard have the hash column. | ||
| if !allParquetBlocksHaveHashColumn(shards) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should only project if it is projection include. Add that check first can avoid checking all blocks of hash column when unnecessary
|
|
||
| // HasHashColumn checks if the parquet block contains the schema.SeriesHashColumn column. | ||
| // This is used to determine if projection pushdown can be enabled. | ||
| func (b *parquetBlock) HasHashColumn() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make it private. No need to expose a public method for a private field.
I think we can change this check to simply just checking the parquet converter marker version. We can change it after we have the change to Parquet store gateway to load the bucket index
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we need to change it later
| assert.Equal(t, expectedVector1, result.(model.Vector)) | ||
|
|
||
| if testCfg.honorProjectionHints { | ||
| aggQuery := `sum by (series_1) (series_1)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is series_1 a valid label to be used in the by clause?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, series_1 is actually a label key included in the series_1 metric. In this test, we generated the series this way
generateSeries("series_1", series1Timestamp, prompb.Label{Name: "series_1", Value: "series_1"})
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
This PR adds
-blocks-storage.bucket-store.honor-projection-hintsCLI flag. If enabled, Store Gateway in Parquet mode will honor projection hints and only materialize requested labels.Which issue(s) this PR fixes:
Fixes #
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]