Flashring externalize#267
Flashring externalize#267nileshsolankimeesho wants to merge 53 commits intofeat/ssd-cache-multi-shard-vmfrom
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. 🗂️ Base branches to auto review (1)
Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
Comment |
change filesize multiplier in plans
|
|
||
| // Init initializes the metrics client | ||
| func Init() { | ||
| if initialized { |
There was a problem hiding this comment.
we can simply use once, another variable initialized is not needed
| } | ||
|
|
||
| func (ma *MetricAverager) Add(value float64) { | ||
| ma.mu.Lock() |
There was a problem hiding this comment.
I think instead of this MetricAverager, We should directly publish metric, This is just adding 1 more unnecessary locking step.
flashring/pkg/metrics/runmetrics.go
Outdated
| // Start a goroutine for each metric channel | ||
| mc.wg.Add(12) | ||
|
|
||
| go mc.collectShardMetric(mc.channels.RP99, "RP99") |
There was a problem hiding this comment.
These go routines wont be needed if we publish metric directly, saving cpu in the pod
flashring/pkg/metrics/runmetrics.go
Outdated
| RecordRewrites(shardIdx int, value int64) | ||
|
|
||
| // Per-shard observation metrics | ||
| RecordRP99(shardIdx int, value time.Duration) |
There was a problem hiding this comment.
one we do a metric.Inc datadog itself provides the histogram values, we do not need to Record them explicitely
flashring/pkg/metrics/runmetrics.go
Outdated
| Rewrites int64 | ||
| RP99 time.Duration | ||
| RP50 time.Duration | ||
| RP25 time.Duration |
There was a problem hiding this comment.
No need to record explicitely for both Read and Write
flashring/pkg/metrics/runmetrics.go
Outdated
| Puts chan ShardMetricValue | ||
| Hits chan ShardMetricValue | ||
| ActiveEntries chan ShardMetricValue | ||
| ExpiredEntries chan ShardMetricValue |
There was a problem hiding this comment.
if we directly publish the metric, we do not need to create these channels, Again this will help in saving CPU
flashring/pkg/metrics/runmetrics.go
Outdated
| mc.instantMetrics[shardIdx][name] = 0 | ||
| } | ||
|
|
||
| mc.instantMetrics[shardIdx]["KeyNotFoundCount"] = 0 |
There was a problem hiding this comment.
This can also be directly Incr from the method itself
flashring/pkg/metrics/runmetrics.go
Outdated
| if !ok { | ||
| return | ||
| } | ||
| mc.mu.Lock() |
There was a problem hiding this comment.
Here also we are taking lock which should be needed
flashring/pkg/metrics/runmetrics.go
Outdated
| shardMetrics[shardIdx] = ShardMetrics{ | ||
| RP99: time.Duration(instants["RP99"]), | ||
| RP50: time.Duration(instants["RP50"]), | ||
| RP25: time.Duration(instants["RP25"]), |
| select { | ||
| case <-metricsCollector.stopCh: | ||
| return | ||
| case <-ticker.C: |
There was a problem hiding this comment.
here we are running self ticker, datadog itself publishes metric in every 30s or so, so we do not need to explicitely run a ticker and leave it to be managed by datadog
a33c7b7 to
f16d4a6
Compare
f87c39a to
8c510a3
Compare
e6fc41c to
af09e73
Compare
🔁 Pull Request Template – BharatMLStack
📌 Summary
📂 Modules Affected
horizon(Real-time systems / networking)online-feature-store(Feature serving infra)trufflebox-ui(Admin panel / UI)infra(Docker, CI/CD, GCP/AWS setup)docs(Documentation updates)___________✅ Type of Change
___________📊 Benchmark / Metrics (if applicable)