Skip to content

Comments

Flashring externalize#267

Open
nileshsolankimeesho wants to merge 53 commits intofeat/ssd-cache-multi-shard-vmfrom
flashring-externalize
Open

Flashring externalize#267
nileshsolankimeesho wants to merge 53 commits intofeat/ssd-cache-multi-shard-vmfrom
flashring-externalize

Conversation

@nileshsolankimeesho
Copy link

🔁 Pull Request Template – BharatMLStack

Please fill out the following sections to help us review your changes efficiently.


📌 Summary

e.g., Adds optimizes Redis fetch latency in online-feature-store, or improves search UI responsiveness in trufflebox-ui.


📂 Modules Affected

  • horizon (Real-time systems / networking)
  • online-feature-store (Feature serving infra)
  • trufflebox-ui (Admin panel / UI)
  • infra (Docker, CI/CD, GCP/AWS setup)
  • docs (Documentation updates)
  • Other: ___________

✅ Type of Change

  • Feature addition
  • Bug fix
  • Infra / build system change
  • Performance improvement
  • Refactor
  • Documentation
  • Other: ___________

📊 Benchmark / Metrics (if applicable)

@coderabbitai
Copy link

coderabbitai bot commented Jan 12, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

🗂️ Base branches to auto review (1)
  • develop

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Comment @coderabbitai help to get the list of available commands and usage tips.


// Init initializes the metrics client
func Init() {
if initialized {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can simply use once, another variable initialized is not needed

}

func (ma *MetricAverager) Add(value float64) {
ma.mu.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think instead of this MetricAverager, We should directly publish metric, This is just adding 1 more unnecessary locking step.

// Start a goroutine for each metric channel
mc.wg.Add(12)

go mc.collectShardMetric(mc.channels.RP99, "RP99")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These go routines wont be needed if we publish metric directly, saving cpu in the pod

RecordRewrites(shardIdx int, value int64)

// Per-shard observation metrics
RecordRP99(shardIdx int, value time.Duration)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one we do a metric.Inc datadog itself provides the histogram values, we do not need to Record them explicitely

Rewrites int64
RP99 time.Duration
RP50 time.Duration
RP25 time.Duration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to record explicitely for both Read and Write

Puts chan ShardMetricValue
Hits chan ShardMetricValue
ActiveEntries chan ShardMetricValue
ExpiredEntries chan ShardMetricValue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we directly publish the metric, we do not need to create these channels, Again this will help in saving CPU

mc.instantMetrics[shardIdx][name] = 0
}

mc.instantMetrics[shardIdx]["KeyNotFoundCount"] = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can also be directly Incr from the method itself

if !ok {
return
}
mc.mu.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here also we are taking lock which should be needed

shardMetrics[shardIdx] = ShardMetrics{
RP99: time.Duration(instants["RP99"]),
RP50: time.Duration(instants["RP50"]),
RP25: time.Duration(instants["RP25"]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

managed by datadog

select {
case <-metricsCollector.stopCh:
return
case <-ticker.C:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we are running self ticker, datadog itself publishes metric in every 30s or so, so we do not need to explicitely run a ticker and leave it to be managed by datadog

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants