Skip to content

Emit replication and heartbeat lag#40

Merged
forge33 merged 3 commits into
masterfrom
pb/emit_rep_hb_lag_metrics
Jun 2, 2026
Merged

Emit replication and heartbeat lag#40
forge33 merged 3 commits into
masterfrom
pb/emit_rep_hb_lag_metrics

Conversation

@forge33
Copy link
Copy Markdown

@forge33 forge33 commented May 28, 2026

Related issue: github#1672
Closes: https://github.com/Shopify/schema-migrations/issues/5455

Description

This PR adds both replica lag and heartbeat lag so dashboards can correlate throttling with the upstream cause.

This is done by adding a per status tick:

  • gh_ost.lag.replication_seconds (histogram) — migrationContext.GetCurrentLagDuration().Seconds()
  • gh_ost.lag.heartbeat_seconds (histogram) — migrationContext.TimeSinceLastHeartbeatOnChangelog().Seconds()

Tagged with throttled:true|false.

This results in some nice lag visualizations, which will indicate if gh-ost is throttled or not
Screenshot 2026-05-28 at 11 37 21 AM

In case this PR introduced Go code changes:

  • contributed code is using same conventions as original code
  • script/cibuild returns with no formatting errors, build errors or unit test errors.

@forge33 forge33 force-pushed the pb/emit_rep_hb_lag_metrics branch from 176c695 to 7b25a86 Compare June 1, 2026 17:32
@forge33 forge33 marked this pull request as ready for review June 1, 2026 17:36
@forge33 forge33 requested review from coding-chimp and grodowski June 1, 2026 17:36
@forge33 forge33 added the #gsd:50633 Data Storage: gh-ost Observability Instrumentation label Jun 1, 2026
@forge33 forge33 merged commit 08b96c5 into master Jun 2, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

#gsd:50633 Data Storage: gh-ost Observability Instrumentation to-upstream

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants