Skip to content

Emit per-stage sleep metrics#47

Open
grodowski wants to merge 6 commits into
pb/consolidate_metrics_emittersfrom
grodowski/issue-5459-sleep-metrics
Open

Emit per-stage sleep metrics#47
grodowski wants to merge 6 commits into
pb/consolidate_metrics_emittersfrom
grodowski/issue-5459-sleep-metrics

Conversation

@grodowski
Copy link
Copy Markdown
Member

@grodowski grodowski commented Jun 2, 2026

Closes https://github.com/Shopify/schema-migrations/issues/5459

Add sleep metric helpers and instrument the main migration sleep/wait paths.

Metrics emitted:

  • sleep.duration_milliseconds tagged with stage
  • sleep.total_milliseconds tagged with stage

Stages covered:

  • cut_over_postpone
  • chunk_throttle
  • retry_backoff
  • replica_wait

Use millisecond units so sub-second waits, such as replica polling and
nice-ratio throttling, are not truncated to zero. Document the sleep stage
catalog and add unit coverage for the helper, including sub-second durations.

🎩

Cherry-picked this commit plus a few others onto a separate branch, ran a dev tophat run to add name to fake_data:
Screenshot 2026-06-02 at 9 56 09 PM
The histogram prometheus query needs to be updated, but we have data

In case this PR introduced Go code changes:

  • contributed code is using same conventions as original code
  • script/cibuild returns with no formatting errors, build errors or unit test errors.

forge33 added 5 commits June 1, 2026 13:29
Use a single metrics emitter abstraction for gauge, count, and histogram samples so metric helpers share one testable client contract.
Keep the small metric emission helpers, runtime reporter, and their tests together so the metrics package has fewer one-function files.
@grodowski grodowski added the #gsd:50633 Data Storage: gh-ost Observability Instrumentation label Jun 2, 2026
@grodowski grodowski changed the base branch from grodowski/metrics-client-histogram to pb/refactor_metrics_emitter June 2, 2026 09:52
@grodowski grodowski force-pushed the grodowski/issue-5459-sleep-metrics branch from f129c8d to c0aa1c3 Compare June 2, 2026 13:01
@grodowski grodowski changed the base branch from pb/refactor_metrics_emitter to pb/consolidate_metrics_emitters June 2, 2026 13:01
   Add sleep metric helpers and instrument the main migration sleep/wait paths.

   Metrics emitted:
   - sleep.duration_milliseconds tagged with stage
   - sleep.total_milliseconds tagged with stage

   Stages covered:
   - cut_over_postpone
   - chunk_throttle
   - retry_backoff
   - replica_wait

   Use millisecond units so sub-second waits, such as replica polling and
   nice-ratio throttling, are not truncated to zero. Document the sleep stage
   catalog and add unit coverage for the helper, including sub-second durations.
@grodowski grodowski force-pushed the grodowski/issue-5459-sleep-metrics branch from c0aa1c3 to 684e645 Compare June 2, 2026 13:18
@grodowski grodowski marked this pull request as ready for review June 2, 2026 13:42
Comment thread go/metrics/emit.go
}

type sleepHistogramEmitter interface {
Histogram(name string, value float64, tags ...string)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here with the separate interfaces, if we could just all collapse them into Emitter into client.go

Comment thread go/metrics/catalog.md
@@ -0,0 +1,15 @@
# gh-ost metrics catalog
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was waiting to do this until the end but sure https://github.com/Shopify/schema-migrations/issues/5465

@forge33 forge33 mentioned this pull request Jun 3, 2026
2 tasks
@forge33 forge33 force-pushed the pb/consolidate_metrics_emitters branch 2 times, most recently from bd9bdb1 to c4e2b06 Compare June 3, 2026 14:41
@forge33 forge33 mentioned this pull request Jun 3, 2026
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

#gsd:50633 Data Storage: gh-ost Observability Instrumentation to-upstream

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants