feat(canopy): send health_details (sizes, fixes, duration) with verification#88
Merged
Conversation
c768a0b to
7f70f21
Compare
…ication Canopy now accepts an arbitrary `health_details` map on /restore-verification, and bestool-canopy 0.4.4 adds a generic request escape hatch. The typed RestoreVerification struct has no such field, so pgro serializes the typed report, splices in `health_details`, and POSTs the merged body via CanopyClient::request to the same endpoint. health_details (snake_case) carries: - sizes: per-database on-disk bytes (pg_database_size), keyed by db name. - fixes: an arbitrary jsonb map of the fix steps the restore applied (locale, reindex, reset_wal, recreated_pg_wal). Stored in _pgro.restore_info.fixes by the init script and forwarded verbatim, so adding a fix is one shell line + its flag — no schema or operator change. - restore_duration_sec: wall-clock from the restore CR's createdAt to report time. sizes + fixes come from one read-only connection to the restore's postgres (done in the switchover block, before any ephemeral teardown destroys the DB). Gathering is best-effort: on the failure path, or if postgres never came up, those pieces are omitted and the verification still sends. Duration is independent of postgres. Records pg_resetwal / pg_wal-recreation via flag files so they surface in fixes alongside the existing locale/reindex flags.
7f70f21 to
7cc83b8
Compare
30ab42c to
518d535
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🤖 Stacks on #87. Canopy accepts an arbitrary
health_detailsmap on/restore-verification, and bestool-canopy 0.4.4 adds a generic requestescape hatch (bestool#626). The typed
RestoreVerificationstruct has nohealth_detailsfield, so pgro serializes the typed report, splices inhealth_details, and POSTs the merged body viaCanopyClient::requestto the same endpoint.
health_details(snake_case)sizes— per-database on-disk bytes (pg_database_size), keyed bydb name.
fixes— an arbitrary jsonb map of the fix steps the restoreapplied:
locale,reindex,reset_wal,recreated_pg_wal. Storedin
_pgro.restore_info.fixesby the init script and forwardedverbatim, so adding a fix later is one shell line + its flag file —
no schema change, no operator change. (
reindex/localewere theexample; this carries whatever we apply.)
restore_duration_sec— wall-clock from the restore CR'screatedAtto report time (≈ activation).Gathering
sizes+fixescome from a single read-only connection to therestore's postgres, done in the switchover block before #87's
ephemeral teardown destroys the DB. Best-effort: on the failure path (or
if postgres never came up) those pieces are omitted and the verification
still sends.
restore_duration_secis independent of postgres.Records
pg_resetwal/pg_wal-recreation via flag files so theysurface in
fixesnext to the existing locale/reindex flags.Follow-up: typed schema (bestool#628)
This uses the arbitrary-JSON
requestpath. Once bestool#628 lands andreleases (build-time
bestool_canopy::schema::types generated fromcanopy's OpenAPI), a follow-up will swap the hand-built
Valuefor thegenerated
RestoreVerificationtype +request_json, for compile-timesafety against canopy's spec. Deliberately not blocking this PR on it.
More stats?
Easy follow-ups if you want them (all cheap from the same connection):
per-db table count, largest-relation size,
pg_stat_databasexact/blkscounters, or the snapshot→restored size ratio.