feat: parallelize ci publish with retry via a unified imagetools plugin#599
Draft
ianpittwood wants to merge 2 commits into
Draft
feat: parallelize ci publish with retry via a unified imagetools plugin#599ianpittwood wants to merge 2 commits into
ianpittwood wants to merge 2 commits into
Conversation
Apply the parallel execution module to `bakery ci publish`: each target's oras index-create -> soci convert -> index-copy -> verify sequence now runs as one job on the parallel executor, so independent targets publish concurrently, and every registry command is wrapped in retry-with-backoff to absorb the transient GHCR eventual-consistency failures described in #591 (not found / manifest unknown / 5xx / timeouts; permanent auth/reference errors fail fast). Consolidate the oras and soci plugins into a single `imagetools` plugin that owns the full pipeline. `bakery oras merge`, `soci convert`, `ci publish`, and `ci merge` all route through it (command names preserved); the soci tool options still parse via `tool: soci`. Parallel module gains RetryPolicy, CommandResult, CommandRunner, ShellJob, JobResult, and run_jobs() alongside the existing one-command ShellTask path, sharing one tracked-spawn primitive so timeout + process-group termination + Ctrl-C safety are identical. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Absorb the read-after-write wait from #598: OrasWaitForSourcesWorkflow polls every per-platform source digest with `oras manifest fetch --descriptor` until all are readable (10 min timeout), naming any laggard. ImageToolsPlugin.execute runs it once up front on create-bearing flows (publish / oras merge), aborting the publish if a digest never propagates — so GHCR eventual-consistency lag becomes condition-based waiting instead of an opaque downstream #591 failure. The soci-only path skips it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Supersedes #598
What & why
bakery ci publishran its 4-phase pipeline (oras index-create → soci convert → oras index-copy → verify) serially per target, with no retry — so the transient GHCR eventual-consistency failures in #591 ("not found" / "manifest unknown" on freshly-pushed digests) failed the whole job and only self-healed on a manual re-run.Building on the
feat/parallel-callsparallel module (#588), this:parallelmodule so future flaky commands can reuse it.imagetoolsplugin. A single plugin owns the whole pipeline and registersoras merge,soci convert, andimagetools publish(command names preserved).ci publish/ci mergeroute through it.tool: socioptions still parse (the tool-options registry now keys by the options class'stooldiscriminator).Parallel module additions
RetryPolicy,CommandResult,CommandRunner,ShellJob,JobResult,run_jobs()— added alongside the existingShellTask/run()(dgoss) path, sharing one_spawn_and_communicateprimitive so timeout/termination/interrupt behavior is identical.Verification
just test— 1769 passed; ruff check + format clean.soci.oras merge,soci convert,ci publish,ci merge,imagetools publishall register.bakery ci publish --dry-runis blocked by local command policy; the BDDci mergescenario + no-mock dry-run integration tests cover the end-to-end path.🤖 Generated with Claude Code