From 923b8d7a799a7e407371a6939d31be2dc1b0af57 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Christian=20Gonz=C3=A1lez=20Di=20Antonio?= Date: Sun, 24 May 2026 11:23:49 +0200 Subject: [PATCH] fix(ci): cosign sign container manifest by tag, not by local podman digest The cosign step of the release workflow failed with MANIFEST_UNKNOWN on every release attempt after multi-arch builds were restored, e.g. on v0.45.0: Error: signing [ghcr.io/slashdevops/idp-scim-sync@sha256:2925...]: GET https://ghcr.io/v2/.../manifests/sha256:2925...: MANIFEST_UNKNOWN The previous logic resolved the digest to sign by piping `podman manifest inspect ghcr.io/...:TAG` into `jq -r '.digest // .manifests[0].digest'`. Two compounding bugs: 1. A manifest list's own JSON has no top-level `.digest` (the list's digest is computed by hashing the JSON, not stored inside it), so the `//` fallback always returned `.manifests[0].digest` -- which is the digest of the first per-arch image (arm64), not of the manifest list itself. 2. Podman re-serializes manifests on push (media-type conversion between Docker manifest.v2+json and OCI image.manifest.v1+json), so the locally computed digest does not match what GHCR stores. Cosign's lookup at that local digest therefore returned 404. Switch to `cosign sign --recursive ${IMAGE}:${TAG}`. Cosign internally HEAD-resolves the tag to the authoritative on-registry digest and signs that digest, so the resulting artifact is identical to what the broken code intended to produce. The classic "signing by tag races with concurrent pushes" caveat does not apply: this job exclusively owns the v and latest tags and has just pushed them sequentially in the previous step. Co-Authored-By: Claude Opus 4.7 (1M context) --- .github/workflows/container-image.yml | 26 +++++++++++++++----------- docs/Whats-New.md | 15 +++++++++++++++ 2 files changed, 30 insertions(+), 11 deletions(-) diff --git a/.github/workflows/container-image.yml b/.github/workflows/container-image.yml index 6a0da64..7e5ca44 100644 --- a/.github/workflows/container-image.yml +++ b/.github/workflows/container-image.yml @@ -90,16 +90,20 @@ jobs: echo "${{ secrets.GITHUB_TOKEN }}" \ | cosign login ghcr.io --username "${{ github.actor }}" --password-stdin IMAGE="ghcr.io/${{ github.repository }}" - # podman manifest inspect → resolve the multi-arch manifest tag we just - # pushed to its content digest, then sign by digest. Cosign best-practice: - # signing by tag races with subsequent pushes; signing by digest does not. - # --recursive covers the manifest and every per-platform image under it. + # Sign by tag — cosign internally HEADs the registry to resolve the + # tag to its authoritative on-registry digest and signs that digest; + # the signature is stored under that digest, not under the tag. We + # previously resolved the digest locally with `podman manifest + # inspect`, but (a) that JSON has no top-level `.digest` for a + # manifest list and the `.manifests[0].digest` fallback returns the + # *first per-arch* image's digest, not the list's, and (b) podman + # re-serializes during push, so the local digest does not exist on + # GHCR. Result: cosign got MANIFEST_UNKNOWN. + # --recursive covers the manifest and every per-platform image + # under it. The classic "signing by tag races with concurrent + # pushes" caveat doesn't apply here: this job exclusively owns + # these tags and has just pushed them. for TAG in "${GIT_VERSION}" "latest"; do - DIGEST=$(podman manifest inspect "${IMAGE}:${TAG}" | jq -r '.digest // .manifests[0].digest') - if [ -z "${DIGEST}" ] || [ "${DIGEST}" = "null" ]; then - echo "::error::Could not resolve digest for ${IMAGE}:${TAG}" - exit 1 - fi - cosign sign --recursive "${IMAGE}@${DIGEST}" | tee -a $GITHUB_STEP_SUMMARY - echo "**Signed:** \`${IMAGE}@${DIGEST}\` (tag: ${TAG})" >> $GITHUB_STEP_SUMMARY + cosign sign --recursive "${IMAGE}:${TAG}" | tee -a $GITHUB_STEP_SUMMARY + echo "**Signed:** \`${IMAGE}:${TAG}\`" >> $GITHUB_STEP_SUMMARY done diff --git a/docs/Whats-New.md b/docs/Whats-New.md index 348c80f..07b63ef 100644 --- a/docs/Whats-New.md +++ b/docs/Whats-New.md @@ -4,6 +4,21 @@ This document tracks notable changes, new features, and bug fixes across release ## Unreleased +### CI fix: cosign now signs the published container manifest by tag (closes the v0.45.0 signing failure) + +Fixes the `Cosign sign published container manifest (keyless / Sigstore)` step of the release workflow, which failed with **`MANIFEST_UNKNOWN: manifest unknown`** on every release attempt after the multi-arch build was restored (see ["CI fix: restore multi-arch container builds"](#ci-fix-restore-multi-arch-container-builds-in-the-release-workflow)). + +**Root cause.** The step resolved the digest to sign by piping `podman manifest inspect` into `jq -r '.digest // .manifests[0].digest'`. Two compounding problems: + +1. **A manifest list's own JSON has no top-level `.digest`** (its digest is computed by hashing the JSON, not stored inside it). So the `//` fallback always wins and returns `.manifests[0].digest` — the digest of the **first per-arch image** (arm64), not the manifest list. +2. **Podman re-serializes manifests when pushing** (media-type conversion between Docker `vnd.docker.distribution.manifest.v2+json` and OCI `vnd.oci.image.manifest.v1+json`). The locally computed digest therefore does not match what GHCR stores, so cosign's lookup of `ghcr.io/…@sha256:` returns 404. + +Result: cosign was asked to sign a digest that exists nowhere on the registry. + +**Fix.** Sign by tag (`cosign sign --recursive ghcr.io/…:TAG`). Cosign internally HEAD-resolves the tag to its authoritative on-registry digest and signs that digest — the signature is still stored *by digest*, so the resulting artifact is identical to what the previous (broken) code intended to produce. The classic "signing-by-tag races with concurrent pushes" caveat does not apply here: this job exclusively owns the `v` and `latest` tags and has just pushed them sequentially in the previous step. + +No code or release-artifact changes. + ### CI fix: restore multi-arch container builds in the release workflow Fixes the `Publish Container Images` job (failing since v0.44.1, surfaced again on the v0.45.0 release as ["Could not resolve digest for ghcr.io/slashdevops/idp-scim-sync:v0.45.0"](https://github.com/slashdevops/idp-scim-sync/actions/runs/26356807875/job/77585211704)).