Skip to content

WIP: OCPBUGS-81712: fixes race condition when mirroring operator catalogs#1390

Open
aguidirh wants to merge 1 commit intoopenshift:mainfrom
aguidirh:fix/OCPBUGS-81712
Open

WIP: OCPBUGS-81712: fixes race condition when mirroring operator catalogs#1390
aguidirh wants to merge 1 commit intoopenshift:mainfrom
aguidirh:fix/OCPBUGS-81712

Conversation

@aguidirh
Copy link
Copy Markdown
Contributor

@aguidirh aguidirh commented Apr 16, 2026

Description

This PR fixes a race condition in oc-mirror v2's operator catalog mirroring that causes diskToMirror operations to fail in air-gapped environments.

Root Cause: During mirrorToDisk, catalog images are resolved by tag at two different times:

  1. During the collection phase (CollectAll()) - creates working-dir structure based on resolved digest
  2. During the mirroring phase (Batch.Worker()) - mirrors the catalog image

If Red Hat updates a catalog between these phases (the tag now points to a different digest), the working-dir structure is created with one digest but the tarball contains a different digest. During diskToMirror, oc-mirror cannot find the catalog in the working-dir and attempts to fetch it from the source registry (registry.redhat.io), which fails in air-gapped environments.

Solution: Pin catalog images to their resolved digest during collection for mirrorToDisk and mirrorToMirror operations. This ensures the same digest is used throughout the entire workflow, preventing the race condition. Additionally, fix the pinned ImageSetConfiguration generator to handle digest-based CatalogToFBCMap keys when generating pinned ISC/DISC files.

Github / Jira issue: OCPBUGS-81712

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Code Improvements (Refactoring, Performance, CI upgrades, etc)
  • Internal repo assets (diagrams / docs on github repo)
  • This change requires a documentation update on openshift docs

Code Changes

Primary Fix: Catalog Digest Pinning

File: internal/pkg/operator/filtered_collector.go

  1. Lines 89-100: Updated CatalogToFBCMap key generation

    • In M2D/M2M modes, use digest-based key to match the Origin reference
  2. Lines 301-310: Pin catalog to digest during collection

    • In M2D/M2M modes, pin catalog reference to resolved digest
    • Preserve original tag for destination if user didn't provide targetTag

File: internal/pkg/operator/filtered_collector_test.go

  • Updated test expectations for M2D and M2M modes
  • Source references now use digest format
  • Destination references still use tag format (user-facing)

Secondary Fix: Pinned ISC/DISC Generation

File: internal/pkg/config/pin_catalogs.go

  1. Lines 86-101: Added findCatalogByOriginalRef() helper function

    • Searches CatalogToFBCMap by matching OperatorFilter.Catalog
    • Enables lookup when map keys use digest format but ISC has tag references
  2. Lines 116-126: Updated pinSingleCatalogDigest() lookup logic

    • Try direct lookup first (tag-based key for D2M)
    • Fall back to findCatalogByOriginalRef() for M2D/M2M digest-based keys

File: internal/pkg/config/pin_catalogs_test.go

  • Added TestPinSingleCatalog_M2DMapKeyFormat
  • Added TestPinCatalogDigests_M2DMapKeyFormat
  • Tests verify pinned ISC generation works with digest-based map keys

How Has This Been Tested?

Unit Tests

All existing unit tests pass

Manual Reproduction Test

A complete reproduction kit is provided below to verify the bug and validate the fix.

Prerequisites

  • Quay.io account with push access
  • podman, skopeo, and jq installed
  • oc-mirror built from this branch

Setup (One-Time)

  1. Set your Quay username:
export QUAY_USERNAME=your-quay-username
  1. Create the fake catalog structure:
mkdir -p /tmp/fake-catalog/configs
cd /tmp/fake-catalog
  1. Create the files below (Dockerfile, configs, scripts)

  2. Run the reproduction script:

chmod +x /tmp/fake-catalog/reproduce-bug.sh
cd /tmp/fake-catalog
./reproduce-bug.sh

This will:

  • Build and push the initial catalog (v1.0) to your Quay repository
  • Create an ImageSetConfiguration
  • Create an update script that simulates Red Hat updating the catalog
  • Start oc-mirror in mirrorToDisk mode
  1. When oc-mirror pauses (it's needed to uncomment executor.go rows 880-881), it will display instructions. Open a second terminal and run:
/tmp/bug-reproduction/update-catalog.sh

This simulates Red Hat pushing a new catalog version (moving the tag to a different digest).

  1. Press Enter in the first terminal to let oc-mirror continue.

Verification

Without the fix:

  • DiskToMirror fails trying to fetch the internet (when the internet is cut-off)

With the fix:

  • Catalog is pinned to digest during collection
  • DiskToMirror succeeds without internet access.

Cleanup for Re-testing

chmod +x /tmp/fake-catalog/cleanup.sh
/tmp/fake-catalog/cleanup.sh

This removes:

  • /tmp/bug-reproduction/ directory
  • ~/.oc-mirror cache
  • Active catalog config file
  • Optionally: local podman images

Reproduction Kit Files

1. Dockerfile

# Use the File-Based Catalog (FBC) format
FROM scratch

# Copy the declarative config
COPY configs /configs

# Metadata for the catalog
LABEL operators.operatorframework.io.index.configs.v1=/configs

2. configs/catalog.json (Initial catalog - v1.0)

{
  "schema": "olm.package",
  "name": "my-test-operator",
  "defaultChannel": "stable"
}
{
  "schema": "olm.channel",
  "name": "stable",
  "package": "my-test-operator",
  "entries": [
    {
      "name": "my-test-operator.v1.0.0"
    }
  ]
}
{
  "schema": "olm.bundle",
  "name": "my-test-operator.v1.0.0",
  "package": "my-test-operator",
  "image": "quay.io/operatorhubio/prometheus:v0.47.0",
  "properties": [
    {
      "type": "olm.package",
      "value": {
        "packageName": "my-test-operator",
        "version": "1.0.0"
      }
    }
  ],
  "relatedImages": [
    {
      "name": "prometheus",
      "image": "quay.io/prometheus/prometheus:v2.22.1"
    }
  ]
}

3. configs/catalog-v2.json (Updated catalog - v2.0)

{
  "schema": "olm.package",
  "name": "my-test-operator",
  "defaultChannel": "stable"
}
{
  "schema": "olm.channel",
  "name": "stable",
  "package": "my-test-operator",
  "entries": [
    {
      "name": "my-test-operator.v1.0.0"
    },
    {
      "name": "my-test-operator.v1.1.0",
      "replaces": "my-test-operator.v1.0.0"
    }
  ]
}
{
  "schema": "olm.bundle",
  "name": "my-test-operator.v1.0.0",
  "package": "my-test-operator",
  "image": "quay.io/operatorhubio/prometheus:v0.47.0",
  "properties": [
    {
      "type": "olm.package",
      "value": {
        "packageName": "my-test-operator",
        "version": "1.0.0"
      }
    }
  ],
  "relatedImages": [
    {
      "name": "prometheus",
      "image": "quay.io/prometheus/prometheus:v2.22.1"
    }
  ]
}
{
  "schema": "olm.bundle",
  "name": "my-test-operator.v1.1.0",
  "package": "my-test-operator",
  "image": "quay.io/operatorhubio/prometheus:v0.50.0",
  "properties": [
    {
      "type": "olm.package",
      "value": {
        "packageName": "my-test-operator",
        "version": "1.1.0"
      }
    }
  ],
  "relatedImages": [
    {
      "name": "prometheus",
      "image": "quay.io/prometheus/prometheus:v2.30.0"
    },
    {
      "name": "alertmanager",
      "image": "quay.io/prometheus/alertmanager:v0.23.0"
    }
  ]
}

4. reproduce-bug.sh

#!/bin/bash
set -e

# OCPBUGS-81712 Bug Reproduction Script
# This script helps you reproduce the race condition bug

QUAY_USERNAME="${QUAY_USERNAME:-YOUR_USERNAME_HERE}"

if [ "$QUAY_USERNAME" = "YOUR_USERNAME_HERE" ]; then
    echo "❌ Please set your Quay username:"
    echo "  export QUAY_USERNAME=your-quay-username"
    exit 1
fi

CATALOG_IMAGE="quay.io/${QUAY_USERNAME}/test-catalog:v1.0"
CATALOG_IMAGE_OLD="quay.io/${QUAY_USERNAME}/test-catalog:v1.0-old"
TEST_DIR="/tmp/bug-reproduction"

echo "🔧 OCPBUGS-81712 Bug Reproduction"
echo "=================================="
echo ""

# Step 1: Build and push initial catalog
echo "📦 Step 1: Building initial catalog (v1.0)..."
cd /tmp/fake-catalog
cp configs/catalog.json configs/catalog-active.json
podman build -t test-catalog:v1.0 .
podman tag test-catalog:v1.0 ${CATALOG_IMAGE}
podman push ${CATALOG_IMAGE}
podman push ${CATALOG_IMAGE} ${CATALOG_IMAGE_OLD}

DIGEST_V1=$(skopeo inspect docker://${CATALOG_IMAGE} | jq -r .Digest)
echo "   ✅ Pushed catalog v1.0 with digest: ${DIGEST_V1}"
echo ""

# Step 2: Create ImageSetConfig
echo "📝 Step 2: Creating ImageSetConfig..."
mkdir -p ${TEST_DIR}
cat > ${TEST_DIR}/imageset-config.yaml <<EOF
kind: ImageSetConfiguration
apiVersion: mirror.openshift.io/v2alpha1
mirror:
  operators:
    - catalog: ${CATALOG_IMAGE}
      packages:
        - name: my-test-operator
EOF
echo "   ✅ Created ${TEST_DIR}/imageset-config.yaml"
echo ""

# Step 3: Create the update script (before running oc-mirror!)
echo "📝 Step 3: Creating catalog update script..."
cat > ${TEST_DIR}/update-catalog.sh <<'EOF2'
#!/bin/bash
echo "🔄 Updating catalog v1.0 (simulating Red Hat catalog update)..."
cd /tmp/fake-catalog
cp configs/catalog-v2.json configs/catalog-active.json

# Step 1: Build v1.0 with NEW content
podman build -t test-catalog:v1.0 .

# Step 2: Push as v1.0 (creates NEW_DIGEST on Quay)
CATALOG_IMAGE_V1="quay.io/${QUAY_USERNAME}/test-catalog:v1.0"
podman tag test-catalog:v1.0 ${CATALOG_IMAGE_V1}
podman push ${CATALOG_IMAGE_V1}

DIGEST_V2=$(skopeo inspect docker://${CATALOG_IMAGE_V1} | jq -r .Digest)
echo "✅ Updated catalog with NEW digest: ${DIGEST_V2}"
echo ""
echo "⚠️  The v1.0 tag now points to NEW_DIGEST (different from collection phase)"
echo "   OLD_DIGEST still exists on Quay (from v2.0 push)"
echo "   The fix ensures oc-mirror uses the pinned OLD_DIGEST, not the moved tag."
EOF2

sed -i "s/\${QUAY_USERNAME}/${QUAY_USERNAME}/g" ${TEST_DIR}/update-catalog.sh
chmod +x ${TEST_DIR}/update-catalog.sh
echo "   ✅ Created ${TEST_DIR}/update-catalog.sh"
echo ""

# Step 3: Run oc-mirror
echo "▶️  Step 4: Starting oc-mirror (will pause after collection)"
echo ""
echo "   When it pauses, open another terminal and run:"
echo "   ${TEST_DIR}/update-catalog.sh"
echo ""
cd ~/go/src/github.com/aguidirh/oc-mirror
./bin/oc-mirror --v2 -c ${TEST_DIR}/imageset-config.yaml file://${TEST_DIR}
echo ""

# Step 4: Verification instructions
echo "🔍 Step 5: Verification steps"
echo ""
echo "After oc-mirror completes, verify the bug:"
echo ""
echo "  # Check working-dir (will have OLD digest path)"
echo "  ls ${TEST_DIR}/working-dir/operator-catalogs/test-catalog/"
echo ""
echo "  # Extract tarball and check (will have NEW digest)"
echo "  cd ${TEST_DIR}"
echo "  mkdir -p extracted"
echo "  tar -xzf mirror_*.tar -C extracted"
echo "  ls extracted/working-dir/operator-catalogs/test-catalog/"
echo ""
echo "  # Compare digests - they should be DIFFERENT!"
echo ""

# Step 5: DiskToMirror test
echo "🎯 Step 6: Test DiskToMirror failure"
echo ""
echo "  # Remove credentials to simulate air-gap"
echo "  mv ~/.docker/config.json ~/.docker/config.json.bak"
echo ""
echo "  # Try diskToMirror - it will try to contact quay.io and fail"
echo "  ./bin/oc-mirror --v2 -c ${TEST_DIR}/imageset-config.yaml \\"
echo "    --from file://${TEST_DIR} \\"
echo "    docker://localhost:5000"
echo ""
echo "  # Expected error: 'unable to retrieve auth token'"
echo ""

echo "✅ Setup complete!"
echo ""
echo "Summary:"
echo "  - Initial digest: ${DIGEST_V1}"
echo "  - Catalog image: ${CATALOG_IMAGE}"
echo "  - Test directory: ${TEST_DIR}"
echo "  - Update script: ${TEST_DIR}/update-catalog.sh"
echo ""
echo "Follow the instructions above to reproduce the bug!"

5. cleanup.sh

#!/bin/bash

# Cleanup script for OCPBUGS-81712 bug reproduction

QUAY_USERNAME="${QUAY_USERNAME:-YOUR_USERNAME_HERE}"

echo "🧹 Cleaning up bug reproduction environment..."
echo ""

# 1. Clean test directory
echo "1️⃣  Removing /tmp/bug-reproduction/..."
rm -rf /tmp/bug-reproduction/
echo "   ✅ Done"

# 2. Clean oc-mirror cache
if [ -d "$HOME/.oc-mirror" ]; then
    echo "2️⃣  Removing $HOME/.oc-mirror cache..."
    rm -rf "$HOME/.oc-mirror"
    echo "   ✅ Done"
else
    echo "2️⃣  No cache at $HOME/.oc-mirror, skipping"
fi

# 3. Clean active catalog config (gets modified during test)
if [ -f "/tmp/fake-catalog/configs/catalog-active.json" ]; then
    echo "3️⃣  Removing /tmp/fake-catalog/configs/catalog-active.json..."
    rm -f /tmp/fake-catalog/configs/catalog-active.json
    echo "   ✅ Done"
else
    echo "3️⃣  No active catalog config, skipping"
fi

# 4. Optional: Clean local podman images
echo ""
echo "4️⃣  Optional: Clean local podman images?"
echo "   This will remove test-catalog:v1.0 from local storage"
read -p "   Clean local images? (y/N): " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
    podman rmi localhost/test-catalog:v1.0 2>/dev/null || true
    if [ "$QUAY_USERNAME" != "YOUR_USERNAME_HERE" ]; then
        podman rmi quay.io/${QUAY_USERNAME}/test-catalog:v1.0 2>/dev/null || true
    fi
    echo "   ✅ Done"
else
    echo "   ⏭️  Skipped"
fi

echo ""
echo "✅ Cleanup complete! Ready to run reproduce-bug.sh again."
echo ""
echo "Note: The catalog on Quay (quay.io/${QUAY_USERNAME}/test-catalog:v1.0)"
echo "      will be reset to v1.0 when you run reproduce-bug.sh again."

6. README.md

# OCPBUGS-81712 Bug Reproduction Kit

This directory contains everything you need to reproduce the race condition bug.

## Quick Start

### 1. Set your Quay username
export QUAY_USERNAME=your-quay-username

### 2. Create the directory structure
mkdir -p /tmp/fake-catalog/configs
cd /tmp/fake-catalog

### 3. Create all files
Create the following files in `/tmp/fake-catalog/`:
- `Dockerfile`
- `configs/catalog.json`
- `configs/catalog-v2.json`
- `reproduce-bug.sh`
- `cleanup.sh`
- `README.md`

(All file contents are provided in the PR description above)

### 4. Run the reproduction script
chmod +x reproduce-bug.sh cleanup.sh
./reproduce-bug.sh

### 5. When oc-mirror pauses
Open a second terminal and run:
/tmp/bug-reproduction/update-catalog.sh

### 6. Continue oc-mirror
Press Enter in the first terminal.

### 7. Verify the fix works
The diskToMirror workflow should work even without access to the internet and no attempt to the original registry happens.

### 8. Clean up for re-testing
./cleanup.sh

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

* **Bug Fixes**
  * Improved catalog resolution: fall back to matching original tag references when digests are present.
  * Catalog handling now uses an origin-digest value (instead of the previous digest field) to build pinned references and component names, ensuring consistent digest-pinning for mirror-to-disk and mirror-to-mirror flows.
  * Abort and skip operators when a required catalog digest cannot be resolved.

* **Tests**
  * Added and updated tests covering digest-keyed catalog maps and end-to-end pinning; expectations use digest-pinned image references.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 16, 2026
@openshift-ci-robot openshift-ci-robot added jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Apr 16, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@aguidirh: This pull request references Jira Issue OCPBUGS-81712, which is invalid:

  • expected the bug to target the "5.0.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Description

This PR fixes a race condition in oc-mirror v2's operator catalog mirroring that causes diskToMirror operations to fail in air-gapped environments.

Root Cause: During mirrorToDisk, catalog images are resolved by tag at two different times:

  1. During the collection phase (CollectAll()) - creates working-dir structure based on resolved digest
  2. During the mirroring phase (Batch.Worker()) - mirrors the catalog image

If Red Hat updates a catalog between these phases (the tag now points to a different digest), the working-dir structure is created with one digest but the tarball contains a different digest. During diskToMirror, oc-mirror cannot find the catalog in the working-dir and attempts to fetch it from the source registry (registry.redhat.io), which fails in air-gapped environments.

Solution: Pin catalog images to their resolved digest during collection for mirrorToDisk and mirrorToMirror operations. This ensures the same digest is used throughout the entire workflow, preventing the race condition. Additionally, fix the pinned ImageSetConfiguration generator to handle digest-based CatalogToFBCMap keys when generating pinned ISC/DISC files.

Github / Jira issue: OCPBUGS-81712

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Code Improvements (Refactoring, Performance, CI upgrades, etc)
  • Internal repo assets (diagrams / docs on github repo)
  • This change requires a documentation update on openshift docs

Code Changes

Primary Fix: Catalog Digest Pinning

File: internal/pkg/operator/filtered_collector.go

  1. Lines 88-93: Updated CatalogToFBCMap key generation
  • In M2D/M2M modes, use digest-based key to match the Origin reference
  • Changed from else if to if for better code clarity
if imgSpec.Transport != consts.OciProtocol && (o.Opts.IsMirrorToDisk() || o.Opts.IsMirrorToMirror()) {
    mapKey = consts.DockerProtocol + image.WithDigest(imgSpec.Name, result.Digest)
}
  1. Lines 293-302: Pin catalog to digest during collection
  • Changed from else if to if for better code clarity
  • In M2D/M2M modes, pin catalog reference to resolved digest
  • Preserve original tag for destination if user didn't provide targetTag
if imgSpec.Transport != consts.OciProtocol && (o.Opts.IsMirrorToDisk() || o.Opts.IsMirrorToMirror()) {
    catalogImage = image.WithDigest(imgSpec.Name, catalogDigest)
    if len(targetTag) == 0 && len(imgSpec.Tag) > 0 {
        targetTag = imgSpec.Tag
    }
}

File: internal/pkg/operator/filtered_collector_test.go

  • Updated test expectations for M2D and M2M modes
  • Source references now use digest format
  • Destination references still use tag format (user-facing)

Secondary Fix: Pinned ISC/DISC Generation

File: internal/pkg/config/pin_catalogs.go

  1. Lines 86-101: Added findCatalogByOriginalRef() helper function
  • Searches CatalogToFBCMap by matching OperatorFilter.Catalog
  • Enables lookup when map keys use digest format but ISC has tag references
func findCatalogByOriginalRef(
    catalog string,
    catalogToFBCMap map[string]v2alpha1.CatalogFilterResult,
    log clog.PluggableLoggerInterface,
) (v2alpha1.CatalogFilterResult, bool)
  1. Lines 116-126: Updated pinSingleCatalogDigest() lookup logic
  • Try direct lookup first (tag-based key for D2M)
  • Fall back to findCatalogByOriginalRef() for M2D/M2M digest-based keys
filterResult, ok := catalogToFBCMap[imgSpec.ReferenceWithTransport]
if !ok {
    filterResult, ok = findCatalogByOriginalRef(catalog, catalogToFBCMap, log)
}

File: internal/pkg/config/pin_catalogs_test.go

  • Added TestPinSingleCatalog_M2DMapKeyFormat
  • Added TestPinCatalogDigests_M2DMapKeyFormat
  • Tests verify pinned ISC generation works with digest-based map keys

How Has This Been Tested?

Unit Tests

All existing unit tests pass:

# Operator collector tests (catalog pinning during collection)
go test -v ./internal/pkg/operator -run TestFilterCollector

# Pinned ISC/DISC generation tests (handles digest-based map keys)
go test -v ./internal/pkg/config -run TestPin

Manual Reproduction Test

A complete reproduction kit is provided below to verify the bug and validate the fix.

Prerequisites

  • Quay.io account with push access
  • podman, skopeo, and jq installed
  • oc-mirror built from this branch

Setup (One-Time)

  1. Set your Quay username:
export QUAY_USERNAME=your-quay-username
  1. Create the fake catalog structure:
mkdir -p /tmp/fake-catalog/configs
cd /tmp/fake-catalog
  1. Create the files below (Dockerfile, configs, scripts)

  2. Run the reproduction script:

chmod +x /tmp/fake-catalog/reproduce-bug.sh
cd /tmp/fake-catalog
./reproduce-bug.sh

This will:

  • Build and push the initial catalog (v1.0) to your Quay repository
  • Create an ImageSetConfiguration
  • Create an update script that simulates Red Hat updating the catalog
  • Start oc-mirror in mirrorToDisk mode
  1. When oc-mirror pauses (it's needed to uncomment executor.go rows 880-881), it will display instructions. Open a second terminal and run:
/tmp/bug-reproduction/update-catalog.sh

This simulates Red Hat pushing a new catalog version (moving the tag to a different digest).

  1. Press Enter in the first terminal to let oc-mirror continue.

Verification

Without the fix:

  • DiskToMirror fails trying to fetch the internet (when the internet is cut-off)

With the fix:

  • Catalog is pinned to digest during collection
  • DiskToMirror succeeds without internet access.

Cleanup for Re-testing

chmod +x /tmp/fake-catalog/cleanup.sh
/tmp/fake-catalog/cleanup.sh

This removes:

  • /tmp/bug-reproduction/ directory
  • ~/.oc-mirror cache
  • Active catalog config file
  • Optionally: local podman images

Reproduction Kit Files

1. Dockerfile

# Use the File-Based Catalog (FBC) format
FROM scratch

# Copy the declarative config
COPY configs /configs

# Metadata for the catalog
LABEL operators.operatorframework.io.index.configs.v1=/configs

2. configs/catalog.json (Initial catalog - v1.0)

{
 "schema": "olm.package",
 "name": "my-test-operator",
 "defaultChannel": "stable"
}
{
 "schema": "olm.channel",
 "name": "stable",
 "package": "my-test-operator",
 "entries": [
   {
     "name": "my-test-operator.v1.0.0"
   }
 ]
}
{
 "schema": "olm.bundle",
 "name": "my-test-operator.v1.0.0",
 "package": "my-test-operator",
 "image": "quay.io/operatorhubio/prometheus:v0.47.0",
 "properties": [
   {
     "type": "olm.package",
     "value": {
       "packageName": "my-test-operator",
       "version": "1.0.0"
     }
   }
 ],
 "relatedImages": [
   {
     "name": "prometheus",
     "image": "quay.io/prometheus/prometheus:v2.22.1"
   }
 ]
}

3. configs/catalog-v2.json (Updated catalog - v2.0)

{
 "schema": "olm.package",
 "name": "my-test-operator",
 "defaultChannel": "stable"
}
{
 "schema": "olm.channel",
 "name": "stable",
 "package": "my-test-operator",
 "entries": [
   {
     "name": "my-test-operator.v1.0.0"
   },
   {
     "name": "my-test-operator.v1.1.0",
     "replaces": "my-test-operator.v1.0.0"
   }
 ]
}
{
 "schema": "olm.bundle",
 "name": "my-test-operator.v1.0.0",
 "package": "my-test-operator",
 "image": "quay.io/operatorhubio/prometheus:v0.47.0",
 "properties": [
   {
     "type": "olm.package",
     "value": {
       "packageName": "my-test-operator",
       "version": "1.0.0"
     }
   }
 ],
 "relatedImages": [
   {
     "name": "prometheus",
     "image": "quay.io/prometheus/prometheus:v2.22.1"
   }
 ]
}
{
 "schema": "olm.bundle",
 "name": "my-test-operator.v1.1.0",
 "package": "my-test-operator",
 "image": "quay.io/operatorhubio/prometheus:v0.50.0",
 "properties": [
   {
     "type": "olm.package",
     "value": {
       "packageName": "my-test-operator",
       "version": "1.1.0"
     }
   }
 ],
 "relatedImages": [
   {
     "name": "prometheus",
     "image": "quay.io/prometheus/prometheus:v2.30.0"
   },
   {
     "name": "alertmanager",
     "image": "quay.io/prometheus/alertmanager:v0.23.0"
   }
 ]
}

4. reproduce-bug.sh

#!/bin/bash
set -e

# OCPBUGS-81712 Bug Reproduction Script
# This script helps you reproduce the race condition bug

QUAY_USERNAME="${QUAY_USERNAME:-YOUR_USERNAME_HERE}"

if [ "$QUAY_USERNAME" = "YOUR_USERNAME_HERE" ]; then
   echo "❌ Please set your Quay username:"
   echo "  export QUAY_USERNAME=your-quay-username"
   exit 1
fi

CATALOG_IMAGE="quay.io/${QUAY_USERNAME}/test-catalog:v1.0"
TEST_DIR="/tmp/bug-reproduction"

echo "🔧 OCPBUGS-81712 Bug Reproduction"
echo "=================================="
echo ""

# Step 1: Build and push initial catalog
echo "📦 Step 1: Building initial catalog (v1.0)..."
cd /tmp/fake-catalog
cp configs/catalog.json configs/catalog-active.json
podman build -t test-catalog:v1.0 .
podman tag test-catalog:v1.0 ${CATALOG_IMAGE}
podman push ${CATALOG_IMAGE}

DIGEST_V1=$(skopeo inspect docker://${CATALOG_IMAGE} | jq -r .Digest)
echo "   ✅ Pushed catalog v1.0 with digest: ${DIGEST_V1}"
echo ""

# Step 2: Create ImageSetConfig
echo "📝 Step 2: Creating ImageSetConfig..."
mkdir -p ${TEST_DIR}
cat > ${TEST_DIR}/imageset-config.yaml <<EOF
kind: ImageSetConfiguration
apiVersion: mirror.openshift.io/v2alpha1
mirror:
 operators:
   - catalog: ${CATALOG_IMAGE}
     packages:
       - name: my-test-operator
EOF
echo "   ✅ Created ${TEST_DIR}/imageset-config.yaml"
echo ""

# Step 3: Create the update script (before running oc-mirror!)
echo "📝 Step 3: Creating catalog update script..."
cat > ${TEST_DIR}/update-catalog.sh <<'EOF2'
#!/bin/bash
echo "🔄 Updating catalog to v2.0 (simulating Red Hat catalog update)..."
cd /tmp/fake-catalog
cp configs/catalog-v2.json configs/catalog-active.json

# Step 1: Build v2.0 with NEW content
podman build -t test-catalog:v2.0 .

# Step 2: Push as v2.0 (creates NEW_DIGEST on Quay)
CATALOG_IMAGE_V2="quay.io/${QUAY_USERNAME}/test-catalog:v2.0"
podman tag test-catalog:v2.0 ${CATALOG_IMAGE_V2}
podman push ${CATALOG_IMAGE_V2}

# Step 3: Retag and push as v1.0 (moves v1.0 tag to NEW_DIGEST)
CATALOG_IMAGE_V1="quay.io/${QUAY_USERNAME}/test-catalog:v1.0"
podman tag test-catalog:v2.0 ${CATALOG_IMAGE_V1}
podman push ${CATALOG_IMAGE_V1}

DIGEST_V2=$(skopeo inspect docker://${CATALOG_IMAGE_V1} | jq -r .Digest)
echo "✅ Updated catalog with NEW digest: ${DIGEST_V2}"
echo ""
echo "⚠️  The v1.0 tag now points to NEW_DIGEST (different from collection phase)"
echo "   OLD_DIGEST still exists on Quay (from v2.0 push)"
echo "   The fix ensures oc-mirror uses the pinned OLD_DIGEST, not the moved tag."
EOF2

sed -i "s/\${QUAY_USERNAME}/${QUAY_USERNAME}/g" ${TEST_DIR}/update-catalog.sh
chmod +x ${TEST_DIR}/update-catalog.sh
echo "   ✅ Created ${TEST_DIR}/update-catalog.sh"
echo ""

# Step 4: Run oc-mirror
echo "▶️  Step 4: Starting oc-mirror (will pause after collection)"
echo ""
echo "   When it pauses, open another terminal and run:"
echo "   ${TEST_DIR}/update-catalog.sh"
echo ""
cd ~/go/src/github.com/aguidirh/oc-mirror
./bin/oc-mirror --v2 -c ${TEST_DIR}/imageset-config.yaml file://${TEST_DIR}
echo ""

# Step 5: Verification instructions
echo "🔍 Step 5: Verification steps"
echo ""
echo "After oc-mirror completes, verify the bug:"
echo ""
echo "  # Check working-dir (will have OLD digest path)"
echo "  ls ${TEST_DIR}/working-dir/operator-catalogs/test-catalog/"
echo ""
echo "  # Extract tarball and check (will have NEW digest)"
echo "  cd ${TEST_DIR}"
echo "  mkdir -p extracted"
echo "  tar -xzf mirror_*.tar -C extracted"
echo "  ls extracted/working-dir/operator-catalogs/test-catalog/"
echo ""
echo "  # Compare digests - they should be DIFFERENT!"
echo ""

# Step 6: DiskToMirror test
echo "🎯 Step 6: Test DiskToMirror failure"
echo ""
echo "  # Remove credentials to simulate air-gap"
echo "  mv ~/.docker/config.json ~/.docker/config.json.bak"
echo ""
echo "  # Try diskToMirror - it will try to contact quay.io and fail"
echo "  ./bin/oc-mirror --v2 -c ${TEST_DIR}/imageset-config.yaml \\"
echo "    --from file://${TEST_DIR} \\"
echo "    docker://localhost:5000"
echo ""
echo "  # Expected error: 'unable to retrieve auth token'"
echo ""

echo "✅ Setup complete!"
echo ""
echo "Summary:"
echo "  - Initial digest: ${DIGEST_V1}"
echo "  - Catalog image: ${CATALOG_IMAGE}"
echo "  - Test directory: ${TEST_DIR}"
echo "  - Update script: ${TEST_DIR}/update-catalog.sh"
echo ""
echo "Follow the instructions above to reproduce the bug!"

5. cleanup.sh

#!/bin/bash

# Cleanup script for OCPBUGS-81712 bug reproduction

QUAY_USERNAME="${QUAY_USERNAME:-YOUR_USERNAME_HERE}"

echo "🧹 Cleaning up bug reproduction environment..."
echo ""

# 1. Clean test directory
echo "1️⃣  Removing /tmp/bug-reproduction/..."
rm -rf /tmp/bug-reproduction/
echo "   ✅ Done"

# 2. Clean oc-mirror cache
if [ -d "$HOME/.oc-mirror" ]; then
   echo "2️⃣  Removing $HOME/.oc-mirror cache..."
   rm -rf "$HOME/.oc-mirror"
   echo "   ✅ Done"
else
   echo "2️⃣  No cache at $HOME/.oc-mirror, skipping"
fi

# 3. Clean active catalog config (gets modified during test)
if [ -f "/tmp/fake-catalog/configs/catalog-active.json" ]; then
   echo "3️⃣  Removing /tmp/fake-catalog/configs/catalog-active.json..."
   rm -f /tmp/fake-catalog/configs/catalog-active.json
   echo "   ✅ Done"
else
   echo "3️⃣  No active catalog config, skipping"
fi

# 4. Optional: Clean local podman images
echo ""
echo "4️⃣  Optional: Clean local podman images?"
echo "   This will remove test-catalog:v1.0 from local storage"
read -p "   Clean local images? (y/N): " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
   podman rmi localhost/test-catalog:v1.0 2>/dev/null || true
   if [ "$QUAY_USERNAME" != "YOUR_USERNAME_HERE" ]; then
       podman rmi quay.io/${QUAY_USERNAME}/test-catalog:v1.0 2>/dev/null || true
   fi
   echo "   ✅ Done"
else
   echo "   ⏭️  Skipped"
fi

echo ""
echo "✅ Cleanup complete! Ready to run reproduce-bug.sh again."
echo ""
echo "Note: The catalog on Quay (quay.io/${QUAY_USERNAME}/test-catalog:v1.0)"
echo "      will be reset to v1.0 when you run reproduce-bug.sh again."

6. README.md

# OCPBUGS-81712 Bug Reproduction Kit

This directory contains everything you need to reproduce the race condition bug.

## Quick Start

### 1. Set your Quay username
export QUAY_USERNAME=your-quay-username

### 2. Create the directory structure
mkdir -p /tmp/fake-catalog/configs
cd /tmp/fake-catalog

### 3. Create all files
Create the following files in `/tmp/fake-catalog/`:
- `Dockerfile`
- `configs/catalog.json`
- `configs/catalog-v2.json`
- `reproduce-bug.sh`
- `cleanup.sh`
- `README.md`

(All file contents are provided in the PR description above)

### 4. Run the reproduction script
chmod +x reproduce-bug.sh cleanup.sh
./reproduce-bug.sh

### 5. When oc-mirror pauses
Open a second terminal and run:
/tmp/bug-reproduction/update-catalog.sh

### 6. Continue oc-mirror
Press Enter in the first terminal.

### 7. Verify the fix works
The diskToMirror workflow should work even without access to the internet and no attempt to the original registry happens.

### 8. Clean up for re-testing
./cleanup.sh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 16, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Catalog handling now uses resolved catalog digests for non‑OCI catalogs: collector writes digest‑keyed entries and rewrites catalog refs to digest‑pinned form; pinning falls back to a linear search by original tag when direct digest-key lookup fails. Tests updated; one file contains commented test pauses.

Changes

Cohort / File(s) Summary
Test Infrastructure
internal/pkg/cli/executor.go
Added commented-out test-only pause statements (fmt.Println / fmt.Scanln) after RebuildCatalogs; no active runtime change.
Catalog Pinning Logic & Tests
internal/pkg/config/pin_catalogs.go, internal/pkg/config/pin_catalogs_test.go
Added findCatalogByOriginalRef fallback search by OperatorFilter.Catalog; pinSingleCatalogDigest falls back to that when direct map lookup by ReferenceWithTransport fails; switched usage from DigestOriginDigest; tests updated and new cases added for digest-keyed catalogMap scenarios.
Operator Collection Logic
internal/pkg/operator/filtered_collector.go
For non‑OCI transports in mirror-to-disk/mirror-to-mirror, CatalogToFBCMap keys are built from Docker protocol + resolved catalog digest; catalogImage rewritten to digest-pinned ref (image.WithDigest(...)); component naming and targetTag handling updated to use OriginDigest and preserve original tag when appropriate.
Operator Collection Tests
internal/pkg/operator/filtered_collector_test.go
Updated expected Source/Origin values to digest-pinned references (replacing tag-based refs) across M2D and M2M tests to match new digest-pinning behavior.
API/Internal Types
internal/pkg/api/v2alpha1/type_internal.go
Renamed/repurposed CatalogFilterResult.DigestCatalogFilterResult.OriginDigest, changing the shape of catalog filter results consumed/produced.

Sequence Diagram(s)

sequenceDiagram
    participant Collector as Collector\n(internal/pkg/operator)
    participant Pinning as PinningLogic\n(internal/pkg/config)
    participant CatalogMap as CatalogMap\n(in-memory map)
    participant Mirror as Mirroring\n(disk/mirror)

    Collector->>Pinning: request catalog resolution (imgSpec)
    alt direct digest-key exists
        Pinning->>CatalogMap: lookup by ReferenceWithTransport (digest key)
        CatalogMap-->>Pinning: CatalogFilterResult (OriginDigest)
    else fallback to original tag
        Pinning->>CatalogMap: linear search by OperatorFilter.Catalog (original tag)
        CatalogMap-->>Pinning: CatalogFilterResult (OriginDigest) or not found
    end
    Pinning-->>Collector: resolved digest-pinned catalog ref or not found
    Collector->>Mirror: mirror using digest-pinned catalogImage (set targetTag from original tag if unset)
    Mirror-->>Collector: mirror result / error
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 35.29% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Stable And Deterministic Test Names ❓ Inconclusive Test files referenced in PR summary (pin_catalogs_test.go, filtered_collector_test.go) could not be located in repository to verify Ginkgo test name stability. Provide test file content or confirm repository state to verify that test names follow Ginkgo guidelines and avoid dynamic values.
✅ Passed checks (10 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Test Structure And Quality ✅ Passed Repository uses standard Go testing framework, not Ginkgo. PR tests follow the established codebase pattern with proper structure and cleanup.
Microshift Test Compatibility ✅ Passed This pull request does not add any new Ginkgo e2e tests. All test modifications are limited to standard Go unit tests using the testing package.
Single Node Openshift (Sno) Test Compatibility ✅ Passed This pull request does not add any Ginkgo e2e tests. The repository is the oc-mirror tool, and modified test files use standard Go testing framework, not Ginkgo patterns.
Topology-Aware Scheduling Compatibility ✅ Passed Changes are exclusively to internal utility code for operator catalog image mirroring and digest pinning, with no scheduling constraints.
Ote Binary Stdout Contract ✅ Passed Commented-out fmt.Println and fmt.Scanln in RunMirrorToDisk method are not process-level code and do not execute.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed This PR does not introduce any new Ginkgo e2e tests. The modified test files are standard Go unit tests using the testing.T interface with testify assertions and mock objects, not Ginkgo-style tests.
Title check ✅ Passed The title accurately describes the main change: fixing a race condition in operator catalog mirroring by pinning catalogs to resolved digests during collection.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from adolfo-ab and r4f4 April 16, 2026 17:24
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 16, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aguidirh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 16, 2026
@aguidirh
Copy link
Copy Markdown
Contributor Author

This PR is a working in progress. There are still a lot of tests to be performed.

Reminder for myself: Test OCI scenario, full: true scenario and mirrorToMirror scenario

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
internal/pkg/cli/executor.go (1)

879-881: Remove the test-only pause block before merge.

Even commented out, this is dead debug code in the main M2D flow and is easy to accidentally re-enable in a follow-up/cherry-pick.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/pkg/cli/executor.go` around lines 879 - 881, Remove the dead
test-only pause block that starts with the TODO comment and the commented
println/Scanln lines (the lines containing "PAUSED - Update catalog, press
Enter..." and the preceding "//TODO REMOVE ME, THIS IS ONLY FOR TEST
PURPOSES."). Delete those three commented lines from
internal/pkg/cli/executor.go so no debug pause code remains in the main M2D
flow.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/pkg/operator/filtered_collector.go`:
- Around line 89-94: The map key is being set to consts.DockerProtocol +
image.WithDigest(..., result.Digest) which uses the rebuilt catalog digest and
causes lookup misses; change the assignment of mapKey in filtered_collector.go
to use the source/catalog digest (the same value used as Origin in
collectOperator and ExecutorSchema.RebuildCatalogs) instead of result.Digest,
preserving the transport prefix (e.g., consts.DockerProtocol +
image.WithDigest(catalogDigest, ...)) so CatalogToFBCMap is indexed by the
original catalog digest that filterOperator/collectOperator expect.

---

Nitpick comments:
In `@internal/pkg/cli/executor.go`:
- Around line 879-881: Remove the dead test-only pause block that starts with
the TODO comment and the commented println/Scanln lines (the lines containing
"PAUSED - Update catalog, press Enter..." and the preceding "//TODO REMOVE ME,
THIS IS ONLY FOR TEST PURPOSES."). Delete those three commented lines from
internal/pkg/cli/executor.go so no debug pause code remains in the main M2D
flow.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: de6c182e-9972-4c7e-bb09-f7c24a3b6d45

📥 Commits

Reviewing files that changed from the base of the PR and between ca5eebd and cb58054.

📒 Files selected for processing (5)
  • internal/pkg/cli/executor.go
  • internal/pkg/config/pin_catalogs.go
  • internal/pkg/config/pin_catalogs_test.go
  • internal/pkg/operator/filtered_collector.go
  • internal/pkg/operator/filtered_collector_test.go

Comment thread internal/pkg/operator/filtered_collector.go Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
internal/pkg/operator/filtered_collector.go (1)

89-100: Consider returning the source catalog digest from collectOperator to avoid duplicate registry lookups.

For M2D/M2M non-OCI cases, getCatalogDigest is called twice: once here at line 92, and again within collectOperator at line 252. Both calls make the same registry request. The duplicate call is necessary here because result.Digest from collectOperator may contain the rebuilt/filtered catalog digest rather than the source digest.

This is functionally correct but inefficient. Consider extending CatalogFilterResult to include a SourceDigest field, populated by collectOperator, so it can be reused here without a second registry call.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/pkg/operator/filtered_collector.go` around lines 89 - 100,
collectOperator currently performs the registry lookup but only returns
result.Digest (which may be the rebuilt digest), causing filtered_collector.go
to call getCatalogDigest again; extend CatalogFilterResult with a new
SourceDigest field, populate it inside collectOperator with the original/source
catalog digest when you perform the initial registry lookup, and then in
filtered_collector.go (where mapKey is built using getCatalogDigest) use
result.SourceDigest instead of calling getCatalogDigest; keep a safe fallback to
call getCatalogDigest only if result.SourceDigest is empty to avoid breaking
behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/pkg/operator/filtered_collector.go`:
- Around line 89-100: collectOperator currently performs the registry lookup but
only returns result.Digest (which may be the rebuilt digest), causing
filtered_collector.go to call getCatalogDigest again; extend CatalogFilterResult
with a new SourceDigest field, populate it inside collectOperator with the
original/source catalog digest when you perform the initial registry lookup, and
then in filtered_collector.go (where mapKey is built using getCatalogDigest) use
result.SourceDigest instead of calling getCatalogDigest; keep a safe fallback to
call getCatalogDigest only if result.SourceDigest is empty to avoid breaking
behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 4afdd5a5-2d8d-45e1-92ac-a630a499c8f4

📥 Commits

Reviewing files that changed from the base of the PR and between 6625780 and 3b3aff1.

📒 Files selected for processing (1)
  • internal/pkg/operator/filtered_collector.go

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
internal/pkg/operator/filtered_collector.go (1)

89-100: Duplicate getCatalogDigest call creates minor performance overhead.

getCatalogDigest is called here (line 92) and again inside collectOperator (line 252). For M2D/M2M with non-OCI catalogs, this results in two registry calls per catalog when resolving tag-based references.

Consider refactoring collectOperator to return the resolved catalogDigest alongside result, allowing the map key construction to reuse it:

// Example: modify collectOperator signature
result, catalogDigest, err := o.collectOperator(ctx, op, relatedImages, copyImageSchemaMap)
// Then use catalogDigest for mapKey construction

This would eliminate the duplicate network call and any theoretical TOCTOU window.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/pkg/operator/filtered_collector.go` around lines 89 - 100, The code
currently calls getCatalogDigest twice for non-OCI catalogs (once here and again
inside collectOperator), causing duplicate registry calls; refactor
collectOperator to return the resolved catalogDigest along with its existing
result so the caller can reuse it when building mapKey instead of invoking
getCatalogDigest again — update collectOperator signature to return (result,
catalogDigest, error) and change the call site to use the returned catalogDigest
when constructing mapKey (consts.DockerProtocol + image.WithDigest(imgSpec.Name,
catalogDigest)), removing the extra getCatalogDigest invocation and keeping
existing behavior for other code paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/pkg/operator/filtered_collector.go`:
- Around line 89-100: The code currently calls getCatalogDigest twice for
non-OCI catalogs (once here and again inside collectOperator), causing duplicate
registry calls; refactor collectOperator to return the resolved catalogDigest
along with its existing result so the caller can reuse it when building mapKey
instead of invoking getCatalogDigest again — update collectOperator signature to
return (result, catalogDigest, error) and change the call site to use the
returned catalogDigest when constructing mapKey (consts.DockerProtocol +
image.WithDigest(imgSpec.Name, catalogDigest)), removing the extra
getCatalogDigest invocation and keeping existing behavior for other code paths.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: f532bb76-49a7-48c0-9452-fa753b51c1f1

📥 Commits

Reviewing files that changed from the base of the PR and between 3b3aff1 and 9a78022.

📒 Files selected for processing (4)
  • internal/pkg/api/v2alpha1/type_internal.go
  • internal/pkg/config/pin_catalogs.go
  • internal/pkg/config/pin_catalogs_test.go
  • internal/pkg/operator/filtered_collector.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/pkg/config/pin_catalogs_test.go

@aguidirh
Copy link
Copy Markdown
Contributor Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Apr 21, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@aguidirh: This pull request references Jira Issue OCPBUGS-81712, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@aguidirh aguidirh changed the title WIP: OCPBUGS-81712: fixes race condition when mirroring operator catalogs OCPBUGS-81712: fixes race condition when mirroring operator catalogs Apr 22, 2026
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 22, 2026
Copy link
Copy Markdown
Contributor

@adolfo-ab adolfo-ab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR introduces a regression here:
https://github.com/openshift/oc-mirror/blob/main/internal/pkg/clusterresources/clusterresources.go#L214

When a user provides a CatalogSource template, oc-mirror compares copyImage.Origin against op.Catalog to find the template. If the user was passing the catalog by tag in the ISC, getCSTemplate cannot find it (since .Origin is by digest now), and we ignore the template. I've reproduced this locally by running m2m with a dummy CS template with a custom label, and the resulting CS in the working-dir doesn't have the label.

Comment thread internal/pkg/config/pin_catalogs.go Outdated
log.Warn("Catalog %s not found in CatalogToFBCMap, skipping pin", catalog)
return "", nil
// OCPBUGS-81712: In M2D/M2M, the map key uses digest format.
// Fall back to searching by matching the original catalog reference.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to add a debug log here to mention that we didn't find the catalog by tag and we're falling back to looking the original ref by digest?

@aguidirh
Copy link
Copy Markdown
Contributor Author

I think this PR introduces a regression here: https://github.com/openshift/oc-mirror/blob/main/internal/pkg/clusterresources/clusterresources.go#L214

When a user provides a CatalogSource template, oc-mirror compares copyImage.Origin against op.Catalog to find the template. If the user was passing the catalog by tag in the ISC, getCSTemplate cannot find it (since .Origin is by digest now), and we ignore the template. I've reproduced this locally by running m2m with a dummy CS template with a custom label, and the resulting CS in the working-dir doesn't have the label.

This is fixed on the last commit. But to be honest I don't like where we are going here, let me explain why:

  • Now we are pinning by digest the operator catalog image, but o.Config.Mirror.Operators remains by tag, so all the places where this is used, this kind of handling (R251-R269 and R89-R102) is needed.

So I was wondering if it would be better to also update o.Config.Mirror.Operators (only on memory without touching the original file) in order to avoid all these handling I mentioned above.

I did not evaluated the impact of changing o.Config.Mirror.Operators yet.

I would like to hear your perspective about it @r4f4, @adolfo-ab and @dorzel

@aguidirh
Copy link
Copy Markdown
Contributor Author

/test unit

@aguidirh
Copy link
Copy Markdown
Contributor Author

/hold

Because there are still some concerns to be addressed on this PR.

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 24, 2026
@adolfo-ab
Copy link
Copy Markdown
Contributor

Wrt to your comment @aguidirh, I think unifying by digest is the right move, but I'm not sure about the impact and if it would delay this fix by a lot. Do we want this in 4.22?

@aguidirh aguidirh changed the title OCPBUGS-81712: fixes race condition when mirroring operator catalogs WIP: OCPBUGS-81712: fixes race condition when mirroring operator catalogs May 7, 2026
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 7, 2026
@aguidirh aguidirh force-pushed the fix/OCPBUGS-81712 branch from 68a3b43 to 46c16c5 Compare May 7, 2026 14:52
This architectural refactoring moves catalog digest resolution to the
beginning of mirrorToDisk and mirrorToMirror workflows instead of
resolving digests throughout the code.

Key changes:
- Added ResolveCatalogDigests() function to resolve all catalog tags to
  digests at the start of M2D/M2M workflows
- Added resolveSingleCatalogDigest() helper to reduce cyclomatic
  complexity (cyclop linter)
- Added pinOperatorCatalogs() helper method in executor.go to eliminate
  code duplication between M2D and M2M
- Modified executor.go to call pinOperatorCatalogs() before operator
  collection in both workflows
- Simplified filtered_collector.go by removing digest resolution logic
  (catalogs are now already pinned to digests)
- Updated M2D and M2M tests to call ResolveCatalogDigests() before
  OperatorImageCollector() to match the real workflow
- Removed obsolete tests (TestPinSingleCatalog_M2DMapKeyFormat and
  TestPinCatalogDigests_M2DMapKeyFormat) that tested the old approach
- Fixed pinned ISC/DISC generation to use digest-based catalog keys in
  CatalogToFBCMap (CLID-513)

This ensures consistent catalog references throughout the workflow and
prevents race conditions when multiple processes access the same catalog.

Signed-off-by: Alex Guidi <aguidi@redhat.com>
@aguidirh aguidirh force-pushed the fix/OCPBUGS-81712 branch from 46c16c5 to d98216d Compare May 7, 2026 14:57
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 7, 2026

@aguidirh: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants