Skip to content

finer-grained retry for docker pull images#6785

Draft
smola wants to merge 2 commits intomainfrom
smola/pull-images-ci
Draft

finer-grained retry for docker pull images#6785
smola wants to merge 2 commits intomainfrom
smola/pull-images-ci

Conversation

@smola
Copy link
Copy Markdown
Member

@smola smola commented Apr 22, 2026

Motivation

Pulling images is flaky, especially with Microsoft's MCR which block our pulls somtimes, and when it does, it will continue to do so in the 3 attempts with the current 10s delay.

Changes

  • Change the script to list images to just pull images instead.
  • Pulls up to 4 images in parallel per registry.
  • If one image pull fails, pull from the other continues uninterrupted, and the failed pull is attempted up to 3 times, with escalating timeout with jitter (5, 10, 20 seconds). Hoping this is gentle enough to avoid Microsoft's MCR blocking our pulls repeatedly.

Workflow

  1. ⚠️ Create your PR as draft ⚠️
  2. Work on you PR until the CI passes
  3. Mark it as ready for review
    • Test logic is modified? -> Get a review from RFC owner.
    • Framework is modified, or non obvious usage of it -> get a review from R&P team

🚀 Once your PR is reviewed and the CI green, you can merge it!

🛟 #apm-shared-testing 🛟

Reviewer checklist

  • Anything but tests/ or manifests/ is modified ? I have the approval from R&P team
  • A docker base image is modified?
    • the relevant build-XXX-image label is present
  • A scenario is added, removed or renamed?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 22, 2026

CODEOWNERS have been resolved as:

utils/scripts/pull-images.py                                            @DataDog/system-tests-core
.github/actions/pull_images/action.yml                                  @DataDog/system-tests-core
utils/scripts/get-image-list.py                                         @DataDog/system-tests-core

@smola smola force-pushed the smola/pull-images-ci branch from 9774d17 to 81eefeb Compare April 22, 2026 10:43
@smola smola changed the title Smola/pull images ci finer-grained retry for docker pull images Apr 22, 2026
@smola smola force-pushed the smola/pull-images-ci branch from 81eefeb to 98d7729 Compare April 22, 2026 12:09
@datadog-prod-us1-5
Copy link
Copy Markdown

datadog-prod-us1-5 Bot commented Apr 22, 2026

Tests

Fix all issues with BitsAI or with Cursor

⚠️ Warnings

🧪 12 Tests failed

tests.integrations.test_db_integrations_sql.Test_MsSql.test_db_instance[spring-boot] from system_tests_suite   View in Datadog   (Fix with Cursor)
ValueError: Span is not found for http://localhost:7777/db?service=mssql&operation=select

self = <tests.integrations.test_db_integrations_sql.Test_MsSql object at 0x7f2ddd8f94c0>
excluded_operations = ()

    def test_db_instance(self, excluded_operations: tuple[str, ...] = ()):
        """The name of the database being connected to. Database instance name. Formerly db.name"""
        db_container = context.get_container_by_dd_integration_name(self.db_service)
    
>       for _, span_meta in self.get_spans_meta(excluded_operations=excluded_operations):
...
tests.integrations.test_db_integrations_sql.Test_MsSql.test_db_operation[spring-boot] from system_tests_suite   View in Datadog   (Fix with Cursor)
ValueError: Span is not found for http://localhost:7777/db?service=mssql&operation=select

self = <tests.integrations.test_db_integrations_sql.Test_MsSql object at 0x7f2ddd8f8f50>
excluded_operations = ()

    def test_db_operation(self, excluded_operations: tuple[str, ...] = ()):
        """The name of the operation being executed"""
    
>       for db_operation, span_meta in self.get_spans_meta(excluded_operations=excluded_operations + ("select_error",)):

...
tests.integrations.test_db_integrations_sql.Test_MsSql.test_db_password[spring-boot] from system_tests_suite   View in Datadog   (Fix with Cursor)
ValueError: Span is not found for http://localhost:7777/db?service=mssql&operation=select

self = <tests.integrations.test_db_integrations_sql.Test_MsSql object at 0x7f2ddd8f9310>
excluded_operations = ()

    def test_db_password(self, excluded_operations: tuple[str, ...] = ()):
        """The database password should not show in the traces"""
        db_container = context.get_container_by_dd_integration_name(self.db_service)
    
>       for db_operation, span_meta in self.get_spans_meta(excluded_operations=excluded_operations):
...
View all

ℹ️ Info

No other issues found (see more)

❄️ No new flaky tests detected

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 87a40be | Docs | Datadog PR Page | Give us feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant