Skip to content

mariadb_copy: replace sleep workarounds with persistent pod retry pattern#1361

Open
ciecierski wants to merge 1 commit intoopenstack-k8s-operators:mainfrom
ciecierski:osprh-27386-mariadb-copy-persistent-pod
Open

mariadb_copy: replace sleep workarounds with persistent pod retry pattern#1361
ciecierski wants to merge 1 commit intoopenstack-k8s-operators:mainfrom
ciecierski:osprh-27386-mariadb-copy-persistent-pod

Conversation

@ciecierski
Copy link
Copy Markdown
Contributor

@ciecierski ciecierski commented Apr 10, 2026

Summary

  • Replace the temporary mariadb_client_timeout sleep and wait bgp pause workarounds in mariadb_copy with a retry loop that polls the podified MariaDB via the already-running mariadb-copy-data pod, eliminating the race condition where commands were executed before the network was fully programmed (same approach used by get_services_configuration).
  • Align pre_checks.bash with the docs by replacing the ephemeral oc run mariadb-client --rm ... sleep && mysql pattern with oc rsh mariadb-copy-data, restoring docs-tests parity.
  • Add a corresponding "wait for podified MariaDB reachable" step to the docs procedure with a BGP/IPv6 NOTE.
  • Remove the mariadb_client_timeout variable from common_defaults as it has no remaining references.

Resolves: OSPRH-27386

Made with Cursor

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 10, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign holser for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…tern

Replace the temporary mariadb_client_timeout sleep and bgp wait pauses
with a proper retry loop that polls the podified MariaDB via the
already-running mariadb-copy-data pod, eliminating the race condition
where commands were executed before the network was fully programmed.
Also align pre_checks.bash with the docs by using oc rsh mariadb-copy-data
instead of the ephemeral mariadb-client pod, and add a corresponding
wait step to the docs procedure.

Resolves: OSPRH-27386
Made-with: Cursor
@ciecierski ciecierski force-pushed the osprh-27386-mariadb-copy-persistent-pod branch from 4755dc8 to 7982e13 Compare April 10, 2026 12:19
Comment thread tests/roles/mariadb_copy/tasks/main.yaml
@jistr
Copy link
Copy Markdown
Contributor

jistr commented Apr 16, 2026

/lgtm

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 2, 2026

This PR is stale because it has been for over 15 days with no activity.
Remove stale label or comment or this PR will be closed in 7 days.

@github-actions github-actions Bot added the Stale label May 2, 2026
@jistr jistr removed the Stale label May 4, 2026
@ciecierski
Copy link
Copy Markdown
Contributor Author

Still busy testing this change in downstream

Copy link
Copy Markdown
Contributor

@klgill klgill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one minor edit and a question.

done
----

. Wait for the `mariadb-copy-data` pod to be able to reach the podified MariaDB:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
. Wait for the `mariadb-copy-data` pod to be able to reach the podified MariaDB:
. Wait for the `mariadb-copy-data` pod to reach the podified MariaDB:

[NOTE]
====
For BGP-enabled environments, this command might take a few moments to succeed while BGP routes are advertised and propagated through the network. The `mariadb-copy-data` pod needs to receive the route to the podified MariaDB IP address through BGP before it can establish a connection. If the command fails, wait a few seconds and retry. The connection should succeed once the BGP route advertisement is complete.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should the customer do if the command fails on the next try?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants