mariadb_copy: replace sleep workarounds with persistent pod retry pattern#1361
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
…tern Replace the temporary mariadb_client_timeout sleep and bgp wait pauses with a proper retry loop that polls the podified MariaDB via the already-running mariadb-copy-data pod, eliminating the race condition where commands were executed before the network was fully programmed. Also align pre_checks.bash with the docs by using oc rsh mariadb-copy-data instead of the ephemeral mariadb-client pod, and add a corresponding wait step to the docs procedure. Resolves: OSPRH-27386 Made-with: Cursor
4755dc8 to
7982e13
Compare
|
/lgtm |
|
This PR is stale because it has been for over 15 days with no activity. |
|
Still busy testing this change in downstream |
klgill
left a comment
There was a problem hiding this comment.
I have one minor edit and a question.
| done | ||
| ---- | ||
|
|
||
| . Wait for the `mariadb-copy-data` pod to be able to reach the podified MariaDB: |
There was a problem hiding this comment.
| . Wait for the `mariadb-copy-data` pod to be able to reach the podified MariaDB: | |
| . Wait for the `mariadb-copy-data` pod to reach the podified MariaDB: |
| [NOTE] | ||
| ==== | ||
| For BGP-enabled environments, this command might take a few moments to succeed while BGP routes are advertised and propagated through the network. The `mariadb-copy-data` pod needs to receive the route to the podified MariaDB IP address through BGP before it can establish a connection. If the command fails, wait a few seconds and retry. The connection should succeed once the BGP route advertisement is complete. | ||
|
|
There was a problem hiding this comment.
What should the customer do if the command fails on the next try?
Summary
mariadb_client_timeoutsleep andwait bgppause workarounds inmariadb_copywith a retry loop that polls the podified MariaDB via the already-runningmariadb-copy-datapod, eliminating the race condition where commands were executed before the network was fully programmed (same approach used byget_services_configuration).pre_checks.bashwith the docs by replacing the ephemeraloc run mariadb-client --rm ... sleep && mysqlpattern withoc rsh mariadb-copy-data, restoring docs-tests parity.mariadb_client_timeoutvariable fromcommon_defaultsas it has no remaining references.Resolves: OSPRH-27386
Made with Cursor