Add exponential backoff when scm update is failing#107
Conversation
|
|
||
| def update | ||
| @cmd.pull.stdout | ||
| retry_count = 0 |
There was a problem hiding this comment.
I don't inherently have a problem with the idea, but I have a few concerns with the implementation:
-
Most software that acts as a client does some sort of exponential backoff, so us doing it is fine, but also, some of the clients we're calling here probably also do this internally or might in the future, and double-exponentially-backing-off is likely to paper over problems and thus hide real infrastructure issues from those people who care about their stuff working. So, we should definitely make this configurable, with the default being either no retries or a single retry. I recommend the config keys being
retriesandretry_backoff_intervalor some such. It also needs to log when this is happening so people know. -
This copy-pasta is sad-making, we can do better. One, I suggest a function in the super class that does something like:
def retriable(name, max_retries, backoff)
retry_count
begin
yield
rescue StandardError => e
@logger.error("Something went wrong runing #{name}!")
@logger.error(e)
if retry_count > max_retries
raise
end
retry_count += 1
@logger.warn("Retrying #{name}: retry #{retry_count}")
sleep(2 ** retry_backoff)
retry
end
endAnd then you can use that from wherever like:
retriable("update repo", desired_retries, desired_backoff) do
@cmd.pull.stdout
end- Further, this
updateand HG'supdateare the same, and simple, so I suggest we make the super class have this, and we can override it from SVN's class.
If update fails, the whole run fails. Try 5 times with delays 2 ^ retry_attempt between tries.