Add a back-off for Muzzle Version Range#11433
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 35b427b2ef
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
This comment has been minimized.
This comment has been minimized.
35b427b to
6585632
Compare
AlexeyKuznetsov-DD
left a comment
There was a problem hiding this comment.
LGTM, but WDYT about adding detection of 429 errors and log them for more visibility?
Something like this:
private fun Throwable.isRateLimited(): Boolean {
var t: Throwable? = this
while (t != null) {
val msg = t.message ?: ""
if ("429" in msg || "Too Many Requests" in msg) return true
t = t.cause
}
return false
}
And also WDYT about fallback to real Maven Central if empty result from proxy received, Claude generated something like this:
/**
* Resolves the version range for a given MuzzleDirective using the provided RepositorySystem and RepositorySystemSession.
* Retries up to 4 times with a 5 s linear backoff on both empty results and HTTP 429 responses.
* When a 429 carries a Retry-After header, that value overrides the calculated wait.
* If all attempts against the configured repos return empty (not rate-limited) and
* MAVEN_REPOSITORY_PROXY is set, one additional attempt is made directly against Maven Central
* to handle cases where the proxy has a cache miss but Central has the artifact.
*/
fun resolveVersionRange(
muzzleDirective: MuzzleDirective,
system: RepositorySystem,
session: RepositorySystemSession,
defaultRepos: List<RemoteRepository> = defaultMuzzleRepos()
): VersionRangeResult {
val directiveArtifact: Artifact = DefaultArtifact(
muzzleDirective.group,
muzzleDirective.module,
muzzleDirective.classifier ?: "",
"jar",
muzzleDirective.versions
)
val repos = muzzleDirective.getRepositories(defaultRepos)
var rateLimited = false
var lastRateLimitException: Exception? = null
for (attempt in 0..3) {
val rangeRequest = VersionRangeRequest().apply {
repositories = repos
artifact = directiveArtifact
}
try {
val range = system.resolveVersionRange(session, rangeRequest)
if (range.lowestVersion != null && range.highestVersion != null) return range
// empty result from proxy — wait before retrying, unless this was the last attempt
if (attempt < 3) Thread.sleep((attempt + 1) * 5_000L)
} catch (e: VersionRangeResolutionException) {
if (!e.isRateLimited()) throw e
rateLimited = true
lastRateLimitException = e
if (attempt < 3) {
// honour Retry-After from the 429 response if present, otherwise linear backoff
val retryAfter = e.retryAfterSeconds()
val waitMs = retryAfter?.times(1000L) ?: ((attempt + 1) * 5_000L)
logger.warn(
"Muzzle: HTTP 429 rate limit detected for " +
"${muzzleDirective.group}:${muzzleDirective.module} ${muzzleDirective.versions} " +
"(attempt ${attempt + 1}/4). " +
(if (retryAfter != null) "Retry-After: ${retryAfter}s. " else "") +
"Waiting ${waitMs / 1000}s before retrying."
)
Thread.sleep(waitMs)
} else {
logger.warn(
"Muzzle: HTTP 429 rate limit detected for " +
"${muzzleDirective.group}:${muzzleDirective.module} ${muzzleDirective.versions} " +
"(attempt ${attempt + 1}/4). All retries exhausted."
)
}
}
}
// Explicit Maven Central fallback: when the proxy returned empty results (not a rate-limit
// error), try Central directly — the proxy may have a cache miss while Central has the data.
if (!rateLimited && System.getenv("MAVEN_REPOSITORY_PROXY") != null) {
logger.warn(
"Muzzle: proxy returned empty version range for " +
"${muzzleDirective.group}:${muzzleDirective.module} ${muzzleDirective.versions} " +
"after 4 attempts. Falling back to Maven Central directly."
)
val central = RemoteRepository.Builder("central", "default", "https://repo1.maven.org/maven2/").build()
val fallbackRequest = VersionRangeRequest().apply {
repositories = muzzleDirective.getRepositories(listOf(central))
artifact = directiveArtifact
}
val fallback = system.resolveVersionRange(session, fallbackRequest)
if (fallback.lowestVersion != null && fallback.highestVersion != null) return fallback
}
if (rateLimited) {
throw IllegalStateException(
"The version range resolution failed due to rate limiting (HTTP 429). " +
"Advised course of action: Restart the job later.",
lastRateLimitException
)
}
throw IllegalStateException(
"The version range resolution failed during report, this is not expected. " +
"Advised course of action: Restart the job later."
)
}
private fun Throwable.isRateLimited(): Boolean {
var t: Throwable? = this
while (t != null) {
val msg = t.message ?: ""
if ("429" in msg || "Too Many Requests" in msg) return true
t = t.cause
}
return false
}
/** Returns the Retry-After value in seconds if the 429 response included it. */
private fun Throwable.retryAfterSeconds(): Long? {
var t: Throwable? = this
while (t != null) {
Regex("""[Rr]etry-[Aa]fter:\s*(\d+)""").find(t.message ?: "")
?.groupValues?.get(1)?.toLongOrNull()
?.let { return it }
t = t.cause
}
return null
}
AlexeyKuznetsov-DD
left a comment
There was a problem hiding this comment.
Approving, but consider to apply any ideas of improvements I suggested in my comment.
|
@AlexeyKuznetsov-DD I'm not sure, the suggest code assumes there's a |
@bric3 Let's try simple approach first. |
What Does This Do
Adds a back-off for muzzle version range checks.
Motivation
Several muzzle jobs are failing and just need to be retried. This creates the burden to watch for transient network failures that could be automatically recovered.
Additional Notes
Contributor Checklist
type:and (comp:orinst:) labels in addition to any other useful labelsclose,fix, or any linking keywords when referencing an issueUse
solvesinstead, and assign the PR milestone to the issueJira ticket: [PROJ-IDENT]
Note: Once your PR is ready to merge, add it to the merge queue by commenting
/merge./merge -ccancels the queue request./merge -f --reason "reason"skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.