fix: add asyncio.Lock to prevent concurrent refresh races#286
Open
nix-tkobayashi wants to merge 6 commits into
Open
fix: add asyncio.Lock to prevent concurrent refresh races#286nix-tkobayashi wants to merge 6 commits into
nix-tkobayashi wants to merge 6 commits into
Conversation
… hook `_sign_request_hook` is an async httpx event hook, but it calls `session_holder.session.get_credentials()` and `session_holder.refresh_if_needed()` synchronously. For profiles that use assumed IAM roles (chained credentials), `get_credentials()` triggers a blocking STS `AssumeRole` call that stalls the event loop and causes connection timeouts (~60 s). Add `SessionHolder.async_get_credentials()` and `SessionHolder.async_refresh_if_needed()` which delegate to `asyncio.to_thread`, keeping the event loop responsive. Update `_sign_request_hook` to call the async variants. Fixes aws#176 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Address review feedback: - Clarify that get_credentials() is the primary blocking path; async_refresh_if_needed() is a defensive measure - Replace timeout-based non-blocking test with ticker coroutine that proves the event loop stays responsive during credential resolution Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Prevent concurrent refresh races when multiple in-flight requests trigger credential refresh simultaneously. Uses a double-check pattern: the outer check avoids lock overhead on the fast path, the inner check ensures only one caller performs the refresh. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When the task holding _refresh_lock is cancelled during asyncio.to_thread(), the lock would be released while the worker thread is still running. A second caller could then acquire the lock, see _needs_refresh == True, and start a duplicate refresh. Shield the to_thread call so that cancellation waits for the refresh to complete before releasing the lock and re-raising CancelledError. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Changes
Follow-up to #270 (per review feedback).
SessionHolder.async_refresh_if_needed()can be called concurrently by multiple in-flight requests when the session is marked for refresh. Without synchronization, all callers race to create a new boto3 session simultaneously — wasting STS calls and risking inconsistent state.This PR:
asyncio.LocktoSessionHolderwith a double-check pattern: the outer_needs_refreshcheck avoids lock overhead on the fast path; the inner check ensures only one caller actually performs the refreshto_threadcall from task cancellation so that a cancelled caller does not release the lock while the worker thread is still running (prevents duplicate refresh on cancellation)threading.Eventbarriers to assertcreate_aws_sessionis called exactly onceUser experience
No user-visible behavior change. This is a correctness hardening for the credential refresh path introduced in #270.
Checklist
Is this a breaking change? (Y/N)
Please add details about how this change was tested.
Acknowledgment
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.