[WIP]: Support IAM role for access to S3#951
Draft
TomAugspurger wants to merge 1 commit intorapidsai:mainfrom
Draft
[WIP]: Support IAM role for access to S3#951TomAugspurger wants to merge 1 commit intorapidsai:mainfrom
TomAugspurger wants to merge 1 commit intorapidsai:mainfrom
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
87cf303 to
ae3d3b8
Compare
Contributor
Author
|
/ok to test ae3d3b8 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
(I need to perform a self-review first).
This PR adds support for IAM role-based authentication to S3 for KvikIO. Briefly, this is a way for compute resources like an EC2 instance to authenticate with S3 without the application / user needing to explicitly handle credentials. Instead, they assign a role to the EC2 instance and applications / libraries (like kvikio) can follow a protocol to get credentials temporary credentials.
User API for specifying credentials
This PR includes a user-facing change in how users should provide credentials to kvikio. Previously, we accepted named arguments like
aws_access_key_id=..., aws_secret_access_key=...or environment variables likeAWS_ACCESS_KEY_ID=.... This is fine when all of your authentication methods (static tokens, env vars) ultimately map to the same bits of data (access key ID, secret access key, etc.) It breaks down when you have different arguments that configure how you get your credentials.So, as a precursor to the main change, we introduce a new
credential=...argument, which is responsible for describing how kvikio should get the credentials.credential=AwsStaticCredential(aws_access_key_id=..., ...)credential=AwsEnvironmentCredential()credential=AwsIamRoleCredential().The new default tries various authentication methods, including environment variables, which preserves backwards compatibility. The previous method specifying named arguments is deprecated.
IAM Role Credentials
See the linked issue for more, but the brief explanation of how this new credential provider works is to retrieve temporary credentials from the AWS provided Instance Metadata Service (IMDS) endpoint. Once fetched, these tokens should be cached and reused until (close to) their expiry, at which point the application (or library, like kvikio) should refresh the token.
The actual tokens retrieved from the IMDS are used in the exact same was as we currently use them.
Closes #950