Skip to content

Conversation

@TrevorBurgoyne
Copy link
Member

@TrevorBurgoyne TrevorBurgoyne commented Jan 20, 2025

  • Catch specific use case where periods are used in a folder to represent a number and not a file extension (eg v0.1.2)
  • Rework of is_file_on_s3 util to use head_object which seems to be more consistent. This does change behavior of that function to not raise an exception if the key is not found, since the 404 error can also just mean that the key is not a valid file (read: its a folder). Instead, users should use a combination of exists_on_s3 and is_file_on_s3 to properly handle all cases.
  • Moved continuation logic to s3_utils instead of mirror_path

@github-actions github-actions bot added the bug Something isn't working label Jan 20, 2025
@TrevorBurgoyne TrevorBurgoyne merged commit 3413c10 into master Jan 20, 2025
@TrevorBurgoyne TrevorBurgoyne deleted the fix/period-in-folder branch January 20, 2025 20:36
Comment on lines 37 to +44
"""Get s3 key."""
ret_key = "/".join([str(s.name) for s in self.key_segments])
# We'll infer folder/file based on extension
return ret_key if '.' in self.key_segments[-1].name else f"{ret_key}/"
# HACK to catch case where the "extension" is actually a part of the folder name
# (eg, a folder named "v0.1.0"), we check if the extension is actually a number
name = self.key_segments[-1].name
ext = name.split('.')[-1]
return ret_key if (ext is not name and not ext.isdigit()) else f"{ret_key}/"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just check the is_file property on the last key segment, or is it possible that it's inconsistent?
Technically S3 objects don't need extensions at all, so checking for "." seems problematic (which is my own fault).

Copy link
Member Author

@TrevorBurgoyne TrevorBurgoyne Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah as far as i can tell the is_file property has to be explicitly set currently (it isn't set when initing a MirrorPath)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😭

Comment on lines +120 to +123
except Exception as e:
# 404 occurs if the key is a "folder" or does not exist
if "404" in str(e):
return False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific exception that could be caught?
Unlikely case but a different error with a key that contains "404" could still hit this, maybe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably yeah, i initially tried using the botocore ClientError exception but for some reason it wasn't catching properly

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll just tag #23 and let it slow-cook on the backburner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants