Skip to content

Fix multilevel artifact downloads#43

Open
imrehg wants to merge 2 commits into
facultyai:masterfrom
imrehg:multilevel-download
Open

Fix multilevel artifact downloads#43
imrehg wants to merge 2 commits into
facultyai:masterfrom
imrehg:multilevel-download

Conversation

@imrehg

@imrehg imrehg commented Jan 22, 2020

Copy link
Copy Markdown
Member

Models can be registered with subfolders, for example such as:

test
├── subdir
│   └── sub.file
└── top.file

In the current form, when one requests this model, the library downloads all files, but the files from the subdirectories are downloaded multiple times: in the root, and also in their subfolder (and along all
the subfolders, if there are even more levels):

test
├── subdir
│   └── sub.file
├── sub.file
└── top.file

This is because mlflow expects a list of files & directories in a path, but doesn't expect all the items recursively within a subdir, since there are extra steps when they traverse those subdirectories.

With this change, when mlflow requests a list of artifacts at a given path, only return the files and directories directly there, and filter out all other recursive items (which mlflow will request later on).

Signed-off-by: Gergely Imreh gergely.imreh@faculty.ai

@imrehg imrehg force-pushed the multilevel-download branch from 9b3df6f to 30f13f2 Compare January 22, 2020 17:37
@imrehg imrehg marked this pull request as ready for review January 22, 2020 17:39
@imrehg imrehg requested review from acroz, pbugnion and zblz January 22, 2020 17:39
Comment thread mlflow_faculty/artifacts.py Outdated
Comment thread mlflow_faculty/artifacts.py Outdated
Comment thread tests/test_artifacts.py Outdated
@imrehg imrehg requested a review from srstevenson January 23, 2020 13:59
@imrehg imrehg force-pushed the multilevel-download branch from 7d6eb6a to e5d56de Compare January 30, 2020 14:14
imrehg added 2 commits July 6, 2020 13:09
Models can be registered with subfolders, for example such as:
```
test
├── subdir
│   └── sub.file
└── top.file
```

In the current form, when one requests this model, the library downloads
all files, but the files from the subdirectories are downloaded
multiple times: in the root, and also in their subfolder (and along all
the subfolders, if there are even more levels):
```
test
├── subdir
│   └── sub.file
├── sub.file
└── top.file
```
This is because `mlflow` expects a list of files & directories in a path,
but doesn't expect all the items recursively within a subdir, since there
are extra steps when they traverse those subdirectories.

With this change, when `mlflow` requests a list of artifacts at a given
path, only return the files and directories directly there, and filter
out all other recursive items (which `mlflow` will request later on).

Signed-off-by: Gergely Imreh <gergely.imreh@faculty.ai>
Signed-off-by: Gergely Imreh <gergely.imreh@faculty.ai>
@imrehg imrehg force-pushed the multilevel-download branch from e5d56de to 7e6edb8 Compare July 6, 2020 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants