Load Image Dataset interface

Alright, Round 2!

Main reason I wanted to build out this library a few weeks ago was to have a streamlined way to:

- ~Quickly label arbitrarily-many images~ We're killing it here
- Have a simple interface for loading all of your image data for ML purposes

## Proposed Flow

After using the `quickLabel` CLI and spitting out our resulting `labels.csv`, we should be able to use that file to create a one-call loader for all of our data.

Maybe something like

``` python
>> from quickLabel.data.loader import load_data
>> PATH = '/usr/my_proj/data/labels.csv'
>> X, y = load_data(PATH)
>> X.shape
(NUM_IMAGES, X_DIM, Y_DIM, 3)
>> y.shape
(NUM_IMAGES,)
```

And then you're off doing whatever `keras`/`pytorch`/`sklearn`/etc implementation you're used to doing.

## Particulars

This would essentially mean creating a file under `quickLabel/data`  called `loader.py` that

- Creates an `X` and `y` of type `np.array`
- Iterates through each record of the `.csv` and per-row:
    - Loads up the image in `PIL` for `X` (handling the BGR → RGB conversion) 
    - Appends the label value to `y`
- Finally returning the two to the user

## One Hangup

How do we want to handle variable-sized images? 

For instance, say our data is of all shapes and sizes-- rectangles, squares, similar shapes but different resolutions, etc.

I see three possible solutions, but they both involve a preliminary scan through the data (before loading anything into `X` or `y`) to get some `max_X`, `min_X`, `max_Y`, `min_Y` values, then we use these to:

1. Upscale all images to the max using the [same function you called here](https://github.com/avlaskin/QuickLabel/blob/6ace67e728c6cbc84d2597e2b15f0d8fff2d4bff/quickLabel/ui/tkinter.py#L272)
2. Downscale all images the same way
3. Determine the max dimensions in both directions and just pad everything to fit the space
   - (This is the one I'm considering most)

Or any other ideas you might have here


----------

I'm happy to knock this out over the next week or so, but want to make sure you think this is a good idea before I dive in.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Load Image Dataset interface #4

Proposed Flow

Particulars

One Hangup

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Load Image Dataset interface #4

Description

Proposed Flow

Particulars

One Hangup

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions