Train postprocess model in docker + docs#128
Train postprocess model in docker + docs#128ilmcconnell wants to merge 5 commits intoUW-COSMOS:masterfrom
Conversation
ankur-gos
left a comment
There was a problem hiding this comment.
Looks mostly good. Just anonymize paths/ make them exposed as CLI arguments.
| --logdir logs/ --modelcfg /ssd/ankur/Cosmos/deployment/configs/model_config.yaml \ | ||
| --detect-weights /ssd/ankur/Cosmos/deployment/weights/model_weights.pth \ | ||
| --logdir logs/ --modelcfg /ssd/iain/Cosmos/deployment/configs/model_config.yaml \ | ||
| --detect-weights /ssd/iain/Cosmos/deployment/weights/model_weights.pth \ |
There was a problem hiding this comment.
All these paths need to be configurable from docker
| @@ -0,0 +1,13 @@ | |||
| #!/bin/sh | |||
There was a problem hiding this comment.
I would replace train_postprocess.sh with this script. Don't have a separate dockerized version.
| -it \ | ||
| --name test_train_postprocess \ | ||
| -e CUDA_VISIBLE_DEVICES=1 \ | ||
| -v /hdd/iaross/train_dir:/train_dir \ |
There was a problem hiding this comment.
Make the paths arguments to pass in. Anonymize these paths.
| # copy files beginnging with that name to the destination folder of the same name | ||
|
|
||
|
|
||
| subset_training_data() No newline at end of file |
There was a problem hiding this comment.
add
if __name__ == '__main__':
You don't want accidental imports to mess this up.
| @@ -0,0 +1,28 @@ | |||
| import click | |||
There was a problem hiding this comment.
looks like this file is a placeholder. It just prints things.
| You can choose which GPU to use with the -e CUDA_VISIBLE_DEVICES=1 argument or omit that argument if it doesn't matter to you. | ||
| Specify bind mounts for the training and validation images plus the trained model output .pth file, with the -v arguments: | ||
|
|
||
| 1. -v <local_full_path_to_training_data>:/train_dir \ |
There was a problem hiding this comment.
make these paths arguments to docker_launch.sh as noted above.
@ankur-gos and @iross
Here's my proposal to train the postprocess model using a docker image plus the docs explaining how to set it up. Any feedback you have would be great!