-
Notifications
You must be signed in to change notification settings - Fork 33
Description
There's a nice fraud detection blueprint which uses cudf for data preprocessing, cuGraph and XGBoost for model training and Dynamo/Triton for model serving.
There is a notebook in the blueprint that covers the whole workflow end-to-end. I would love to take this example apart and explore how each piece would be deployed in production. We could create a derived deployment workflow example that shows some examples of how you would do this.
Preprocessing
I expect the preprocessing and would be submitted as a job via some kind of workflow engine. Something along the lines of Airflow, Dagster or Prefect.
Training
The model training would also be run via a workflow engine. I expect the metrics would be stored in a tools like MLFlow.
It would be nice to show how you could iterate on the model and train new ones, either by retraining nightly with new data or by running ad-hoc experiments, tracking all the metrics in the server.
Deployment
We could also put together an example of how you could handle continuous deployment of the models, so if a new one was trained which scored better it would be deployed to production.