What problem does this solve?
After packaging a model (#26 ), there is no way to export it to a specific deployment platform. Users targeting SageMaker, BentoML, Ray Serve, or standalone Docker must manually write platform-specific boilerplate — inference.py, service.py, deployment.py, serve.py — each with different calling conventions.
Proposed solution
New command: inference-engine export models/sentiment/v1/ --target sagemaker|bentoml|ray|docker
Reads deployment.json for metadata. Fails with a clear message if deployment.json is absent (run inference-engine package first).
sagemaker
export/sentiment-v1-sagemaker/
├── model.tar.gz (artifact + inference.py, tarred for S3 upload)
└── inference.py (model_fn loads via definition.py; predict_fn calls pipeline.run())
bentoml
export/sentiment-v1-bentoml/
├── service.py (BentoML Service wrapping InferencePipeline.run())
├── bentofile.yaml
└── requirements.txt
ray
export/sentiment-v1-ray/
├── deployment.py (Ray Serve Deployment; handle() calls pipeline.run())
└── requirements.txt
docker (standalone, no platform SDK)
export/sentiment-v1-docker/
├── Dockerfile
├── requirements.txt
└── serve.py (~30-line FastAPI app: POST /predict → pipeline.run())
All exporters implement a common BaseExporter ABC. The LLM is only used when the target format requires a calling convention that differs from pipeline.run(x) — in practice rare, as all templates call pipeline.run() directly.
Alternatives considered
Replicate export — deferred. Replicate's format changes frequently and their target audience doesn't align with the platform's production focus.
Area
CLI (deploy / fix)
What problem does this solve?
After packaging a model (#26 ), there is no way to export it to a specific deployment platform. Users targeting SageMaker, BentoML, Ray Serve, or standalone Docker must manually write platform-specific boilerplate —
inference.py,service.py,deployment.py,serve.py— each with different calling conventions.Proposed solution
New command:
inference-engine export models/sentiment/v1/ --target sagemaker|bentoml|ray|dockerReads
deployment.jsonfor metadata. Fails with a clear message ifdeployment.jsonis absent (runinference-engine packagefirst).sagemakerbentomlraydocker(standalone, no platform SDK)All exporters implement a common
BaseExporterABC. The LLM is only used when the target format requires a calling convention that differs frompipeline.run(x)— in practice rare, as all templates callpipeline.run()directly.Alternatives considered
Replicate export — deferred. Replicate's format changes frequently and their target audience doesn't align with the platform's production focus.
Area
CLI (deploy / fix)