What problem does this solve?
After deploying a model, there is no way to produce a portable, self-contained artifact that can be shipped to another environment. Users must manually write Dockerfiles, pin dependencies, and assemble metadata — the same boilerplate every time.
Proposed solution
New command: inference-engine package models/sentiment/v1/
Generates in the model directory:
models/sentiment/v1/
├── definition.py (existing)
├── Dockerfile (generated from template)
├── requirements.txt (generated — pinned from current venv)
└── deployment.json (metadata: name, version, framework, load_format, device, created_at, sample_input)
Dockerfile is template-based, not LLM-generated. Parameterised by Python version (sys.version_info), device (cpu → python:3.x-slim, gpu → nvidia/cuda:12.x-runtime), and port (default 8000, --port to override).
requirements.txt pins framework-relevant packages from the current environment using importlib.metadata.version(). No LLM involved. Always includes fastapi, uvicorn, inference-engine.
| Framework |
Packages pinned |
| sklearn |
scikit-learn, joblib, numpy |
| pytorch |
torch |
| transformers |
transformers, torch, tokenizers |
| xgboost |
xgboost, numpy |
| lightgbm |
lightgbm, numpy |
| catboost |
catboost, numpy |
| onnx |
onnxruntime, numpy |
| sentence_transformers |
sentence-transformers, torch |
deployment.json includes sample_input from DeployAnswers so downstream commands (benchmark, snippets, metadata API) can use it without requiring the user to re-specify it.
Alternatives considered
Manually writing Dockerfiles and requirements files. Error-prone and not reproducible across environments.
Area
CLI (deploy / fix)
What problem does this solve?
After deploying a model, there is no way to produce a portable, self-contained artifact that can be shipped to another environment. Users must manually write Dockerfiles, pin dependencies, and assemble metadata — the same boilerplate every time.
Proposed solution
New command:
inference-engine package models/sentiment/v1/Generates in the model directory:
Dockerfile is template-based, not LLM-generated. Parameterised by Python version (
sys.version_info), device (cpu→python:3.x-slim,gpu→nvidia/cuda:12.x-runtime), and port (default 8000,--portto override).requirements.txtpins framework-relevant packages from the current environment usingimportlib.metadata.version(). No LLM involved. Always includesfastapi,uvicorn,inference-engine.scikit-learn,joblib,numpytorchtransformers,torch,tokenizersxgboost,numpylightgbm,numpycatboost,numpyonnxruntime,numpysentence-transformers,torchdeployment.jsonincludessample_inputfromDeployAnswersso downstream commands (benchmark,snippets, metadata API) can use it without requiring the user to re-specify it.Alternatives considered
Manually writing Dockerfiles and requirements files. Error-prone and not reproducible across environments.
Area
CLI (deploy / fix)