What problem does this solve?
When inspection confidence is low or framework is None, codegen fires immediately with incomplete metadata. The LLM guesses at load strategy and input type, producing broken code. All three retries burn on the same wrong assumption. The user gets a failed deploy with no explanation of what was ambiguous.
This is especially common for XGBoost/LightGBM models (optional deps not installed in inspection env), PyTorch state dicts (ambiguous without knowing the model class), and ONNX models with dynamic axes.
Proposed solution
Add a dedicated LLM interpretation stage between inspection and codegen. It fires when confidence < high or framework is None.
Input to the call: raw_facts + inspection_errors + sample_input + --framework hint
System prompt instructs the LLM to return:
{
"framework": "xgboost",
"load_format": "joblib",
"input_hint": "numpy array, shape (n_samples, n_features)",
"output_hint": "integer class label",
"confidence": "medium",
"question": "Is this an XGBModel or a raw Booster?",
"question_field": "load_format"
}
question is one short specific question if a critical unknown remains, else null. Max 2 clarifying questions in interactive mode. Skipped entirely in --yes / non-interactive mode.
High-confidence artifacts (clean sklearn .pkl with all attributes present) skip Stage 2 entirely — no extra API call.
The interpretation result patches ArtifactMetadata before generate() is called. The generate() and fix() prompts receive the enriched metadata.
Alternatives considered
Relying solely on the rule-based inspector and hoping codegen recovers via the repair loop. This works for simple sklearn models but fails consistently for ambiguous artifacts — the LLM can't fix what it doesn't know is wrong.
Area
CLI (deploy / fix)
What problem does this solve?
When inspection confidence is low or
frameworkisNone, codegen fires immediately with incomplete metadata. The LLM guesses at load strategy and input type, producing broken code. All three retries burn on the same wrong assumption. The user gets a failed deploy with no explanation of what was ambiguous.This is especially common for XGBoost/LightGBM models (optional deps not installed in inspection env), PyTorch state dicts (ambiguous without knowing the model class), and ONNX models with dynamic axes.
Proposed solution
Add a dedicated LLM interpretation stage between inspection and codegen. It fires when
confidence < highorframework is None.Input to the call:
raw_facts + inspection_errors + sample_input + --framework hintSystem prompt instructs the LLM to return:
{ "framework": "xgboost", "load_format": "joblib", "input_hint": "numpy array, shape (n_samples, n_features)", "output_hint": "integer class label", "confidence": "medium", "question": "Is this an XGBModel or a raw Booster?", "question_field": "load_format" }questionis one short specific question if a critical unknown remains, elsenull. Max 2 clarifying questions in interactive mode. Skipped entirely in--yes/ non-interactive mode.High-confidence artifacts (clean sklearn
.pklwith all attributes present) skip Stage 2 entirely — no extra API call.The interpretation result patches
ArtifactMetadatabeforegenerate()is called. Thegenerate()andfix()prompts receive the enriched metadata.Alternatives considered
Relying solely on the rule-based inspector and hoping codegen recovers via the repair loop. This works for simple sklearn models but fails consistently for ambiguous artifacts — the LLM can't fix what it doesn't know is wrong.
Area
CLI (deploy / fix)