Skip to content

_parse_methods conflates load() and predict() into one block when LLM wraps output in a class #21

@AK11105

Description

@AK11105

What happened?

_parse_methods in agent.py uses a lookahead regex (?=\ndef ) to split load() and predict() from LLM output. When the LLM omits the blank line between methods, the lookahead \ndef still matches \ndef predict — but when the LLM wraps output in a class body (despite the system prompt), both regexes match the same outer block. Both load_body and predict_body contain both methods.

Validation passes if predict happens to work, but load is silently broken — self._model is never set. The error only surfaces at runtime when the server loads the model.

Additionally, when the LLM adds trailing text after predict, \Z is the only terminator and predict_match captures everything to end-of-string.

Steps to reproduce

  1. Mock the LLM to return both methods wrapped in a class body
  2. Call _parse_methods on the output
  3. Observe both load_body and predict_body contain both methods
  4. Deploy succeeds; first prediction raises AttributeError: '_GeneratedModel' object has no attribute '_model'

Expected behavior

Split on def boundaries explicitly rather than using a lookahead:

def _parse_methods(raw: str) -> tuple[str, str]:
    raw = re.sub(r"```(?:python)?", "", raw).replace("```", "").strip()
    blocks = re.split(r"(?=^def )", raw, flags=re.MULTILINE)
    methods = {}
    for block in blocks:
        block = block.strip()
        if block.startswith("def load(self)"):
            methods["load"] = block
        elif block.startswith("def predict(self,"):
            methods["predict"] = block
    if "load" not in methods or "predict" not in methods:
        raise ValueError(f"Could not parse load() and predict() from LLM output:\n{raw}")
    return methods["load"], methods["predict"]

Environment

  • Area: app/cli/core/agent.py
  • Phase: 8 (fix before Phase 9 begins)
  • Priority: Medium

Relevant logs or error output

AttributeError: '_GeneratedModel' object has no attribute '_model'
# at runtime, not at deploy time — deploy reports success

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions