Cache LLM generation results by artifact hash + interpreted metadata + prompt version

### What problem does this solve?

Re-deploying the same artifact (e.g. after a server restart or a `fix` attempt) burns LLM API calls on identical codegen. The artifact hasn't changed, the prompt hasn't changed, but the full generate → validate → repair loop runs again from scratch.

### Proposed solution

Cache LLM generation results keyed by a SHA-256 hash of:
- artifact file bytes
- interpreted metadata JSON (from the LLM interpretation stage, A3) — captures user answers to clarifying questions
- `PROMPT_VERSION` constant in `agent.py` — bumped manually when system prompts change

This is more precise than hashing the artifact alone: two deploys of the same artifact with different answers to clarifying questions (e.g. "state_dict" vs "full model") produce different cache keys and different cached code.

Cache stored at `~/.inference-engine/cache/<hash>.json`:
```json
{
  "key": "<sha256>",
  "load_body": "...",
  "predict_body": "...",
  "framework": "pytorch",
  "created_at": "2026-05-14T07:00:00Z",
  "prompt_version": "3"
}
```

On deploy, if a cache hit exists:
- interactive: `"Cached generation found (pytorch, created 2026-05-14). Use it? [Y/n]"`
- `--yes` mode: always use cache, print `"Using cached generation."`

Cache is never expired by time — artifacts are immutable. Invalidated only when `PROMPT_VERSION` changes (different key, old entry is simply never matched).

### Alternatives considered

No cache. Acceptable for one-off deploys but wasteful in CI pipelines that re-deploy the same artifact on every run.

### Area

CLI (deploy / fix)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache LLM generation results by artifact hash + interpreted metadata + prompt version #35

What problem does this solve?

Proposed solution

Alternatives considered

Area

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Cache LLM generation results by artifact hash + interpreted metadata + prompt version #35

Description

What problem does this solve?

Proposed solution

Alternatives considered

Area

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions