Skip to content

Add GET /models/{name}/{version}/metadata endpoint to expose runtime model info #34

@AK11105

Description

@AK11105

What problem does this solve?

There is no API to inspect what the engine knows about a deployed model at runtime — its framework, expected input shape, routing strategy, executor, or whether it's currently loaded in the cache. This information exists across multiple internal components but is not surfaced anywhere.

Proposed solution

New endpoint: GET /models/{name}/{version}/metadata

Response:

{
  "name": "sentiment",
  "version": "v1",
  "framework": "sklearn",
  "load_format": "joblib",
  "device": "cpu",
  "input_hint": "raw text string",
  "output_hint": "integer class label",
  "sample_input": "great movie",
  "executor": "cpu",
  "routing_strategy": "static",
  "loaded": true,
  "artifact_size_mb": 2.1
}

Data sources, in priority order:

  1. deployment.json in the model directory (written by inference-engine package) — provides framework, load_format, device, input_hint, output_hint, sample_input, artifact_size_mb
  2. ExecutionPolicy config — provides executor
  3. RoutingService config — provides routing_strategy
  4. Registry cache state — provides loaded

Fields absent from deployment.json are returned as null. The endpoint does not fail if deployment.json is absent — it returns what it can from the other sources.

Alternatives considered

Reading deployment.json directly from disk. This doesn't include runtime state (loaded, routing_strategy) and requires filesystem access from the client.

Area

Model loading / registry

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions