Add POST /admin/reload endpoint to hot-reload models and routing config without restart

### What problem does this solve?

After `inference-engine deploy` writes a new model definition or updates routing config, the running server must be restarted to pick up the changes. In production this means downtime or a rolling restart for a change that should be instantaneous.


### Proposed solution

New admin endpoint: `POST /admin/reload` (required scope: `admin`)

Behaviour:
1. Calls `registry.clear_cache()` — evicts all loaded pipelines from the LRU cache
2. Re-runs `registry.warm_up()` — reloads all registered definitions from disk
3. Re-reads routing config via `importlib.reload` — picks up any routing changes written by `inference-engine deploy`
4. Returns:
```json
{
  "reloaded_models": ["sentiment:v1", "classifier:v2"],
  "routing_updated": true,
  "duration_ms": 142
}
```

`ModelRegistry.clear_cache()` evicts all entries from `self._cache` and resets LRU order. `warm_up()` is already idempotent.

`RoutingService` needs a new `update_routes()` method that replaces `self.routes` atomically after the config module is reloaded.


### Alternatives considered

Server restart. Causes downtime and loses in-flight requests. Not acceptable for production use.

### Area

Inference / prediction

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add POST /admin/reload endpoint to hot-reload models and routing config without restart #31

What problem does this solve?

Proposed solution

Alternatives considered

Area

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add POST /admin/reload endpoint to hot-reload models and routing config without restart #31

Description

What problem does this solve?

Proposed solution

Alternatives considered

Area

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions