Skip to content

Add POST /admin/reload endpoint to hot-reload models and routing config without restart #31

@AK11105

Description

@AK11105

What problem does this solve?

After inference-engine deploy writes a new model definition or updates routing config, the running server must be restarted to pick up the changes. In production this means downtime or a rolling restart for a change that should be instantaneous.

Proposed solution

New admin endpoint: POST /admin/reload (required scope: admin)

Behaviour:

  1. Calls registry.clear_cache() — evicts all loaded pipelines from the LRU cache
  2. Re-runs registry.warm_up() — reloads all registered definitions from disk
  3. Re-reads routing config via importlib.reload — picks up any routing changes written by inference-engine deploy
  4. Returns:
{
  "reloaded_models": ["sentiment:v1", "classifier:v2"],
  "routing_updated": true,
  "duration_ms": 142
}

ModelRegistry.clear_cache() evicts all entries from self._cache and resets LRU order. warm_up() is already idempotent.

RoutingService needs a new update_routes() method that replaces self.routes atomically after the config module is reloaded.

Alternatives considered

Server restart. Causes downtime and loses in-flight requests. Not acceptable for production use.

Area

Inference / prediction

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions