Skip to content

OmniRouter: synthetic 'hal0' upstream causes 502 PoolTimeout on /v1/chat/completions #412

@thinmintdev

Description

@thinmintdev

Summary

Calls to POST /v1/chat/completions against a model that's served by a local slot (agent-hermes in our case) consistently 502 with:

{"error":{"code":"dispatch.upstream_unavailable","message":"upstream 'hal0' unreachable: PoolTimeout","details":{"upstream":"hal0","target":"http://127.0.0.1:8080/v1/chat/completions","error":""}}}

The dispatch target is hal0-api's own bind URL. From the api startup log:

slots.autoregistered_composite upstream=hal0
omni_router.attached base_url=http://127.0.0.1:13305

/api/slots lists a kind: "slot" entry named hal0 with _synthetic: true, advertising url=http://127.0.0.1:8080/v1. Even with all advertised models matched against a real local slot (agent-hermes, primary), the synthetic-hal0 slot wins routing and the request ends up dispatched to a URL that resolves to hal0-api itself.

Reproduction

On hal0 v0.3.0a1, with agent-hermes slot loaded and /v1/models advertising qwen3-coder-reap-25b-a3b-q5km:

curl -sS -X POST http://127.0.0.1:8080/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{"model":"qwen3-coder-reap-25b-a3b-q5km","messages":[{"role":"user","content":"hi"}],"max_tokens":5}' \
  --max-time 60

→ 502 dispatch.upstream_unavailable after ~30-60s.

The same request against the llama-server slot directly works:

curl http://127.0.0.1:8001/v1/chat/completions ...   # → 200, generates response

Expected

Either the synthetic upstream should not exist when a same-model local slot is configured, or _synthetic_reason: "install a local slot of the same name to take over" should describe how to actually achieve that (creating a slot with name = hal0 via hal0 slot create doesn't take over the synthetic — the route still loses).

Workaround

Point OpenAI-compatible clients at the slot's port directly (http://127.0.0.1:8001/v1) instead of the omnirouter. This loses model swap routing across slots but at least chat works.

Suggested fix area

  • omni_router should refuse to dispatch to a URL that resolves to its own bind.
  • slots.autoregistered_composite should defer to same-named real slots when present, instead of being preferred.

Environment

  • hal0 v0.3.0a1, CT 105 (LXC), AMD Strix Halo
  • Discovered while wiring NousResearch/hermes-agent via openrouter spawn

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions