Skip to content

Slot config idle_timeout_s not honored by lemond — slots evict aggressively regardless #414

@thinmintdev

Description

@thinmintdev

Summary

Setting idle_timeout_s on a slot's config (via PUT /api/slots/{name}/config) does not change lemond's eviction behaviour. Setting it to 86400 (24h) results in eviction within ~60-90s of the last request anyway.

Reproduction

curl -X PUT http://127.0.0.1:8080/api/slots/agent-hermes/config \
  -H 'Content-Type: application/json' \
  -d '{"name":"agent-hermes","type":"llm","device":"gpu-vulkan","group":"chat",
       "port":8001,"idle_timeout_s":86400,
       "model":{"default":"qwen3-coder-reap-25b-a3b-q5km","ctx_size":65536}}'

# Confirm the config was accepted:
curl -sS http://127.0.0.1:8080/api/slots/agent-hermes/config
# → ...,"idle_timeout_s": 86400,...

# Load it:
curl -X POST http://127.0.0.1:8080/api/slots/agent-hermes/load \
  -d '{"model_id":"qwen3-coder-reap-25b-a3b-q5km"}'

# Wait 90s, then re-check:
sleep 90
curl http://127.0.0.1:8080/api/slots/agent-hermes
# → state: offline, message: "model evicted from lemond (auto-reloads on next request)"

Expected

With idle_timeout_s: 86400 configured, the slot should remain loaded for up to 24h of idle, not 60-90s.

Impact

For agent runtimes that depend on multi-turn tool-call flows (hermes-agent here), the slot evicts mid-conversation between the first inference pass (which generates the tool call JSON) and the second pass (which processes the tool result). The result is Connection error on the second hop, even though the slot was healthy 30s earlier.

Workaround

Systemd timer that re-issues hal0 slot load <name> every 4 minutes. Functional but adds load to lemond and races against the eviction.

Suggested fix area

The slot config's idle_timeout_s needs to actually reach lemond's eviction loop. Right now it appears to be stored but ignored.

Environment

  • hal0 v0.3.0a1, CT 105

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions