Summary
When the gateway source cannot bind its port (EADDRINUSE), the daemon exits and the launchd/systemd service respawns it immediately, producing an unbounded crash loop with no backoff and no detection of who owns the port.
Evidence
Observed on a dev machine: 161 distinct daemon PIDs in ~30 minutes (one respawn roughly every 10s), each dying with:
listen EADDRINUSE: address already in use 127.0.0.1:8787
The port was held by an unrelated long-lived process. The loop ran for days, burning CPU, and was invisible at the CLI: every hyp command just printed a generic warning.
Problems
- No backoff or respawn cap; launchd
KeepAlive + an always-failing daemon = hot loop.
- No detection/report of which process owns the port.
- No circuit breaker to park in a diagnosable stopped state.
Proposed
- Classify
EADDRINUSE as a distinct, terminal-ish error class.
- Add respawn throttling (launchd
ThrottleInterval / in-process exponential backoff) and a cap that parks the daemon in a clearly-stopped, diagnosable state after N fast failures.
- On bind failure, identify and report the owning PID/command.
Note
The user-facing error text for this case is being improved separately on branch hypaware-version (actionable EADDRINUSE message). This issue tracks the loop/backoff/detection behavior.
Summary
When the gateway source cannot bind its port (
EADDRINUSE), the daemon exits and the launchd/systemd service respawns it immediately, producing an unbounded crash loop with no backoff and no detection of who owns the port.Evidence
Observed on a dev machine: 161 distinct daemon PIDs in ~30 minutes (one respawn roughly every 10s), each dying with:
The port was held by an unrelated long-lived process. The loop ran for days, burning CPU, and was invisible at the CLI: every
hypcommand just printed a generic warning.Problems
KeepAlive+ an always-failing daemon = hot loop.Proposed
EADDRINUSEas a distinct, terminal-ish error class.ThrottleInterval/ in-process exponential backoff) and a cap that parks the daemon in a clearly-stopped, diagnosable state after N fast failures.Note
The user-facing error text for this case is being improved separately on branch
hypaware-version(actionable EADDRINUSE message). This issue tracks the loop/backoff/detection behavior.