fix(dashboard): detect dead tunnels as unhealthy#180
Merged
Conversation
The tunnel health check was reporting healthy=true for dead WireGuard
tunnels because it only checked if an endpoint was configured (static
config) and if the link was UP. It did not verify the tunnel was
actually passing traffic.
Added two helper functions:
- _parse_transfer_bytes: parses WG transfer strings (e.g. '5.69 KiB')
- _has_valid_handshake: checks latest-handshake is not 0/none/idle
_detect_tunnel_interface now requires a completed handshake (and
ideally rx > 0) before marking a tunnel as active. A new handshake_ok
field is exposed in the tunnel status dict.
_routing_status healthy check for tunnel mode now also requires
active=true and handshake_ok=true on the primary tunnel.
Before: dead tunnel with endpoint configured → healthy: true
After: dead tunnel with no handshake/rx → healthy: false
with reason: 'endpoint configured, no handshake (VPN server unreachable)'
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The dashboard health check reported
healthy: truefor dead WireGuard tunnels. It only verified that an endpoint was configured and the link was UP — both are static kernel state that remain true even when the VPN server is completely unreachable.This led to a false "healthy" indicator on the dashboard while the proxy was unable to reach any Telegram DC (all upstream connections stuck in SYN-SENT).
Root Cause
_detect_tunnel_interface()setactive = Truesolely based on the presence of anendpointfield inawg showoutput. But endpoint is a static config value — it's always present regardless of whether the tunnel is actually passing traffic.Fix
_parse_transfer_bytes()to parse WG transfer strings (5.69 KiB→ bytes)_has_valid_handshake()to checklatest-handshakeis not0/none/idle_detect_tunnel_interface()now requires a completed handshake before markingactive = Truehandshake_okfield exposed in tunnel status_routing_status()healthy check for tunnel mode now requiresactiveandhandshake_okendpoint configured, no handshake (VPN server unreachable)Before / After
healthy: true❌healthy: false✅healthy: true✅healthy: true✅healthy: false✅healthy: false✅Already deployed and verified on production.