Skip to content

add additional check for internet connectivity#1776

Merged
jasonlyc merged 1 commit intofirewalla:masterfrom
chgchi:my_dev
Mar 10, 2026
Merged

add additional check for internet connectivity#1776
jasonlyc merged 1 commit intofirewalla:masterfrom
chgchi:my_dev

Conversation

@chgchi
Copy link
Contributor

@chgchi chgchi commented Mar 6, 2026

No description provided.

@MelvinTo
Copy link
Contributor

Quick review for PR #1776, overall direction is good (avoid false WAN-down when ping targets are unreachable), but I see 2 correctness risks worth fixing before merge:

  1. Fallback can still mark WAN down on idle links (false negative)

    • _checkWanStatusByConntrack() relies on observing conntrack -E -e UPDATE events within 5s.
    • If the WAN is healthy but there’s no qualifying traffic in that short window, it returns false and WAN is marked inactive.
    • This changes failure mode from "ICMP unreachable" to "no recent traffic during sample window".
  2. Only first IPv4 is checked

    • const replyDst = ip4s[0]...
    • On WAN with multiple IPv4s/aliases, fallback may miss active flows bound to other IPs and report down incorrectly.

Suggestions

  • Check all interface IPv4s (loop each reply-dst).
  • Prefer a more deterministic signal than a short event window (e.g., snapshot query of recent conntrack entries for this interface/IPs, or longer/parameterized observation window).
  • Optional: keep active unchanged when fallback has low confidence (instead of forcing down), depending on desired fail-open/fail-closed behavior.

@chgchi
Copy link
Contributor Author

chgchi commented Mar 10, 2026

For PR #1776 comments:

  1. Regarding WANs with multiple IPv4 addresses/aliases:
  • The code has been modified to use Promise.all to check each IP individually.
  1. Regarding querying a snapshot of recent connection tracking entries for this interface/IP address, this was the initial solution considered, but it has the following drawbacks:
  • Stale Connections: Even if the WAN disconnects, entries in the Conntrack table do not immediately disappear. TCP connections are maintained in the table based on ip_conntrack_tcp_timeout_established (86400s) if no FIN/RST packets are received.

  • In high-concurrency environments, the Conntrack table can become very large. Traversing the entire table to filter active connections for a specific interface is very CPU-intensive.

  1. Parameterize monitor time, combined with this._wanConnState for judgment:
  • default → 5s (initial check)
  • ON_OFF_THRESHOLD(2) >= this._wanConnState.failureCount > 0 → 10s, longer after initial failures to reduce false "down"
  • otherwise OFF_ON_THRESHOLD(5) >= this._wanConnState.successCount > 0 → 8s, longer after initial success for stability
  1. The fallback check is an additional check performed after a ping test fails. If the fallback check fails, the result of the ping test is preserved.

@MelvinTo
Copy link
Contributor

Re-reviewed latest PR #1776 update.

Good news: the two key risks previously raised are now addressed.

  • Multi-IPv4 handled: _checkWanStatusByConntrack() now checks all interface IPv4s (replyDsts) instead of only ip4s[0].
  • Idle-window false negatives reduced: conntrack monitor window is now tunable/adaptive (5/8/10s based on state), instead of fixed 5s.

Current verdict: LGTM from correctness perspective for the originally reported issues.

Optional follow-up (not blocker):

  • conntrack fallback can still miss very-low-traffic healthy links (inherently sampling-based). If needed later, consider adding a snapshot-based signal in addition to event-stream monitoring.

@MelvinTo MelvinTo requested a review from jasonlyc March 10, 2026 08:34
@jasonlyc jasonlyc merged commit b6a8a0b into firewalla:master Mar 10, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants