Skip to content

Holepunch monitor falsely reports direct success when peer endpoint is reachable via the tunnel's own routed subnet, causing connection flap #60

@krmbluek

Description

@krmbluek

Describe the Bug

Summary

When a Newt site registers with an endpoint IP that falls inside a subnet the CLI also routes through the WireGuard tunnel, the holepunch monitor's test packets are routed through the tunnel itself (via the relay), receive responses, and are misclassified as successful direct holepunches. The CLI then switches the peer to direct mode, the data plane fails because the registered endpoint isn't reachable over the underlying network, and the connection cycles between relay and broken-direct indefinitely.

The --holepunch=false flag is documented but does not stop this behavior — only the initial rapid startup test honors the flag, while the ongoing holepunch monitor continues to run and trigger the cycle.

Topology

  • Two Newt sites configured on the Pangolin server:
    • Site 3 registered with the office WAN IP — works fine, stays on relay, no flap
    • Site 6 registered with a LAN IP 10.1.10.16:63047 — flaps continuously
  • Site 6's LAN IP got registered automatically because the Newt host shares a LAN with the Pangolin server, so gerbil saw the registration arrive from the LAN-side source IP. No explicit endpoint advertisement was set on Newt — this is default-configuration behavior.
  • The CLI is running on a roaming client on a completely separate network (10.0.120.0/24) with no underlying-network path to either the Pangolin server's WAN IP or to 10.1.10.16.
  • The CLI installs a route for 10.1.0.0/16 via the pangolin interface, which covers the misregistered Site 6 endpoint.

Environment

  • Pangolin CLI version: 0.6.2 (latest at time of filing)
  • OS: Ubuntu 24.04 LTS, x86_64 (Surface Pro 9, linux-surface kernel)
  • Pangolin server version: 1.17.0
  • Newt version on peer site: 1.11.0, native binary running as systemd service, no explicit endpoint flag (default args: --id <redacted> --secret <redacted> --endpoint https://pangolin.example.com)

To Reproduce

  1. Configure a Newt site such that its registered endpoint is a private IP. This happens automatically when the Newt host shares a LAN with the Pangolin server.
  2. Configure a resource that pushes a route covering that private IP through the tunnel (e.g. 10.1.0.0/16).
  3. From a network with no underlying path to either the Pangolin server's WAN or the registered LAN IP, run pangolin up (or pangolin up --holepunch=false).
  4. Run pangolin logs client --follow and observe the cycle.

Expected Behavior

Expected behavior

Holepunch test fails because there is no underlying-network path to the endpoint. Peer stays on relay. Connection is stable.

Additionally, --holepunch=false should disable holepunch attempts entirely, including the ongoing monitor.

Actual behavior

The startup rapid-test correctly fails (it tests via the underlying socket before the tunnel is up). But once the tunnel is up and the relay is working, the holepunch connection monitor sends test packets to the registered endpoint via the standard kernel routing table. Those packets match the 10.1.0.0/16 tunnel route, traverse the tunnel via the relay, reach the legitimate Newt peer (which actually is at 10.1.10.16 on its own LAN), and get a response. The monitor reports "CONNECTED" with a relay-roundtrip RTT and switches the peer to direct mode.

The data plane then fails because the WireGuard endpoint can't be reached over the underlying network. After ~5 seconds of timeout, the CLI falls back to relay. The monitor immediately succeeds again (same recursive path), switches back to direct, fails again. The cycle repeats indefinitely every 15-30 seconds.

This happens whether --holepunch is true or false.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions