Skip to content

fix: review fixes for tunnel auto-reconnect, uptime & SOCKS detection#13

Merged
rekurt merged 6 commits into
masterfrom
claude/serene-babbage-tlean2
Jun 12, 2026
Merged

fix: review fixes for tunnel auto-reconnect, uptime & SOCKS detection#13
rekurt merged 6 commits into
masterfrom
claude/serene-babbage-tlean2

Conversation

@rekurt

@rekurt rekurt commented Jun 12, 2026

Copy link
Copy Markdown
Owner

Что это

Carries the feature work from #12 (SOCKS/proxy detection, tunnel uptime/copy, auto-reconnect) plus the fixes for the defects raised in code review of that PR. The single fix commit (f1b9138) sits on top of #12's three commits, so it can be merged as-is or cherry-picked into #12.

Исправленные замечания ревью

Критичное

  1. Backoff теперь реально растёт для недоступного хоста. Сброс backoff больше не происходит при каждом respawn (где ssh переживает 150 мс spawn-check, но умирает через секунды на TCP-таймауте). Сброс перенесён в refresh_status и срабатывает только когда туннель продержался AliveSTABILITY_THRESHOLD (30s). Backoff удваивается на каждой попытке до MAX_BACKOFF. → 2s → ×2 → 60s, как и заявлено.
  2. Авто-реконнект больше не блокирует UI. Respawn идёт через неблокирующий spawn_ssh_args_nowait (без thread::sleep(150ms)). Валидирующий spawn оставлен для интерактивного пути (первичный запуск + ручной restart).
  3. s (save) больше не удаляет реконнектящийся туннель. drop_failed прунит только туннели Failed И уже не авто-реконнектящиеся (исчерпавшие ретраи / отключённые).

Средней важности

  1. Нет ложного no listener. В grace-окне после (ре)старта и при паузе авто-обновления (скан устарел) показываем alive вместо жёлтого предупреждения.
  2. Зафиксировано ограничение health-check — он advisory/view-only, reconnect-loop его не лечит (документировано в коде).
  3. command_string() экранирует аргументы — identity-файл с пробелом в пути вставляется как один аргумент.

Мелочи

  1. fmt_uptimeprt_core::core::scanner::format_uptime рядом с format_duration (дедуп).
  2. auto_reconnect — теперь живой флаг: сбрасывается после MAX_RECONNECT_ATTEMPTS, заодно решая п.3; восстанавливается при ручном restart.
  3. Единый приватный конструктор from_child для spawn/spawn_with_host.
  4. Перекрёстные ссылки между PROXY_PORTS и known_ports.

Тесты

  • Новые юнит-тесты shell_quote; тест fmt_uptime переехал в scanner как format_uptime_buckets.
  • cargo test --workspace зелёный (241), clippy --all-targets и fmt --check чисты.

https://claude.ai/code/session_01V2L7zMrNT4vcJFjAHyuShp


Generated by Claude Code

claude added 4 commits June 12, 2026 16:53
Add a ProxyListening heuristic that flags processes listening on
well-known SOCKS/proxy ports (1080, 1081, 3128, 9050, 9150). 8080/8443
are deliberately excluded to avoid false positives on http-alt servers.
The new reason plugs into the existing enrichment pipeline, so proxy
listeners get the magenta + [!] treatment and match the 'suspicious'
filter automatically. Also adds tor-socks/socks names to known_ports.
- Add an Uptime column (compact 45s/12m/3h04m/2d05h formatting), reset
  on restart.
- Layer a listener health check over the Alive status: an alive ssh
  child with no local listener on its port (broken -D/-L bind) now shows
  a yellow 'no listener' warning instead of a misleading green 'alive'.
  Reuses already-scanned LISTEN entries — opens no new connections.
- Add 'c' key to copy the tunnel's full ssh command to the clipboard,
  reusing the existing copy_to_clipboard helper.
- New i18n strings (uptime column, no-listener, copy hint) in en/ru/zh.
Tunnels that die on their own are now automatically restarted with
exponential backoff (2s doubling up to 60s) so an unreachable host
isn't hammered. reconnect_failed() runs each loop iteration right after
cleanup(); gating is by next_retry_at rather than the tick, and a
successful restart resets the backoff. Manual kill removes a tunnel
entirely, so only self-inflicted failures reconnect.
Fixes the defects raised in code review of the auto-reconnect / uptime /
SOCKS-detection work:

Critical
- Exponential backoff now actually grows for an unreachable host. The reset
  no longer happens on every respawn (where `ssh` survives the 150ms spawn
  check but dies seconds later on TCP timeout); it moves to refresh_status and
  only fires once the tunnel has stayed Alive for STABILITY_THRESHOLD (30s).
  Backoff doubles per attempt up to MAX_BACKOFF.
- Auto-reconnect no longer blocks the UI thread: it respawns via a
  non-blocking spawn_ssh_args_nowait (no 150ms liveness sleep). The validating
  spawn is kept for the interactive path (initial spawn + manual restart).
- save no longer silently deletes a tunnel that's mid-reconnect: drop_failed
  prunes only tunnels that are Failed AND no longer auto-reconnecting.

Medium
- No more false "no listener": within a grace window after (re)start, or while
  auto-refresh is paused (stale scan), trust the Alive status instead.
- Documented that the listener health check is advisory/view-only and not acted
  on by the reconnect loop.
- command_string now shell-quotes each argument so identity paths with spaces
  paste back as a single argument.

Cleanup
- format_uptime moved into prt-core scanner next to format_duration (dedup).
- auto_reconnect is now a live flag: cleared after MAX_RECONNECT_ATTEMPTS so a
  permanently unreachable host stops being retried; re-enabled on manual restart.
- Single private from_child constructor shared by spawn/spawn_with_host.
- Cross-referenced PROXY_PORTS and known_ports so the two lists stay in sync.

Adds shell_quote unit tests; moves the fmt_uptime test to scanner as
format_uptime_buckets. cargo test/clippy/fmt all clean.

https://claude.ai/code/session_01V2L7zMrNT4vcJFjAHyuShp
@rekurt rekurt marked this pull request as ready for review June 12, 2026 19:37

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f1b91387aa

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread crates/prt/src/views/tunnels.rs Outdated
Comment on lines +24 to +25
app.session.entries.iter().any(|e| {
e.entry.state == ConnectionState::Listen && e.entry.local_addr.port() == local_port

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Confirm the listener belongs to this tunnel

In a local-port conflict, this reports the tunnel as healthy even when ssh failed to install the forward: OpenSSH defaults to exitonforwardfailure no (checked with ssh -G example.com), so the ssh child can keep running while another process owns local_port. Since this check only matches LISTEN + port, any unrelated or recently retained listener on the same port makes the status green instead of no listener; compare against the spawned ssh child's PID/process or start ssh with ExitOnForwardFailure=yes so bind failures cannot be masked.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in 79dd4c1. has_local_listener now also requires the LISTEN socket's PID to be this tunnel's ssh child (SshTunnel::pid()), so an unrelated/retained process holding the same port no longer masks a bind failure as green. For -L/-D the ssh child is the process that binds the local port, so the PID match is exact.

I considered ExitOnForwardFailure=yes too — it's stronger (ssh exits on bind failure → tunnel goes Failed → the auto-reconnect loop heals it, which also addresses the "health check is view-only" limitation). But it changes ssh-arg generation globally in prt-core and its tests, so I kept this contained PID fix here and left that as a possible follow-up.


Generated by Claude Code

Address Codex review: OpenSSH defaults to `ExitOnForwardFailure no`, so on a
local-port conflict the ssh child keeps running while another process owns the
port. The previous `LISTEN + port` match would then report the tunnel green
even though the forward never bound. Require the listening socket's PID to be
this tunnel's ssh child so a masked bind failure correctly shows `no listener`.

https://claude.ai/code/session_01V2L7zMrNT4vcJFjAHyuShp

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 79dd4c1e4b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread crates/prt/src/forward.rs
Comment on lines +394 to +395
if tunnel.retry_count >= MAX_RECONNECT_ATTEMPTS {
tunnel.auto_reconnect = false;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep auto-reconnect enabled after a stable final retry

When the retry that reaches MAX_RECONNECT_ATTEMPTS actually spawns successfully, this still flips auto_reconnect to false before that child has a chance to prove stable. refresh_status later resets the backoff and retry_count after STABILITY_THRESHOLD, but it never turns auto_reconnect back on, so a tunnel that recovers on its 10th attempt and then drops later is left permanently failed instead of reconnecting again. Only disable after a failed respawn, or re-enable it when the tunnel becomes stably alive.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5ced54e. refresh_status now also sets auto_reconnect = true inside the STABILITY_THRESHOLD reset block (alongside resetting backoff/retry_count). So a tunnel that recovers on its final attempt and stays Alive ≥ 30s is re-armed and will reconnect again if it later drops; a tunnel that never proves stable stays given-up (auto_reconnect = false) as intended.

I kept the give-up keyed on attempt count rather than only on a failed respawn, because auto-reconnect uses the non-blocking spawn — respawn() returns Ok as soon as the ssh binary launches, and the real "host unreachable" failure surfaces a few seconds later via cleanup(), not from respawn's return. So gating the give-up on Err alone would never trip for the main unreachable-host case; the re-arm-on-stable path is what keeps a genuine recovery from being stranded.


Generated by Claude Code

Address Codex review: the give-up at MAX_RECONNECT_ATTEMPTS flips
auto_reconnect to false even when that attempt's respawn succeeds. If the host
actually recovers on the final attempt, refresh_status resets the backoff and
retry_count after STABILITY_THRESHOLD but previously left auto_reconnect false,
so a recovered tunnel that later dropped was stuck permanently failed. Re-enable
auto_reconnect alongside the stability reset so a proven-healthy tunnel is always
eligible to reconnect again.

https://claude.ai/code/session_01V2L7zMrNT4vcJFjAHyuShp
@rekurt rekurt merged commit ba0aad2 into master Jun 12, 2026
3 checks passed
@rekurt rekurt deleted the claude/serene-babbage-tlean2 branch June 12, 2026 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants