Skip to content

fix(security): harden OAuth login, server-side SSRF, audit logging, and agent token#133

Merged
ZingerLittleBee merged 13 commits into
mainfrom
secuity
Jun 8, 2026
Merged

fix(security): harden OAuth login, server-side SSRF, audit logging, and agent token#133
ZingerLittleBee merged 13 commits into
mainfrom
secuity

Conversation

@ZingerLittleBee

@ZingerLittleBee ZingerLittleBee commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Summary

Security-hardening pass for the auth, service-monitor, and admin surfaces. Findings came from a multi-role audit plus a regression-focused review; each fix is committed independently.

Auth

  • OAuth login-CSRF / session-fixation: bind each login flow to the initiating browser via a short-lived HttpOnly pre-auth nonce cookie (constant-time compared, single-use), and adopt PKCE S256. Redact the nonce and PKCE verifier in Debug.
  • Agent WebSocket token: prefer the Authorization header over the ?token= query string (CWE-598), keeping the query param as a fallback for proxies that strip the header.

Server-side SSRF (service-monitor checkers)

  • Move the SSRF guard into a shared serverbee_common::ssrf module. Block loopback / link-local / cloud-metadata (169.254.169.254) / NAT64 and IPv4-mapped forms, while still allowing RFC1918/ULA private ranges for legitimate internal monitoring. Connect only to the validated addresses to close the DNS-rebinding window.
  • Allow custom ports for HTTP keyword checks (validate_monitor_url) so non-standard service ports keep working; the strict 80/443 validator stays on the agent's global-only path.
  • Re-validate the guard on every HTTP redirect hop (reqwest's auto-follow previously bypassed the pin on cross-host redirects).
  • Reject loopback/metadata targets at monitor create/update time (422) so the restriction surfaces at configuration time instead of as silent runtime check failures / false alerts.

Admin surface

  • Audit-log DB backup/restore and user create/update/delete. The restore audit row is written into the freshly restored database (the live pool still holds the pre-restore inode), with a tracing fallback.

Test Plan

  • cargo test -p serverbee-common — 132 passed
  • cargo test -p serverbee-server --lib — 571 passed
  • cargo test -p serverbee-server --test integration — 66 passed
  • cargo clippy --workspace -- -D warnings — 0 warnings
  • Manual: configure an OAuth provider and verify login still works end-to-end (nonce cookie + PKCE)
  • Manual: confirm existing agents reconnect (Authorization header preferred, query fallback)

Move the agent's SSRF guard into serverbee_common::ssrf (single source of
truth) and apply it to the server-side service-monitor checkers. A new
is_monitor_safe_addr / resolve_and_check_monitor blocks loopback, link-local
(incl. cloud metadata 169.254.169.254) and NAT64 while allowing RFC1918
private ranges so legitimate internal monitoring keeps working. HTTP/TCP/SSL
checkers connect only to validated addresses (closing DNS-rebind); the DNS
checker rejects metadata/loopback nameservers.
The SSRF guard restricted HTTP keyword checks to ports 80/443, which broke
monitoring of services on non-standard ports (e.g. :8080, :3000) — a common
internal-monitoring case the address-level guard already supports. Add
validate_monitor_url (any port; scheme + embedded-credentials checks retained)
and use it in the HTTP keyword checker. is_monitor_safe_addr still blocks
loopback/link-local/metadata regardless of port. The strict validate_url
(80/443 only) stays in use on the agent's global-only IP-quality path.
restore_backup renames the live DB to .pre-restore and moves the uploaded
file onto db_path, but state.db's pooled connections still hold the
pre-restore inode. Writing the settings.restore audit row through state.db
therefore lands it in the discarded database, which is lost after the
mandatory restart — exactly the record that matters most. Open a short-lived
connection to the freshly restored file so the entry persists into the DB the
server reopens, and emit a tracing warning as a durable fallback.
The HTTP keyword checker built a single reqwest client pinned to the original
host via resolve_to_addrs but left reqwest's default redirect policy, which
auto-follows up to 10 hops. A monitored endpoint could 3xx-redirect to
169.254.169.254 or an internal host and bypass the SSRF guard entirely. Disable
auto-redirect and follow manually, re-running validate_monitor_url +
resolve_and_check_monitor and re-pinning the client on every hop (mirrors the
agent ip_quality fetcher and widget_module). Redirects are followed as GET
without the body or custom headers so neither is replayed to a redirected host.
The SSRF guard added to the tcp/ssl/http_keyword checkers silently flips any
pre-existing loopback/localhost monitor from passing to failing on upgrade and
can page the notification group after two checker cycles, since targets were
never validated at config time. Reject targets whose literal host is loopback,
link-local, the cloud-metadata endpoint, or "localhost" in
ServiceMonitorService::create/update with a 422, so the restriction surfaces at
configuration time instead of as runtime false-alerts. RFC1918/ULA private
ranges stay allowed for internal monitoring; dns (nameserver-only guard) and
whois are not address-checked. Document the restriction and update the
service-monitor integration test to assert the create-time rejection.
@ZingerLittleBee ZingerLittleBee merged commit 37bbe19 into main Jun 8, 2026
3 checks passed
@ZingerLittleBee ZingerLittleBee deleted the secuity branch June 8, 2026 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant