Implement socket-activated zero-downtime deploy switchover#67
Merged
Conversation
Member
|
🤔 not sure this is entirely correct or solves the problem. I think partly its because we're wrapping the Go command with Assuming we separate them, not even sure the readiness checks in Ansible are needed or the Caddy retries 🤔 |
Member
|
Though the readiness checks aren't bad anyway just to be safe |
4 tasks
Systemd socket activation keeps the listening socket open across service restarts so connections queue at the kernel instead of getting 503s from Caddy. The Go server detects LISTEN_FDS and uses the inherited fd, falling back to normal listen for local dev. Caddy retry window bumped as a safety net. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
e8fc77d to
a62be24
Compare
swalkinshaw
approved these changes
Apr 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
LISTEN_FDSdetection with fallback to normal listen for local devlb_try_duration5s→10s,lb_try_interval250ms→100ms) as safety netWhy
We were seeing brief 502/503 responses during deploy because restarting the service drops the listening socket. Socket activation (
wppackages.socket) keeps the socket open across service restarts — incoming connections queue at the kernel instead of failing.Builds on #95 which separated Litestream into its own service, unblocking socket activation (the old
litestream -execwrapper wouldn't pass through the socket fd).Changes
internal/http/server.go—systemdListener()consumes the fd passed by systemd viaLISTEN_FDS/LISTEN_PID; falls back toListenAndServewhen not socket-activated (local dev)templates/wppackages.socket.j2— new systemd socket unit listening on{{ go_listen_addr }}templates/wppackages.service.j2— addsRequires=wppackages.sockettasks/main.yml— deploys and enables the socket unit before the serviceCaddyfile.j2— retry tuning as additional safety netTest plan
provisionand verifywppackages.socketis active (systemctl status wppackages.socket)wppackages.servicestarts via socket activation (journalctl -u wppackagesshows "using systemd socket activation")deployand monitor for 502/503 elimination during switchovermake dev)🤖 Generated with Claude Code