Skip to content

Commit b1a69b7

Browse files
Sbussisoclaude
andcommitted
ci: switch back to depot.dev builder (Fly remote builder regression)
Fly's standard remote builder started returning 'unauthorized' on the WireGuard heartbeat at ~06:56 UTC on 2026-05-04 for valid deploy tokens. Pattern: WARN Failed to start remote builder heartbeat: unauthorized Error: failed to fetch an image or build from source: unauthorized (Request ID 01KQTE4AHWKB2PAAS8A372EKNP-iad) (Trace ID e238a0166452cd04726a196f0d785b06) Reproduced after rotating FLY_API_TOKEN to a brand-new token (also from `fly tokens create deploy`). Both old + new tokens work fine for HTTPS API calls (validate fly.toml, list machines, etc.) but get rejected by the WireGuard auth backend. flyctl 0.4.45's wireguardless fallback path then bugs out on a malformed heartbeat URL. Either Fly tightened deploy-token scopes or there's an active platform incident; either way, our CI was wedged with no token rotation able to recover. Switching --depot=false → --depot=true routes the build through depot.dev's own builder infrastructure, bypassing the Fly remote builder path entirely. Why depot was off before: 2026-04-28 had two consecutive 5-minute timeouts that wedged CI ~10 min before failing. If those return, the next escalation is local Buildkit on the GitHub runner with docker/build-push-action — bypasses both Fly's builder AND depot's, but adds infrastructure to maintain. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 71fcd21 commit b1a69b7

1 file changed

Lines changed: 18 additions & 5 deletions

File tree

.github/workflows/deploy.yml

Lines changed: 18 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -94,10 +94,20 @@ jobs:
9494
# create a new machine. ~30-60s of downtime per deploy (same as
9595
# `strategy = "immediate"` on a good day) but reliably works.
9696
#
97-
# Note: `--depot=false` is intentional — depot.dev timed out
98-
# twice in a row on 2026-04-28 (5 min each), wedging CI for
99-
# ~10 min before failing. Standard remote builder is slower
100-
# (~3 min cached vs ~30s depot) but reliable.
97+
# Builder choice has flipped twice now:
98+
# - 2026-04-28: tried depot.dev, timed out 5 min × 2 in a row
99+
# (~10 min wasted per deploy). Switched to --depot=false
100+
# (Fly's standard remote builder). ~100 deploys worked.
101+
# - 2026-05-04: Fly's standard remote builder started returning
102+
# `unauthorized` on the WireGuard heartbeat for valid deploy
103+
# tokens (Request ID 01KQTE4AHWKB2PAAS8A372EKNP-iad).
104+
# Swapped tokens — same failure. Server-side scope change
105+
# or platform incident; either way, our CI was wedged.
106+
# Switched back to depot.dev which uses its own builder
107+
# infrastructure, bypassing the broken Fly path.
108+
# If depot.dev's timeouts return, the next move is probably
109+
# Buildkit-on-runner via `docker/build-push-action` and pushing
110+
# the resulting tag to registry.fly.io ourselves.
101111
- name: Build + push image to Fly registry
102112
id: build
103113
run: |
@@ -107,7 +117,10 @@ jobs:
107117
# printed on a line like:
108118
# image: registry.fly.io/opensentry-command:deployment-XXXX
109119
# Capture it so the next step can target it explicitly.
110-
OUT=$(flyctl deploy --remote-only --depot=false --build-only --push 2>&1 | tee /dev/stderr)
120+
# --depot=true: route the build through depot.dev rather
121+
# than Fly's standard remote builder. See the long comment
122+
# above for why we're back on depot.
123+
OUT=$(flyctl deploy --remote-only --depot=true --build-only --push 2>&1 | tee /dev/stderr)
111124
IMAGE=$(echo "$OUT" | grep -oP 'image:\s+\K[^\s]+' | tail -1)
112125
if [ -z "$IMAGE" ]; then
113126
echo "::error::Could not find image tag in flyctl output"

0 commit comments

Comments
 (0)