feat(dind): implement POST /containers/{id}/attach for docker run#72
Merged
Conversation
Docker CLI's `docker run <image>` (non-detached) does three calls:
POST /containers/create → returns container ID
POST /containers/{id}/attach → HTTP/1.1 Upgrade hijacks the conn
POST /containers/{id}/start → kicks off the task
We previously returned 501 for any /attach because the route fell
through to handleNotImplemented. The CLI surfaced this as:
unable to upgrade to tcp, received 501
so even after fixing the alpine:3.20 ref-qualifier path, the second
fault in the same `docker run` invocation kept jobs broken.
Implementation, modeled on handleExecStartHijack:
- handleContainerAttach in new pkg/dind/attach.go. Verifies the
Upgrade headers, drains the request body, hijacks the conn, sends
HTTP/1.1 101 UPGRADED, then waits on entry.started for the task to
be running.
- After the task is up, tails entry.LogPath to the upgraded conn.
Output is wrapped in stdcopy framing (8-byte header per chunk)
when Tty=false so the CLI demultiplexes correctly. Because
cio.LogFile merges stdout and stderr we emit everything as stream
type 1 (stdout); perfect-fidelity stdout/stderr split would
require splitting the log files at task-create time and is
flagged out of scope in the file comment.
- waitForStart blocks on entry.started, request-context cancellation,
or a 60s deadline. Defensive fallback path polls entry.Task in case
started is somehow nil.
- tailLogToWriter is a generic file-tailer driven by a chan struct{}
so the unit tests don't have to depend on containerd types. The
handler bridges from Task.Wait()'s ExitStatus channel onto a plain
done signal via a small goroutine.
containerEntry grows one field — `started chan struct{}` — initialized
by handleContainerCreate and closed by handleContainerStart after
task.Start succeeds. Closes unblocks any pending attach handlers.
Stdin-attach (`docker run -i`) is accepted at the protocol level
(hijacked conn is writable) but bytes sent by the client go nowhere
yet — that requires a containerd IO mode set at task-create time,
which dind doesn't currently expose. The most common ephpm/ephemerd
docker-run flows are non-interactive so this unblocks them today;
follow-up tracked in the file comment.
Tests in pkg/dind/attach_test.go:
- 404 when container id unknown (route lookup precedes hijack).
- 400 when client didn't send Upgrade+Connection headers.
- waitForStart unblocks on signal, respects deadline, respects
request-context cancellation.
- tailLogToWriter forwards seeded content + appended bytes and
exits promptly after done is closed.
- stdcopy frame layout regression: 1-byte type, 3 NUL padding, 4-byte
BE size, payload.
Full pkg/dind suite stays green (TestPushHandlerEndToEnd and
TestCleanup_DindNamespaces unaffected).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Docker CLI's
docker run <image>(non-detached) issues three calls in sequence:We previously returned 501 Not Implemented for any
/attachbecause the route inrouteContainerfell through tohandleNotImplemented. The CLI surfaced this as:so even after the
alpine:3.20ref-qualifier fix landed, the second fault in the samedocker runinvocation kept ephpm's dind-based jobs broken. This PR implements the missing handler.Implementation
Modeled on the existing
handleExecStartHijackinpkg/dind/exec.go:pkg/dind/attach.go(new):handleContainerAttach— verifies Upgrade+Connection headers, drains the request body, hijacks the conn, sendsHTTP/1.1 101 UPGRADED, then waits onentry.startedfor the task to be running. Once started, tailsentry.LogPathto the upgraded conn, multiplexed withstdcopyWriterwhenTty=false.waitForStart(reqCtx, entry, deadline)— blocks onentry.started, request-context cancellation, or a 60s deadline. Defensive polling fallback ifstartedis somehow nil.tailLogToWriter(logPath, dst, done)— generic file tailer driven by achan struct{}so unit tests don't pull in containerd types. The handler bridges fromTask.Wait()'sExitStatuschannel via a small goroutine.pkg/dind/containers.go:containerEntrygains one field,started chan struct{}, initialized inhandleContainerCreateand closed byhandleContainerStartaftertask.Start()succeeds.routeContainernow dispatchesattachto the new handler.Output framing
When the container was started without a TTY (
Tty=false), every chunk goes back with Docker's stdcopy 8-byte header (stream-type[1] + padding[3] + size[4]big-endian, then payload). The Docker CLI demultiplexes those into its own stdout and stderr.Because
cio.LogFilemerges container stdout and stderr into a single file, we emit everything on stream 1 (stdout) under non-TTY mode. Perfect-fidelity split would require splitting the log files at task-create time. Flagged in the file comment as a future-improvement.Tests
pkg/dind/attach_test.go(7 cases):TestHandleContainerAttach_404Unknown— route lookup precedes hijack; unknown container id returns 404.TestHandleContainerAttach_400WithoutUpgrade— plain POST withoutUpgrade: tcp+Connection: upgradeheaders returns 400 instead of dropping into the hijack path.TestWaitForStart_SignalUnblocks— closingstartedreleases the wait promptly.TestWaitForStart_DeadlineExpires— never-closedstartedhits the deadline.TestWaitForStart_RequestCancelled—r.Context()cancellation short-circuits (handles client disconnect after/attachbut before/start).TestTailLogToWriter_ForwardsThenExitsOnStatus— pre-seeded content streams, appended bytes stream, and the function exits after the done channel closes.TestStdcopyWriter_FrameLayout— regression assertion on the 8-byte header format the CLI demuxes.Full
pkg/dindsuite green (TestPushHandlerEndToEnd,TestCleanup_DindNamespaces, all the existing exec/buildkit/auth tests still pass).Out of scope (documented in the file comment)
docker run -i): hijacked conn is writable but the bytes go nowhere yet. Requires a containerd IO mode set at task-create time, which dind doesn't currently expose. The vast majority ofdocker runinvocations in CI workloads (docker run alpine:3.20,docker run -d ..., build helpers, etc.) are non-interactive and unblocked by this PR.Test plan
pkg/dindunit tests pass locally withCGO_ENABLED=0 go test -tags containers_image_openpgp ./pkg/dind/....docker run alpine:3.20 sh -c "echo hi"inside dind, confirm the container runs, output streams back, exit code propagates. Pairs with thealpine:3.20ref-qualifier fix.Related
fix/dind-qualify-tag-parsePR which fixed the first failure (docker run alpine:3.20→invalid port ":3.20"). Both bugs surface in the samedocker runinvocation but have unrelated root causes. Either alone leaves the other still broken; they're separate PRs for reviewability, both needed to actually unblock the workflow.