Skip to content

feat(dind): implement POST /containers/{id}/attach for docker run#72

Merged
luthermonson merged 1 commit into
mainfrom
fix/dind-container-attach
May 22, 2026
Merged

feat(dind): implement POST /containers/{id}/attach for docker run#72
luthermonson merged 1 commit into
mainfrom
fix/dind-container-attach

Conversation

@luthermonson
Copy link
Copy Markdown
Contributor

Summary

Docker CLI's docker run <image> (non-detached) issues three calls in sequence:

POST /containers/create        → returns container ID
POST /containers/{id}/attach   → HTTP/1.1 Upgrade hijacks the connection
POST /containers/{id}/start    → kicks off the task

We previously returned 501 Not Implemented for any /attach because the route in routeContainer fell through to handleNotImplemented. The CLI surfaced this as:

unable to upgrade to tcp, received 501

so even after the alpine:3.20 ref-qualifier fix landed, the second fault in the same docker run invocation kept ephpm's dind-based jobs broken. This PR implements the missing handler.

Implementation

Modeled on the existing handleExecStartHijack in pkg/dind/exec.go:

  • pkg/dind/attach.go (new):

    • handleContainerAttach — verifies Upgrade+Connection headers, drains the request body, hijacks the conn, sends HTTP/1.1 101 UPGRADED, then waits on entry.started for the task to be running. Once started, tails entry.LogPath to the upgraded conn, multiplexed with stdcopyWriter when Tty=false.
    • waitForStart(reqCtx, entry, deadline) — blocks on entry.started, request-context cancellation, or a 60s deadline. Defensive polling fallback if started is somehow nil.
    • tailLogToWriter(logPath, dst, done) — generic file tailer driven by a chan struct{} so unit tests don't pull in containerd types. The handler bridges from Task.Wait()'s ExitStatus channel via a small goroutine.
  • pkg/dind/containers.go:

    • containerEntry gains one field, started chan struct{}, initialized in handleContainerCreate and closed by handleContainerStart after task.Start() succeeds.
    • routeContainer now dispatches attach to the new handler.

Output framing

When the container was started without a TTY (Tty=false), every chunk goes back with Docker's stdcopy 8-byte header (stream-type[1] + padding[3] + size[4] big-endian, then payload). The Docker CLI demultiplexes those into its own stdout and stderr.

Because cio.LogFile merges container stdout and stderr into a single file, we emit everything on stream 1 (stdout) under non-TTY mode. Perfect-fidelity split would require splitting the log files at task-create time. Flagged in the file comment as a future-improvement.

Tests

pkg/dind/attach_test.go (7 cases):

  • TestHandleContainerAttach_404Unknown — route lookup precedes hijack; unknown container id returns 404.
  • TestHandleContainerAttach_400WithoutUpgrade — plain POST without Upgrade: tcp + Connection: upgrade headers returns 400 instead of dropping into the hijack path.
  • TestWaitForStart_SignalUnblocks — closing started releases the wait promptly.
  • TestWaitForStart_DeadlineExpires — never-closed started hits the deadline.
  • TestWaitForStart_RequestCancelledr.Context() cancellation short-circuits (handles client disconnect after /attach but before /start).
  • TestTailLogToWriter_ForwardsThenExitsOnStatus — pre-seeded content streams, appended bytes stream, and the function exits after the done channel closes.
  • TestStdcopyWriter_FrameLayout — regression assertion on the 8-byte header format the CLI demuxes.

Full pkg/dind suite green (TestPushHandlerEndToEnd, TestCleanup_DindNamespaces, all the existing exec/buildkit/auth tests still pass).

Out of scope (documented in the file comment)

  • Stdin attach (docker run -i): hijacked conn is writable but the bytes go nowhere yet. Requires a containerd IO mode set at task-create time, which dind doesn't currently expose. The vast majority of docker run invocations in CI workloads (docker run alpine:3.20, docker run -d ..., build helpers, etc.) are non-interactive and unblocked by this PR.
  • True stdout/stderr split: see the framing note above.

Test plan

  • All pkg/dind unit tests pass locally with CGO_ENABLED=0 go test -tags containers_image_openpgp ./pkg/dind/....
  • Live verification: trigger an ephpm CI job that does docker run alpine:3.20 sh -c "echo hi" inside dind, confirm the container runs, output streams back, exit code propagates. Pairs with the alpine:3.20 ref-qualifier fix.

Related

  • Follows the fix/dind-qualify-tag-parse PR which fixed the first failure (docker run alpine:3.20invalid port ":3.20"). Both bugs surface in the same docker run invocation but have unrelated root causes. Either alone leaves the other still broken; they're separate PRs for reviewability, both needed to actually unblock the workflow.

Docker CLI's `docker run <image>` (non-detached) does three calls:

  POST /containers/create        → returns container ID
  POST /containers/{id}/attach   → HTTP/1.1 Upgrade hijacks the conn
  POST /containers/{id}/start    → kicks off the task

We previously returned 501 for any /attach because the route fell
through to handleNotImplemented. The CLI surfaced this as:

  unable to upgrade to tcp, received 501

so even after fixing the alpine:3.20 ref-qualifier path, the second
fault in the same `docker run` invocation kept jobs broken.

Implementation, modeled on handleExecStartHijack:

  - handleContainerAttach in new pkg/dind/attach.go. Verifies the
    Upgrade headers, drains the request body, hijacks the conn, sends
    HTTP/1.1 101 UPGRADED, then waits on entry.started for the task to
    be running.
  - After the task is up, tails entry.LogPath to the upgraded conn.
    Output is wrapped in stdcopy framing (8-byte header per chunk)
    when Tty=false so the CLI demultiplexes correctly. Because
    cio.LogFile merges stdout and stderr we emit everything as stream
    type 1 (stdout); perfect-fidelity stdout/stderr split would
    require splitting the log files at task-create time and is
    flagged out of scope in the file comment.
  - waitForStart blocks on entry.started, request-context cancellation,
    or a 60s deadline. Defensive fallback path polls entry.Task in case
    started is somehow nil.
  - tailLogToWriter is a generic file-tailer driven by a chan struct{}
    so the unit tests don't have to depend on containerd types. The
    handler bridges from Task.Wait()'s ExitStatus channel onto a plain
    done signal via a small goroutine.

containerEntry grows one field — `started chan struct{}` — initialized
by handleContainerCreate and closed by handleContainerStart after
task.Start succeeds. Closes unblocks any pending attach handlers.

Stdin-attach (`docker run -i`) is accepted at the protocol level
(hijacked conn is writable) but bytes sent by the client go nowhere
yet — that requires a containerd IO mode set at task-create time,
which dind doesn't currently expose. The most common ephpm/ephemerd
docker-run flows are non-interactive so this unblocks them today;
follow-up tracked in the file comment.

Tests in pkg/dind/attach_test.go:

  - 404 when container id unknown (route lookup precedes hijack).
  - 400 when client didn't send Upgrade+Connection headers.
  - waitForStart unblocks on signal, respects deadline, respects
    request-context cancellation.
  - tailLogToWriter forwards seeded content + appended bytes and
    exits promptly after done is closed.
  - stdcopy frame layout regression: 1-byte type, 3 NUL padding, 4-byte
    BE size, payload.

Full pkg/dind suite stays green (TestPushHandlerEndToEnd and
TestCleanup_DindNamespaces unaffected).
@luthermonson luthermonson merged commit a2f6109 into main May 22, 2026
4 checks passed
@luthermonson luthermonson deleted the fix/dind-container-attach branch May 22, 2026 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant