fix(dind): translate sibling bind sources to runner snapshot upperdir#82
Merged
Conversation
The dind shim used to silently drop any `-v` source that didn't os.Stat on the daemon's filesystem. GHA `container:` workflows ask for sources inside the runner container's mount namespace (/home/runner/_work/_temp/<uuid>.sh), which the dind daemon can't see — so every bind was dropped, the sibling started without the script directory, and `docker exec sh -e /__w/_temp/<uuid>.sh` failed with "cannot open". Resolve each requested source against (1) the non-rootfs bind table ephemerd installed into the runner (/var/run/docker.sock and friends), then (2) the runner snapshot's overlayfs upperdir (rw — runner-written workspace files), then (3) the lowerdirs (ro — image layers, shared across jobs so writes would corrupt the cache). Anything that doesn't match returns 400 with a clear "bind mount X -> Y rejected" message instead of being quietly dropped. Linux-only for v1; Windows-native jobs use a different snapshotter and mount model and are deferred. Architecture doc at docs/arch/dind-bind-translation.md. Closes the GHA `container:` regression for jobs running on ephemerd self-hosted runners.
Prepare(ctx, key, "") with an empty parent returns a plain bind mount, not an overlay — so the e2e never saw an upperdir= option to extract. Stage an empty committed parent first, then Prepare on top, so the active snapshot is a real overlayfs mount with upperdir/lowerdir layout the translator can walk. Also fix the lease-delete callback to set the runtime namespace before calling the LeasesService, which was failing on cleanup with "namespace is required".
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
GitHub Actions workflows that use the
container:directive failimmediately on ephemerd self-hosted runners:
The dind shim accepted every
-vin the API request but silentlydropped any bind whose source didn't
os.Staton the dind daemon'sfilesystem. Because the sources arrive from inside the runner
container's mount namespace, the dind daemon (running outside that
namespace) saw none of them. Every bind was dropped. The sibling
container started but its
_tempmountpoint was empty, so the step'sdocker exec sh -e /__w/_temp/<uuid>.shfailed.This broke every workflow using
container:— ephpm, Anthropic-styleworkflows, anything wanting a reproducible toolchain image.
Fix
A container-to-container bind translation layer. The runtime registers
the runner snapshot key plus the non-rootfs bind table (e.g.
/var/run/docker.sock→ the per-job dind socket file) with the dindserver right after
NewContainersucceeds. When a sibling-createrequest arrives, each
-vsource resolves against:rw.ro(image layers areshared across jobs; a rw mount on top would corrupt the cache).
bind mount X -> Y rejected: source not visible to ephemerd dind. Loud failure replaces the previous silentdrop.
Security envelope
Siblings can only see what the runner could already see. Bind table
entries are paths ephemerd itself installed into the runner; snapshot
upperdir/lowerdir entries are inside the runner's rootfs. There is no
code path that resolves attacker-supplied sources against the real host
filesystem — the silent-drop bug accidentally provided this property
and the loud-fail fix preserves it explicitly. See the arch doc.
Lifecycle
pkg/runtime.Destroycallsenv.Dind.Stop()(which kills everysibling and drops the dind namespace) before
container.Delete(WithSnapshotCleanup)removes the runner snapshot.Siblings are gone before the upperdir disappears — no stale mounts in
normal teardown.
Scope
goruntime.GOOS != "windows"guard skipsregistration only on Windows-native runner code paths.
Linux-on-Windows jobs run inside a Hyper-V Linux VM where the in-VM
ephemerd process is Linux and takes the registration branch
normally.
container:is deferred — different snapshotter(
windowsfilter), different bind semantics.scenarios; rejected because ephemerd's teardown order makes that
impossible.
Tests
pkg/dind/bindtranslate_test.go:translateBindSource.buildBindMountsincluding the full 8-bindset from a real ephpm failure log — asserts docker.sock translation,
_templands in upperdir rw,externalslands in lowerdir ro.pkg/dind/bindtranslate_e2e_test.go:overlayfssnapshotter via shared embedded containerd._temp/marker.shin the actual upperdir, register the snapshotwith dind, translate, then
os.ReadFilethe marker through thetranslated source path. Proves what we hand containerd points at the
right bytes on disk.
Test plan
container:andobserve
actions/checkout+ step scripts run inside the siblingcontainer successfully.
Docs
docs/arch/dind-bind-translation.md— problem, two-container model,resolution policy, security envelope, lifecycle, wiring, Windows
reasoning, deferred follow-ups.