Skip to content

GitHub Actions Workflow for Multi-Architecture Container Builds #51

@jbearak

Description

@jbearak

Problem Statement

The current GitHub Actions workflow has several inefficiencies and will fail due to storage constraints:

  1. Incomplete file watching - Missing dotfiles dependencies in paths configuration
  2. Inefficient parallel builds - No cache sharing between r-container and full-container jobs
  3. Storage constraints - 58GB+ images exceed GitHub Actions 10GB cache limit
  4. Resource waste - Duplicate work building shared base layers

Requirements

Primary Goals

  • Build both r-container and full-container for linux/amd64 and linux/arm64
  • Maximize cache efficiency between builds (r-container layers should benefit full-container)
  • Handle 58GB+ total storage requirements (25GB + 25GB + 4GB + 4GB + intermediate layers)
  • Maintain current functionality (push to GHCR with proper tagging)

Technical Constraints

  • Must use native GitHub runners (no QEMU emulation inefficiency)
  • Must work within GitHub Actions limits and quotas
  • Must be reliable for production use

Proposed Solution

1. Fix File Dependencies

Update paths configuration to include all Dockerfile dependencies:

paths:
  - 'Dockerfile'
  - 'build.sh' 
  - 'R_packages.txt'
  - 'install_r_packages.sh'
  - 'r-shell-config'
  - 'dotfiles/*'
  - '.github/workflows/build-and-push.yml'

2. Sequential Builds with Registry Cache

Replace current parallel matrix strategy with single job building r-container first, then full-container:

Rationale:

  • r-container is subset of full-container (shares base layers through R package installation)
  • Registry cache bypasses 10GB GitHub Actions cache limit
  • Sequential execution ensures optimal layer reuse

3. Implementation Approach

Replace current workflow with:

name: Build and Push Containers

on:
  push:
    branches: [ main ]
    paths:
      - 'Dockerfile'
      - 'build.sh'
      - 'R_packages.txt'
      - 'install_r_packages.sh'
      - 'r-shell-config'
      - 'dotfiles/*'
      - '.github/workflows/build-and-push.yml'
  workflow_dispatch: {}

permissions:
  contents: read
  packages: write

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository_owner }}/base-container

jobs:
  build-and-push:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Log in to GHCR
        uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Generate metadata
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
          tags: |
            type=ref,event=branch
            type=ref,event=pr
            type=sha,prefix={{branch}}-
            type=raw,value=latest,enable={{is_default_branch}}

      - name: Build and push r-container
        uses: docker/build-push-action@v5
        with:
          context: .
          platforms: linux/amd64,linux/arm64
          target: r-container
          push: true
          tags: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:r-${{ steps.meta.outputs.version }}
          cache-from: type=registry,ref=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:cache-r
          cache-to: type=registry,ref=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:cache-r,mode=max

      - name: Build and push full-container
        uses: docker/build-push-action@v5
        with:
          context: .
          platforms: linux/amd64,linux/arm64
          target: full-container
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          cache-from: |
            type=registry,ref=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:cache-r
            type=registry,ref=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:cache-full
          cache-to: type=registry,ref=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:cache-full,mode=max

Expected Benefits

  1. ~50% faster builds - full-container reuses all r-container layers
  2. Reliable storage - Registry cache handles 58GB+ without limits
  3. Correct triggering - All file dependencies properly watched
  4. Resource efficiency - Single runner, no duplicate work
  5. Better debugging - Sequential execution shows exactly where failures occur

Implementation Notes

  • Remove QEMU setup (using native runners for each architecture)
  • Use docker/metadata-action@v5 for consistent tagging
  • Registry cache persists indefinitely (vs 7-day GitHub Actions cache)
  • No authentication setup needed (GITHUB_TOKEN is automatic)

Acceptance Criteria

  • Workflow triggers on changes to any Dockerfile dependency
  • r-container builds successfully for both architectures
  • full-container builds successfully reusing r-container cache
  • Images pushed to GHCR with correct tags (latest, r-latest, etc.)
  • Build time reduced compared to current parallel approach
  • No storage-related failures during builds

Files to Modify

  • .github/workflows/build-and-push.yml (complete replacement)

This issue addresses all identified problems while providing a robust, scalable solution for the container build pipeline.

Git Workflow (MANDATORY)

  • Create a new branch based off of the amd64 branch
  • Submit a stacked pull request

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions