Skip to content

AVRO-4272: Modernize share/docker/Dockerfile with BuildKit and reduce image size#3826

Open
iemejia wants to merge 1 commit into
apache:mainfrom
iemejia:avro-dockerfixes
Open

AVRO-4272: Modernize share/docker/Dockerfile with BuildKit and reduce image size#3826
iemejia wants to merge 1 commit into
apache:mainfrom
iemejia:avro-dockerfixes

Conversation

@iemejia

@iemejia iemejia commented Jun 21, 2026

Copy link
Copy Markdown
Member

Summary

Modernize the Docker build image (share/docker/Dockerfile) to leverage BuildKit features, reduce image size, and improve rebuild speed.

Changes

BuildKit cache mounts

  • Add # syntax=docker/dockerfile:1 directive for portability
  • Use --mount=type=cache for apt, npm, cpanm, and bundler caches — eliminates repeated apt-get update/apt-get clean cycles, produces smaller layers, and speeds up rebuilds
  • Use --mount=type=bind for Ruby gem resolution (replaces COPY layers)

Package trimming

  • Replace libboost-all-dev (~70+ sub-packages) with only the four Boost packages needed by the C++ test suite: libboost-dev, libboost-test-dev, libboost-random-dev, libboost-math-dev
  • Remove redundant packages already provided by build-essential (g++, gcc, make) or by other -dev packages (libsnappy1v5)
  • Remove apt-transport-https (unnecessary on Ubuntu 24.04)
  • Add --no-install-recommends to all apt-get install calls
  • Add php-zip and unzip (previously pulled as recommends, needed by Composer)
  • Remove Rust toolchain (no longer needed)
  • Add tmux to ease interactive work inside the container (e.g. running multiple build/test sessions in parallel)

Housekeeping

  • Convert build-only variables (MAVEN_VERSION, APACHE_DIST_URLS, PHP8_VERSION) from ENV to ARG
  • Remove dead PIP_NO_CACHE_DIR=off (misleading and moot with uv)
  • Fix .NET SDK install to use && instead of ; (fail-fast on errors)
  • Fix PHP extension build to use /tmp/lang/php (consistent naming, proper cleanup)
  • Move Ruby bundle install after .NET/Java layers for better cache efficiency
  • Move libbz2-dev, libzstd-dev, libyaml-dev to main packages section

Python packaging fix (lang/py/pyproject.toml)

  • Fix license-files path to point to avro/LICENSE (actual location)
  • Add [tool.setuptools.packages.find] with include = ["avro*"] to ensure correct package discovery
  • Bump mypy upper bound to <2.2.0 in uv.lock

Requirements

Requires Docker BuildKit (default builder since Docker 23.0).

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR modernizes the Avro build Docker image to leverage Docker BuildKit features (cache/bind mounts) while trimming installed packages to reduce image size and improve rebuild speed. It also adjusts Python packaging metadata and dependency locking for the Python implementation.

Changes:

  • Updated share/docker/Dockerfile to use BuildKit cache/bind mounts, trim apt packages, and reorder language/tool installs for better caching.
  • Updated Python packaging metadata in lang/py/pyproject.toml and bumped the mypy upper bound in lang/py/uv.lock.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 3 comments.

File Description
share/docker/Dockerfile BuildKit cache/bind mounts, package trimming, install flow changes for faster rebuilds/smaller layers.
lang/py/pyproject.toml Python packaging metadata tweaks (license files path, setuptools package discovery).
lang/py/uv.lock Updates the mypy version constraint upper bound.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread share/docker/Dockerfile Outdated
Comment thread share/docker/Dockerfile Outdated
Comment thread lang/py/pyproject.toml Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 6 comments.

Comment thread share/docker/Dockerfile Outdated
Comment thread lang/py/pyproject.toml
Comment thread share/docker/Dockerfile
Comment thread share/docker/Dockerfile
Comment thread share/docker/Dockerfile
Comment thread share/docker/Dockerfile Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 3 comments.

Comment thread share/docker/Dockerfile
Comment thread share/docker/Dockerfile Outdated
Comment thread share/docker/Dockerfile Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.

Comment thread share/docker/Dockerfile Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

Comment thread share/docker/Dockerfile Outdated
Comment thread share/docker/Dockerfile
… image size

Use BuildKit cache mounts (--mount=type=cache) for apt, npm, cpanm,
and bundler. Eliminates repeated apt-get update/clean cycles, produces
smaller layers, and speeds up rebuilds. Use bind mounts for Ruby gem
resolution, replacing COPY layers. Add # syntax=docker/dockerfile:1
directive for portability.

Replace libboost-all-dev (~70+ sub-packages) with the four Boost
packages actually needed by the C++ test suite: libboost-dev,
libboost-test-dev, libboost-random-dev, libboost-math-dev.

Remove redundant packages provided by build-essential (g++, gcc, make)
or by other -dev packages (libsnappy1v5). Remove apt-transport-https
(unnecessary on Ubuntu 24.04). Add --no-install-recommends to all
apt-get install calls. Add php-zip and unzip (previously pulled as
recommends, needed by Composer).

Move libbz2-dev and libzstd-dev from PHP section to main packages.
Move libyaml-dev from Ruby section to main packages. Remove Rust
toolchain (no longer needed).

Fix .NET SDK install to use && instead of ; (fail-fast on errors).
Convert build-only variables (MAVEN_VERSION, APACHE_DIST_URLS,
PHP8_VERSION) from ENV to ARG. Remove dead PIP_NO_CACHE_DIR=off
(misleading and moot with uv). Move Ruby bundle install after .NET/Java
layers for better cache efficiency. Fix PHP extension build to use
/tmp/lang/php (consistent naming, proper cleanup).

Requires Docker BuildKit (default builder since Docker 23.0).

Assisted-by: GitHub Copilot:claude-opus-4.6

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

Comment thread share/docker/Dockerfile
Comment on lines +109 to +112
curl -fsSL -o /tmp/nodesource_setup.sh https://deb.nodesource.com/setup_24.x \
&& bash /tmp/nodesource_setup.sh \
&& rm /tmp/nodesource_setup.sh \
&& apt-get -qqy install --no-install-recommends nodejs \
Comment thread share/docker/Dockerfile
Comment on lines +206 to +208
&& curl -fsSL -o /tmp/uv-install.sh https://astral.sh/uv/0.11.23/install.sh \
&& sh /tmp/uv-install.sh \
&& rm /tmp/uv-install.sh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants