Part of #715
Problem
Our build artifacts are not reproducible. Building the same source twice may produce different checksums due to timestamps embedded in the tarball/wheel metadata. This makes voter verification harder and prevents bit-for-bit comparison of locally rebuilt packages against the RC artifacts.
Solution
Set SOURCE_DATE_EPOCH to the timestamp of the tagged commit before building. This ensures all file timestamps inside the archive are deterministic.
Airflow does this in their release tooling and it allows voters to rebuild from source and binary-compare against the SVN artifacts.
Implementation
- In
scripts/apache_release.py, before calling hatch build or python -m build:
import subprocess
epoch = subprocess.check_output(["git", "log", "-1", "--format=%ct", tag]).strip()
os.environ["SOURCE_DATE_EPOCH"] = epoch.decode()
- Document in
scripts/README.md that builds are reproducible
- Add a test that rebuilds from a tagged commit and compares checksums
References
Part of #715
Problem
Our build artifacts are not reproducible. Building the same source twice may produce different checksums due to timestamps embedded in the tarball/wheel metadata. This makes voter verification harder and prevents bit-for-bit comparison of locally rebuilt packages against the RC artifacts.
Solution
Set
SOURCE_DATE_EPOCHto the timestamp of the tagged commit before building. This ensures all file timestamps inside the archive are deterministic.Airflow does this in their release tooling and it allows voters to rebuild from source and binary-compare against the SVN artifacts.
Implementation
scripts/apache_release.py, before callinghatch buildorpython -m build:scripts/README.mdthat builds are reproducibleReferences
breeze release-management prepare-airflow-distributions