Skip to content

Fix cross-platform builds, PyPI upload, and Beam 2.53+ compatibility#97

Open
czgdp1807 wants to merge 7 commits intotensorflow:masterfrom
czgdp1807:macos-build
Open

Fix cross-platform builds, PyPI upload, and Beam 2.53+ compatibility#97
czgdp1807 wants to merge 7 commits intotensorflow:masterfrom
czgdp1807:macos-build

Conversation

@czgdp1807
Copy link

@czgdp1807 czgdp1807 commented Jan 28, 2026

Problem

Building and publishing tfx-bsl fails on both Linux and macOS due to multiple issues:

  1. macOS linker error: ThinLTO assumes Apple's libLTO.dylib, causing failures with conda-provided compilers
  2. zlib compilation: Macro conflicts between zlib 1.2.11 and modern macOS SDK headers
  3. Module symbols: Python 2-era extension symbols cause linker failures on Python 3
  4. HDF5 dependency: Missing system library for h5py (transitive dependency) on Python 3.10+ and 3.11+
  5. Test failures: Apache Beam 2.53+ metrics API changes break telemetry tests on Python 3.11+
  6. Missing dependencies: dill not explicitly listed despite being required
  7. C++ compilation: Missing absl/strings/str_cat.h header causes build failure
  8. PyPI upload failure: Incorrect repository URL (pypi.org/legacy/ instead of upload.pypi.org/legacy/) causes 404 errors

Solution

This PR addresses all build and test failures across platforms:

Changes

.bazelrc

  • Disable ThinLTO on macOS/macOS arm64 to avoid linker errors with conda compilers
  • Define HAVE_UNISTD_H=1 for zlib compatibility with modern macOS headers

.github/workflows/build.yml

WORKSPACE

  • Update zlib to version 1.3.1 (from implicit 1.2.11)
  • Define explicit http_archive for zlib before tensorflow dependency loads it
  • Prevents fdopen macro conflicts with macOS SDK _stdio.h

pyproject.toml

  • Linux: Upgrade from manylinux2014 to manylinux_2_28 for HDF5 >= 1.10.7
  • Linux: Add hdf5-devel installation in before-build step
  • macOS: Add brew install hdf5 in before-build step
  • Fixes h5py build failures for Python 3.10+ on Linux and Python 3.11+ on all platforms

setup.py

  • Add explicit dill>=0.3.1,<1.0.0 dependency (required by Apache Beam but not declared)
  • Pin tensorflow-serving-api to >=2.17.1,<3 (was >=2.13.0,<3)
  • Updated all selectors: default, and nightly
  • Ensures consistent protobuf/grpc versions across all build configurations

tfx_bsl/build_macros.bzl

  • Remove Python 2 module initialization symbols (init_*, init*)
  • Keep only PyInit_<module> for Python 3 compatibility
  • Fixes undefined symbol errors during extension module linking

tfx_bsl/cc/sketches/misragries_sketch.h

  • Add missing #include "absl/strings/str_cat.h" header
  • Fixes compilation error: no member named 'StrCat' in namespace 'absl'

tfx_bsl/telemetry/collection_test.py

  • Update tests for Apache Beam 2.53+ metrics API compatibility
  • Remove with context manager pattern and explicitly call pipeline_result.wait_until_finish()
  • Ensures metrics are committed before assertions run

tfx_bsl/coders/csv_decoder_test.py

  • Update exception handling for Apache Beam 2.53+ compatibility
  • Accept both ValueError and RuntimeError (newer Beam wraps exceptions)
  • Add explicit pipeline_result.wait_until_finish() call

Testing

Local testing: macOS arm64, Python 3.12, Bazel 6.5.0

  • Build completes without compiler or linker errors
  • All previously failing tests now pass
  • Package installs via pip install -e ".[test]"

CI validation: All checks passing on fork

PyPI Upload verification: Tested and working

Impact

  • Enables macOS arm64 builds with conda environments
  • Fixes CI failures on Python 3.10+ (Linux) and 3.11+ (all platforms)
  • Resolves HDF5 dependency issues for h5py transitive dependency
  • Updates tests for Apache Beam 2.53+ compatibility
  • Maintains Python 3.9+ compatibility
  • Drops Python 2 support (already EOL)

Root Cause Analysis

The CI failures emerged because,

  • 6 months ago: older package versions didn't require h5py or use Beam 2.53+
  • Today: same commit resolves to newer package versions that:
    • Pull in h5py as transitive dependency (requires HDF5 C library)
    • Use Apache Beam 2.53+ with changed metrics API behavior

This is why Python 3.9 still passes (uses older dependency ranges) while 3.10+ fails.

This commit addresses several build failures specific to macOS with
Xcode Command Line Tools clang 17:

- Disable ThinLTO on macOS to avoid libLTO.dylib mismatch errors
  between Bazel and conda-provided compilers
- Pin tensorflow-serving-api to version 2.17.1
- Update zlib to 1.3.1 and define HAVE_UNISTD_H to prevent fdopen
  macro conflicts with macOS SDK headers
- Remove deprecated Python 2 module initialization symbols (init_*),
  keeping only PyInit_* for Python 3 compatibility

These changes enable successful builds on macOS arm64 systems using
conda environments without requiring Apple's native toolchain.

Tested on: macOS arm64, Python 3.12, Bazel 6.5.0
@czgdp1807 czgdp1807 changed the title Fix macOS arm64 build compatibility with Xcode compiler toolchain clang 17 Fix cross-platform build issues and update tests for Beam 2.53+ Feb 2, 2026
@czgdp1807 czgdp1807 changed the title Fix cross-platform build issues and update tests for Beam 2.53+ Fix cross-platform builds, PyPI upload, and Beam 2.53+ compatibility Feb 3, 2026
@czgdp1807
Copy link
Author

@vkarampudi @aktech This is ready to go. Please feel free to merge or let me know any changes needed in this PR.

@czgdp1807
Copy link
Author

ModuleNotFoundError: No module named 'pkg_resources'

Minor error I guess. Will fix it after I am free from tfx.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments