Prebuilt wheels and HDF5 plugin archives are available on the GitHub Releases page for Linux (amd64, arm64) and macOS (arm64).
Python wheel (recommended — includes everything):
pip install ebcc-*.whlHDF5 plugin only (for use with CDO or other HDF5 applications):
unzip ebcc-h5plugin-<platform>.zip
export HDF5_PLUGIN_PATH=$(pwd)pip install -e .
# or with optional dependencies
pip install -e ".[dev]" # Development dependencies
pip install -e ".[zarr]" # Zarr support (requires python 3.11+ for zarr 3.0+)git clone --recurse-submodules https://github.com/spcl/EBCC.git
mkdir compression-filter/src/build
cd compression-filter/src/build
cmake -DCMAKE_INSTALL_PREFIX=. ..
make && make install
# the compiled filter is stored in `src/build/lib/libh5z_ebcc.so`
Our compression algorithm is implemented as an HDF5 filter. The filter expects a chunk size equal to a single 2D “frame”, and can scale to any number of dimensions greater or equan than 2.
Its base functionality compresses the data using JPEG2000. The user provides a "base" compression ratio for this functionality. The user can also enable a compression of the residual in order to improve accuracy. The residual is the difference between the original frame and the uncompressed frame. The residual is compressed independently of the base frame, and is summed to the uncompressed base frame when decompressing. We support three modes of operation:
NONE: there is no residualMAX_ERROR: the residual is wavelet encoded and sparsified. The sparsification factor is found through an iterative process that tries out several factors and selects the largest one that keeps the max error below the selected thresholdRELATIVE_ERROR: same as MAX_ERROR, but using the (data-range) relative error instead, rel_error = (x - ref) / (ref.max - ref.min)
The input parameters depend on the chosen mode of operation. As HDF5 filters support integer parameters only, a translation from float and double types to integer representation is required. We provide a python wrapper EBCC_Filter in filter_wrapper.py to simply the process. An example of how to use the filter using the python h5py library can be found in test.py
The latest version of CDO can be downloaded here: https://code.mpimet.mpg.de/projects/cdo/files. It can be installed with:
tar -xvf <CDO-VERSION>.tar.gz
cd <CDO-VERSION>
./configure --enable-netcdf4 --with-netcdf=yes --with-hdf5=yes
make
sudo make install
Make sure the necessary libraries are loaded with export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH. Installation paths and the variables CFLAGS and CPPFLAGS may need to be set in order for the configuration and the compliation to succeed.
CDO can be now used with the --filter options as output by filter_wrapper.py. For HDF5 to see the filter, the HDF5_PLUGIN_PATH needs to be set.
As an example, with the default filter configuration and tiles of 721x1440 (used by ERA5):
HDF5_PLUGIN_PATH=<path/to/filter> cdo -b F32 --filter 308,721,1440,1128792064,3,1008981770 copy temperature.nc compressed.nc
# or
HDF5_PLUGIN_PATH=<path/to/filter> cdo -b F32 --filter $(python filter_wrapper.py --base_cr 30 --height 721 --width 1440 -m 0.5) copy temperature.nc compressed.ncAs an alternative, the setfilter function is also supported. The function allows the user to specify a filter for every variable in a netcdf file. Prepare a file myfilter containing the filter specification>
temperature="308,721,1440,1128792064,3,1008981770"
Then run:
HDF5_PLUGIN_PATH=<path/to/filter> cdo -b F32 setfilter,filename=myfilter temperature.nc compressed.nc
CAUTION Make sure to set output precision to float32 in cdo using -b F32! Otherwise, undefined behavior will occur (Segmentation Fault or incorrect result). Also make sure the chunksize of input netcdf file is a multiple of the tile size (height, width). If not, please either change height & width or rechunk the file using nccopy -c ... .
EBCC_LOG_LEVEL: valid value int [0, 5], default to 3, 0 - TRACE, 1 - DEBUG, 2 - INFO, 3 - WARN, 4 - ERROR, 5 - FATALEBCC_INIT_BASE_ERROR_QUANTILE: valid value float [0, 1), default to 1e-6. For max-error modes, the base JP2 layer is optimized so at least1 - EBCC_INIT_BASE_ERROR_QUANTILEof values are within the optimization error target before residual compression is considered. Set to 0 to require the base layer itself to satisfy the target everywhere, which can make residual compression unnecessary.EBCC_DISABLE_PURE_BASE_COMPRESSION_FALLBACK: when set, turn off the final pure JP2 fallback that can replace JP2 + residual if pure JP2 is smaller or residual compression cannot satisfy the target (not recommended).EBCC_DISABLE_PURE_BASE_COMPRESSION_FALLBACK_CONSISTENCY: when set, skip resetting the pure JP2 fallback search to the originalbase_crbefore enforcing the final max-error target. This can save work but may make fallback behavior less consistent with a fresh pure-base search.EBCC_DISABLE_MEAN_ADJUSTMENT: when set, do not shift the storedminvalandmaxvalby the measured mean compression error. Disabling this keeps the selected compressed representation unchanged by any final mean-bias correction.EBCC_ERROR_BOUND_STRICT_MODE: valid value1to enable. When enabled, abort compression if the selected pure JP2 or lossless JP2 fallback still exceeds the configured max-error bound. By default, EBCC emits a warning and keeps the selected fallback.EBCC_ERROR_BOUND_SLACK: valid value float [0, 1), default to 0.01. When mean adjustment is enabled, max-error and relative-error modes optimize compression against(1 - EBCC_ERROR_BOUND_SLACK) * error_target, leaving the remaining fraction of the configured bound for the final mean adjustment. The finalminval/maxvaladjustment is still applied only if the adjusted reconstruction stays within the original configured error bound.