Release v1.1.0: Pyproject migration, Pixi support, and CMIP7 CVs#231
Draft
Release v1.1.0: Pyproject migration, Pixi support, and CMIP7 CVs#231
Conversation
- Add missing imports to config.py doctest examples - Convert file-writing example to code-block to avoid side effects - Add proper imports (xarray, numpy) to bounds.py doctest examples - Add Rule object creation in __init__.py doctest example for add_vertical_bounds - Change print() assertions to direct boolean checks for cleaner output
- Update config.py expected xarray_engine from netcdf4 to h5netcdf (matches Dockerfile env) - Add +ELLIPSIS directive to bounds functions to ignore INFO log output - Keeps tests strict on actual functionality while allowing log format variations
…n pipelines - Replace matrix-based jobs with individual named jobs per Python version - Each version now flows independently: build-X-Y → meta-X-Y → [unit, integration, doctest]-X-Y - Python 3.9 can complete entire pipeline while 3.12 is still building - Reduces pipeline latency and improves parallelization - Total jobs: 4 builds + 16 tests (4 versions × 4 test types)
- Add CMIP7_DReq_Software and cmip6-cmor-tables to flake8 exclusions - Add both submodules to isort skip list - Update black exclude pattern to cover all three submodules - Prevents linting failures from third-party code in git submodules
- Use ellipsis wildcards in expected output lines instead of bare '...' - Match actual logging output structure with '...INFO → message...' - Avoids doctest ambiguity where '...' is interpreted as continuation prompt - Properly validates that bounds are added while allowing variable formatting
- Set PYTHONLOGLEVEL=CRITICAL for all doctest jobs in CI - Prevents logging output from interfering with doctest expected output - Cleaner solution than modifying doctest examples or pytest config - Applies to all four Python versions (3.9, 3.10, 3.11, 3.12)
- Add Docker login step to authenticate with ghcr.io - Push images with two tags per Python version: - ghcr.io/esm-tools/pycmor-testground:py3.X-<commit-sha> - ghcr.io/esm-tools/pycmor-testground:py3.X-<branch-name> - Upload images as workflow artifacts for same-run access - Enables reproducible test environments via container registry - Infrastructure as Code: Dockerfile.test defines test infrastructure
Document the Infrastructure as Code approach for test environments: - Container image publishing to GitHub Container Registry - Tagging scheme for reproducibility (commit SHA, branch, semver) - CI/CD workflow for building and distributing testgrounds - Local usage examples for developers - Future improvements (conditional publishing, cleanup policies, multi-arch) - Infrastructure as Code principles and traceability - Troubleshooting guide for common issues The testground system treats test infrastructure as code, with Dockerfile.test as the declarative specification and container images as infrastructure artifacts.
- Use substring(github.sha, 0, 7) for 7-character short SHA - Use github.head_ref || github.ref_name to get actual branch name in both PR and push contexts (avoids '231/merge' format) This fixes the invalid tag error caused by github.ref_name returning '231/merge' for pull requests instead of the source branch name.
Remove tar-based artifact workflow in favor of direct GHCR pulls. Changes: - Remove load: true and local unprefixed tags from all build jobs - Remove tar export, cache, and artifact upload steps - Update all test jobs to pull directly from ghcr.io - Add GHCR login to all test jobs Benefits: - Fixes Docker Hub authentication error (no unprefixed tags) - Simplifies workflow (-60 lines) - Better performance (GHCR layer caching vs tar artifacts) - Perfect CI/local parity - same images available locally
Bumps [pypa/gh-action-pypi-publish](https://github.com/pypa/gh-action-pypi-publish) from 1.9.0 to 1.13.0. - [Release notes](https://github.com/pypa/gh-action-pypi-publish/releases) - [Commits](pypa/gh-action-pypi-publish@v1.9.0...v1.13.0) --- updated-dependencies: - dependency-name: pypa/gh-action-pypi-publish dependency-version: 1.13.0 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
…github/workflows/pypa/gh-action-pypi-publish-1.13.0 chore(deps): bump pypa/gh-action-pypi-publish from 1.9.0 to 1.13.0 in /.github/workflows
The >>> prompts in the .. code-block:: python directives were being interpreted by pytest's --doctest-modules as actual doctests, even though they're inside Sphinx code-block directives. This caused doctest failures due to the ... continuation lines being misinterpreted as doctest continuation markers. Changed the examples to plain Python code without the interactive prompts, which is more appropriate for Sphinx code-block directives anyway. The examples are now for documentation purposes only, not executable doctests. Fixes doctest errors in: - add_bounds_from_coords() - add_vertical_bounds()
The logger was hardcoded to INFO level, which meant that legitimate INFO log statements (side effects of normal operation) would appear in doctest output even when PYTHONLOGLEVEL=CRITICAL was set in CI. Now the logger respects the PYTHONLOGLEVEL environment variable, allowing doctests to run with logging suppressed while keeping the logging statements in the actual code (which is correct - logging is a valid side effect). Changes: - Read PYTHONLOGLEVEL from environment, default to INFO if not set - Apply the log level when configuring the RichHandler - This allows CI doctest runs to suppress all logs below CRITICAL
Re-added doctest prompts (>>>) to bounds.py examples now that logging is properly suppressed via PYTHONLOGLEVEL. The examples now show both input and output datasets with structured representations, making it much easier to understand what the functions do. Changes: - Restored >>> prompts for executable doctests - Added print() statements for input datasets before transformation - Added print() statements for output datasets after transformation - Used doctest directives (+ELLIPSIS, +NORMALIZE_WHITESPACE) for flexibility - Shows full xarray Dataset structure: dimensions, coordinates, data variables This provides clear before/after visualization while maintaining executable tests that verify the functions work correctly.
ARM64 builds take 3-4x longer due to QEMU emulation, so make them optional to speed up CI. Builds now default to linux/amd64 only. To build ARM64 images: 1. Go to Actions tab in GitHub 2. Select 'Run Basic Tests' workflow 3. Click 'Run workflow' 4. Check the 'Build ARM64 images' option This allows: - Fast CI for most PRs and commits (amd64 only) - Manual ARM64 builds when needed for M1/M2/M3 Mac users - ARM64 builds still happen on tags (for releases) Changes: - Add workflow_dispatch trigger with build_arm64 boolean input - Conditionally set platforms based on input (defaults to amd64 only) - Applied to all 4 Python version build jobs
- Implement three chunking algorithms (simple, even_divisor, iterative) inspired by dynamic_chunks library - Add chunking module (src/pycmor/std_lib/chunking.py) with functions for calculating optimal chunk sizes based on target size and access patterns - Integrate chunking into save_dataset() with automatic encoding generation - Add 7 new configuration options for chunking and compression control - Support global and per-rule chunking configuration via YAML - Include comprehensive test suite (13 tests, all passing) - Add user documentation with examples and troubleshooting guide - Default: 100MB chunks, time-dimension preference, level 4 compression This enables users to optimize NetCDF file I/O performance by configuring internal chunking strategies that match their data access patterns.
…ss() - Replace auto-import with enable_xarray_accessor() for lazy registration - Add _build_rule() helper for interactive Rule construction - Add StdLibAccessor with tab-completable std_lib steps via ds.pycmor.stdlib - Add .process() method for running full pipelines interactively - Add BaseModelRun ABC in pycmor.tutorial for test infrastructure - Update existing tests to use enable_xarray_accessor() - Add comprehensive test suite in test_accessor_api.py
# Conflicts: # src/pycmor/core/cmorizer.py
- Add required compound_name field to all CMIP7 test config rules (validator requires it for cmor_version=CMIP7) - Add setuptools to Dockerfile.test (pyfesom2 imports pkg_resources)
The vendored all_var_info.json does not populate cmip7_compound_name or cmip6_compound_name on DRVs. So variable_id falls back to the short name (e.g., "tas"). The matching logic compared the full compound name "Amon.tas" against the plain "tas" when only one side had a dot, which always failed. Fix: always extract the short name from compound_name for comparison, regardless of whether the DRV also has dots. Also add a fallback match against drv.name directly. Add CMIP7 DRV fixtures (dr_cmip7_tas, dr_cmip7_thetao) for testing.
Pipeline._run_prefect() now uses return_state=True and checks for failures, re-raising the original exception. Previously, Prefect swallowed exceptions via on_failure callbacks that only logged. CMORizer._parallel_process_prefect() also checks both the flow-level state and individual rule future states for failures. This ensures integration tests correctly fail when pipeline steps raise exceptions.
DefaultPipeline had both handle_unit_conversion (correct pipeline step taking data+rule) and units.convert (low-level function taking da+from_unit+to_unit). The latter was called with (data, rule) args, causing ParameterBindError: missing required argument 'to_unit'. handle_unit_conversion already calls convert() internally, so the duplicate step was both wrong and redundant.
- dimension_mapping.py: use getattr(rule, "dimension_mapping") instead
of rule._pycmor_cfg("dimension_mapping", default={}) -- dimension_mapping
is a rule attribute, not a config option, and everett rejects non-string
defaults
- CMIP7 test configs: add activity_id="CMIP" to rules that need it for
global attribute generation
- cmorizer.py: fix parallel error checking to handle both PrefectFuture
and State objects from different Prefect versions
…_run - dimension_mapping.py: check isinstance(user_mapping, dict) to handle Mock objects in tests (getattr on Mock returns Mock, not None) - base_model_run.py: convert doctest example to code-block to prevent pytest from trying to execute it
Cherry-picked from PR #194 by @mzapponi (adapted for src/pycmor/ paths): - gather_inputs.py: if rule has time_dimname and dataset uses that dimension instead of "time", rename it automatically on load - pipeline.py: defensive getattr for _cluster attribute Co-authored-by: Martina Zapponi <mzapponi@users.noreply.github.com>
fix: accessor API with lazy registration and BaseModelRun infrastructure
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR consolidates several major improvements for the pycmor 1.1.0 release:
Major Changes
✅ PR #212 - Pyproject Migration
setup.py/setup.cfgto modernpyproject.tomlconfiguration✅ PR #224 - Pixi Support
pixi.lockfor reproducible environments✅ PR #222 - CMIP7 Controlled Vocabularies Implementation
Already Incorporated
The following PRs were already merged into prep-release:
Breaking Changes
Testing
Checklist
pixiconda environment