Guard the datashader import in the rasterizer benchmark#3560
Merged
Conversation
Wrap the datashader (and spatialpandas, which only feeds the datashader path) imports in a try/except so the benchmark runs whether or not datashader is installed. The datashader column is skipped when absent and included when present. Claude-Session: https://claude.ai/code/session_01TF3CsJ7nee9agjajEWLw5d
brendancol
commented
Jun 27, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
PR Review: Guard the datashader import in the rasterizer benchmark
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
None.
Nits (optional improvements)
benchmarks/rasterizer_benchmarks.py(generate_markdown): when datashader is absent, the polygon timing and ratio tables still print a "datashader" column header filled with--. The issue wording says "skip the datashader column". Dropping the column outright for polygons would mean threadingHAS_DATASHADERintogenerate_markdown. The--fallback matches how the cupy column already handles a missing backend, so I read this as a reasonable consistency choice rather than a defect. Flagging it so the maintainer can decide.benchmarks/rasterizer_benchmarks.py:39: the guard catches onlyImportError. If datashader is installed but breaks on import for some other reason, that will propagate. Standard practice and fine; noting for completeness.
What looks good
- spatialpandas and datashader are folded into one guard. That is correct, since spatialpandas only feeds the datashader path, so there is no half-guarded state.
- Every reference to
ds.andspatialpandas(therun_datashaderbody, thespd_gdfbuild) sits behinduse_datashader, which now requiresHAS_DATASHADER. No NameError path when datashader is missing. use_datashader = kind == "polygon" and HAS_DATASHADERis the minimal correct gate.
Checklist
- Algorithm matches reference/paper: n/a (benchmark infra)
- All implemented backends produce consistent results: n/a
- NaN handling is correct: n/a
- Edge cases covered: both datashader-present and datashader-absent paths verified
- Dask chunk boundaries handled correctly: n/a
- No premature materialization or unnecessary copies: yes
- Benchmark exists or not needed: this is the benchmark
- README feature matrix updated: not applicable, no new function
- Docstrings present and accurate: the guard carries an explanatory comment
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #3452.
benchmarks/rasterizer_benchmarks.pykeeps datashader as a deliberate A/B comparison, so it stays referenced. But once datashader is no longer installed by default, the unconditionalimport datashader as dsmade the whole script fail to run. This wraps that import (plus spatialpandas, which only feeds the datashader path) in a try/except.HAS_DATASHADERis True and the full comparison runs unchanged.HAS_DATASHADERis False, the polygon path skips the datashader timing run, and the datashader cells show--(same as how the cupy column already handles a missing backend).Backend coverage: not applicable, this is a benchmark script rather than a spatial op.
Test plan:
HAS_DATASHADERis True and datashader takes part in both the timing and consistency runs.HAS_DATASHADERis False, the script imports, the xrspatial/geocube/rasterio runners and consistency checks still work, and the markdown report builds.py_compilepasses.https://claude.ai/code/session_01TF3CsJ7nee9agjajEWLw5d