Daskify features by ckmah · Pull Request #179 · YeoLab/bento-tools

ckmah · 2025-02-25T03:48:42Z

point features
shape features Extend shape features #168
rename "features" to something less generic

- Introduced `pytest.ini` for test discovery and logging configuration. - Updated `_utils.py` to persist points in Dask. - Enhanced image measurement module with new functions: `mean_intensity`, `regionprops`, and `moments_optimized`. - Added clustering functionality in `_cluster.py` for unsupervised learning on point features. - Implemented density and distance calculations in `_density.py` and `_distance.py`. - Introduced optimized measurement engine in `_measure_optimized.py` using Polars and Numba. - Added tests for optimized measure functions in `test_measure_optimized.py` to ensure functionality and performance benchmarks. - Updated shape measures in `_measure.py` to support parallel processing and improved logging.

review-notebook-app · 2025-06-22T02:36:59Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

… ripley metrics - Created SVG files for distance, polarity, moments, and ripley benchmarks to visualize performance metrics. - Added corresponding JSON files to store benchmark results and machine information. - Introduced new plotting scripts to generate benchmark analysis and summary plots. - Updated dependencies in `pyproject.toml` to include `polars` for enhanced data processing capabilities.

… data - Removed `xgboost` from dependencies and updated `scikit-learn` to a more flexible version. - Consolidated development dependencies under a new `dev` section in `pyproject.toml`. - Enhanced the synthetic dataset creation function to unify shapes, points, and optional images/labels for testing. - Updated tests to utilize the new synthetic data generation approach, ensuring consistency across shape and point feature tests. - Improved logging in test fixtures for better traceability during test execution.

…lity Co-authored-by: ckmah <3103744+ckmah@users.noreply.github.com>

- Keep data in sparse format throughout computation where possible - Add chunk_size parameter for controlling memory usage during writes - Update set_points_metadata to preserve sparse DataFrame columns - Add automatic zarr writing for efficient storage via ome-zarr - Improve memory efficiency with sparse-aware operations Co-authored-by: ckmah <3103744+ckmah@users.noreply.github.com>

Co-authored-by: ckmah <3103744+ckmah@users.noreply.github.com>

- Create raster grid programmatically from shape bounds - Store flux results as multi-channel 2D image with chunked dask arrays - Channels include: gene values, embeddings, color RGB, and counts - Leverage ome-zarr native chunked writing via Image2DModel - Update tests to validate image-based storage - Remove dependency on pre-existing raster points Co-authored-by: ckmah <3103744+ckmah@users.noreply.github.com>

- Add adaptive scale factors based on image dimensions - Only create multiscale pyramid if image is large enough - Prevents errors with small images in tests - Ensures proper ome-zarr metadata for all image sizes Co-authored-by: ckmah <3103744+ckmah@users.noreply.github.com>

…ning - Introduced functions for creating raster grids and computing cell flux values. - Implemented SVD model training for dimensionality reduction of flux data. - Added support for streaming computation with zarr-backed SpatialData. - Updated flux function to handle batch processing and efficient writing to zarr. - Adjusted parameters for training size and improved memory management during processing. Co-authored-by: ckmah <3103744+ckmah@users.noreply.github.com>

…ithm Refactor flux algorithm to use Image2DModel with chunked ome-zarr storage

ckmah added 13 commits February 23, 2025 19:40

Fix Xenium data transformation scaling in prep function

2cfba0d

experimental/simpler dask bags implementation for point features

dbc8e14

Add experimental point features module with catalog decorator

f8add44

rename distance functions

e5d52f9

simplify api for point features, now renamed to measure

7c2f698

add more point measures

df24a7b

add untested shape measures

00318e5

wip image measures

c7a1df0

add logger

169237e

generalized dask parallel for labeled images

b8c1ead

sklearn dep regression

156e868

testing datasets, need to cleanup

8aacbf4

ckmah and others added 12 commits July 25, 2025 21:02

feat: replace current implementation of features with experimental

da56fc3

Initial plan

e681e40

Initial setup: Fix scipy version constraint for Python 3.12 compatibi…

ed510a2

…lity Co-authored-by: ckmah <3103744+ckmah@users.noreply.github.com>

Fix function name collision (radius parameter vs imported function)

acbe0f7

Co-authored-by: ckmah <3103744+ckmah@users.noreply.github.com>

replace test data with synthetic

7b3a033

Merge pull request #192 from YeoLab/copilot/refactor-bt-tl-flux-algor…

18aa1d1

…ithm Refactor flux algorithm to use Image2DModel with chunked ome-zarr storage

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Daskify features#179

Daskify features#179
ckmah wants to merge 25 commits intomasterfrom
daskify-features

ckmah commented Feb 25, 2025

Uh oh!

review-notebook-app bot commented Jun 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ckmah commented Feb 25, 2025

Uh oh!

review-notebook-app bot commented Jun 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants