Conversation
There was a problem hiding this comment.
Pull request overview
Exposes Rust tree-balance indices (Sackin, B2, leaf depth variance) to the Python phylo2vec.stats API via PyO3 bindings, and adds Python tests ported from the Rust implementation.
Changes:
- Added PyO3 bindings for
sackin,leaf_depth_variance, andb2in the Rust extension module. - Introduced
phylo2vec.stats.balancePython wrapper module and exported the new metrics fromphylo2vec.stats. - Added Python tests covering expected values on small, ladder, and balanced trees.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
py-phylo2vec/src/lib.rs |
Adds PyO3-exposed functions wired to Rust balance metrics and registers them in _phylo2vec_core. |
py-phylo2vec/phylo2vec/stats/balance.py |
New Python-facing wrappers that accept vector/matrix inputs and call into the Rust extension. |
py-phylo2vec/phylo2vec/stats/__init__.py |
Updates docstring and exports to include the new balance metrics in __all__. |
py-phylo2vec/tests/test_stats.py |
Adds test cases for Sackin, leaf depth variance, and B2. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def sackin(vector_or_matrix): | ||
| if vector_or_matrix.ndim == 2: | ||
| v = vector_or_matrix[:, 0].astype(int) | ||
| elif vector_or_matrix.ndim == 1: | ||
| v = vector_or_matrix | ||
| else: | ||
| raise ValueError( | ||
| "vector_or_matrix should either be a vector (ndim == 1) or matrix (ndim == 2)" | ||
| ) | ||
| return core.sackin(v) | ||
|
|
||
|
|
||
| def b2(vector_or_matrix): | ||
| if vector_or_matrix.ndim == 2: | ||
| v = vector_or_matrix[:, 0].astype(int) | ||
| elif vector_or_matrix.ndim == 1: | ||
| v = vector_or_matrix | ||
| else: | ||
| raise ValueError( | ||
| "vector_or_matrix should either be a vector (ndim == 1) or matrix (ndim == 2)" | ||
| ) | ||
| return core.b2(v) | ||
|
|
||
|
|
||
| def leaf_depth_variance(vector_or_matrix): | ||
| if vector_or_matrix.ndim == 2: | ||
| v = vector_or_matrix[:, 0].astype(int) | ||
| elif vector_or_matrix.ndim == 1: | ||
| v = vector_or_matrix | ||
| else: | ||
| raise ValueError( | ||
| "vector_or_matrix should either be a vector (ndim == 1) or matrix (ndim == 2)" | ||
| ) |
There was a problem hiding this comment.
The ndim-check + topology extraction logic is duplicated across sackin/b2/leaf_depth_variance. Consider factoring this into a small private helper (e.g., _extract_topology_vector) to reduce repetition and ensure any future changes (dtype casting, validation, error message) stay consistent across metrics.
| def sackin(vector_or_matrix): | |
| if vector_or_matrix.ndim == 2: | |
| v = vector_or_matrix[:, 0].astype(int) | |
| elif vector_or_matrix.ndim == 1: | |
| v = vector_or_matrix | |
| else: | |
| raise ValueError( | |
| "vector_or_matrix should either be a vector (ndim == 1) or matrix (ndim == 2)" | |
| ) | |
| return core.sackin(v) | |
| def b2(vector_or_matrix): | |
| if vector_or_matrix.ndim == 2: | |
| v = vector_or_matrix[:, 0].astype(int) | |
| elif vector_or_matrix.ndim == 1: | |
| v = vector_or_matrix | |
| else: | |
| raise ValueError( | |
| "vector_or_matrix should either be a vector (ndim == 1) or matrix (ndim == 2)" | |
| ) | |
| return core.b2(v) | |
| def leaf_depth_variance(vector_or_matrix): | |
| if vector_or_matrix.ndim == 2: | |
| v = vector_or_matrix[:, 0].astype(int) | |
| elif vector_or_matrix.ndim == 1: | |
| v = vector_or_matrix | |
| else: | |
| raise ValueError( | |
| "vector_or_matrix should either be a vector (ndim == 1) or matrix (ndim == 2)" | |
| ) | |
| def _extract_topology_vector(vector_or_matrix): | |
| """Extract a 1D topology vector from either a vector or matrix.""" | |
| if vector_or_matrix.ndim == 2: | |
| return vector_or_matrix[:, 0].astype(int) | |
| if vector_or_matrix.ndim == 1: | |
| return vector_or_matrix | |
| raise ValueError( | |
| "vector_or_matrix should either be a vector (ndim == 1) or matrix (ndim == 2)" | |
| ) | |
| def sackin(vector_or_matrix): | |
| v = _extract_topology_vector(vector_or_matrix) | |
| return core.sackin(v) | |
| def b2(vector_or_matrix): | |
| v = _extract_topology_vector(vector_or_matrix) | |
| return core.b2(v) | |
| def leaf_depth_variance(vector_or_matrix): | |
| v = _extract_topology_vector(vector_or_matrix) |
and phylo2vec.stats.balance module
small trees