feat: aggregate_sample_features saves .pt tensor alongside H5 by raylim · Pull Request #112 · pathology-data-mining/Mussel

raylim · 2026-04-02T21:45:46Z

Update desc_doc in aggregate_sample_features.py to mention H5 and PT tensor outputs
Add opt-out example (save_pt=false) to CLI help text
Move new parameters (save_pt, output_pt_suffix) to end of signature for backward compatibility (resolved)
Handle stale PT file when save_pt=False (resolved)

Copilot

Pull request overview

This PR extends aggregate_sample_features to save a PyTorch .pt tensor alongside the existing per-sample HDF5 output, enabling downstream consumers (e.g., mussel-nf slide-level inputs) to consume features without an extra conversion step.

Changes:

Add optional .pt output to aggregate_sample_features via save_pt (default True) and output_pt_suffix.
Expose save_pt / output_pt_suffix through the aggregate_sample_features CLI config and pass-through.
Expand tests to assert .pt outputs exist and have expected shapes, plus add an opt-out test for save_pt=False.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
`mussel/utils/feature_extract.py`	Add optional `.pt` writing for aggregated sample features.
`mussel/cli/aggregate_sample_features.py`	Surface new `.pt`-output options in the CLI config and wire them into the call.
`tests/mussel/cli/test_aggregate_sample_features.py`	Add `.pt` existence/shape assertions and a `save_pt=False` test.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

mussel/utils/feature_extract.py

Copilot · 2026-04-02T21:49:10Z

mussel/cli/aggregate_sample_features.py

+        output_dir (str): Directory where output files are written.
+        output_h5_suffix (str): Filename suffix for H5 output files (default "features.h5").
            Each sample writes to "{output_dir}/{sample_id}.{output_h5_suffix}".
+        output_pt_suffix (str): Filename suffix for PT output files (default "features.pt").
+            Only used when save_pt=True.
+        save_pt (bool): Whether to also save a PyTorch .pt tensor alongside the H5
+            (default True).


The CLI help text in desc_doc below still says it “writes one output H5 per unique sample”, but the default behavior now also writes a .pt tensor unless save_pt=false. Please update the CLI description/help output to mention the additional PT file and how to opt out, so the CLI’s usage docs match the actual outputs.

@copilot apply changes based on this feedback

Done in a761faa. The desc_doc now says "writes one output H5 and one output PT tensor per unique sample" and includes an H5-only example using save_pt=false.

Already addressed during the rebase onto #115 — desc_doc now reads "writes one output H5 and one output PT tensor per unique sample. To write only the H5 (no PT file), pass save_pt=false" and includes an H5-only example.

Add save_pt=True (default) and output_pt_suffix='features.pt' parameters to aggregate_sample_features and AggregateSampleFeaturesConfig. Motivation: downstream pipelines (mussel-nf MERGE_SAMPLE_FEATURES) need a .pt tensor for slide-level model compatibility. Previously callers had to manually convert H5 → PT; this makes it native. Changes: - mussel/utils/feature_extract.py: add save_pt, output_pt_suffix params; save torch.from_numpy(features) via save_torch_tensor when save_pt=True - mussel/cli/aggregate_sample_features.py: expose save_pt, output_pt_suffix in AggregateSampleFeaturesConfig and pass through to the function - tests: add PT output assertions to existing tests; add test_aggregate_sample_features_save_pt_false for opt-out path Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings April 2, 2026 21:45

Copilot started reviewing on behalf of raylim April 2, 2026 21:46 View session

Copilot AI reviewed Apr 2, 2026

View reviewed changes

Copilot started work on behalf of raylim April 2, 2026 22:00 View session

Copilot finished work on behalf of raylim April 2, 2026 22:02

raylim force-pushed the feat/aggregate-save-pt branch from a761faa to 7381074 Compare April 3, 2026 17:04

raylim changed the base branch from main to refactor/aggregate-sample-features-use-objects April 3, 2026 17:04

raylim changed the base branch from refactor/aggregate-sample-features-use-objects to main April 3, 2026 17:06

raylim force-pushed the feat/aggregate-save-pt branch from 7381074 to 88d5881 Compare April 3, 2026 17:06

raylim merged commit 46e4b23 into main Apr 3, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: aggregate_sample_features saves .pt tensor alongside H5#112

feat: aggregate_sample_features saves .pt tensor alongside H5#112
raylim merged 1 commit intomainfrom
feat/aggregate-save-pt

raylim commented Apr 2, 2026 •

edited by Copilot AI

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 2, 2026

Uh oh!

raylim Apr 2, 2026

Uh oh!

Copilot AI Apr 2, 2026

Uh oh!

raylim Apr 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

raylim commented Apr 2, 2026 • edited by Copilot AI Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

raylim Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

raylim Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

raylim commented Apr 2, 2026 •

edited by Copilot AI

Loading