Skip to content

Conversation

@alejoe91
Copy link
Member

@alejoe91 alejoe91 commented Jan 7, 2026

Depends on #4316

Slices a sorting object based on an array ov valid periods. Periods are defined as a structured dtype as:

base_period_dtype = [
    ("start_sample_index", "int64"),
    ("end_sample_index", "int64"),
    ("segment_index", "int64"),
    ("unit_index", "int64"),
]

EDIT:

Refactored computation of spike train metrics, to make sure that periods are consistently taken into account. Added 2 utils functions to compute durations per unit and bin edges per unit, that optionally use the provided periods

@alejoe91 alejoe91 added the core Changes to core module label Jan 7, 2026
@alejoe91 alejoe91 mentioned this pull request Jan 7, 2026
1 task
@alejoe91 alejoe91 marked this pull request as ready for review January 8, 2026 07:36
@alejoe91 alejoe91 requested a review from chrishalcrow January 8, 2026 07:36
@samuelgarcia
Copy link
Member

This is OK for me.
Make a clear documentation somwhere woule help ?

Co-authored-by: Chris Halcrow <57948917+chrishalcrow@users.noreply.github.com>
Co-authored-by: Chris Halcrow <57948917+chrishalcrow@users.noreply.github.com>
@alejoe91
Copy link
Member Author

@chrishalcrow I refactored a few metrics to make sure durations, spike counts, and bins are properly accounted for when slicing with periods. Happy to discuss about this!

@alejoe91 alejoe91 changed the title Implement select_sorting_periods Implement select_sorting_periods in metrics Jan 13, 2026
@alejoe91
Copy link
Member Author

We don't want to extend these quality metrics to the extension() ? What if someone wants to only get the ISI, CCG or anything else only on the periods? Would it be easy to slice the sorting, and then compute only on the sub sorting? Are the extension robust w.r.t. periods?

I think that extension by extension we could use valid_unit_periods when computed :)

@alejoe91
Copy link
Member Author

@samuelgarcia @chrishalcrow tests added! all good now :)

num_segments = len(segment_samples)
bin_duration_samples = int(bin_duration_s * sorting.sampling_frequency)

if periods is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there are no good periods, then periods will be an empty array? Should we have a general strategy for when this is true - empty units often cause pain/nans...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test for empty periods in num spikes and firing rates. I could add similar tests for all metrics. What do you think?

"""

sorting = sorting_analyzer.sorting
sorting = sorting.select_periods(periods)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I choose periods so that only one unit is good, and get the num spikes, get

import spikeinterface.full as si
import numpy as np
from spikeinterface.core.base import unit_period_dtype

rec, sort = si.generate_ground_truth_recording(seed=1205)
sa = si.create_sorting_analyzer(sort, rec)
periods = np.array((0,1000,4000,0), dtype=unit_period_dtype)
quality_metrics = si.compute_quality_metrics(sa, periods=periods, metric_names = ['num_spikes'])

print(quality_metrics['num_spikes'].values)
>>> [162 173 135 154 158 144 151 141 153 162]

But this sorting only has spikes from unit index 0, right? Is count_num_spikes_per_unit below using an old cache or something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point! Thanks for testing! Let me check ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed now! I was missing the support_periods class attributes to the spike train metrics

Also added similar tests there :)

cumulative_segment_samples = np.cumsum([0] + segment_samples[:-1])
for unit_id in unit_ids:
firing_histograms = []
if num_spikes[unit_id] == 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello - this num_spikes seems to be the number of spikes of the original sorting before unit periods are selected, which causes problems down the line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oups!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if cls.needs_job_kwargs:
args += (job_kwargs,)
if cls.supports_periods:
args += (periods,)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
args += (periods,)
args += (periods,)
else:
raise ValueError("This metric do not support periods")

if metric_names is None:
metric_names = self.params["metric_names"]

periods = self.params.get("periods", None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is to be backward compatible.
I think it would be better to make the caompatibility in the _handle_backward_compatibility_on_load no ?


# can't use _misc_metric_name_to_func as some functions compute several qms
# e.g. isi_violation and synchrony
quality_metrics = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be infered by the classe no ?

spike_locations_in_bin = spike_locations_array[i0:i1][direction]

unit_index = sorting_analyzer.sorting.id_to_index(unit_id)
mask = spikes_in_bin["unit_index"] == unit_index
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this fast ?

import numpy as np


def create_ground_truth_pc_distributions(center_locations, total_points):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed anymore ?

sorting = sorting_analyzer.sorting
for unit_id in sorting.unit_ids:
unit_index = sorting.id_to_index(unit_id)
periods_unit = periods[periods["unit_index"] == unit_index]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the intuition that lloping only once over periods and suming directly in a prepared vector will be way faster than this repetition of masking in the loop.
No ?

num_samples_in_period += period["end_sample_index"] - period["start_sample_index"]
total_samples[unit_id] = num_samples_in_period
else:
total_samples = {unit_id: sorting_analyzer.get_total_samples() for unit_id in sorting_analyzer.unit_ids}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
total_samples = {unit_id: sorting_analyzer.get_total_samples() for unit_id in sorting_analyzer.unit_ids}
total = sorting_analyzer.get_total_samples()
total_samples = {unit_id: total for unit_id in sorting_analyzer.unit_ids}

samples_per_period = sorting_analyzer.get_num_samples(segment_index) // num_periods
if bin_size_s is not None:
bin_size_samples = int(bin_size_s * sorting_analyzer.sampling_frequency)
print(samples_per_period / bin_size_samples)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oups

return total_durations


def compute_periods(sorting_analyzer, num_periods, bin_size_s=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should find a better for this.
Like create_regular_periods or something.
The compute is suggesting an algo method.
No ?

return np.concatenate(all_periods)


def create_ground_truth_pc_distributions(center_locations, total_points):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok
moved here

@samuelgarcia
Copy link
Member

Globally OK for me
I made a few comment and some of then should be discussed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Changes to core module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants