-
Notifications
You must be signed in to change notification settings - Fork 241
Implement select_sorting_periods in metrics
#4302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
This is OK for me. |
Co-authored-by: Chris Halcrow <57948917+chrishalcrow@users.noreply.github.com>
Co-authored-by: Chris Halcrow <57948917+chrishalcrow@users.noreply.github.com>
|
@chrishalcrow I refactored a few metrics to make sure durations, spike counts, and bins are properly accounted for when slicing with periods. Happy to discuss about this! |
select_sorting_periodsselect_sorting_periods in metrics
I think that extension by extension we could use |
|
@samuelgarcia @chrishalcrow tests added! all good now :) |
| num_segments = len(segment_samples) | ||
| bin_duration_samples = int(bin_duration_s * sorting.sampling_frequency) | ||
|
|
||
| if periods is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When there are no good periods, then periods will be an empty array? Should we have a general strategy for when this is true - empty units often cause pain/nans...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a test for empty periods in num spikes and firing rates. I could add similar tests for all metrics. What do you think?
| """ | ||
|
|
||
| sorting = sorting_analyzer.sorting | ||
| sorting = sorting.select_periods(periods) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I choose periods so that only one unit is good, and get the num spikes, get
import spikeinterface.full as si
import numpy as np
from spikeinterface.core.base import unit_period_dtype
rec, sort = si.generate_ground_truth_recording(seed=1205)
sa = si.create_sorting_analyzer(sort, rec)
periods = np.array((0,1000,4000,0), dtype=unit_period_dtype)
quality_metrics = si.compute_quality_metrics(sa, periods=periods, metric_names = ['num_spikes'])
print(quality_metrics['num_spikes'].values)>>> [162 173 135 154 158 144 151 141 153 162]But this sorting only has spikes from unit index 0, right? Is count_num_spikes_per_unit below using an old cache or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point! Thanks for testing! Let me check ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed now! I was missing the support_periods class attributes to the spike train metrics
Also added similar tests there :)
| cumulative_segment_samples = np.cumsum([0] + segment_samples[:-1]) | ||
| for unit_id in unit_ids: | ||
| firing_histograms = [] | ||
| if num_spikes[unit_id] == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello - this num_spikes seems to be the number of spikes of the original sorting before unit periods are selected, which causes problems down the line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oups!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if cls.needs_job_kwargs: | ||
| args += (job_kwargs,) | ||
| if cls.supports_periods: | ||
| args += (periods,) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| args += (periods,) | |
| args += (periods,) | |
| else: | |
| raise ValueError("This metric do not support periods") |
| if metric_names is None: | ||
| metric_names = self.params["metric_names"] | ||
|
|
||
| periods = self.params.get("periods", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is to be backward compatible.
I think it would be better to make the caompatibility in the _handle_backward_compatibility_on_load no ?
|
|
||
| # can't use _misc_metric_name_to_func as some functions compute several qms | ||
| # e.g. isi_violation and synchrony | ||
| quality_metrics = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be infered by the classe no ?
| spike_locations_in_bin = spike_locations_array[i0:i1][direction] | ||
|
|
||
| unit_index = sorting_analyzer.sorting.id_to_index(unit_id) | ||
| mask = spikes_in_bin["unit_index"] == unit_index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this fast ?
| import numpy as np | ||
|
|
||
|
|
||
| def create_ground_truth_pc_distributions(center_locations, total_points): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not needed anymore ?
| sorting = sorting_analyzer.sorting | ||
| for unit_id in sorting.unit_ids: | ||
| unit_index = sorting.id_to_index(unit_id) | ||
| periods_unit = periods[periods["unit_index"] == unit_index] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the intuition that lloping only once over periods and suming directly in a prepared vector will be way faster than this repetition of masking in the loop.
No ?
| num_samples_in_period += period["end_sample_index"] - period["start_sample_index"] | ||
| total_samples[unit_id] = num_samples_in_period | ||
| else: | ||
| total_samples = {unit_id: sorting_analyzer.get_total_samples() for unit_id in sorting_analyzer.unit_ids} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| total_samples = {unit_id: sorting_analyzer.get_total_samples() for unit_id in sorting_analyzer.unit_ids} | |
| total = sorting_analyzer.get_total_samples() | |
| total_samples = {unit_id: total for unit_id in sorting_analyzer.unit_ids} |
| samples_per_period = sorting_analyzer.get_num_samples(segment_index) // num_periods | ||
| if bin_size_s is not None: | ||
| bin_size_samples = int(bin_size_s * sorting_analyzer.sampling_frequency) | ||
| print(samples_per_period / bin_size_samples) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oups
| return total_durations | ||
|
|
||
|
|
||
| def compute_periods(sorting_analyzer, num_periods, bin_size_s=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should find a better for this.
Like create_regular_periods or something.
The compute is suggesting an algo method.
No ?
| return np.concatenate(all_periods) | ||
|
|
||
|
|
||
| def create_ground_truth_pc_distributions(center_locations, total_points): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
moved here
|
Globally OK for me |
Depends on #4316
Slices a sorting object based on an array ov valid periods. Periods are defined as a structured dtype as:
EDIT:
Refactored computation of spike train metrics, to make sure that periods are consistently taken into account. Added 2 utils functions to compute durations per unit and bin edges per unit, that optionally use the provided periods