-
Notifications
You must be signed in to change notification settings - Fork 21
Enhance ResamplerConfig with closed and label options and add comprehensive tests #1344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
c1003e4
add9529
37e7409
7e1cb16
7e9e406
c1ba871
81a8840
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -9,7 +9,7 @@ | |
| import itertools | ||
| import logging | ||
| import math | ||
| from bisect import bisect | ||
| from bisect import bisect, bisect_left | ||
| from collections import deque | ||
| from datetime import datetime, timedelta, timezone | ||
| from typing import assert_never | ||
|
|
@@ -20,7 +20,7 @@ | |
| from ..._internal._asyncio import cancel_and_await | ||
| from .._base_types import Sample | ||
| from ._base_types import Sink, Source, SourceProperties | ||
| from ._config import ResamplerConfig, ResamplerConfig2 | ||
| from ._config import ResamplerConfig, ResamplerConfig2, WindowSide | ||
| from ._exceptions import ResamplingError, SourceStoppedError | ||
| from ._wall_clock_timer import TickInfo, WallClockTimer | ||
|
|
||
|
|
@@ -411,7 +411,8 @@ def resample(self, timestamp: datetime) -> Sample[Quantity]: | |
| """Generate a new sample based on all the current *relevant* samples. | ||
|
|
||
| Args: | ||
| timestamp: The timestamp to be used to calculate the new sample. | ||
| timestamp: The reference timestamp for the resampling process. This | ||
| timestamp indicates the end of the resampling period. | ||
|
|
||
| Returns: | ||
| A new sample generated by calling the resampling function with all | ||
|
|
@@ -437,12 +438,22 @@ def resample(self, timestamp: datetime) -> Sample[Quantity]: | |
| ) | ||
| minimum_relevant_timestamp = timestamp - period * conf.max_data_age_in_periods | ||
|
|
||
| min_index = bisect( | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the behavior how to resample, i.e. left or right open and the labeling should be config parameters.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should make left or right opened configurable with the corresponding label, such as:
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the latter is also reasonable options (see e.g. https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.resample.html), but don't see a strong reason to implement this now if not needed. If it's well-documented, the users can also adjust the timestamps trivially. So your proposal sounds good to me.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, whatever we do, we should probably be much more explicit of how output samples are calculated and structured. |
||
| match conf.closed: | ||
| case WindowSide.LEFT: | ||
| bisect_func = bisect_left | ||
| case WindowSide.RIGHT: | ||
| bisect_func = bisect | ||
| case unexpected: | ||
| assert_never(unexpected) | ||
|
|
||
| min_index = bisect_func( | ||
| self._buffer, | ||
| minimum_relevant_timestamp, | ||
| key=lambda s: s[0], | ||
| ) | ||
| max_index = bisect(self._buffer, timestamp, key=lambda s: s[0]) | ||
|
|
||
| max_index = bisect_func(self._buffer, timestamp, key=lambda s: s[0]) | ||
|
|
||
| # Using itertools for slicing doesn't look very efficient, but | ||
| # experiments with a custom (ring) buffer that can slice showed that | ||
| # it is not that bad. See: | ||
|
|
@@ -458,6 +469,15 @@ def resample(self, timestamp: datetime) -> Sample[Quantity]: | |
| if relevant_samples | ||
| else None | ||
| ) | ||
|
|
||
| match conf.label: | ||
| case WindowSide.LEFT: | ||
| timestamp -= conf.resampling_period | ||
| case WindowSide.RIGHT: | ||
| pass | ||
| case unexpected: | ||
| assert_never(unexpected) | ||
|
|
||
| return Sample(timestamp, None if value is None else Quantity(value)) | ||
|
|
||
| def _log_no_relevant_samples( | ||
|
|
@@ -538,13 +558,14 @@ async def _receive_samples(self) -> None: | |
| # We need the noqa because pydoclint can't figure out that `recv_exception` is an | ||
| # `Exception` instance. | ||
| async def resample(self, timestamp: datetime) -> None: # noqa: DOC503 | ||
| """Calculate a new sample for the passed `timestamp` and send it. | ||
| """Calculate a new sample using buffered samples up to the given `timestamp` and send it. | ||
|
|
||
| The helper is used to calculate the new sample and the sender is used | ||
| to send it. | ||
|
|
||
| Args: | ||
| timestamp: The timestamp to be used to calculate the new sample. | ||
| timestamp: The timestamp up to which all buffered samples are | ||
| considered for calculating the new sample. | ||
|
|
||
| Raises: | ||
| SourceStoppedError: If the source stopped sending samples. | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.