Skip to content

Enhance ResamplerConfig with closed and label options and add comprehensive tests#1344

Open
malteschaaf wants to merge 7 commits intofrequenz-floss:v1.x.xfrom
malteschaaf:resampler
Open

Enhance ResamplerConfig with closed and label options and add comprehensive tests#1344
malteschaaf wants to merge 7 commits intofrequenz-floss:v1.x.xfrom
malteschaaf:resampler

Conversation

@malteschaaf
Copy link
Contributor

@malteschaaf malteschaaf commented Jan 14, 2026

This PR introduces the WindowSide enum and two new configuration options—closed and label—to the ResamplerConfig class, providing a clearer and more flexible way to define the behavior of the resampling process. Comprehensive tests have been added to ensure correctness.

Changes:

  1. Introduce WindowSide enum

    • Added the WindowSide enum with values LEFT and RIGHT:
      • LEFT: Represents the start of a resampling window.
      • RIGHT: Represents the end of a resampling window.
    • Exposed WindowSide in the __init__.py of the timeseries module for public use.
  2. Add closed option to ResamplerConfig

    • Introduced the closed parameter with options WindowSide.RIGHT (default) and WindowSide.LEFT.
    • Updated resampling logic to respect the closed configuration:
      • RIGHT: Includes samples at the end of the window, excludes those at the start.
      • LEFT: Includes samples at the start of the window, excludes those at the end.
    • Updated documentation to explain the behavior of the closed parameter.
  3. Add label option to ResamplerConfig

    • Introduced the label parameter with options WindowSide.RIGHT (default) and WindowSide.LEFT.
    • Updated resampling logic to respect the label configuration:
      • RIGHT: Timestamp of the resampled data corresponds to the end of the resampling window.
      • LEFT: Timestamp of the resampled data corresponds to the start of the resampling window.
    • Adjusted logic for setting sample_time based on the label configuration.
    • Updated documentation accordingly.
  4. Add tests for closed and label

    • Verified closed behavior with both WindowSide.RIGHT and WindowSide.LEFT, including additional sample points.
    • Verified label behavior to ensure timestamps of resampled data correspond correctly to the start or end of the window.
    • Added assertions to confirm correct resampling outputs and updates to source properties and buffer lengths.

)
minimum_relevant_timestamp = timestamp - period * conf.max_data_age_in_periods

min_index = bisect(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the behavior how to resample, i.e. left or right open and the labeling should be config parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make left or right opened configurable with the corresponding label, such as:
right open: [t,t+1) -> labeled as t
left open: (t-1,t] -> labeled as t (the current behavior)
Or do you want to allow the user to additionally do something like:
right open, label in the end: [t,t+1) -> labeled as t+1
left open, label in the beginning:(t-1,t] -> labeled as t-1
Because I think the later could lead to a lot of confusion (not sure if its ever needed)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the latter is also reasonable options (see e.g. https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.resample.html), but don't see a strong reason to implement this now if not needed. If it's well-documented, the users can also adjust the timestamps trivially. So your proposal sounds good to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, whatever we do, we should probably be much more explicit of how output samples are calculated and structured.

Copy link
Contributor

@llucax llucax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general LGTM, I think this would need quite some testing, and we probably want to introduce this using a config option to change the behavior, specially if configurability is something we want to allow in the future (if we don't see having this level of configurability useful, we can hide it in a feature flag using a env var like we did with the new wall clock timer). As we experimented recently with the changes in the component graph, subtle changes in behavior can go easily unnoticed and cause issues.

)
minimum_relevant_timestamp = timestamp - period * conf.max_data_age_in_periods

min_index = bisect(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, whatever we do, we should probably be much more explicit of how output samples are calculated and structured.

@github-actions github-actions bot added the part:tests Affects the unit, integration and performance (benchmarks) tests label Feb 3, 2026
@malteschaaf malteschaaf changed the title WIP: Adjust Resampling Logic and Defaults Enhance ResamplerConfig with closed and label options and add comprehensive tests Feb 3, 2026
@malteschaaf malteschaaf changed the title Enhance ResamplerConfig with closed and label options and add comprehensive tests Enhance ResamplerConfig with closed and label options and add comprehensive tests Feb 3, 2026
@malteschaaf
Copy link
Contributor Author

Ready for Review 🧐

Changes implemented:

  • Removed modifications to the max_data_age_in_periods default value
  • Made the openness of the resampling window configurable via the closed parameter:
    • "right": Includes the end of the window, excludes the start. (Default: same behavior as before)
    • "left": Includes the start of the window, excludes the end.
  • Added configurability for timestamp labeling with the label parameter:
    • "start": Resampled timestamp represents the start of the resampling window.
    • "end": Resampled timestamp represents the end of the resampling window. (Default: same behavior as before)
  • Added comprehensive tests to verify the behavior of the closed and label parameters.

@github-actions github-actions bot added the part:docs Affects the documentation label Feb 3, 2026
@malteschaaf malteschaaf marked this pull request as ready for review February 3, 2026 16:14
@malteschaaf malteschaaf requested a review from a team as a code owner February 3, 2026 16:14
@malteschaaf malteschaaf force-pushed the resampler branch 2 times, most recently from b39f3de to 1231852 Compare February 12, 2026 12:29
@malteschaaf
Copy link
Contributor Author

Ready for Review 🧐

Changes implemented:

  • Changed label from Literal["start","end"] to Literal["left","right"]

@malteschaaf malteschaaf requested a review from cwasicki February 12, 2026 12:31
Copy link
Collaborator

@cwasicki cwasicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@malteschaaf
Copy link
Contributor Author

Ready for Review again

Changes:

  • Intorduced WindowSide enum to represent the sides of a resampling window and used that for the configuration of the closed and label parameters.
  • Updated the documentation for the closed and label config parameters, removing mapping left and right to "start" and "end"

Copy link
Contributor

@daniel-zullo-frequenz daniel-zullo-frequenz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'll leave to Luca to review it and approve it

@llucax
Copy link
Contributor

llucax commented Mar 3, 2026

Sorry for the delay, will have a look now.

Copy link
Contributor

@llucax llucax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, a few very nitpick comments, and one that is moderately important to me (using match instead of if). I know it might look irrelevant, but it looks to me like one of these "64k is enough for everyone" situation, where it seems impossible that anyone could add a new enum member... until someone does.

- Introduced the `WindowSide` enum with values `LEFT` and `RIGHT`:
  - `LEFT`: Represents the start of the resampling window.
  - `RIGHT`: Represents the end of the resampling window.
- Exposed the `WindowSide` enum in the `__init__.py` of the timeseries module for public use.

Signed-off-by: Malte Schaaf <malte.schaaf@frequenz.com>
…havior

- Introduced the `closed` parameter in ResamplerConfig with options `WindowSide.RIGHT` (default) and `WindowSide.LEFT`.
- Updated the resampling logic to respect the `closed` configuration:
  - `RIGHT`: Includes samples at the end of the window, excludes those at the start.
  - `LEFT`: Includes samples at the start of the window, excludes those at the end.
- Adjusted documentation to reflect the new `closed` parameter and its behavior.

Signed-off-by: Malte Schaaf <malte.schaaf@frequenz.com>
… resampled data

- Introduced the `label` parameter in ResamplerConfig with options `WindowSide.RIGHT` (default) and `WindowSide.LEFT`.
- Updated the resampling logic to respect the `label` configuration:
  - `RIGHT`: The timestamp of the resampled data corresponds to the end of the resampling window.
  - `LEFT`: The timestamp of the resampled data corresponds to the start of the resampling window.
- Adjusted the logic for setting `sample_time` to use the `label` configuration.
- Updated documentation to reflect the new `label` parameter and its behavior.

Signed-off-by: Malte Schaaf <malte.schaaf@frequenz.com>
- Enhanced the `test_resampler_closed_option` to include additional samples at 2.5, 3, and 4 seconds.
- Verified the behavior of the `closed` option (`WindowSide.RIGHT` and `WindowSide.LEFT`) with the extended timeline.
- Added assertions to ensure correct resampling function calls and sink outputs for the new samples.
- Confirmed that source properties and buffer length are updated correctly after processing the additional samples.

Signed-off-by: Malte Schaaf <malte.schaaf@frequenz.com>
…havior

- Introduced `test_resampler_label_option` to validate the `label` configuration in ResamplerConfig.
- Tested both `WindowSide.LEFT` and `WindowSide.RIGHT` options to ensure the resampled datas timestamp corresponds to the start or end of the resampling window, respectively.
- Verified sink outputs with the expected timestamp and resampled value.

Signed-off-by: Malte Schaaf <malte.schaaf@frequenz.com>
Improve the docstring of the `_StreamingHelper`s `resample` method to better describe how the timestamp parameter is used in resampling calculations. The timestamp now clearly represents the upper bound for buffered samples considered in the resampling operation.

Signed-off-by: Malte Schaaf <malte.schaaf@frequenz.com>
Signed-off-by: Malte Schaaf <malte.schaaf@frequenz.com>
@malteschaaf
Copy link
Contributor Author

READY FOR REVIEW

  • Rebased
  • Changed Release Notes according to comment
  • match enum instead of if
  • Updated the docstring of the resample method of the StreamingHelper class to better describe how the timestamp parameter is used in resampling calculations

@malteschaaf malteschaaf requested a review from llucax March 4, 2026 08:35
Copy link
Contributor

@llucax llucax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

@github-project-automation github-project-automation bot moved this from To do to Review approved in Python SDK Roadmap Mar 4, 2026
@malteschaaf
Copy link
Contributor Author

Unfortunately I have no rights to merge myself @llucax

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

part:data-pipeline Affects the data pipeline part:docs Affects the documentation part:tests Affects the unit, integration and performance (benchmarks) tests

Projects

Status: Review approved

Development

Successfully merging this pull request may close these issues.

4 participants