Skip to content

Feat/config engine#21

Closed
viktorbeck98 wants to merge 30 commits intodevelopmentfrom
feat/config_engine
Closed

Feat/config engine#21
viktorbeck98 wants to merge 30 commits intodevelopmentfrom
feat/config_engine

Conversation

@viktorbeck98
Copy link
Copy Markdown
Collaborator

First version for the persistency and the configuration engine. The core of the configuration engine is integrated in the persistency. So far I implemented:

  • A generic persistency class for event-based purposes that incorporates interchangable data structures that are defined at definition of the persistency object.
  • Multiple data structures that can be injected into the persistency class based on the use case, such as:
    • EventDataFrame: best for developers to get an event-type-separated view of the data (based on pandas)
    • ChunkedEventDataFrame: similar as above but is fit for online processing (based on polars). Technically production-ready but since it stores all incoming data it is still RAM heavy. Howerver, in the future, for a storage component we could connect this to a database handling the storage.
    • EventVariableTracker: is also a data structure just as the two above but only stores series of persitency changes in the most efficient way (run-length encoding - RLE) and the set of unique values per variable. Persistency changes can thereby be seen as a feature extracted from the data. Detectors like the NewValueDetector, ComboDetector, EventSequenceDetector and possibly more can use this as the underlying structure.
  • A RLU set implementation for handling outdated values (not yet integrated into persitency class)

Some todos:

  • Integrate persitency class into the existing detectors (I am on it already)
  • implement the functionality to remove outdated values from persistency

@viktorbeck98 viktorbeck98 requested a review from ipmach January 7, 2026 14:58
@viktorbeck98 viktorbeck98 self-assigned this Jan 7, 2026
@viktorbeck98 viktorbeck98 added enhancement New feature or request New feature labels Jan 7, 2026
@ipmach
Copy link
Copy Markdown
Contributor

ipmach commented Jan 22, 2026

looking good so far!

But we need to discuss about the notebooks, because I dont know if we want them in the main branch. They present multiple issues:

  • Code traceability is pretty bad. As the files do not have human readable code. (it is written in Jupyter language).
  • They are not tested in the main pipeline, so they have a higher risk of not working in the next versions.

@viktorbeck98
Copy link
Copy Markdown
Collaborator Author

@ipmach regarding the notebooks, that would probably be the cleanest solution, but I personally think they are the easiest way to show how smth works. I will remove them for now. Maybe we can think of it at a later point.

viktorbeck98 and others added 4 commits January 26, 2026 14:40
Reorganize trackers.py into a proper package with separate modules for
event tracking, multi-tracking, stability classification and conversion.
Also removes unused demo notebooks and fixes method signatures for
consistency.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

def test_initialization(self):
"""Test EventVariableTrackerData initialization."""
evt = EventTracker(tracker_type=SingleStabilityTracker)

def test_add_data(self):
"""Test adding data."""
evt = EventTracker(tracker_type=SingleStabilityTracker)

def test_get_variables(self):
"""Test retrieving variable names."""
evt = EventTracker(tracker_type=SingleStabilityTracker)

def test_get_stable_variables(self):
"""Test retrieving stable variables."""
evt = EventTracker(tracker_type=SingleStabilityTracker)

def test_integration_with_stability_tracker(self):
"""Test full integration with SingleVariableTracker."""
evt = EventTracker(tracker_type=SingleStabilityTracker)

def test_to_data_with_variable_level(self):
"""Test to_data method with variable level (default)."""
evt = EventTracker(tracker_type=SingleStabilityTracker, feature_type="variable")

def test_to_data_with_variable_combo_level(self):
"""Test to_data method with variable_combo level."""
evt = EventTracker(tracker_type=SingleStabilityTracker, feature_type="variable_combo")

def test_to_data_integration_with_add_data(self):
"""Test that to_data and add_data work together correctly."""
evt = EventTracker(tracker_type=SingleStabilityTracker, feature_type="variable")

def test_conversion_function_assignment(self):
"""Test that converter is correctly assigned based on feature_type."""
evt_var = EventTracker(tracker_type=SingleStabilityTracker, feature_type="variable")
def test_conversion_function_assignment(self):
"""Test that converter is correctly assigned based on feature_type."""
evt_var = EventTracker(tracker_type=SingleStabilityTracker, feature_type="variable")
evt_combo = EventTracker(tracker_type=SingleStabilityTracker, feature_type="variable_combo")
@viktorbeck98 viktorbeck98 changed the base branch from main to development February 2, 2026 13:53
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be add it in the docs branch once that one is merge. I suggest remove it for now and add it later.

Copy link
Copy Markdown
Contributor

@ipmach ipmach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is too big, it has many changes and many are not related with the topic. In the future please focus in smaller PR because it breaks the Agile principle and it makes code harder to maintain because it its harder to review and follow.

I added different comments in the files, the main highlights are:

  • The folder common is not a good place for persistency. It should be part of utils because it is an ad on, not a common component.
  • I did not review persistency files as is a big review and not a critical component. Is the first version so probably will be future changes so I am not worry. I am ok leaving it like this for now.
  • Critical the datasets that are not from AIT should not be part of AIT software. They should be remove from the repository. It is ok adding a script to download them, but should not be part of the repository.
  • Relevant classes like CoreDetector, ComboDetector, NewValueDetector need minor improvements to increase readability.
  • Parsers modifications are weird that were added. Some functions inside the matcher are terrible for performance and redability. I suggest removing parsers changes from the PR as well as the changes in the reader.

I did not yet test it locally but I think there are many changes that need to be done before approval.

It is clear that you put a lot of work in this and I do not want to discourage you. The configuration engine will be definitely a great feature in DetectMate. But it needs a little of fine-tunning.

Best,
Andre

Comment thread src/detectmatelibrary/common/detector.py
Comment thread src/detectmatelibrary/detectors/new_value_combo_detector.py
Comment thread src/detectmatelibrary/detectors/new_value_detector.py
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct_single_template is a terrible function in readability and efficiency. The overiteration of templates is really inefficient.

Also it does nothing to do with the PR topic

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function was taken from LogBatcher which improves template quality significantly which is also proven in literatue. This was solely for the demo and will come in a separate PR. I had to change TemplateMatcher since it didn't work anymore after your changes ...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not relevant for the PR

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this class?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an abstraction that makes sure that the repr functions of classes have a unified representation

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not own this dataset. It should be remove from the repository.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not own this dataset. It should be remove from the repository.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not own this dataset. It should be remove from the repository.

Comment thread pyproject.toml
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we use polaris? Because if not is a strict dependency because it does not have big support in many hardwares-

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed previously, no we dont need it and it should be removed as a strict dependency. We can add it as an optional dependency in pyproject.toml + lazy import in the class

@viktorbeck98
Copy link
Copy Markdown
Collaborator Author

@ipmach see this PR more as a draft. It took simply too long with all the changes that were necessary and the demos which made it messy but I just went with it. I wanted to have inital feedback from you. I'll closee the PR and create new smaller ones, improved with your feedback

@ipmach
Copy link
Copy Markdown
Contributor

ipmach commented Feb 11, 2026

@viktorbeck98 Yes, I get it. It is hard to separate things when we have to develop the tool and do the demos at the same time. Let me know if you any questions from my feedback! :)

@ipmach ipmach deleted the feat/config_engine branch March 11, 2026 08:39
@ipmach ipmach restored the feat/config_engine branch March 11, 2026 08:39
@ipmach ipmach deleted the feat/config_engine branch March 11, 2026 08:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

2 participants