Skip to content

conect2ai/METROIND2026-Temporal-Recurrence-Matrix

Repository files navigation

   

 

Temporal Recurrence Matrix Representation for Anomaly Detection in Industrial Network Traffic

1. Abstract/Overview

Detecting anomalies in industrial communication networks is particularly challenging when protocol-specific information is unavailable or when monitoring must operate on resource-constrained devices. This study introduces a lightweight, protocol-agnostic method for representing industrial network traffic using observable packet characteristics. Packet length and idle time between consecutive transmissions are utilized to construct a Temporal Recurrence Matrix (TRM) over sliding windows, which captures recurrence patterns in communication behavior. Statistical features are extracted from transitions between consecutive matrices and employed for anomaly detection. The proposed approach was evaluated on a PROFIBUS dataset with controlled anomaly injection and multiple machine learning models. Results indicate that TRM-derived features support effective anomaly detection across the evaluated representations, with the multilayer perceptron achieving the highest F1-scores. Furthermore, the entire pipeline was implemented on a microcontroller platform, achieving millisecond-level processing times and demonstrating feasibility for real-time embedded monitoring in industrial networks.

2. Repository Structure

The repository contains the minimum material required to reproduce the experiments reported in the article, focusing on the MLP and TEDA pipelines.

METROIND2026-Temporal-Recurrence-Matrix
├── data/                     # Raw datasets and benchmark logs
│   ├── raw/                  # Raw CSV traces used in the notebooks
│   └── benchmark/            # Arduino logs and label files used in comparisons
│
├── figures/                  # Images used in the README and documentation
│
├── notebooks/                # Jupyter notebooks reproducing the experiments
│   ├── mlp_execution.ipynb
│   ├── teda_execution.ipynb
│   └── train_all_models.ipynb
│
├── src/                      # Python implementation of TEDA
│
├── scripts/                  # Experiment automation and serial streaming scripts
│
├── arduino/
│   └── nano_ble_full_pipeline
│       ├── nano_ble_full_pipeline.ino
│       ├── mlp_models_len_idle_notebook.h
│       ├── teda.h
│       └── len_idle_parity_constants.h
│
├── requirements.txt          # Python dependencies
└── README.md                 # Project documentation

3. Environment Setup

To reproduce the experiments locally, first install Python 3.11.11 and create a virtual environment. Use your preferred package manager. Below, I used Anaconda.

conda create --name trm python==3.11.11
conda activate trm

Clone this repository:

git clone https://github.com/conect2ai/METROIND2026-Temporal-Recurrence-Matrix.git
cd METROIND2026-Temporal-Recurrence-Matrix

Next, install the project dependencies:

pip install -r requirements.txt

To execute the notebooks interactively, launch Visual Studio Code or your preferred code editor:

code .

All notebooks should be executed from the notebooks directory so that relative paths to the datasets are resolved correctly.


4. Datasets

The raw datasets used in the experiments are located in:

data/raw/

The main file used in the experiments is:

  • data_1.csv – This dataset contains 208,125 lines and 6 columns.

Benchmark data used for the comparison between offline execution and embedded execution is located in:

data/benchmark/

This directory contains Arduino logs and label files used during the evaluation.


5. Reproducing the Experiments

The experiments reported in the paper can be reproduced using the Jupyter notebooks available in the notebooks directory.

Test all models

Open the notebook:

notebooks/train_all_models.ipynb

This notebook reproduces the complete offline pipeline, including:

  • raw data loading
  • TRM feature construction
  • All models training and evaluation

Features used by the models

Feature Meaning
js Jensen-Shannon divergence to baseline TRM
fro Frobenius distance to baseline TRM
rare mass in rare / high-delay bins of the TRM
d_js temporal derivative of js
d_fro temporal derivative of fro

Models and hyperparameters

Model Type Main architecture / hyperparameters
TEDA unsupervised, online/incremental threshold search: np.arange(1.8, 3.4 + 0.2, 0.2)
Isolation Forest unsupervised IsolationForest(n_estimators=400, contamination="auto", random_state=42)
One-Class SVM unsupervised OneClassSVM(kernel="rbf", nu=0.02, gamma="scale")
Random Forest supervised RandomForestClassifier(n_estimators=500, class_weight="balanced_subsample", random_state=42, n_jobs=-1)
XGBoost supervised XGBClassifier(n_estimators=400, max_depth=6, learning_rate=0.05, subsample=0.9, colsample_bytree=0.8, objective="binary:logistic", eval_metric="logloss", scale_pos_weight=n_neg/n_pos, random_state=42)
XGBoost fallback supervised HistGradientBoostingClassifier(learning_rate=0.05, max_depth=8, random_state=42)
ANN (MLP) supervised MLPClassifier(hidden_layer_sizes=(64, 32), activation="relu", alpha=1e-4, learning_rate_init=1e-3, max_iter=300, early_stopping=True, random_state=42)
Autoencoder (PyTorch) unsupervised encoder: Linear(5,64) → ReLU → Linear(64,16); decoder: Linear(16,64) → ReLU → Linear(64,5); optimizer: Adam(lr=1e-3); loss: MSELoss(); epochs: 25
KAN (pykan) supervised KAN(width=[5, 8, 2], grid=3, k=3, seed=42); optimizer: LBFGS; steps: 25; loss: CrossEntropyLoss(); max_samples: 5000

Results of the offline evaluation

TEDA Pipeline

Open the notebook:

notebooks/teda_execution_en.ipynb

This notebook reproduces the anomaly detection workflow using the TEDA algorithm, including scoring and decision rules.

MLP Pipeline

Open the notebook:

notebooks/mlp_execution.ipynb

This notebook reproduces the complete offline pipeline, including:

  • raw data loading
  • TRM feature construction
  • model training and evaluation
  • comparison with Arduino benchmark logs

6. Embedded Deployment

Once the best models have been identified during the offline experiments, they can be deployed on a microcontroller in order to evaluate real-time performance in an embedded environment.

The embedded implementation of the anomaly detection pipeline was tested using the following hardware:

  • Arduino Nano 33 BLE Sense

First, download and install the Arduino IDE for your operating system:

Then follow the steps below.

  1. Connect the microcontroller to your computer.
  2. Open the firmware file in the Arduino IDE:
arduino/nano_ble_full_pipeline/nano_ble_full_pipeline.ino
  1. Select the board Arduino Nano 33 BLE in the Arduino IDE.

  2. Identify the serial port where the board is connected (this information will be required later when running the Python scripts).

In the example shown in the figure above, the device is connected to the port:

/dev/cu.usbmodem1401
  1. Compile and upload the firmware to the microcontroller.

After uploading the firmware, the Serial Monitor can be used to observe telemetry messages generated by the pipeline. The serial monitor must be configured with the following parameters:

  • Baud rate: 115200

It is important to note that only one process can access the serial port at a time. Therefore, if the Python scripts are being used to capture telemetry data from the board, the Arduino Serial Monitor must be closed.


7. Running the Python Scripts

Once you have uploaded the code your hardware, it is time to simulate the data stream. You can do this running the python scripts

⚠️ During this process, close the monitor serial in the Arduino IDE

You need to replace the port /dev/cu.usbmodem1401 for the value that appers in you computer.

Example: stream a dataset to the board and capture telemetry.

python scripts/stream_raw_and_capture.py \
  --port /dev/cu.usbmodem1401 \
  --raw-csv data/raw/data_1.csv \
  --out outputs/log_nano_full_pipeline.csv \
  --labels-out outputs/stream_labels.csv \
  --rep composite \
  --model teda \
  --period-ms 1

Available options:

  • --rep: len, idle, composite
  • --model: mlp, teda

Example output:

timestamp,len,idle,score,label,infer_us,total_us
0,64,120,0.11,0,350,910
1,64,118,0.09,0,341,900
2,120,300,0.78,1,350,915

📄 License

This project is licensed under the MIT License — see the LICENSE file for details.

About us

The Conect2AI research group is composed of undergraduate and graduate students from the Federal University of Rio Grande do Norte (UFRN). The group focuses on applying Artificial Intelligence and Machine Learning to emerging areas such as Embedded Intelligence, Internet of Things, and Intelligent Transportation Systems, contributing to energy efficiency and sustainable mobility solutions.

Website: http://conect2ai.dca.ufrn.br

About

This repository contains the implementations of the proposed approach and the experiments performed for the IEEE Metroind 2026 Temporal Recurrence Matrix.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors