Temporal Recurrence Matrix Representation for Anomaly Detection in Industrial Network Traffic

✍🏾Authors: Morsinaldo Medeiros, Dennins Brandão, Marianne Silva, Paolo Ferrari, and Ivanovitch Silva

1. Abstract/Overview

Detecting anomalies in industrial communication networks is particularly challenging when protocol-specific information is unavailable or when monitoring must operate on resource-constrained devices. This study introduces a lightweight, protocol-agnostic method for representing industrial network traffic using observable packet characteristics. Packet length and idle time between consecutive transmissions are utilized to construct a Temporal Recurrence Matrix (TRM) over sliding windows, which captures recurrence patterns in communication behavior. Statistical features are extracted from transitions between consecutive matrices and employed for anomaly detection. The proposed approach was evaluated on a PROFIBUS dataset with controlled anomaly injection and multiple machine learning models. Results indicate that TRM-derived features support effective anomaly detection across the evaluated representations, with the multilayer perceptron achieving the highest F1-scores. Furthermore, the entire pipeline was implemented on a microcontroller platform, achieving millisecond-level processing times and demonstrating feasibility for real-time embedded monitoring in industrial networks.

2. Repository Structure

The repository contains the minimum material required to reproduce the experiments reported in the article, focusing on the MLP and TEDA pipelines.

METROIND2026-Temporal-Recurrence-Matrix
├── data/                     # Raw datasets and benchmark logs
│   ├── raw/                  # Raw CSV traces used in the notebooks
│   └── benchmark/            # Arduino logs and label files used in comparisons
│
├── figures/                  # Images used in the README and documentation
│
├── notebooks/                # Jupyter notebooks reproducing the experiments
│   ├── mlp_execution.ipynb
│   ├── teda_execution.ipynb
│   └── train_all_models.ipynb
│
├── src/                      # Python implementation of TEDA
│
├── scripts/                  # Experiment automation and serial streaming scripts
│
├── arduino/
│   └── nano_ble_full_pipeline
│       ├── nano_ble_full_pipeline.ino
│       ├── mlp_models_len_idle_notebook.h
│       ├── teda.h
│       └── len_idle_parity_constants.h
│
├── requirements.txt          # Python dependencies
└── README.md                 # Project documentation

3. Environment Setup

To reproduce the experiments locally, first install Python 3.11.11 and create a virtual environment. Use your preferred package manager. Below, I used Anaconda.

conda create --name trm python==3.11.11
conda activate trm

Clone this repository:

git clone https://github.com/conect2ai/METROIND2026-Temporal-Recurrence-Matrix.git
cd METROIND2026-Temporal-Recurrence-Matrix

Next, install the project dependencies:

pip install -r requirements.txt

To execute the notebooks interactively, launch Visual Studio Code or your preferred code editor:

code .

All notebooks should be executed from the notebooks directory so that relative paths to the datasets are resolved correctly.

4. Datasets

The raw datasets used in the experiments are located in:

data/raw/

The main file used in the experiments is:

data_1.csv – This dataset contains 208,125 lines and 6 columns.

Benchmark data used for the comparison between offline execution and embedded execution is located in:

data/benchmark/

This directory contains Arduino logs and label files used during the evaluation.

5. Reproducing the Experiments

The experiments reported in the paper can be reproduced using the Jupyter notebooks available in the notebooks directory.

Test all models

Open the notebook:

notebooks/train_all_models.ipynb

This notebook reproduces the complete offline pipeline, including:

raw data loading
TRM feature construction
All models training and evaluation

Features used by the models

Feature	Meaning
`js`	Jensen-Shannon divergence to baseline TRM
`fro`	Frobenius distance to baseline TRM
`rare`	mass in rare / high-delay bins of the TRM
`d_js`	temporal derivative of `js`
`d_fro`	temporal derivative of `fro`

Models and hyperparameters

Model	Type	Main architecture / hyperparameters
TEDA	unsupervised, online/incremental	threshold search: `np.arange(1.8, 3.4 + 0.2, 0.2)`
Isolation Forest	unsupervised	`IsolationForest(n_estimators=400, contamination="auto", random_state=42)`
One-Class SVM	unsupervised	`OneClassSVM(kernel="rbf", nu=0.02, gamma="scale")`
Random Forest	supervised	`RandomForestClassifier(n_estimators=500, class_weight="balanced_subsample", random_state=42, n_jobs=-1)`
XGBoost	supervised	`XGBClassifier(n_estimators=400, max_depth=6, learning_rate=0.05, subsample=0.9, colsample_bytree=0.8, objective="binary:logistic", eval_metric="logloss", scale_pos_weight=n_neg/n_pos, random_state=42)`
XGBoost fallback	supervised	`HistGradientBoostingClassifier(learning_rate=0.05, max_depth=8, random_state=42)`
ANN (MLP)	supervised	`MLPClassifier(hidden_layer_sizes=(64, 32), activation="relu", alpha=1e-4, learning_rate_init=1e-3, max_iter=300, early_stopping=True, random_state=42)`
Autoencoder (PyTorch)	unsupervised	encoder: `Linear(5,64) → ReLU → Linear(64,16)`; decoder: `Linear(16,64) → ReLU → Linear(64,5)`; optimizer: `Adam(lr=1e-3)`; loss: `MSELoss()`; epochs: `25`
KAN (pykan)	supervised	`KAN(width=[5, 8, 2], grid=3, k=3, seed=42)`; optimizer: `LBFGS`; steps: `25`; loss: `CrossEntropyLoss()`; max_samples: `5000`

Results of the offline evaluation

TEDA Pipeline

Open the notebook:

notebooks/teda_execution_en.ipynb

This notebook reproduces the anomaly detection workflow using the TEDA algorithm, including scoring and decision rules.

MLP Pipeline

Open the notebook:

notebooks/mlp_execution.ipynb

This notebook reproduces the complete offline pipeline, including:

raw data loading
TRM feature construction
model training and evaluation
comparison with Arduino benchmark logs

6. Embedded Deployment

Once the best models have been identified during the offline experiments, they can be deployed on a microcontroller in order to evaluate real-time performance in an embedded environment.

The embedded implementation of the anomaly detection pipeline was tested using the following hardware:

Arduino Nano 33 BLE Sense

First, download and install the Arduino IDE for your operating system:

Then follow the steps below.

Connect the microcontroller to your computer.
Open the firmware file in the Arduino IDE:

arduino/nano_ble_full_pipeline/nano_ble_full_pipeline.ino

Select the board Arduino Nano 33 BLE in the Arduino IDE.
Identify the serial port where the board is connected (this information will be required later when running the Python scripts).

In the example shown in the figure above, the device is connected to the port:

/dev/cu.usbmodem1401

Compile and upload the firmware to the microcontroller.

After uploading the firmware, the Serial Monitor can be used to observe telemetry messages generated by the pipeline. The serial monitor must be configured with the following parameters:

Baud rate: 115200

It is important to note that only one process can access the serial port at a time. Therefore, if the Python scripts are being used to capture telemetry data from the board, the Arduino Serial Monitor must be closed.

7. Running the Python Scripts

Once you have uploaded the code your hardware, it is time to simulate the data stream. You can do this running the python scripts

⚠️ During this process, close the monitor serial in the Arduino IDE

You need to replace the port /dev/cu.usbmodem1401 for the value that appers in you computer.

Example: stream a dataset to the board and capture telemetry.

python scripts/stream_raw_and_capture.py \
  --port /dev/cu.usbmodem1401 \
  --raw-csv data/raw/data_1.csv \
  --out outputs/log_nano_full_pipeline.csv \
  --labels-out outputs/stream_labels.csv \
  --rep composite \
  --model teda \
  --period-ms 1

Available options:

--rep: len, idle, composite
--model: mlp, teda

Example output:

timestamp,len,idle,score,label,infer_us,total_us
0,64,120,0.11,0,350,910
1,64,118,0.09,0,341,900
2,120,300,0.78,1,350,915

📄 License

This project is licensed under the MIT License — see the LICENSE file for details.

About us

The Conect2AI research group is composed of undergraduate and graduate students from the Federal University of Rio Grande do Norte (UFRN). The group focuses on applying Artificial Intelligence and Machine Learning to emerging areas such as Embedded Intelligence, Internet of Things, and Intelligent Transportation Systems, contributing to energy efficiency and sustainable mobility solutions.

Website: http://conect2ai.dca.ufrn.br

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Temporal Recurrence Matrix Representation for Anomaly Detection in Industrial Network Traffic

✍🏾Authors: Morsinaldo Medeiros, Dennins Brandão, Marianne Silva, Paolo Ferrari, and Ivanovitch Silva

1. Abstract/Overview

2. Repository Structure

3. Environment Setup

4. Datasets

5. Reproducing the Experiments

Test all models

Features used by the models

Models and hyperparameters

Results of the offline evaluation

TEDA Pipeline

MLP Pipeline

6. Embedded Deployment

7. Running the Python Scripts

📄 License

About us

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
arduino/nano_ble_full_pipeline		arduino/nano_ble_full_pipeline
data		data
figures		figures
notebooks		notebooks
outputs/article_case_study_final		outputs/article_case_study_final
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Temporal Recurrence Matrix Representation for Anomaly Detection in Industrial Network Traffic

✍🏾Authors: Morsinaldo Medeiros, Dennins Brandão, Marianne Silva, Paolo Ferrari, and Ivanovitch Silva

1. Abstract/Overview

2. Repository Structure

3. Environment Setup

4. Datasets

5. Reproducing the Experiments

Test all models

Features used by the models

Models and hyperparameters

Results of the offline evaluation

TEDA Pipeline

MLP Pipeline

6. Embedded Deployment

7. Running the Python Scripts

📄 License

About us

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages