diff --git a/article-deep_belief_network/DBN_article.md b/article-deep_belief_network/DBN_article.md
new file mode 100644
index 0000000..20b7dbf
--- /dev/null
+++ b/article-deep_belief_network/DBN_article.md
@@ -0,0 +1,437 @@
+# Deep Belief Network
+
+&nbsp;&nbsp;&nbsp;In the modern era of innovation and technology, Artificial Intelligence (AI) invaded most 
+transformative and captivating advancements of our life. Starting with predictions on data, 
+classifying things into categories, and ending with pictures and music generation, AI is just 
+everywhere. One of the most outstanding application of AI is creation (a.k.a. generation). 
+Today we will dive in to discover one of the generative networks: Deep Belief Network (DBN).
+
+## Content:
+1. [Introduction](#introduction)
+2. [What are Deep Belief Networks (DBNs)?](#what-are-deep-belief-networks-dbns)
+3. [DBNs Overview](#dbns-overview)
+4. [Math Base](#math-base)
+5. [DBNs Training](#dbns-training)
+6. [Use Cases](#use-cases)
+7. [Drawbacks and limits](#drawbacks-and-limits)
+8. [Conclusion](#conclusion)
+9. [References](#references)
+
+
+## Introduction
+
+&nbsp;&nbsp;&nbsp;Before we get into more specifics of DBNs, let’s go briefly from 
+some general concepts for a better understanding. It is common 
+practice to divide Machine Learning models into discriminative 
+and generative ones [1]. As it might be concluded from the names, 
+discriminative models aim to separate data points into different 
+classes, while generative ones – to generate data points. 
+Generative models are usually used in unsupervised learning 
+problems as they are trained on inputs without their labels. 
+DBNs, that will be discussed today, are a type of deep learning 
+architecture that combine neural networks and unsupervised 
+learning.
+
+<p align="center">
+    <img src="images/dbn_suggestive_meme.jpg" alt="Fast_Learning" style="width:300px;"/>
+<br><em>Source: https://medium.com/swlh/what-are-rbms-deep-belief-networks-and-why-are-they-important-to-deep-learning-491c7de8937a</em>
+</p>
+
+## What are Deep Belief Networks (DBNs)?
+
+&nbsp;&nbsp;&nbsp;Deep Belief Network is a deep learning stochastic architecture, 
+composed of layers of Restricted Boltzmann Machines (RBMs), which are trained 
+in an unsupervised manner [2]. \
+&nbsp; &nbsp; &nbsp;A Restricted Boltzmann Machines is a generative unsupervised 
+model used for feature selection and feature reduction technique, 
+for dimensionality reduction, classification, regression, and 
+other tasks in Machine Learning and Deep Learning [3]. It is 
+learning on a probability distribution on a certain dataset 
+and uses the learnt distribution to come up with conclusions on 
+unexplored data. A typical RBM architecture is represented below 
+(where h represents hidden nodes and v – visible nodes).
+
+<p align="center">
+    <img src="images/dbn_structure.png" alt="RBM architecture" style="width:400px;"/>
+<br><em>Source: https://www.javatpoint.com/keras-restricted-boltzmann-machine</em>
+</p>
+
+&nbsp; &nbsp; &nbsp;All RBMs that are a part of a Deep Belief Network, are trained in an 
+unsupervised manner, one at a time [4]. Thus, the output of one of them 
+becomes the input for the next one. The output of the final machine 
+is used either in classification or regression tasks, making the 
+general DBN architecture look like the one represented below. 
+The prior reason that lead to the appearance of DBNs is to create 
+unbiased values stored in leaf nodes and to avoid being stuck in 
+the local minima [5]. 
+
+<p align="center">
+    <img src="images/dbn_network.jpg" alt="DBN architecture" style="width:400px;"/>
+<br><em>Source: https://www.sciencedirect.com/topics/engineering/deep-belief-network</em>
+</p>
+
+## DBNs Overview
+
+&nbsp;&nbsp;&nbsp;However, besides the Standard DBNs described earlier, there might be 
+distinguished several extensions that incorporate different 
+structures and components. Therefore, the following are the most 
+notorious variations of DBNs:
+* Convolutional Deep Belief Networks (ConvDBNs) – in addition to RBMs, 
+incorporates convolutional layers [6];
+* Temporal Deep Belief Networks (Temporal DBNs) – extended Standard DBN 
+to model sequential and time-series data, incorporating 
+recurrent connections or temporal dependencies between layers [7];
+* Variational Deep Belief Networks (VDBNs) – use a probabilistic 
+modeling technique, variational interface, in DBMs, making 
+multiple layers of hidden units fully connected between 
+consecutive layers;
+* Stacked Autoencoders -  although not a DBN in the traditional 
+sense, when multiple autoencoders are stacked, they form a deep 
+architecture that shares a considerate amount of similarities 
+with Deep Belief Networks.
+
+## Math Base
+&nbsp;&nbsp;&nbsp;Despite the diversity of the existing DBNs, it shouldn’t be 
+forgotten the fact that all of the share similar “roots”. 
+Recall that DBN is a network assembled out of many single 
+networks. Except the first and last layers, others play dual 
+role serving at the same time as hidden layers that comes before 
+and as input for the following one [8] . The joint distribution 
+between the observed vector X and the hidden layers hk may be 
+expressed using the formula: 
+
+<p align="center">
+$P(x, h^1, ..., h^l) = (\displaystyle\prod^{l-2}_{k=0} P(h^k|h^{k + 1})P(h^{l - 1}, h^l))$
+</p>
+
+, where:
+* $X$ = $h_0$,
+* $P(h^k|h^{k + 1})$ – a conditional distribution for the visible units 
+conditioned on the hidden units of the RBM at level k,
+* $P(h^{l - 1}, h^l)$ – visible-hidden joint distribution in the top-level RBM.
+
+## DBNs Training
+
+&nbsp;&nbsp;&nbsp;Before training a DBN, it is necessary to remember that it is composed 
+of multiple Restricted Boltzmann Machines that should be individually trained.
+We will analyze a typical structured DBN training using an example of a classification 
+task of handwritten digits. The data will be obtained from the MNIST dataset.
+
+**Step 1. Initiate the units and parameters of the Restricted Boltzmann Machine.**
+
+To implement an RBM, we create a corresponding class, with the `__init__` function 
+containing the number of visible and hidden layers, the visible and hidden biases, 
+and the weight matrix initialization. The Xavier is used to ensure the weights are 
+initialized properly, which leads to an improved training process. 
+
+```python
+class RBM(nn.Module):
+    def __init__(self, visible_units, hidden_units):
+        super(RBM, self).__init__()
+        self.visible_units = visible_units  # number of visible units
+        self.hidden_units = hidden_units  # number of hidden units
+        self.weights = nn.Parameter(torch.empty(hidden_units, visible_units))  # weight matrix
+        nn.init.xavier_uniform_(self.weights)  # weights initialization
+        self.visible_bias = nn.Parameter(torch.zeros(visible_units))  # bias for the visible units
+        self.hidden_bias = nn.Parameter(torch.zeros(hidden_units))  # bias for the hidden units
+```
+
+**Step 2. Initialize the number of RBM layers and the size of each of 
+them, specifying the parameters (weights and biases) and 
+initialization strategy.**
+
+The second step focuses on deciding upon the number of RBM layers in the final model, 
+specifying the size of each visible and hidden layer, and appropriately initializing 
+the RBM parameters. 
+
+For the digit classification, we will use a simple model with 784 visible units (as 
+the images in MNIST dataset are 28x28 pixels) and one hidden layer of 128 units.
+
+```python
+input_dim = 784
+hidden_layers = [128]
+
+rbm_layers = []
+current_input = train_inputs
+current_val_input = val_inputs
+for h_units in hidden_layers:
+    rbm = RBM(input_dim, h_units).to(device)
+    rbm_layers.append(rbm)  # store the RBM
+    input_dim = h_units  # update the input size for the next RBM
+```
+
+<p align="center">
+    <img src="images/dbn_rbm_layers.png" alt="RBM training" style="width:500px;"/>
+<br><em>Source: https://icecreamlabs.medium.com/deep-belief-networks-all-you-need-to-know-68aa9a71cc53</em>
+</p>
+
+**Step 3. Pre-train RBM layers – using Greedy learning algorithm, that 
+implies layer-by-layer approach, determine the relationship 
+between variables in one layer and variables in layer above.**
+
+During this step, the RBMs are pre-trained, one by one. Each RBM, trained layer by 
+layer, is stored in a list for stacking later.
+
+```python
+input_dim = 784
+hidden_layers = [128]
+
+
+rbm_layers = []
+current_input = train_inputs
+current_val_input = val_inputs
+for h_units in hidden_layers:
+    rbm = RBM(input_dim, h_units).to(device)
+    rbm_train_loss, rbm_val_loss = rbm.pretrain(current_input, current_val_input, rbm_batch_size=batch_size)  # train RBM
+    rbm_layers.append(rbm)  # store the RBM
+    input_dim = h_units  # update the input size for the next RBM
+```
+
+The images were reconstructed every five epochs to monitor the process of training 
+the RBM and how well the model learns new features. Below can be observed the results 
+after five epochs of training.
+
+<p align="center">
+    <img src="images/dbn_actual_reconstructed_images_5_epochs.png" alt="DBN architecture" style="width:400px;"/>
+<br><em>The Original and Recreated images after 5 training epochs</em>
+</p>
+
+It can be noticed that while it keeps the main characteristics of the original images, 
+the recreated ones are still blurred, especially compared to the results obtained after 
+10 epochs of training. It can also be noticed that more training epochs lead to more 
+defined boundaries of digits, and overall better quality of reconstructions.
+
+<p align="center">
+    <img src="images/dbn_actual_reconstructed_images_10_epochs.png" alt="DBN architecture" style="width:400px;"/>
+<br><em>The Original and Recreated images after 10 training epochs</em>
+</p>
+
+The improvement in RBM performance is also proven by the train and validation loss 
+dynamic, which shows a decreasing trend, close to 0 in the end.
+
+<p align="center">
+    <img src="images/dbn_RBM_train_validation_loss.png" alt="DBN architecture" style="width:400px;"/>
+</p>
+
+**Step 4. Feature extraction – use hidden activation of the final RBM 
+layer as features that can be used for fine-tuned for specific 
+tasks.**
+
+This step focuses on the features (hidden activations) extraction from the provided input. 
+After extracting them, the features will be used as input for the next RBMs. 
+
+
+```python
+input_dim = 784
+hidden_layers = [128]
+
+rbm_layers = []
+current_input = train_inputs
+current_val_input = val_inputs
+for h_units in hidden_layers:
+    rbm = RBM(input_dim, h_units).to(device)
+    rbm_train_loss, rbm_val_loss = rbm.pretrain(current_input, current_val_input, rbm_batch_size=batch_size)  # train RBM
+    current_input = rbm.extract_features(current_input)  # extracting features
+    current_val_input = rbm.extract_features(current_val_input)
+    rbm_layers.append(rbm)  # store the RBM
+    input_dim = h_units  # update the input size for the next RBM
+```
+
+**Step 5. Initialize Supervisor Layer – add a supervised layer (usually 
+softmax) on top of the last RBM layer.**
+
+For the next step, we initialize the DBN class, specifying what classifier we will be 
+using. As we are dealing with a classification problem, we will be using Softmax as the 
+last activation layer.
+
+```python
+class DBN(nn.Module):
+    def __init__(self, rbm_layers, output_classes):
+        super(DBN, self).__init__()
+        self.rbms = nn.ModuleList(rbm_layers)
+        self.classifier = nn.Sequential(
+            nn.Linear(rbm_layers[-1].hidden_units, output_classes),  # fully connected layer
+            nn.Softmax(dim=1),  # Softmax for classification
+        )
+
+    def forward(self, x):
+        for rbm in self.rbms:
+            x = torch.sigmoid(torch.matmul(x, rbm.weights.t()) + rbm.hidden_bias)
+        return self.classifier(x)
+```
+
+**Step 6. ~~Word-by-word, say loudly and clearly the spell.~~ Train the model.**
+
+During this step, the RBMs are used only for feature extraction, while the supervised 
+layer is trained on labelled data to classify the digits accordingly.
+
+```python
+def train_dbn(dbn, dbn_train_data, dbn_train_labels, dbn_val_data, dbn_val_labels, epochs=25, lr=0.05):
+    criterion = nn.CrossEntropyLoss()
+    train_losses, val_losses = [], []
+
+    for epoch in range(epochs):
+        dbn.train()
+        outputs = dbn(dbn_train_data)
+        train_loss = criterion(outputs, dbn_train_labels)
+        train_loss.backward()
+        train_losses.append(train_loss.item())
+
+        dbn.eval()
+        with torch.no_grad():
+            val_outputs = dbn(dbn_val_data)
+            val_loss = criterion(val_outputs, dbn_val_labels)
+            val_losses.append(val_loss.item())
+
+        print(f"Epoch {epoch + 1}/{epochs}, Train Loss: {train_loss.item():.4f}, Val Loss: {val_loss:.4f}")
+
+    return train_losses, val_losses
+```
+
+The model was trained for 25 epochs, during which the loss was reduced considerably, 
+as it can be observed from the diagram provided below.
+
+<p align="center">
+    <img src="images/dbn_train_validation_loss.png" alt="DBN architecture" style="width:400px;"/>
+</p>
+
+**Step 7. Fine-tune the model.**
+
+Use labeled data and backpropagation to update entire network parameters, 
+adjusting them to minimize the loss, optimize the gradient descent, adjusting weights 
+and biases based on the gradients of the loss with respect to the parameters, as well
+as the number of epochs, hidden layers, or the batch size 
+(you may perform this step in an iterative manner).
+
+```python
+def train_dbn(dbn, dbn_train_data, dbn_train_labels, dbn_val_data, dbn_val_labels, epochs=25, lr=0.05):
+    criterion = nn.CrossEntropyLoss()
+    optimizer = optim.Adam(dbn.parameters(), lr=lr)
+    train_losses, val_losses = [], []
+
+    for epoch in range(epochs):
+        dbn.train()
+        optimizer.zero_grad()
+        outputs = dbn(dbn_train_data)
+        train_loss = criterion(outputs, dbn_train_labels)
+        train_loss.backward()
+        optimizer.step()
+        train_losses.append(train_loss.item())
+
+        dbn.eval()
+        with torch.no_grad():
+            val_outputs = dbn(dbn_val_data)
+            val_loss = criterion(val_outputs, dbn_val_labels)
+            val_losses.append(val_loss.item())
+
+        print(f"Epoch {epoch + 1}/{epochs}, Train Loss: {train_loss.item():.4f}, Val Loss: {val_loss:.4f}")
+
+    return train_losses, val_losses
+```
+
+**Step 8. Evaluate the trained DBN on a test set.**
+
+This step is intended to show the performance of the model, evaluating it against a test 
+set or, in other words, unseen data. Using the provided source code, it was possible to 
+achieve an accuracy of 0.94, which is a pretty solid result.
+
+```python
+def evaluate(dbn_model, data, labels):
+    with torch.no_grad():
+        outputs = dbn_model(data)
+        _, predicted = torch.max(outputs, 1)
+        accuracy = (predicted == labels).sum().item() / len(labels)
+
+    print(f"Accuracy: {accuracy:.2f}")
+```
+
+**Step 9. Postprocessing.** 
+
+After performing everything mentioned above you might want to improve the performance
+of the model by adding some additional steps as thresholding or normalization.
+
+```python
+# normalization
+transform = transforms.Compose([
+    transforms.ToTensor(),
+    transforms.Normalize((0.5,), (0.5,)),
+    transforms.Lambda(lambda x: (x + 1) / 2)
+])
+```
+
+## Use Cases
+
+&nbsp;&nbsp;&nbsp;On broad terms, Deep Belief Networks can be described as more 
+efficient version of feedforward neural network. There follows a 
+vast applicability of this type of networks: image recognition, 
+recognizing, clustering and generating of images, video sequences 
+and motion-caption data. For example, ConvDBNs are good to use 
+for tasks that involve grid-like data, such as images, while 
+Temporal DBNs are better for speech recognition or natural 
+language processing tasks. To sum up, the fields they are widely 
+used are:
+
+* Computer Vision – object recognition and classification [6];
+* NLP – sentiment analysis and text classification;
+* Speech recognition -  transcribing speech into text;
+* Recommendation Systems - giving suggestions based on previous inputs;
+* Financial analysis – predicting stock market and how risky some actions are;
+* Bioinformatics – predict the interactions between components like proteins, 
+find new drugs and predict how genes may express.
+
+## Drawbacks and limits
+
+&nbsp;&nbsp;&nbsp;In spite of a lot advantages DBNs brings, they are still a quite 
+early version of a deep neural network and might be not so 
+effective as its newest aliases due to a considerable set of 
+drawbacks, such as:
+
+* high hardware requirements;
+* complex data model that is difficult to train;
+* difficult to use by unexperienced people;
+* requires of a huge amount of data for a good performance;
+* requires classifiers to grasp the output.
+
+## Conclusion
+
+&nbsp;&nbsp;&nbsp;To summarize everything up, Deep Belief Network is an 
+early-deep-learning-days architecture, composed of multiple 
+Restricted Boltzmann Machines, aimed to perform classification, 
+clustering and generation tasks. Due to the presence of RBMs, 
+trained once at a time, it uses Greedy Algorithm to train each 
+layer until all RBM are trained and the output can be passed to 
+supervised learning model. Although it is quite old, it still 
+has its fields of applicability where it performs considerable 
+better than other known algorithms, that being one of the reasons 
+to know about its existence. Not the last, it is just an awesome 
+algorithm with a unique architecture that might bring you fun 
+while diving in!
+
+<p align="center">
+    <img src="images/dbn_end_picture.jpg" alt="Closing image" style="width:500px;"/>
+<br><em>Source: https://www.123rf.com/photo_166564781_that-s-all-folks-vintage-movie-ending-screen-background-the-end-vector-illustration-.html</em>
+</p>
+
+
+## References
+[1] - E. ARGOUARC'H, F. DESBOUVRIES, E. BARAT, E. KAWASAKI, 
+_Generative vs. Discriminative modeling under the lens of uncertainty quantification_,
+doi: arXiv.2406.09172. Access link: https://arxiv.org/pdf/2406.09172 \
+[2] - M. ZAMBRA, A. TESTOLIN, M. ZoORZI, _A developmental approach for training deep 
+belief networks_, doi: arXiv.2207.05473. Access link: https://arxiv.org/pdf/2207.05473 \
+[3] - MEDIUM, _What Are RBMs, Deep Belief Networks and Why Are They Important to Deep Learning?_.
+Article. [quoted 29.07.2023]. Access link:  https://medium.com/swlh/what-are-rbms-deep-belief-networks-and-why-are-they-important-to-deep-learning-491c7de8937a \
+[4] - JAVATPOINT, _Restricted Boltzmann Machine_. Article. [quoted 30.07.2023]. Access link: 
+https://www.javatpoint.com/keras-restricted-boltzmann-machine \
+[5] - AL-JABERY K. K., WUNSCH D. C., _Selected approaches to supervised learning_, 
+Computational Learning Approaches to Data Analytics in Biomedical Applications, 2020 \
+[6] - H. LEE, R. GROSSE, Ra. RANGANATH, A. Y. NG, 
+_Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations_,
+doi: 10.1145/1553374.1553453, Access link: https://ai.stanford.edu/~ang/papers/icml09-ConvolutionalDeepBeliefNetworks.pdf? \
+[7] - F.Y ZHOU, J. Q YIN, Y. Yang, H.T. ZHANG, 
+_Online recognition of human actions based on temporal deep belief neural network_,
+doi: 10.16383/j.aas.2016.c150629. 
+Access link: https://www.researchgate.net/publication/306168057_Online_recognition_of_human_actions_based_on_temporal_deep_belief_neural_network \
+[8] - ABIRAMI S., CHITRA P., _The Digital Twin Paradigm for 
+Smarter Systems and Environments: The Industry Use Cases_, 
+Advances in Computers, 2020
\ No newline at end of file
diff --git a/article-deep_belief_network/README.md b/article-deep_belief_network/README.md
new file mode 100644
index 0000000..3ccf044
--- /dev/null
+++ b/article-deep_belief_network/README.md
@@ -0,0 +1,59 @@
+# Deep Belief Network: The hidden hero behind the Deep Learning (r)evolution 
+
+Welcome to an article dedicated to the Deep Belief Network discussion! Be ready to 
+discover some dark magic and impressive algorithms behind modern generative AI.
+## Article Summary
+
+The proposed article focuses on the discussion of Deep Belief Networks (DBN), a type of 
+stochastic neural network, one of the first algorithms to prove the feasibility of 
+unsupervised learning and hidden layers. It highlights the extensive usage of DBNs in 
+different tasks, such as generation, recognition and feature extraction, focusing the 
+implementation on digit classification based on the MNIST dataset.
+
+## Getting Started
+
+Follow the following steps to check how Deep Belief Networks work for classification
+of handwritten digits!
+
+
+1. **Create a New Project**: Create a new empty Python Project on your device and navigate 
+to the project directory.
+
+    ```sh
+    cd my-dbn-project
+    ```
+
+2. **Prepare Your Environment**: Before you begin, make sure you have a virtual environment set up for your project. If not, create and activate a virtual environment:
+
+    ```sh
+    python -m venv dbn_env
+    source dbn_env/bin/activate      # On Windows: .\dbn_env\Scripts\activate
+    ```
+
+3. **Copy the source code**: Inside the empty directory, add the file with the provided
+source (`dbn_implementation.py`) code and the requirements (`requirements.txt`).
+
+4. **Install Requirements**: Inside your virtual environment, install the required packages from the `requirements.txt` file:
+
+    ```sh
+    pip install -r requirements.txt
+    ```
+
+5. **Run the source code**: To run the source code provided for the DBN, use the command
+provided below. It can take some time, as the MNIST dataset should be installed.
+
+    ```sh
+    python dbn_implementation.py
+    ```
+
+## Next Steps
+
+Now that you have successfully run the provided source code, consider optimizing some 
+hyperparameters or making your own DBN implementation for a different task (you may 
+consider fraud detection or text classification based on topics).
+
+Make sure to share your achievements with us!😉
+
+For any questions or further assistance, reach out to our community.
+
+Happy coding!✨💙
diff --git a/article-deep_belief_network/images/dbn_RBM_train_validation_loss.png b/article-deep_belief_network/images/dbn_RBM_train_validation_loss.png
new file mode 100644
index 0000000..b81edd2
Binary files /dev/null and b/article-deep_belief_network/images/dbn_RBM_train_validation_loss.png differ
diff --git a/article-deep_belief_network/images/dbn_actual_reconstructed_images_10_epochs.png b/article-deep_belief_network/images/dbn_actual_reconstructed_images_10_epochs.png
new file mode 100644
index 0000000..2559653
Binary files /dev/null and b/article-deep_belief_network/images/dbn_actual_reconstructed_images_10_epochs.png differ
diff --git a/article-deep_belief_network/images/dbn_actual_reconstructed_images_5_epochs.png b/article-deep_belief_network/images/dbn_actual_reconstructed_images_5_epochs.png
new file mode 100644
index 0000000..6ddd1fc
Binary files /dev/null and b/article-deep_belief_network/images/dbn_actual_reconstructed_images_5_epochs.png differ
diff --git a/article-deep_belief_network/images/dbn_end_picture.jpg b/article-deep_belief_network/images/dbn_end_picture.jpg
new file mode 100644
index 0000000..7350271
Binary files /dev/null and b/article-deep_belief_network/images/dbn_end_picture.jpg differ
diff --git a/article-deep_belief_network/images/dbn_network.jpg b/article-deep_belief_network/images/dbn_network.jpg
new file mode 100644
index 0000000..3202ba0
Binary files /dev/null and b/article-deep_belief_network/images/dbn_network.jpg differ
diff --git a/article-deep_belief_network/images/dbn_rbm_layers.png b/article-deep_belief_network/images/dbn_rbm_layers.png
new file mode 100644
index 0000000..81e45cf
Binary files /dev/null and b/article-deep_belief_network/images/dbn_rbm_layers.png differ
diff --git a/article-deep_belief_network/images/dbn_structure.png b/article-deep_belief_network/images/dbn_structure.png
new file mode 100644
index 0000000..ba3696b
Binary files /dev/null and b/article-deep_belief_network/images/dbn_structure.png differ
diff --git a/article-deep_belief_network/images/dbn_suggestive_meme.jpg b/article-deep_belief_network/images/dbn_suggestive_meme.jpg
new file mode 100644
index 0000000..697abe1
Binary files /dev/null and b/article-deep_belief_network/images/dbn_suggestive_meme.jpg differ
diff --git a/article-deep_belief_network/images/dbn_train_validation_loss.png b/article-deep_belief_network/images/dbn_train_validation_loss.png
new file mode 100644
index 0000000..ae37a18
Binary files /dev/null and b/article-deep_belief_network/images/dbn_train_validation_loss.png differ
diff --git a/article-deep_belief_network/src/dbn_implementation.py b/article-deep_belief_network/src/dbn_implementation.py
new file mode 100644
index 0000000..292e48d
--- /dev/null
+++ b/article-deep_belief_network/src/dbn_implementation.py
@@ -0,0 +1,359 @@
+import matplotlib.pyplot as plt
+import torch.optim as optim
+from torchvision import datasets, transforms
+from torch.utils.data import DataLoader, random_split
+
+import torch
+import torch.nn as nn
+
+
+class RBM(nn.Module):
+    """
+    Restricted Boltzmann Machine implementation to extract the features from the input data in an unsupervised manner
+    """
+    def __init__(self, visible_units, hidden_units):
+        """
+        Initialize the Restricted Boltzmann Machine
+
+        Args:
+            visible_units: the number of visible units
+            hidden_units: the number of hidden units
+        """
+        super(RBM, self).__init__()
+        self.visible_units = visible_units
+        self.hidden_units = hidden_units
+        self.weights = nn.Parameter(torch.empty(hidden_units, visible_units))  # weight matrix
+        nn.init.xavier_uniform_(self.weights)  # weights initialization
+        self.visible_bias = nn.Parameter(torch.zeros(visible_units))  # bias for the visible units
+        self.hidden_bias = nn.Parameter(torch.zeros(hidden_units))  # bias for the hidden units
+
+    def sample_h(self, v):
+        """
+        Computes the probability of hidden units given the visible units
+        Args:
+            v: activations for visible layers
+        Returns:
+            The probability of hidden layer activations
+        """
+        h_prob = torch.sigmoid(torch.matmul(v, self.weights.t()) + self.hidden_bias)
+        return h_prob
+
+    def sample_v(self, h):
+        """
+        Computes the probability of visible units given the hidden units
+        Args:
+            h: activations for hidden layers
+        Returns:
+            The probability of visible layer activations
+        """
+        v_prob = torch.sigmoid(torch.matmul(h, self.weights) + self.visible_bias)
+        return v_prob
+
+    def pretrain(self, rbm_train_data, rbm_val_data, rbm_batch_size, epochs=10, lr=0.001):
+        """
+        Pretrain the Restricted Boltzmann Machine using unsupervised learning, minimizing the reconstruction
+        errors using Contrastive Divergence
+
+        Args:
+            rbm_train_data: training data
+            rbm_val_data: validation data
+            rbm_batch_size: the batch size for training
+            epochs: number of training epochs
+            lr: learning rate
+
+        Returns:
+            The lists of training and validation losses for each epoch
+        """
+        num_samples = rbm_train_data.size(0)
+        train_losses, val_losses = [], []
+        for epoch in range(epochs):
+            epoch_loss = 0
+            indices = torch.randperm(num_samples)
+            data = rbm_train_data[indices]
+
+            for i in range(0, num_samples, rbm_batch_size):
+                batch = data[i:i + rbm_batch_size].to(next(self.parameters()).device)
+                loss = self.train_rbm(batch, lr)
+                epoch_loss += loss.item() * batch.size(0)
+            epoch_loss /= num_samples
+            train_losses.append(epoch_loss)
+
+            # visualize the reconstructed images for each 5 epochs
+            if (epoch + 1) % 5 == 0:
+                print(f"Reconstruction on epoch {epoch + 1}:")
+                self.visualize_reconstructions(data)
+            with torch.no_grad():
+                val_loss = self.evaluate(rbm_val_data)
+                val_losses.append(val_loss)
+
+            print(f"Epoch {epoch + 1}/{epochs}, Loss: {epoch_loss:.4f}")
+
+        return train_losses, val_losses
+
+    def train_rbm(self, v, lr=0.01):
+        """
+        Perform one training step of Contrastive Divergence
+
+        Args:
+            v: the input batch of visible layer data, shape (batch_size, visible_units)
+            lr: learning rate for the parameter updates
+
+        Returns:
+            Reconstruction loss for the input batch
+        """
+        h_prob = self.sample_h(v)
+        v_reconstructed = self.sample_v(h_prob)
+        h_prob_reconstructed = self.sample_h(v_reconstructed)
+
+        positive_grad = torch.matmul(v.t(), h_prob)
+        negative_grad = torch.matmul(v_reconstructed.t(), h_prob_reconstructed)
+
+        with torch.no_grad():
+            self.weights += lr * (positive_grad - negative_grad).t()
+            self.visible_bias += lr * torch.sum(v - v_reconstructed, dim=0)
+            self.hidden_bias += lr * torch.sum(h_prob - h_prob_reconstructed, dim=0)
+
+        return torch.mean((v - v_reconstructed) ** 2)
+
+    def evaluate(self, data):
+        """
+        Compute the reconstruction error on the given dataset to evaluate the RBM performance
+
+        Args:
+            data: the dataset to evaluate, shape (num_samples, visible_units)
+
+        Returns:
+            Mean squared reconstruction error
+        """
+        with torch.no_grad():
+            v_reconstructed = self.sample_v(self.sample_h(data))
+            return torch.mean((data - v_reconstructed) ** 2).item()
+
+    @torch.no_grad()
+    def extract_features(self, data):
+        """
+        Extract features from the hidden layer of the RBM
+
+        Args:
+            data: input data to extract features from, shape (num_samples, visible_units)
+
+        Returns:
+            The activations for the hidden layers, shape (num_samples, hidden_units)
+        """
+        h_activations = self.sample_h(data)
+        return h_activations
+
+    @torch.no_grad()
+    def visualize_reconstructions(self, data, num_samples=16):
+        """
+        Visualize original and reconstructed images for the evaluated samples
+
+        Args:
+            data: the input dataset to reconstruct.
+            num_samples (default: 16): the number of samples to visualize
+        """
+        v = data[:num_samples]
+        v_reconstructed = self.sample_v(self.sample_h(v))
+        fig, axes = plt.subplots(2, num_samples // 2, figsize=(12, 4))
+        for i, ax in enumerate(axes.flatten()):
+            ax.imshow(v[i].view(28, 28).cpu(), cmap="gray")
+            ax.axis("off")
+        plt.suptitle("Original images:")
+        plt.show()
+
+        fig, axes = plt.subplots(2, num_samples // 2, figsize=(12, 4))
+        for i, ax in enumerate(axes.flatten()):
+            ax.imshow(v_reconstructed[i].view(28, 28).cpu(), cmap="gray")
+            ax.axis("off")
+        plt.suptitle("Reconstructed images:")
+        plt.show()
+
+
+class DBN(nn.Module):
+    """
+    Deep Belief Network implementation to combine multiple Restricted Boltzmann Machines and adding supervised training
+    for classification task
+    """
+    def __init__(self, rbm_layers, output_classes):
+        """
+        Initialize the Deep Belief Network
+
+        Args:
+            rbm_layers: a list of pretrained RBMs stacked together
+            output_classes: the number of output classes for the classification task
+        """
+        super(DBN, self).__init__()
+        self.rbms = nn.ModuleList(rbm_layers)
+        self.classifier = nn.Sequential(
+            nn.Linear(rbm_layers[-1].hidden_units, output_classes),  # fully connected layer
+            nn.Softmax(dim=1),  # Softmax for classification
+        )
+
+    def forward(self, x):
+        """
+        Perform a forward pass through the DBN
+
+        Args:
+            x: the input tensor, shape (batch_size, visible_units)
+
+        Returns:
+            The logits for each class, shape (batch_size, output_classes)
+        """
+        for rbm in self.rbms:
+            x = torch.sigmoid(torch.matmul(x, rbm.weights.t()) + rbm.hidden_bias)
+        return self.classifier(x)
+
+
+def train_dbn(dbn, dbn_train_data, dbn_train_labels, dbn_val_data, dbn_val_labels, epochs=25, lr=0.05):
+    """
+    Train the Deep Belief Network (DBN) using supervised learning
+
+    Args:
+        dbn: the Deep Belief Network model to train
+        dbn_train_data: the training data
+        dbn_train_labels: the training data labels
+        dbn_val_data: the validation data
+        dbn_val_labels: the validation labels
+        epochs (default: 15): the number of training epochs
+        lr (default: 0.05): learning rate for the Adam optimizer
+
+    Returns:
+        The lists of training and validation losses per epoch
+    """
+    criterion = nn.CrossEntropyLoss()
+    optimizer = optim.Adam(dbn.parameters(), lr=lr)
+    train_losses, val_losses = [], []
+
+    for epoch in range(epochs):
+        dbn.train()
+        optimizer.zero_grad()
+        outputs = dbn(dbn_train_data)
+        train_loss = criterion(outputs, dbn_train_labels)
+        train_loss.backward()
+        optimizer.step()
+        train_losses.append(train_loss.item())
+
+        dbn.eval()
+        with torch.no_grad():
+            val_outputs = dbn(dbn_val_data)
+            val_loss = criterion(val_outputs, dbn_val_labels)
+            val_losses.append(val_loss.item())
+
+        print(f"Epoch {epoch + 1}/{epochs}, Train Loss: {train_loss.item():.4f}, Val Loss: {val_loss:.4f}")
+
+    return train_losses, val_losses
+
+
+def evaluate(dbn_model, data, labels):
+    """
+    Evaluate the trained DBN on a test dataset
+
+    Args:
+        dbn_model: the trained Deep Belief Network model
+        data: the test data, shape (num_samples, visible_units)
+        labels: the test data labels, shape (num_samples)
+    """
+    with torch.no_grad():
+        outputs = dbn_model(data)
+        _, predicted = torch.max(outputs, 1)
+        accuracy = (predicted == labels).sum().item() / len(labels)
+
+    print(f"Accuracy: {accuracy:.2f}")
+
+
+def plot_losses(train_losses, val_losses, title):
+    """
+    Plot the training and validation loss curves
+
+    Args:
+        train_losses: the list of training losses per epoch
+        val_losses: the list of validation losses per epoch
+        title: the title of the plot
+    """
+    plt.figure(figsize=(10, 6))
+    plt.plot(train_losses, label="Train Loss")
+    plt.plot(val_losses, label="Validation Loss")
+    plt.xlabel("Epochs")
+    plt.ylabel("Loss")
+    plt.title(title)
+    plt.legend()
+    plt.grid(True)
+    plt.show()
+
+
+def prepare_data(data_loader):
+    """
+    Prepare data by flattening and batching
+
+    Args:
+        data_loader: a dataLoader object
+
+    Returns:
+        the flattened input data and corresponding labels
+    """
+    inputs, labels = [], []
+    for batch in data_loader:
+        images, targets = batch
+        images = images.view(images.size(0), -1)
+        inputs.append(images)
+        labels.append(targets)
+    inputs = torch.cat(inputs)
+    labels = torch.cat(labels)
+    return inputs, labels
+
+
+# normalization
+transform = transforms.Compose([
+    transforms.ToTensor(),
+    transforms.Normalize((0.5,), (0.5,)),
+    transforms.Lambda(lambda x: (x + 1) / 2)
+])
+
+mnist_data = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
+
+train_size = int(0.6 * len(mnist_data))
+val_size = int(0.25 * len(mnist_data))
+test_size = len(mnist_data) - train_size - val_size
+
+train_data, val_data, test_data = random_split(mnist_data, [train_size, val_size, test_size])
+
+batch_size = 32
+train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True)
+val_loader = DataLoader(val_data, batch_size=batch_size, shuffle=False)
+test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=False)
+
+device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+
+train_inputs, train_labels = prepare_data(train_loader)
+val_inputs, val_labels = prepare_data(val_loader)
+test_inputs, test_labels = prepare_data(test_loader)
+
+train_inputs, train_labels = train_inputs.to(device), train_labels.to(device)
+val_inputs, val_labels = val_inputs.to(device), val_labels.to(device)
+test_inputs, test_labels = test_inputs.to(device), test_labels.to(device)
+
+input_dim = 784
+hidden_layers = [128]
+output_classes = 10
+
+rbm_layers = []
+current_input = train_inputs
+current_val_input = val_inputs
+for h_units in hidden_layers:
+    rbm = RBM(input_dim, h_units).to(device)
+    rbm_train_loss, rbm_val_loss = rbm.pretrain(current_input, current_val_input, rbm_batch_size=batch_size)
+    current_input = rbm.extract_features(current_input)
+    current_val_input = rbm.extract_features(current_val_input)
+    rbm_layers.append(rbm)
+    input_dim = h_units
+
+    plot_losses(rbm_train_loss, rbm_val_loss, title="RBM training and validation losses")
+
+dbn = DBN(rbm_layers=rbm_layers, output_classes=output_classes)
+dbn.to(device)
+
+dbn_train_loss, dbn_val_loss = train_dbn(dbn, train_inputs, train_labels, val_inputs, val_labels)
+
+plot_losses(dbn_train_loss, dbn_val_loss, title="DBN Training and Validation Loss")
+
+evaluate(dbn, test_inputs, test_labels)
diff --git a/article-deep_belief_network/src/requirements.txt b/article-deep_belief_network/src/requirements.txt
new file mode 100644
index 0000000..692974b
--- /dev/null
+++ b/article-deep_belief_network/src/requirements.txt
@@ -0,0 +1,3 @@
+matplotlib==3.10.0
+torch==2.5.1
+torchvision==0.20.1