GitHub - ericabrisigotti/optimization_of_ML_algorithms_through_Bayesian_optimization_and_gaussian_processes: final project from the Information Theory and Inference class held at University of Padova by Prof. Michele Allegra during Academic Year 2023-2024

Optimization of Machine Learning Algorithms through Bayesian optimization and Gaussian Processes

Project for the "Information Theory and Inference" class, held by Prof. Michele Allegra at University of Padova during Academic Year 2023-2024,

by Erica Brisigotti & Ekaterina Chueva

The goals of the project are:

implement a Bayesian Optimization (BO) algorithm from scratch
test the algorithm on benchmark functions
apply it to a machine learning task

Content of the project

1) Bulding a BO algorithm

For the theory the article "Taking the Human Out of the Loop: A Review of Bayesian Optimization", B. Shahriari, K. Swersky, Z. Wang, R. P. Adams and N. de Freitas, 2016 was taken as a main reference.

The implemeted kernels are squared-exponential, Matern-1, Matern-3 and Matern-5. On top of that, there are two types of acquisition functions: expected improvement and upper confidenc bound ("optimistic"). For the details and context see the jupyter notebook.

2) Testing the algorithm on benchmark functions

Three benchmark functions of different complexity were taken from the Olympus package: HyperEllipsoid, Levy and AckleyPath. They were taken with the opposite sign, as our algorithm finds maximum of the function.

The results of the testing are provided on the following pictures:

As we can see from the three plots above, there is no single perfect recipe for all functions.

For Levy and Hyperellipsoid, we notice that:

the final results are reached by all combinations of policies and kernels, but with varying number of BO iterations
the optimistic policy starts examining from further points, but is also more likely to change for a better value
the expected improvement policy starts closer to the final output, but changes less

These tendencies are likely to be related to the smoother shape of the functions.

For AckleyPath, we clearly see from the plots that both policies get stuck in different local maxima, because of the very complicated shape of the function.

3) Applying BO to machine learning

We consider a simple CNN and the MNIST dataset of handwritten digits as our task. The target function is the accuracy of the CNN, while parameters to be optimized are learning rate and the momentum of the stochastic gradient descent.

The results of apllying our algorithm to this task are presented on the following plot:

We notice from the plot that, for our CNN application, the matern-5 kernel seems to be optimal for both policies.

Besides that, we also compare the results obtained with Bayesian optimization with similar packages tools:

Keras Tuner provides multiple algorithm choices, including a Bayesian Optimization one.
the Bayesian optimization package

As we found, our implementation is comparable in terms of speed and results with these tools.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
ITaI_project_data		ITaI_project_data
AckleyPath.png		AckleyPath.png
BO_package_ei_best_params.txt		BO_package_ei_best_params.txt
CNN.png		CNN.png
HyperEllipsoid.png		HyperEllipsoid.png
ITaI_project_done.ipynb		ITaI_project_done.ipynb
Levy.png		Levy.png
README.md		README.md
keras_tuner_BO_best_params.txt		keras_tuner_BO_best_params.txt
pseudo_code.png		pseudo_code.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Optimization of Machine Learning Algorithms through Bayesian optimization and Gaussian Processes

Content of the project

1) Bulding a BO algorithm

2) Testing the algorithm on benchmark functions

3) Applying BO to machine learning

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Optimization of Machine Learning Algorithms through Bayesian optimization and Gaussian Processes

Content of the project

1) Bulding a BO algorithm

2) Testing the algorithm on benchmark functions

3) Applying BO to machine learning

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages