This repository collects the coursework for CS7CS4 Machine Learning. The final assignment is the main focus of the repo and contains the most substantial modelling work.
The final_assignment/ folder contains the submission work for the coursework's final project. It is split into two main model families:
final_assignment/MathGPT/for arithmetic language models and dataset generation.final_assignment/BoolGPT/for boolean-expression language models, dataset generation, and finetuning experiments.
The final assignment also includes supporting analysis scripts, experiment outputs, and archived results from model variants. The arithmetic and boolean projects are the parts most worth reading first.
The Week_* folders contain the earlier weekly assignments and supporting files for the course.
This repository uses Python, PyTorch, NumPy, pandas, matplotlib, and scikit-learn for the larger experiments.