Skip to content

Adit31/Large-Language-Models-Independent-Study

Repository files navigation

Large Language Models Independent Study

The report for this quarter and the previous quarter are uploaded as a single document in the name "Independent_Study_Compilation.pdf".

All the code files related to the model are present in the folder "SumGenToBT". Due to the large model size, it wasn't possible to upload the checkpoints and the final weights of the model on Github. However, those files are present on the VM, and can be shared if and when required.

"SumGenToBT/sumgen/base_finetuned" contains all the output files generated by the model post-training. The model was trained using the "run.sh" file present in "sumgen" folder, and that was the point I was stuck at. Ideally, what we need to do to get it working is first perform java-to-english translation, send the results for processing (binarization), and get those outputs into the model for english-to-python translation. The code was doing that impicitly somewhere, which I wasn't able to find out and that's why this part had me stymied. Another thought was to just use the evaluation code, but the evaluation code was using the back-translation, and the input was given accordingly. Deeper understanding of what was happening behind the curtains was required to pin-point exactly what we should be adding in our explicit call.

About

Theoretical and Practical results of research on Large Language Models (focused towards GPT)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors