In this repo you find post-Quantization methods applyed to two models GPT-2 and LLama_3.2-1B model. we use standard method to calculate perplexity of GPT-2 model but this method require more resoures for bigger model like LLama_3.2 so for Llama_3.2 we generate text and from that text we measure it's perplexity.
Jeet9788/Quantization_paper
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|