Quantization_paper

In this repo you find post-Quantization methods applyed to two models GPT-2 and LLama_3.2-1B model. we use standard method to calculate perplexity of GPT-2 model but this method require more resoures for bigger model like LLama_3.2 so for Llama_3.2 we generate text and from that text we measure it's perplexity.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
AWQ_Llama_3_2.ipynb		AWQ_Llama_3_2.ipynb
Gptq_bnb_LLmint8_Llama_3_2_1B.ipynb		Gptq_bnb_LLmint8_Llama_3_2_1B.ipynb
Kquant_GPT_2.ipynb		Kquant_GPT_2.ipynb
LLm_int8_gpt2.ipynb		LLm_int8_gpt2.ipynb
README.md		README.md
allgptq_gpt2.ipynb		allgptq_gpt2.ipynb
bnb_nf4_gpt-2.ipynb		bnb_nf4_gpt-2.ipynb
k_Quant_4_Llama_3_2.ipynb		k_Quant_4_Llama_3_2.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quantization_paper

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Quantization_paper

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages