GitHub - mohsen-goodarzi/Gaokerena-R: Enhancing Reasoning Skills in Small Persian Medical Language Models Can Outperform Large-Scale Data Training

📒 Table of Contents

📒 Table of Contents
📍 Overview
🏃 Training process
📊 Results
⚠️ Risks and Limitations
⛔️ License
🤝 Collaborators

📍 Overview

We present gaokerena-R, a model trained with a limited-data approach to enhance the Persian medical reasoning capabilities of the aya-expanse-8b model. Despite using less data, gaokerena-R outperforms our previous model, gaokerena-V, which was trained on a much larger dataset. This demonstrates the effectiveness of our reasoning-focused training strategy under data-constrained conditions.

🏃 Training process

📊 Results

	gaokerena-R + aya-expanse-8b(verifier)	gaokerena-V	aya-expanse-8b
MMLU-anatomy(fa)	47.40	48.14	40.74
MMLU-medicalgenetics(fa)	56.0	53.0	49.0
MMLU-collegemedicine(fa)	50.28	43.93	44.51
MMLU-clinicalknowledge(fa)	58.86	55.47	52.07
MMLU-professionalmedicine(fa)	48.89	47.05	45.58
MMLU-collegebiology(fa)	54.86	47.22	45.14
MMLU(avg)	52.98	49.31	46.64
IBMSEE Sept 2023	46.42	38.69	34.52
Prompt	COT for the main model & Straight for the verifier	Straight	Straight
Inference_time	$\approx 5 \times 35 + 8 s$	$\approx 10s$	$\approx 10s$

⚠️ Risks and Limitations

While Gaokerena aims to provide relatively accurate information, it is not a substitute for professional medical advice. The model may have limitations in:

Handling medical emergencies.
Addressing highly specialized or rare medical conditions.
Offering region-specific guidance, as the training data does not include localized Persian medical practices.

⛔️ License

CC BY-NC-SA 4.0 (non-commercial use only)

🤝 Collaborators

Mehrdad Ghassabi
Sadra Hakim
Dr. Hamid Reza Baradaran Kashani
Pedram Rostami

Name		Name	Last commit message	Last commit date
Latest commit History 336 Commits
assets		assets
dataset		dataset
doc		doc
evaluations		evaluations
Readme.md		Readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📒 Table of Contents

📍 Overview

🏃 Training process

📊 Results

⚠️ Risks and Limitations

⛔️ License

🤝 Collaborators

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📒 Table of Contents

📍 Overview

🏃 Training process

📊 Results

⚠️ Risks and Limitations

⛔️ License

🤝 Collaborators

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages