sarahlmk

sarahlmk

Achievements

alpaca_eval alpaca_eval Public

Forked from tatsu-lab/alpaca_eval

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Jupyter Notebook
FastChat FastChat Public

Forked from lm-sys/FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python
lm-evaluation-harness lm-evaluation-harness Public

Forked from EleutherAI/lm-evaluation-harness

A framework for few-shot evaluation of language models.

Python
portuguese-llm-bench portuguese-llm-bench Public

Unified evaluation suite for Portuguese LLMs — covering language understanding, reasoning, safety, and toxicity benchmarks.

Python