implement the med xpert qa text scenario in the medhelm by chakravarthik27 · Pull Request #19 · PacificAI/medhelm

chakravarthik27 · 2026-05-19T12:56:11Z

This pull request introduces the MedXpertQA Text benchmark to the codebase, enabling the evaluation of medical question answering capabilities in large language models. It includes the implementation of the scenario, integration into run specifications, and updates to the configuration and dependencies to support the new benchmark. The most important changes are summarized below:

New Scenario Implementation:

Added the MedXpertQATextScenario class in medxpert_qa_text_scenario.py, which loads and processes the MedXpertQA Text dataset from HuggingFace, structures instances for evaluation, and provides scenario metadata.

Integration with Benchmarking Framework:

Registered a new run specification function get_medxpert_qa_text_spec() in medhelm_run_specs.py to define how the scenario should be run, including adapter and metric specs.
Updated schema_medhelm.yaml to add medxpert_qa_text to the list of run groups and provided its display name, description, metric groups, environment, and taxonomy information for the benchmark schema.
Dependency and Build Updates:
Relaxed and aligned version constraints for several dependencies in pyproject.toml, such as datasets, numba, and together, and added tiktoken as a new dependency to support the new scenario.
Pinned the setuptools version below 82 for openai-whisper extra build dependencies to avoid build issues.

blidiselalin

Re-pin numba and together in requirements.txt (don't leave them fully unpinned).
Clarify whether tiktoken should be core vs. optional.
Fix the citation and description mismatch in schema_medhelm.yaml

…encies

…xl to dependencies

…and metadata

…tScenario

…axonomy details

…arity and accuracy

blidiselalin

Looks good

chakravarthik27 self-assigned this May 19, 2026

chakravarthik27 requested review from MiguelAFH, blidiselalin and iulianigas May 19, 2026 13:10

iulianigas approved these changes May 20, 2026

View reviewed changes

blidiselalin reviewed May 21, 2026

View reviewed changes

chakravarthik27 added 7 commits May 21, 2026 10:57

feat: implement MedXpert-QA-Text benchmark scenario and update depend…

b7b2213

…encies

fix: update package specifications for numba and together; add openpy…

530ad78

…xl to dependencies

feat: implement MedXpertQA Text scenario with detailed documentation …

63cec53

…and metadata

fix: remove unused import for ensure_file_downloaded in MedXpertQATex…

e6df289

…tScenario

feat: enhance MedXpertQATextScenario metadata with display name and t…

781d2a6

…axonomy details

feat: update MedXpertQATextScenario name and add tiktoken dependency

36811ac

feat: update MedXpertQA descriptions and requirements for improved cl…

5477d69

…arity and accuracy

chakravarthik27 force-pushed the PAL-1263-implement-the-med-xpert-qa-text-scenario-in-the-medhelm branch from e22d464 to 5477d69 Compare May 21, 2026 11:07

chakravarthik27 requested a review from blidiselalin May 21, 2026 13:27

feat: update display name for MedXpertQA Text scenario for consistency

9c755d0

blidiselalin approved these changes May 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement the med xpert qa text scenario in the medhelm#19

implement the med xpert qa text scenario in the medhelm#19
chakravarthik27 wants to merge 8 commits into
mainfrom
PAL-1263-implement-the-med-xpert-qa-text-scenario-in-the-medhelm

chakravarthik27 commented May 19, 2026

Uh oh!

blidiselalin left a comment

Uh oh!

blidiselalin left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

chakravarthik27 commented May 19, 2026

Uh oh!

blidiselalin left a comment

Choose a reason for hiding this comment

Uh oh!

blidiselalin left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants