Skip to content

connerlambden/refute-inspect

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

refute-inspect

Inspect AI adapter for REFUTE — judge-free tasks for scientific critique and epistemic calibration on recent science paper summaries.

Tasks

Task Description
refute_forced_choice Pick the more flawed of twin summaries (contamination-proof, chance 50%)
refute_soundness Binary sound/flawed classification

Install

pip install inspect-ai datasets
git clone https://github.com/connerlambden/refute-inspect.git
cd refute-inspect && pip install -e .

Run

inspect eval src/refute_inspect/refute_inspect.py@refute_forced_choice --model openai/gpt-4o
inspect eval src/refute_inspect/refute_inspect.py@refute_soundness --model openai/gpt-4o

Dataset loads from Hugging Face BGPT-OFFICIAL/refute (config refute_soundness, revision pinned at runtime).

Links

Hub integrator index

See also the dataset INTEGRATORS.md for all registration links.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages