Skip to content

meaoww/gap-k

Repository files navigation

Gap-K%: Measuring Top-1 Prediction Gap for Detecting Pretraining Data

arXiv Website | Project Page

🔍 Overview

This repository provides the official implementation of Gap-K%: Measuring Top-1 Prediction Gap for Detecting Pretraining Data (ACL'26). It includes experimental code for Gap-K% along with several baseline methods on the WikiMIA and MIMIR datasets. For the Neighbor baseline experiments on the MIMIR benchmark, we use the implementation provided here: https://github.com/zjysteven/mimir

⚙️ Environment

Our experiments are conducted under the following environment:

  • Python 3.10
  • PyTorch 2.7.1
  • CUDA 12.6

After setting up PyTorch, install the remaining dependencies:

pip install -r requirements.txt

🔐 Hugging Face Access

Please log in to Hugging Face before running the scripts:

huggingface-cli login

📁 Dataset

We conduct experiments on WikiMIA and MIMIR:

🤖 Models

We conduct WikiMIA experiments on a diverse set of large language models:

Note: LLaMA-65B is evaluated using INT8 inference.

For MIMIR experiments, we use Pythia 160M, 1.4B, 2.8B, 6.9B, and 12B.

🚀 Running

We provide SLURM job scripts to run all experiments:

  • wikimia.sh
    Evaluates Loss, Zlib, Min-K%, Min-K%++, Gap-K% on WikiMIA.

  • wikimia_neighbor.sh
    Evaluates Neighbor on WikiMIA.

  • mimir.sh
    Evaluates Loss, Zlib, Min-K%, Min-K%++, Gap-K% on MIMIR.

After running the scripts, results will be saved to:

results/

🙏Acknowledgement

This implementation is based on the official codebase of Min-K%++.

About

[ACL'26] Gap-K%: Measuring Top-1 Prediction Gap for Detecting Pretraining Data

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors