Skip to content

theislab/ReconEval

Repository files navigation

ReconEval

paper docs license data

Benchmark for gene expression reconstruction from single-cell latent representations, covering observational and perturbational tasks.

ReconEval — benchmark overview

Fig 1. (a) Reconstructing latent cell representations. (b) Latent space modeling under various conditions. (c) Two reconstruction schemes: stand-alone reconstruction (end-to-end & foundation-model) and latent-shift reconstruction (perturbation prediction). (d) Experiment space spans three datasets, three out-of-distribution levels and four hyperparameter axes. (e) Three metric families: statistical, biological, perturbational.

Documentation

Full documentation, API reference and rendered tutorials at reconeval.readthedocs.io.

What ReconEval evaluates

Latent representations

  • End-to-end: PCA, AE, scVI, nlscVI, mlscVI across latent dims {10, 32, 128, 512, 2048} and library size handling (None, Modeled, Observed).
  • Foundation model embeddings: SE from STATE (2058-d), scGPT (512-d), scConcept (512-d), SCimilarity (128-d)

Decoders

  • MLP, Transformer, KNN

Datasets

Out-of-distribution levels — 3 level of splitting by cell type / cell line, perturbation, condition.

Metric families — see Computing metrics on your own data below for the API.

  • Statistical — R², MMD-RBF, energy distance
  • Biological — DEG recovery, coexpression structure, cell-cycle composition, cytokine response, pathway activity
  • Perturbational — KNN purity

Tutorials

The metrics notebook walks through each metric on a single (true, reconstructed) AnnData pair, then shows the rank-percentile aggregation used to compare methods. The same API applies to all three benchmark settings in Fig 1c.

The analysis notebooks under Reproducibility run the same recipe against the cached paper artefacts.

Experiments

YAML configs and SLURM submission scripts for each benchmark setting are in experiments/, organised by task:

Folder What it contains
experiments/preprocessing/ PBMC / LuCA / Tahoe data-preparation scripts.
experiments/01_end_to_end/ PCA / AE / scVI / nlscVI / mlscVI end-to-end reconstruction.
experiments/02_foundation_model/ FM (SE, scGPT, scConcept, SCimilarity) embed + decoder train.
experiments/03_latent_shift/ CellFlow / STATE latent-shift reconstruction.

Each task has its own configs/, codes/ and submit/ tree (Hydra configs, Python drivers, sbatch wrappers, eval scripts). See each task's README.md for env, data and CLI override notes.

Reproducibility

Three notebooks under analysis/data/plots/ reproduce the paper's figures from cached metric CSVs and lookup tables hosted on huggingface.co/datasets/theislab/ReconEval. Download those into analysis/frozen/; the notebooks write SVGs to analysis/figs/figN/. No model is retrained.

Run them from analysis/data/plots/ so the relative paths ../frozen/ and ../figs/ resolve.

Setting (Fig 1c) Notebook Figures produced
End-to-end reconstruction (PCA / AE / VAE) analysis/data/plots/fig2_clean.ipynb Fig 2 (qualitative + summary + scaling)
Foundation-model reconstruction (frozen FM + decoder) analysis/data/plots/fig3_clean.ipynb Fig 3 (FM × decoder × metrics panels)
Latent-shift reconstruction (CellFlow + STATE) analysis/data/plots/fig4_clean.ipynb Fig 4 (ST/CF scaling + B-cell spotlight)

Data availability

Paper

Preprint: TBD

Citation

TBD — will be added when the preprint is available.

License

MIT — see LICENSE.

About

Benchmarking for gene expression from single cell latent representations

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors