SpaceBio-Bench

Mission-held-out transcriptomics benchmarks for evaluating whether AI/ML and foundation models generalize spaceflight biological signatures across missions, tissues, and model systems.

Former public name: GeneLab Benchmark. The v1-v7 historical benchmark surface keeps that name; SpaceBio-Bench is the forward-looking platform name.

Maintainer / citation author: JangKeun Kim, Weill Cornell Medicine.

Current public release note: v7.1.2 public-card/metadata patch over canonical v7.1 results. The patch updates documentation, public metadata, and access guidance; it does not introduce new benchmark result generation. Dataset freeze: 2026-03-01.

What This Is

SpaceBio-Bench evaluates a practical space-biology question:

If a model learns a transcriptomic spaceflight signature from one mission, can it recognize that signature in a different mission it has never seen?

The current public benchmark uses NASA Open Science Data Repository (OSDR) spaceflight transcriptomics, with emphasis on mouse multi-tissue bulk RNA-seq, mission-held-out validation, and transparent release boundaries.

Current Public Surfaces

Surface	Public status	Use it for	Entry point
v7.1 GeneLab Benchmark	Canonical historical result surface	v1-v7 results, public fold package, citation	Canonical results
Hugging Face dataset	Public processed fold package	Download selected LOMO feature matrices and result artifacts	HF dataset card
v9 public bulk	Metadata catalog	Task catalog, source inventory, and baseline summaries	v9 HF-style card

For linked methods, evaluation, and release-status notes, start with the SpaceBio-Bench public documentation map. For machine-readable release status, see release/release_manifest.json.

Benchmark Design

Layer	Design choice
Core split	Leave-One-Mission-Out (LOMO); mission is the independence unit
Primary label	Flight vs. ground control in public spaceflight transcriptomics
Main data source	NASA OSDR mouse spaceflight RNA-seq
Feature surfaces	Gene expression, Hallmark pathways, KEGG pathways, combined pathway features
Model tracks	Classical ML, gene-expression foundation models, text LLMs, graph/network baselines
Evaluation	AUROC, bootstrap confidence interval, permutation p-value, task-specific diagnostics
Leakage guard	Fold-specific variance filtering is computed on training missions only

Scope At A Glance

Dimension	v1-v7 public benchmark surface
Tissues	8: liver, gastrocnemius, kidney, thymus, skin, eye, lung, colon
Public OSDR source catalog	24+ OSD accessions
Processed sample scope	600+ binary/control samples across release layers
v4 multi-method grid	8 tissues x 8 classifiers x 4 feature types = 256 evaluations
Model families	Classical ML, 4 gene-expression foundation models, 3 text LLMs
Public HF fold package	Selected reviewer-facing LOMO tasks with train/test matrices and metadata

Headline Results

The v7.1 canonical result source is docs/CANONICAL_RESULTS_V7_1.md.

Result surface	Takeaway
Multi-method benchmark	PCA-LR is the strongest 8-tissue gene-level baseline in v4, with mean AUROC 0.776.
Best tissue rows	Thymus 0.948, colon 0.921, lung 0.901, kidney 0.829 across best method-feature combinations.
Cross-mission transfer	Thymus and gastrocnemius show the strongest mission-transfer signal; liver and kidney are harder.
Pathway features	Pathway representations rescue some weaker gene-level tissues, especially kidney and eye.
Foundation models	Tested gene-expression foundation models underperform tuned classical baselines on small-n bulk RNA-seq mission shift.
Held-out validation	Thymus RR-23 AUROC 0.905; skin RR-7 AUROC 0.885.

The original long-form README content is preserved at docs/README_LONGFORM_V7_1_ARCHIVE_2026_06_15.md.

Quick Start

Load A Public Fold From Hugging Face

pip install -r requirements.txt huggingface_hub

from huggingface_hub import hf_hub_download
import pandas as pd

repo_id = "jang1563/genelab-benchmark"
fold = "A5_skin_lomo/fold_RR-7_test"

def hf_csv(name):
    return pd.read_csv(
        hf_hub_download(
            repo_id=repo_id,
            filename=f"{fold}/{name}",
            repo_type="dataset",
        ),
        index_col=0,
    )

train_X = hf_csv("train_X.csv")
train_y = hf_csv("train_y.csv").iloc[:, 0]
test_X = hf_csv("test_X.csv")
test_y = hf_csv("test_y.csv").iloc[:, 0]

print(f"Train: {train_X.shape}, Test: {test_X.shape}")

Each public fold includes feature matrices, labels, sample metadata, fold_info.json, and selected_genes.txt.

Validate The Public Release Manifest

make release-qa
make hpc-public-qa
python3 scripts/validate_release_manifest.py
python3 -m unittest tests/test_release_manifest.py

Create a dry-run Hugging Face upload plan:

make hf-upload-plan HF_TASK=A5 HF_UPLOAD_PLAN=/tmp/spacebiobench_hf_upload_plan_A5.json

Reproduce From OSDR Inputs

Raw-data reproduction requires R/Bioconductor and task-specific preprocessing. See docs/r_dependencies.md and the scripts under scripts/.

Repository Map

tasks/                 Public v1 LOMO task inputs and selected fold packages
evaluation/            Historical v1 result JSON and summaries
v2/ ... v7/            Completed historical benchmark layers
v9/                    Public bulk metadata catalog and extension workspaces
docs/                  Cards, canonical results, methods, plans, and release notes
release/               Machine-readable public release manifest
scripts/               Data, evaluation, upload, validation, and figure scripts

Key Documents

Need	Document
Public result source	docs/CANONICAL_RESULTS_V7_1.md
Public documentation map	docs/SPACEBIOBENCH_TRANSPARENCY_CARD_PACK.md
System scope	docs/SPACEBIOBENCH_SYSTEM_CARD.md
Evaluation interpretation	docs/SPACEBIOBENCH_EVALUATION_CARD.md
Release status	docs/SPACEBIOBENCH_RELEASE_READINESS_CARD.md
Public statement guide	docs/SPACEBIOBENCH_CLAIM_REGISTER.md
Hugging Face dataset card source	docs/hf_dataset_card.md
v9 metadata catalog card source	docs/v9_hf_dataset_card.md
Contributing and submissions	CONTRIBUTING.md and docs/submission_format.md
Machine-readable release state	release/release_manifest.json

Data And Release Notes

All source data is derived from publicly available NASA OSDR resources. Code is MIT licensed. The processed public dataset package follows the license declared in the Hugging Face dataset card; upstream OSDR datasets should be cited and used under their individual terms.

Release labels are intentionally separated:

v7.1: canonical historical result surface and citation target.
v9 public bulk: metadata catalog and baseline-summary surface.

Contributing

Use CONTRIBUTING.md for documentation fixes, data-access reports, reproducibility issues, and public benchmark submissions. Prediction submissions should follow docs/submission_format.md.

Citation

Please cite the software using CITATION.cff. GitHub renders the same citation metadata in the repository citation panel.

@dataset{kim2026genelab,
  title = {SpaceBio-Bench / GeneLab Benchmark: Mission-Held-Out Spaceflight Transcriptomics Benchmark},
  author = {Kim, JangKeun},
  year = {2026},
  url = {https://huggingface.co/datasets/jang1563/genelab-benchmark},
  note = {v7.1.2 documentation, public-card, and metadata patch over canonical v7.1 results; data freeze 2026-03-01}
}

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
.github		.github
data		data
docs		docs
evaluation		evaluation
figures		figures
processed		processed
release		release
scripts		scripts
tasks		tasks
tests		tests
v2		v2
v3		v3
v4		v4
v5		v5
v6		v6
v7		v7
v9		v9
.gitignore		.gitignore
.zenodo.json		.zenodo.json
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
DATA_CATALOG.md		DATA_CATALOG.md
GLDS_verified.json		GLDS_verified.json
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt
requirements_geneformer.txt		requirements_geneformer.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpaceBio-Bench

What This Is

Current Public Surfaces

Benchmark Design

Scope At A Glance

Headline Results

Quick Start

Load A Public Fold From Hugging Face

Validate The Public Release Manifest

Reproduce From OSDR Inputs

Repository Map

Key Documents

Data And Release Notes

Contributing

Citation

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SpaceBio-Bench

What This Is

Current Public Surfaces

Benchmark Design

Scope At A Glance

Headline Results

Quick Start

Load A Public Fold From Hugging Face

Validate The Public Release Manifest

Reproduce From OSDR Inputs

Repository Map

Key Documents

Data And Release Notes

Contributing

Citation

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages