"Multi-cohort TBI research fails at harmonization, not at modeling. Without standardized variable definitions, even large datasets produce incomparable results."
| Attribute | Value |
|---|---|
| Status | Incubating |
| Maturity | Design Phase |
| License | Apache-2.0 |
| Part of | Evidence Commons |
| Mission Pillar | Pillar 2 (UQ & Conformal) + Pillar 1 (Evaluation) -- cross-cutting |
NINDS-CDE-Tools is intended to provide developer-facing utilities for validating, transforming, and mapping clinical datasets to NINDS Common Data Element (CDE) compliant schemas. The NINDS CDE program standardizes data collection across TBI clinical studies, and NINDS CDE v3.0 reclassifies biomarkers under "Neurodiagnostic Technologies: Biomarkers." Consistent CDE compliance is a prerequisite for reproducible multi-cohort TBI research, yet no open-source tooling exists to validate datasets against CDE definitions or harmonize variables across studies programmatically.
The core ontology that will power these tools already exists in the private evidenceos-research/evidenceos-ontology package: 13 YAML schema files totaling 7,780 lines. The parent codebase also contains a loader registry (991 LOC) with 247+ CDE mappings across 8 datasets. This repository is designed to contain the extracted, standalone tooling layer -- a CLI and Python library that wraps these schemas for community use. None of the planned tools exist yet; the schemas exist, the tooling does not.
| Component | Description | Parent Code Exists | This Repo |
|---|---|---|---|
| CDE Validation CLI | Validate dataset columns, value ranges, and coding against NINDS CDE definitions | Schemas: 13 YAMLs, 7,780 lines | Not started |
| Mapping Utilities | Transform non-CDE datasets to CDE-compliant format | Loader registry: 991 LOC, 247+ mappings | Not started |
| Dataset Harmonization | Align variables across multi-cohort studies (IMPACT, CRASH, CENTER-TBI) to shared schema | Partial (loader handles per-dataset) | Not started |
| CDE Browser | Searchable catalog of CDEs with definitions and value sets | Not built | Not started |
| FHIR Mapping | Bridge CDE definitions to FHIR resources for health system interoperability | Not built | Not started |
What exists in the parent codebase (evidenceos-research/evidenceos-ontology):
- 13 YAML schema files (7,780 lines total):
cbim_framework,clinical_entities,endophenotype_atn,imaging_cdes,implementation_science,knowledge_translation,methodology_extraction,provenance,temporal_phases,universal_schema,lsr_to_modeling_pillar,tbi_modeling_taxonomy,living_systematic_review - Loader registry: 991 LOC, 8 datasets, 247+ CDE mappings
What does not exist yet:
- CDE validation CLI
- Mapping utilities as standalone tools
- Cross-cohort harmonization helpers
- Searchable CDE browser
- CDE-to-FHIR mapping layer
- Any standalone code in this repository
- Extract CDE schema definitions from ontology YAML files into a standalone, documented format
- Build validation CLI that checks datasets against extracted schemas (column presence, value ranges, coding)
- Implement mapping layer for common input formats (REDCap exports, CSV, FHIR Questionnaire)
- Add harmonization utilities tested against known multi-cohort alignment challenges
- Build searchable CDE catalog with definitions, value sets, and cross-references
- Publish as standalone Python package (PyPI) with no dependency on the private research codebase
graph LR
A[NINDS CDE v3.0<br/>13 YAMLs] --> B[NINDS-CDE-Tools]
B --> C[All Research Packages<br/>CDE compliance]
B --> D[Lab-in-a-Box<br/>data quality gate]
B --> E[AIDA<br/>CDE-to-FHIR mapping]
style B fill:#2A9D8F,stroke:#1E3A8A,color:#fff
NINDS-CDE-Tools is designed to serve all research packages (CDE compliance), Lab-in-a-Box (data quality enforcement at ingestion), and AIDA Infrastructure (interoperability via CDE-to-FHIR mapping). As standards tooling for global TBI research, this is a strong Digital Public Good candidate. Canonical source: evidenceos-research/evidenceos-ontology. See Evidence Commons for related projects.
This project is in the design phase. The underlying schemas exist in the parent codebase, but no standalone tooling has been extracted. Contributions are welcome once the initial CLI and validation layer are in place.
Apache-2.0 -- see LICENSE for details.