+
+
+
\ No newline at end of file
diff --git a/docs/_layouts/page-triary-newreleases.html b/docs/_layouts/page-triary-newreleases.html
new file mode 100644
index 0000000..6a0b135
--- /dev/null
+++ b/docs/_layouts/page-triary-newreleases.html
@@ -0,0 +1 @@
+{% include pageAlt.html page_header="headers/standard.html" page_sidebar="empty.html" page_extra_js="/js/new-releases.js" %}
diff --git a/docs/_layouts/page-triary.html b/docs/_layouts/page-triary.html
new file mode 100644
index 0000000..a76f021
--- /dev/null
+++ b/docs/_layouts/page-triary.html
@@ -0,0 +1 @@
+{% include pageAlt.html page_header="headers/standard.html" page_sidebar="empty.html" %}
\ No newline at end of file
diff --git a/docs/assays/metadata/10XMultiome.md b/docs/assays/metadata/10XMultiome.md
index a1ab8b9..1027650 100644
--- a/docs/assays/metadata/10XMultiome.md
+++ b/docs/assays/metadata/10XMultiome.md
@@ -1,24 +1,28 @@
----
-layout: page
----
-# 10X-Multiome
-
-Version 2 (current)
-
-## Version 2 (current)
-
-| Attribute | Type | Description | Allowable Values | Required |
-|----------------------------------------|-----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|------------|
-| dataset_type | Allowable Value | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium```| True |
-| contributors_path | Textfield | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | True |
-| data_path | Textfield | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | True |
-| capture_batch_id | Textfield | A lab-generated ID to identify which cells were captured at the same time. This would, for example, be an ID to denote which datasets were derived from a single 10X Genomics Chromium Controller run. In the case of the 10X Controller this could be the chip ID and would allow users the ability to determine which samples were processed together in a Chromium controller. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | True |
-| preparation_instrument_vendor | Allowable Value | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` | True |
-| preparation_instrument_model | Allowable Value | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist ```| True |
-| number_of_pre-amplification_pcr_cycles | Numeric | The number of PCR cycles run after the Chromium Controller step and prior to separating the suspension and initiating library construction | | True |
-| metadata_schema_id | Textfield | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | True |
-| preparation_instrument_kit | Allowable Value | The reagent kit used with the preparation instrument. | ```10X Genomics; Chromium Next GEM Chip G Single Cell Kit``` ```16 rxns; PN 1000127``` ```10X Genomics; Chromium Next GEM Chip G Single Cell Kit``` ```48 rxns; PN 1000120``` ```10X Genomics; Chromium Next GEM Chip K Automated Single Cell Kit``` ```48 rxns; PN 1000289``` ```10X Genomics; Chromium Next GEM Chip K Single Cell Kit``` ```16 rxns; PN 1000287``` ```10X Genomics; Chromium Next GEM Chip K Single Cell Kit``` ```48 rxns; PN 1000286``` ```10X Genomics; Chromium Next GEM Chip Q Single Cell Kit``` ```16 rxns; PN 1000422``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```16 rxn; PN 1000283``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```4 rxn; PN 1000285``` ```10X Genomics; Visium FFPE Reagent Kit v2-Small``` ```PN 1000436``` ```Custom``` | True |
-| preparation_protocol_doi | Textfield | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: | | True |
-| parent_sample_id | Textfield | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | True |
-
-
\ No newline at end of file
+---
+layout: page-triary
+---
+
+# 10X-Multiome Metadata Attributes
+
+Fields that are collected for 10X-Multiome data, available at ```Dataset.metadata.```
+
+
+* indicates a required field
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | |
+| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | |
+| preparation_protocol_doi * | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | ```https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1``` |
+| dataset_type * | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` |
+| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | |
+| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | |
+| capture_batch_id | | A lab-generated ID to identify which cells were captured at the same time. This would, for example, be an ID to denote which datasets were derived from a single 10X Genomics Chromium Controller run. In the case of the 10X Controller this could be the chip ID and would allow users the ability to determine which samples were processed together in a Chromium controller. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | |
+| preparation_instrument_vendor * | | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` |
+| preparation_instrument_model * | | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist``` |
+| preparation_instrument_kit * | | The reagent kit used with the preparation instrument. | ```10X Genomics; Chromium Next GEM Chip G Single Cell Kit``` ```16 rxns; PN 1000127``` ```48 rxns; PN 1000120``` ```10X Genomics; Chromium Next GEM Chip K Automated Single Cell Kit``` ```48 rxns; PN 1000289``` ```10X Genomics; Chromium Next GEM Chip K Single Cell Kit``` ```16 rxns; PN 1000287``` ```48 rxns; PN 1000286``` ```10X Genomics; Chromium Next GEM Chip Q Single Cell Kit``` ```16 rxns; PN 1000422``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```16 rxn; PN 1000283``` ```4 rxn; PN 1000285``` ```10X Genomics; Visium FFPE Reagent Kit v2-Small``` ```PN 1000436``` ```Custom``` |
+| number_of_pre_amplification_pcr_cycles | | | |
+| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | |
+
+
+
diff --git a/docs/assays/metadata/ATACseq.md b/docs/assays/metadata/ATACseq.md
index 3e030c1..cce1089 100644
--- a/docs/assays/metadata/ATACseq.md
+++ b/docs/assays/metadata/ATACseq.md
@@ -1,257 +1,89 @@
----
-layout: page
----
-# ATACseq
-
-NOTE: Several versions of this metadata schema have been created over time. The (Latest) version contains most attributes, but there may be some deprecated attributes in the older versions for which data has been collected. HuBMAP is in the process of creating a reference which combines all of these versions into a single view. That reference will be available here once completed.
-
-
-Version 3 (Latest)
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-----------------------------------------------------|-----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|------------|
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA ```| True |
-| acquisition_instrument_vendor | Allowable Value | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | True |
-| acquisition_instrument_model | Allowable Value | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | True |
-| source_storage_duration_value | Numeric | How long was the source material (parent) stored, prior to this sample being processed. | | True |
-| source_storage_duration_unit | Allowable Value | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | True |
-| time_since_acquisition_instrument_calibration_value | Numeric | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | False |
-| time_since_acquisition_instrument_calibration_unit | Allowable Value | The time unit of measurement |```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | False |
-| contributors_path | Textfield | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | True |
-| data_path | Textfield | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | True |
-| barcode_read | Allowable Value | Which read file contains the cell or capture spot barcode. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if the source material is barcoded. This field is used to determine which analysis pipeline to run. | ```Read 2 (R2)``` ```Read 1 (R1)``` ```Not applicable``` | True |
-| barcode_size | Allowable Value | Length of the cell or capture spot barcode in base pairs. Cell and capture spot barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences, the offsets. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if the source material is barcoded. This field is used to determine which analysis pipeline to run. | ```14``` ```16``` ```40``` ```8,8,8``` ```8,6``` ```Not applicable``` | True |
-| umi_read | Allowable Value | Which read file contains the UMI barcode. This should be included when constructing sequencing libraries with a non-commercial kit. | ```Read 2 (R2)``` ```Read 1 (R1)``` ```Not applicable``` | True |
-| umi_size | Allowable Value | Length of the umi barcode in base pairs. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if UMI are present. This field is used to determine which analysis pipeline to run. | ```8``` ```9``` ```10``` ```12``` ```Not applicable``` | True |
-| assay_input_entity | Allowable Value | This is the entity from which the analyte is being captured. For example, for bulk sequencing this would be "tissue", while it would be "single cell" for single cell sequencing. This field is used to determine which analysis pipeline to run. | ```area of interest``` ```single cell``` ```single nucleus``` ```spot``` ```tissue (bulk)``` | True |
-| number_of_input_cells_or_nuclei | Numeric | How many cells or nuclei were input to the assay? This is typically not available for preparations working with bulk tissue. | | False |
-| library_adapter_sequence | Textfield | 5’ and/or 3’ read adapter sequences used as part of the library preparation protocol to render the library compatible with the sequencing protocol and instrumentation. This should be provided as comma-separated list of key:value pairs (adapter name:sequence). | | True |
-| library_average_fragment_size | Numeric | Average size of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. Numeric value in base pairs (bp). | | True |
-| library_input_amount_value | Numeric | The amount of cDNA, after amplification, that was used for library construction. | | False |
-| library_input_amount_unit | Allowable Value | unit of library input amount value | ```ng``` ```ul``` | False |
-| library_output_amount_value | Numeric | Total amount (eg. nanograms) of library after the clean-up step of final pcr amplification step. Answer the question: What is the Qubit measured concentration (ng/ul) times the elution volume (ul) after the final clean-up step? | | False |
-| library_output_amount_unit | Allowable Value | Units of library final yield. | ```ng``` ```ul``` | False |
-| library_concentration_value | Numeric | The concentration value of the pooled library samples submitted for sequencing. | | True |
-| library_concentration_unit | Allowable Value | Unit of library concentration value. | ```ng/ul``` ```nM``` | True |
-| library_layout | Allowable Value | Whether the library was generated for single-end or paired end sequencing | ```paired-end``` ```single-end``` | True |
-| number_of_pcr_cycles_for_indexing | Numeric | Number of PCR cycles performed in order to add adapters and amplify the library. This does not include the cDNA amplification which is captured in the "number of iterations of cDNA amplification" field. | | True |
-| library_preparation_kit | Allowable Value | Reagent kit used for library preparation | ```10X Genomics; Automated Library Construction Kit``` ```24 rxns; PN 1000428``` ```10X Genomics; Chromium Next GEM Automated Single Cell 5' Kit v2``` ```24 rxns; PN 1000290``` ```10X Genomics; Chromium Next GEM Automated Single Cell 5' Kit v2``` ```4 rxns; PN 1000298``` ```10X Genomics; Chromium Next GEM Single Cell 3' GEM``` ```Library & Gel Bead Kit v3.1``` ```16 rxns; PN 1000121``` ```10X Genomics; Chromium Next GEM Single Cell 3' HT Kit v3.1``` ```48 rxns; PN 1000348``` ```10X Genomics; Chromium Next GEM Single Cell 3' HT Kit v3.1``` ```8 rxns; PN 1000370``` ```10X Genomics; Chromium Next GEM Single Cell 3' Kit v3.1``` ```16 rxns; PN 1000268``` ```10X Genomics; Chromium Next GEM Single Cell 3' Kit v3.1``` ```4 rxns; PN 1000269``` ```10X Genomics; Chromium Next GEM Single Cell 5' Kit v2``` ```16 rxns; PN 1000263``` ```10X Genomics; Chromium Next GEM Single Cell 5' Kit v2``` ```4 rxns; PN 1000265``` ```10X Genomics; Chromium Next GEM Single Cell Fixed RNA Hybridization & Library Kit``` ```4 rxns; PN 1000415``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```16 rxn; PN 1000283``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```4 rxn; PN 1000285``` ```10X Genomics; Chromium Single Cell 3' GEM``` ```Library & Gel Bead Kit v3``` ```4 rxns PN 1000092``` ```10X Genomics; Chromium Single Cell 3' Library & Gel Bead Kit``` ```4 rxns; PN 120267``` ```10X Genomics; Visium CytAssist Spatial Gene Expression for FFPE``` ```Human Transcriptome``` ```11 mm``` ```2 reactions; PN 1000522``` ```10X Genomics; Visium CytAssist Spatial Gene Expression for FFPE``` ```Human Transcriptome``` ```6.5mm``` ```4 reactions; PN 1000520``` ```10X Genomics; Visium Spatial for FFPE Gene Expression Kit``` ```Human Transcriptome``` ```1 slides``` ```4 reactions; PN 1000338``` ```10X Genomics; Visium Spatial for FFPE Gene Expression Kit``` ```Mouse Transcriptome``` ```4 rxns; PN 1000339``` ```10X Genomics; Visium Spatial Gene Expression Slide and Reagent Kit``` ```1 slides``` ```4 reactions; PN 1000187``` ```10X Genomics; Visium Spatial Gene Expression Slide and Reagent Kit``` ```4 slides``` ```16 reactions; PN 1000184``` ```Custom``` ```Illumina; TruSeq Stranded mRNA Library Prep (48 samples); PN 20020594``` ```Illumina; TruSeq Stranded mRNA Library Prep (96 samples); PN 20020595``` ```New England BioLabs; NEBNext Ultra II RNA Library Prep Kit for Illumina; PN E7770``` ```Parse Biosciences; Evercode WT Mini v2 Kit``` ```12 rxns; PN ECW02010``` ```Parse Biosciences; Evercode WT v2 Kit``` ```48 rxns; PN ECW02030)``` | True |
-| sample_indexing_kit | Allowable Value | Indexes are needed for multiplexing sequencing libraries for simultaneous sequencing (pooling) and proper attachment to the Illumina flowcell. Each indexing kit would have a number of compatible sequences ("sample indexing sets") that are used to label some number of samples (the number of sets depend on the kit). | ```10X Genomics; Chromium i7 Sample Index Plate (96 rxn); PN 220103``` ```10X Genomics; Dual Index Kit TS``` ```Set A; PN 1000251``` ```10X Genomics; Dual Index Kit TT``` ```Set A (96 rxn); PN 1000215``` ```10X Genomics; Single Index Kit N``` ```Set A (96 rxn); PN 1000212``` ```Custom``` ```Illumina; IDT for Illumina - TruSeq RNA UD Indexes v2 (96 Indexes``` ```96 Samples); PN 20040871``` ```Illumina; TruSeq RNA CD Index Plate (96 Indexes``` ```96 Samples); PN 20019792``` ```Illumina; TruSeq RNA Single Indexes Set A (12 Indexes``` ```48 Samples); PN 20020492``` ```Illumina; TruSeq RNA Single Indexes Set B (12 Indexes``` ```48 Samples); PN 20020493``` ```Integrated DNA Technologies: Custom DNA Oligos``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-AB``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-CD``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-EF``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-GH``` ```Not applicable``` ```Parse Biosciences; Fragmentation Reagents; PN WX100``` ```Parse Biosciences; UDI Plate - WT; PN UDI1001 ```| True |
-| sample_indexing_set | Textfield | The specific sequencing barcode index set used, selected from the sample indexing kit. Example: For 10X this might be "SI-GA-A1", for Nextera "N505 - CTCCTTAC" | | True |
-| is_technical_replicate | Allowable Value | Is the sequencing reaction run in replicate, "Yes" or "No". If "Yes", FASTQ files in dataset need to be merged. | ```Yes``` ```No``` | True |
-| expected_entity_capture_count | Numeric | Number of cells, nuclei or capture spots expected to be captured by the assay. For Visium this is the total number of spots covered by tissue, within the capture area. | | False |
-| sequencing_reagent_kit | Allowable Value | Reagent kit used for sequencing |```Custom``` ```Illumina``` ```HiSeq 3000/4000 PE Cluster Kit PE-410-1001``` ```PN 1000283, Illumina``` ```NextSeq 1000/2000 P2 Reagent v3 Kit (100 Cycles)``` ```PN 20046811, Illumina``` ```NextSeq 1000/2000 P2 Reagent v3 Kit (200 Cycles)``` ```PN 20046812, Illumina``` ```NextSeq 1000/2000 P2 Reagent v3 Kit (300 Cycles)``` ```PN 20046813, Illumina``` ```NextSeq 2000 P3 Reagent Kit (300 Cycles)``` ```PN 20040561, Illumina``` ```NextSeq 2000 P3 Reagents Kit (100 Cycles)``` ```PN 20040559, Illumina``` ```NextSeq 500/550 Hi Output Kit 150 Cycles``` ```v2.5``` ```PN 20024907, Illumina``` ```NextSeq 500/550 Hi Output Kit 75 Cycles v2.5``` ```PN 20024906, Illumina``` ```NextSeq 500/550 Mid Output Kit 150 Cycles v2.5``` ```PN 20024904, Illumina``` ```NovaSeq 6000 S1 Reagent Kit (200 Cycles)``` ```PN 20012864, Illumina``` ```NovaSeq 6000 S1 Reagent v1.5 Kit (100 Cycles)``` ```PN 20028319, Illumina``` ```NovaSeq 6000 S1 Reagent v1.5 Kit (200 Cycles)``` ```PN 20028318, Illumina``` ```NovaSeq 6000 S1 Reagent v1.5 Kit (300 Cycles)``` ```PN 20028317, Illumina``` ```NovaSeq 6000 S2 Reagent v1.5 Kit (100 Cycles)``` ```PN 20028316, Illumina``` ```NovaSeq 6000 S4 Reagent Kit v1.5 (300 cycles)``` ```PN 20028312, Illumina``` ```NovaSeq 6000 S4 Reagent v1.5 Kit (200 Cycles)``` ```PN 20028313, Illumina``` ```NovaSeq 6000 SP Reagent v1.5 Kit (100 Cycles)``` ```PN 20028401, Illumina``` ```NovaSeq X Series 1.5B Reagent Kit (100 Cycle)``` ```PN 20104703, Illumina``` ```NovaSeq X Series 1.5B Reagent Kit (200 Cycle)``` ```PN 20104704, Illumina``` ```NovaSeq X Series 1.5B Reagent Kit (300 Cycle)``` ```PN 20104705, Illumina``` ```NovaSeq X Series 10B Reagent Kit (100 Cycle)``` ```PN 20085596, Illumina``` ```NovaSeq X Series 10B Reagent Kit (200 Cycle)``` ```PN 20085595, Illumina``` ```NovaSeq X Series 10B Reagent Kit (300 Cycle) ``` ```PN 20085594``` | True |
-| sequencing_read_format | Textfield | Number of sequencing cycles in each round of sequencing (i.e., Read1, i7 index, i5 index, and Read2). This is reported as a comma-delimited list. Example: For 10X snATAC-seq (R1,Index,R2,R3) this might be: 50,8,16,50. For SNARE-seq2 this might be: 75,94,8,75 | | True |
-| transposition_method | Allowable Value | Modality of capturing accessible chromatin molecules. For example, this would be the type of kit that was used. | ```bulkATACseq``` ```sciATACseq``` ```Custom``` ```scATACseq``` | True |
-| sequencing_batch_id | Textfield | The ID for the sequencing run. This could, for example, be the chip ID and should allow users the ability to determine which samples were processed together in a sequencing run. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | False |
-| capture_batch_id | Textfield | A lab-generated ID to identify which cells were captured at the same time. This would, for example, be an ID to denote which datasets were derived from a single 10X Genomics Chromium Controller run. In the case of the 10X Controller this could be the chip ID and would allow users the ability to determine which samples were processed together in a Chromium controller. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | False |
-| preparation_instrument_vendor | Allowable Value | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` | False |
-| preparation_instrument_model | Allowable Value | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist ```| False |
-| metadata_schema_id | Textfield | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | True |
-| preparation_instrument_kit | Allowable Value | The reagent kit used with the preparation instrument. | ```10X Genomics; Chromium Next GEM Chip G Single Cell Kit``` ```16 rxns; PN 1000127``` ```10X Genomics; Chromium Next GEM Chip G Single Cell Kit``` ```48 rxns; PN 1000120``` ```10X Genomics; Chromium Next GEM Chip K Automated Single Cell Kit``` ```48 rxns; PN 1000289``` ```10X Genomics; Chromium Next GEM Chip K Single Cell Kit``` ```16 rxns; PN 1000287``` ```10X Genomics; Chromium Next GEM Chip K Single Cell Kit``` ```48 rxns; PN 1000286``` ```10X Genomics; Chromium Next GEM Chip Q Single Cell Kit``` ```16 rxns; PN 1000422``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```16 rxn; PN 1000283``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```4 rxn; PN 1000285``` ```10X Genomics; Visium FFPE Reagent Kit v2-Small``` ```PN 1000436``` ```Custom``` | False |
-| preparation_protocol_doi | Link | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | True |
-| parent_sample_id | Textfield | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | True |
-| barcode_offset | Allowable Value | Positions in the read at which the cell or capture spot barcodes start. Cell and capture spot barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences (the offsets). First barcode at position 0, then 38, then 76. This should be included when constructing sequencing libraries with a non-commercial kit. | ```0``` ```8``` ```20``` ```1,27``` ```0,38,76``` ```10,48,86``` ```Not applicable``` | True |
-| umi_offset | Allowable Value | Position in the read at which the UMI barcode starts. This should be included when constructing sequencing libraries with a non-commercial kit. | ```0``` ```16``` ```36``` ```Not applicable``` | True |
-| dataset_type | Allowable Value | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium```| True |
-
-
-
-SNARE-seq2 / sciATACseq / snATACseq Version 1
-
-| Attribute | Type | Description | Allowable Values | Required |
-|---------------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['1'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['sequence'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['SNARE-seq2', 'sciATACseq', 'snATACseq'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['DNA'] | True |
-| is_targeted | boolean | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| is_technical_replicate | boolean | If TRUE, fastq files in dataset need to be merged. | | True |
-| library_id | Textfield | A library ID, unique within a TMC, which allows corresponding RNA and chromatin accessibility datasets to be linked. | | True |
-| sc_isolation_protocols_io_doi | Textfield | Link to a protocols document answering the question: How were single cells separated into a single-cell suspension? | | True |
-| sc_isolation_entity | Allowable Value | The type of single cell entity derived from isolation protocol. | ['whole cell', 'nucleus', 'cell-cell multimer', 'spatially encoded cell barcoding'] | True |
-| sc_isolation_tissue_dissociation | Textfield | The method by which tissues are dissociated into single cells in suspension. | | True |
-| sc_isolation_enrichment | Allowable Value | The method by which specific cell populations are sorted or enriched. | ['none', 'FACS'] | True |
-| sc_isolation_quality_metric | Textfield | A quality metric by visual inspection prior to cell lysis or defined by known parameters such as wells with several cells or no cells. This can be captured at a high level. "OK" or "not OK", or with more specificity such as "debris", "clump", "low clump". | | True |
-| sc_isolation_cell_number | Numeric | Total number of cell/nuclei yielded post dissociation and enrichment. | | True |
-| transposition_input | Numeric | Number of cell/nuclei input to the assay. | | True |
-| transposition_method | Allowable Value | Modality of capturing accessible chromatin molecules. | ['SNARE-Seq2-AC', 'bulkATACseq', 'snATACseq', 'sciATACseq'] | True |
-| transposition_transposase_source | Allowable Value | The source of the Tn5 transposase and transposon used for capturing accessible chromatin. | ['10X snATAC', 'In-house', 'Nextera', '10X multiome'] | True |
-| transposition_kit_number | Textfield | If Tn5 came from a kit, provide the catalog number. | | False |
-| library_construction_protocols_io_doi | Textfield | A link to the protocol document containing the library construction method (including version) that was used, e.g. "Smart-Seq2", "Drop-Seq", "10X v3". DOI for protocols.io referring to the protocol for this assay. | | True |
-| library_layout | Allowable Value | State whether the library was generated for single-end or paired end sequencing. | ['single-end', 'paired-end'] | True |
-| library_adapter_sequence | Textfield | Adapter sequence to be used for adapter trimming. | | True |
-| cell_barcode_read | Textfield | Which read file(s) contains the cell barcode. Multiple cell_barcode_read files must be provided as a comma-delimited list (e.g. file1,file2,file3). This field is not required for barcoding by single-cell combinatorial indexing. | | False |
-| cell_barcode_offset | Textfield | Positions in the read at which the cell barcodes start. Cell barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences (the offsets). First barcode at position 0, then 38, then 76. (Does not apply to sciATACseq, SNARE-seq and BulkATAC.) | | False |
-| cell_barcode_size | Textfield | Length of the cell barcode in base pairs. Cell barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences, the offsets. (Does not apply to sciATACseq, SNARE-seq and BulkATAC.) | | False |
-| library_pcr_cycles | Numeric | Number of PCR cycles to enrich for accessible chromatin fragments. | | True |
-| library_pcr_cycles_for_sample_index | Numeric | Number of PCR cycles performed for library generation (figure in Descriptions section) | | True |
-| library_final_yield | Numeric | Total ng of library after final pcr amplification step. | | True |
-| library_final_yield_unit | Allowable Value | Units for library_final_yield | ['ng'] | False |
-| library_average_fragment_size | Numeric | Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. | | True |
-| sequencing_reagent_kit | Textfield | Reagent kit used for sequencing | | True |
-| sequencing_read_format | Textfield | Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. | | True |
-| sequencing_read_percent_q30 | Numeric | Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + ...) | | True |
-| sequencing_phix_percent | Numeric | Percent PhiX loaded to the run | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
-
-SNARE-seq2 / sciATACseq / snATACseq Version 0
-
-| Attribute | Type | Description | Allowable Values | Required |
-|---------------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------|------------|
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['sequence'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['scRNAseq-10xGenomics', 'snRNAseq-10xGenomics-v2', 'snRNAseq-10xGenomics-v3', 'scRNAseq', 'sciRNAseq', 'snRNAseq', 'SNARE2-RNAseq'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['RNA'] | True |
-| is_targeted | boolean | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| sc_isolation_protocols_io_doi | Textfield | Link to a protocols document answering the question: How were single cells separated into a single-cell suspension? | | True |
-| sc_isolation_entity | Textfield | The type of single cell entity derived from isolation protocol | | True |
-| sc_isolation_tissue_dissociation | Textfield | The method by which tissues are dissociated into single cells in suspension. | | True |
-| sc_isolation_enrichment | Textfield | The method by which specific cell populations are sorted or enriched. | | False |
-| sc_isolation_quality_metric | Textfield | A quality metric by visual inspection prior to cell lysis or defined by known parameters such as wells with several cells or no cells. This can be captured at a high level. | | True |
-| sc_isolation_cell_number | Numeric | Total number of cell/nuclei yielded post dissociation and enrichment | | True |
-| rnaseq_assay_input | Numeric | Number of cell/nuclei input to the assay | | True |
-| rnaseq_assay_method | Textfield | The kit used for the RNA sequencing assay | | True |
-| library_construction_protocols_io_doi | Textfield | A link to the protocol document containing the library construction method (including version) that was used, e.g. "Smart-Seq2", "Drop-Seq", "10X v3". | | True |
-| library_layout | Allowable Value | State whether the library was generated for single-end or paired end sequencing. | ['single-end', 'paired-end'] | True |
-| library_adapter_sequence | Textfield | Adapter sequence to be used for adapter trimming | | True |
-| library_id | Textfield | A library ID, unique within a TMC, which allows corresponding RNA and chromatin accessibility datasets to be linked. | | True |
-| is_technical_replicate | boolean | Is the sequencing reaction run in repliucate, TRUE or FALSE | | True |
-| cell_barcode_read | Textfield | Which read file contains the cell barcode | | True |
-| cell_barcode_offset | Textfield | Position(s) in the read at which the cell barcode starts. | | True |
-| cell_barcode_size | Textfield | Length of the cell barcode in base pairs | | True |
-| library_pcr_cycles | Numeric | Number of PCR cycles to amplify cDNA | | True |
-| library_pcr_cycles_for_sample_index | Numeric | Number of PCR cycles performed for library indexing | | True |
-| library_final_yield_value | Numeric | Total number of ng of library after final pcr amplification step. This is the concentration (ng/ul) * volume (ul) | | True |
-| library_final_yield_unit | Allowable Value | Units of final library yield | ['ng'] | False |
-| library_average_fragment_size | Numeric | Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. | | True |
-| sequencing_reagent_kit | Textfield | Reagent kit used for sequencing | | True |
-| sequencing_read_format | Textfield | Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. | | True |
-| sequencing_read_percent_q30 | Numeric | Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + ...) | | True |
-| sequencing_phix_percent | Numeric | Percent PhiX loaded to the run | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
-
-bulkATACseq Version 1
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-------------------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['1'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['sequence'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['bulkATACseq'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['DNA'] | True |
-| is_targeted | boolean | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| bulk_transposition_input_number_nuclei | Textfield | A number (no comma separators) | | True |
-| bulk_atac_cell_isolation_protocols_io_doi | Textfield | Link to a protocols document answering the question: How was tissue stored and processed for cell/nuclei isolation | | True |
-| is_technical_replicate | boolean | Is this a sequencing replicate? | | True |
-| library_adapter_sequence | Textfield | Adapter sequence to be used for adapter trimming | | True |
-| library_average_fragment_size | Numeric | Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. | | True |
-| library_concentration_value | Numeric | The concentration value of the pooled library samples submitted for sequencing. | | True |
-| library_concentration_unit | Allowable Value | Unit of library_concentration_value | ['nM'] | False |
-| library_construction_protocols_io_doi | Textfield | A link to the protocol document containing the library construction method (including version) that was used, e.g. "Smart-Seq2", "Drop-Seq", "10X v3". | | True |
-| library_creation_date | Datetime | date and time of library creation. YYYY-MM-DD, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s. | | False |
-| library_final_yield_value | Numeric | Total number of ng of library after final pcr amplification step. This is the concentration (ng/ul) * volume (ul) | | True |
-| library_final_yield_unit | Allowable Value | Units of final library yield | ['ng'] | False |
-| library_id | Textfield | A library ID, unique within a TMC, which allows corresponding RNA and chromatin accessibility datasets to be linked. | | True |
-| library_layout | Allowable Value | State whether the library was generated for single-end or paired end sequencing. | ['single-end', 'paired-end'] | True |
-| library_pcr_cycles | Numeric | Number of PCR cycles performed in order to add adapters and amplify the library. Usually, this includes 5 pre-amplificationn cycles followed by 0-5 additional cycles determined by qPCR. | | True |
-| library_preparation_kit | Textfield | Reagent kit used for library preparation | | True |
-| sample_quality_metric | Textfield | This is a quality metric by visual inspection. This should answerthe question: Are the nuclei intact and are the nuclei free of significant amountsof debris? This can be captured at a high level, âOKâ or ânotOKâ. | | True |
-| sequencing_phix_percent | Numeric | Percent PhiX loaded to the run | | True |
-| sequencing_read_format | Textfield | Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. | | True |
-| sequencing_read_percent_q30 | Numeric | Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + ...) | | True |
-| sequencing_reagent_kit | Textfield | Reagent kit used for sequencing | | True |
-| transposition_kit_number | Textfield | If Tn5 came from a kit, provide the catalog number. | | False |
-| transposition_method | Textfield | Modality of capturing accessible chromatin molecules. The kit used, for example. | | True |
-| transposition_transposase_source | Textfield | The source of the Tn5 transposase and transposon used for capturing accessible chromatin. | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
-
-
-
-bulkATACseq 0
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-------------------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['1'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['sequence'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['bulkATACseq'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['DNA'] | True |
-| is_targeted | boolean | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| bulk_transposition_input_number_nuclei | Textfield | A number (no comma separators) | | True |
-| bulk_atac_cell_isolation_protocols_io_doi | Textfield | Link to a protocols document answering the question: How was tissue stored and processed for cell/nuclei isolation | | True |
-| is_technical_replicate | boolean | Is this a sequencing replicate? | | True |
-| library_adapter_sequence | Textfield | Adapter sequence to be used for adapter trimming | | True |
-| library_average_fragment_size | Numeric | Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. | | True |
-| library_concentration_value | Numeric | The concentration value of the pooled library samples submitted for sequencing. | | True |
-| library_concentration_unit | Allowable Value | Unit of library_concentration_value | ['nM'] | False |
-| library_construction_protocols_io_doi | Textfield | A link to the protocol document containing the library construction method (including version) that was used, e.g. "Smart-Seq2", "Drop-Seq", "10X v3". | | True |
-| library_creation_date | Datetime | date and time of library creation. YYYY-MM-DD, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s. | | False |
-| library_final_yield_value | Numeric | Total number of ng of library after final pcr amplification step. This is the concentration (ng/ul) * volume (ul) | | True |
-| library_final_yield_unit | Allowable Value | Units of final library yield | ['ng'] | False |
-| library_id | Textfield | A library ID, unique within a TMC, which allows corresponding RNA and chromatin accessibility datasets to be linked. | | True |
-| library_layout | Allowable Value | State whether the library was generated for single-end or paired end sequencing. | ['single-end', 'paired-end'] | True |
-| library_pcr_cycles | Numeric | Number of PCR cycles performed in order to add adapters and amplify the library. Usually, this includes 5 pre-amplificationn cycles followed by 0-5 additional cycles determined by qPCR. | | True |
-| library_preparation_kit | Textfield | Reagent kit used for library preparation | | True |
-| sample_quality_metric | Textfield | This is a quality metric by visual inspection. This should answerthe question: Are the nuclei intact and are the nuclei free of significant amountsof debris? This can be captured at a high level, âOKâ or ânotOKâ. | | True |
-| sequencing_phix_percent | Numeric | Percent PhiX loaded to the run | | True |
-| sequencing_read_format | Textfield | Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. | | True |
-| sequencing_read_percent_q30 | Numeric | Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + ...) | | True |
-| sequencing_reagent_kit | Textfield | Reagent kit used for sequencing | | True |
-| transposition_kit_number | Textfield | If Tn5 came from a kit, provide the catalog number. | | False |
-| transposition_method | Textfield | Modality of capturing accessible chromatin molecules. The kit used, for example. | | True |
-| transposition_transposase_source | Textfield | The source of the Tn5 transposase and transposon used for capturing accessible chromatin. | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
+---
+layout: page-triary
+---
+
+# ATACseq Metadata Attributes
+
+Fields that are collected for ATACseq data, available at ```Dataset.metadata.```
+
+
+* indicates a required field
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| preparation_protocol_doi * | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | ```https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1``` |
+| dataset_type * | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` |
+| analyte_class * | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` |
+| is_targeted * | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ```Yes``` ```No``` |
+| acquisition_instrument_vendor * | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` |
+| acquisition_instrument_model * | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` |
+| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | |
+| source_storage_duration_unit * | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` |
+| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | |
+| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` |
+| contributors_path | | Relative path to file with ORCID IDs for contributors for this dataset. | |
+| data_path | | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | |
+| barcode_offset * | | Positions in the read at which the cell or capture spot barcodes start. Cell and capture spot barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences (the offsets). First barcode at position 0, then 38, then 76. This should be included when constructing sequencing libraries with a non-commercial kit. | ```0``` ```8``` ```20``` ```1,27``` ```0,38,76``` ```10,48,86``` ```Not applicable``` |
+| barcode_read * | | Which read file contains the cell or capture spot barcode. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if the source material is barcoded. This field is used to determine which analysis pipeline to run. | ```Read 2 (R2)``` ```Read 1 (R1)``` ```Not applicable``` |
+| barcode_size * | | Length of the cell or capture spot barcode in base pairs. Cell and capture spot barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences, the offsets. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if the source material is barcoded. This field is used to determine which analysis pipeline to run. | ```14``` ```16``` ```40``` ```8,8,8``` ```8,6``` ```Not applicable``` |
+| umi_offset * | | Position in the read at which the UMI barcode starts. This should be included when constructing sequencing libraries with a non-commercial kit. | ```0``` ```16``` ```36``` ```Not applicable``` |
+| umi_read * | | Which read file contains the UMI barcode. This should be included when constructing sequencing libraries with a non-commercial kit. | ```Read 2 (R2)``` ```Read 1 (R1)``` ```Not applicable``` |
+| umi_size * | | Length of the umi barcode in base pairs. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if UMI are present. This field is used to determine which analysis pipeline to run. | ```8``` ```9``` ```10``` ```12``` ```Not applicable``` |
+| assay_input_entity * | | This is the entity from which the analyte is being captured. For example, for bulk sequencing this would be "tissue", while it would be "single cell" for single cell sequencing. This field is used to determine which analysis pipeline to run. | ```area of interest``` ```single cell``` ```single nucleus``` ```spot``` ```tissue (bulk)``` |
+| number_of_input_cells_or_nuclei | | How many cells or nuclei were input to the assay? This is typically not available for preparations working with bulk tissue. | |
+| library_adapter_sequence | | Adapter sequence to be used for adapter trimming | |
+| library_average_fragment_size | | Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. | |
+| library_input_amount_value | | The amount of cDNA, after amplification, that was used for library construction. | |
+| library_input_amount_unit * | | unit of library input amount value | ```ng``` ```ul``` |
+| library_output_amount_value | | Total amount (eg. nanograms) of library after the clean-up step of final pcr amplification step. Answer the question: What is the Qubit measured concentration (ng/ul) times the elution volume (ul) after the final clean-up step? | |
+| library_output_amount_unit | | Units of library final yield. | ```ng``` ```ul``` |
+| library_concentration_value | | The concentration value of the pooled library samples submitted for sequencing. | |
+| library_concentration_unit * | | Unit of library_concentration_value | ```ng/ul``` ```nM``` |
+| library_layout * | | State whether the library was generated for single-end or paired end sequencing. | ```paired-end``` ```single-end``` |
+| number_of_pcr_cycles_for_indexing | | Number of PCR cycles performed in order to add adapters and amplify the library. This does not include the cDNA amplification which is captured in the "number of iterations of cDNA amplification" field. | |
+| library_preparation_kit * | | Reagent kit used for library preparation | ```10X Genomics; Automated Library Construction Kit``` ```24 rxns; PN 1000428``` ```10X Genomics; Chromium Next GEM Automated Single Cell 5' Kit v2``` ```24 rxns; PN 1000290``` ```4 rxns; PN 1000298``` ```10X Genomics; Chromium Next GEM Single Cell 3' GEM``` ```Library & Gel Bead Kit v3.1``` ```16 rxns; PN 1000121``` ```10X Genomics; Chromium Next GEM Single Cell 3' HT Kit v3.1``` ```48 rxns; PN 1000348``` ```8 rxns; PN 1000370``` ```10X Genomics; Chromium Next GEM Single Cell 3' Kit v3.1``` ```16 rxns; PN 1000268``` ```4 rxns; PN 1000269``` ```10X Genomics; Chromium Next GEM Single Cell 5' Kit v2``` ```16 rxns; PN 1000263``` ```4 rxns; PN 1000265``` ```10X Genomics; Chromium Next GEM Single Cell Fixed RNA Hybridization & Library Kit``` ```4 rxns; PN 1000415``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```16 rxn; PN 1000283``` ```4 rxn; PN 1000285``` ```10X Genomics; Chromium Single Cell 3' GEM``` ```Library & Gel Bead Kit v3``` ```4 rxns PN 1000092``` ```10X Genomics; Chromium Single Cell 3' Library & Gel Bead Kit``` ```4 rxns; PN 120267``` ```10X Genomics; Visium CytAssist Spatial Gene Expression for FFPE``` ```Human Transcriptome``` ```11 mm``` ```2 reactions; PN 1000522``` ```6.5mm``` ```4 reactions; PN 1000520``` ```10X Genomics; Visium Spatial for FFPE Gene Expression Kit``` ```1 slides``` ```4 reactions; PN 1000338``` ```Mouse Transcriptome``` ```4 rxns; PN 1000339``` ```10X Genomics; Visium Spatial Gene Expression Slide and Reagent Kit``` ```4 reactions; PN 1000187``` ```4 slides``` ```16 reactions; PN 1000184``` ```Custom``` ```Illumina; TruSeq Stranded mRNA Library Prep (48 samples); PN 20020594``` ```Illumina; TruSeq Stranded mRNA Library Prep (96 samples); PN 20020595``` ```New England BioLabs; NEBNext Ultra II RNA Library Prep Kit for Illumina; PN E7770``` ```Parse Biosciences; Evercode WT Mini v2 Kit``` ```12 rxns; PN ECW02010``` ```Parse Biosciences; Evercode WT v2 Kit``` ```48 rxns; PN ECW02030)``` |
+| sample_indexing_kit * | | Indexes are needed for multiplexing sequencing libraries for simultaneous sequencing (pooling) and proper attachment to the Illumina flowcell. Each indexing kit would have a number of compatible sequences ("sample indexing sets") that are used to label some number of samples (the number of sets depend on the kit). | ```10X Genomics; Chromium i7 Sample Index Plate (96 rxn); PN 220103``` ```10X Genomics; Dual Index Kit TS``` ```Set A; PN 1000251``` ```10X Genomics; Dual Index Kit TT``` ```Set A (96 rxn); PN 1000215``` ```10X Genomics; Single Index Kit N``` ```Set A (96 rxn); PN 1000212``` ```Custom``` ```Illumina; IDT for Illumina - TruSeq RNA UD Indexes v2 (96 Indexes``` ```96 Samples); PN 20040871``` ```Illumina; TruSeq RNA CD Index Plate (96 Indexes``` ```96 Samples); PN 20019792``` ```Illumina; TruSeq RNA Single Indexes Set A (12 Indexes``` ```48 Samples); PN 20020492``` ```Illumina; TruSeq RNA Single Indexes Set B (12 Indexes``` ```48 Samples); PN 20020493``` ```Integrated DNA Technologies: Custom DNA Oligos``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-AB``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-CD``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-EF``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-GH``` ```Not applicable``` ```Parse Biosciences; Fragmentation Reagents; PN WX100``` ```Parse Biosciences; UDI Plate - WT; PN UDI1001``` |
+| sample_indexing_set | | The specific sequencing barcode index set used, selected from the sample indexing kit. Example: For 10X this might be "SI-GA-A1", for Nextera "N505 - CTCCTTAC" | |
+| is_technical_replicate * | | Is this a sequencing replicate? | ```Yes``` ```No``` |
+| expected_entity_capture_count | | Number of cells, nuclei or capture spots expected to be captured by the assay. For Visium this is the total number of spots covered by tissue, within the capture area. | |
+| sequencing_reagent_kit * | | Reagent kit used for sequencing | ```Custom``` ```Illumina``` ```HiSeq 3000/4000 PE Cluster Kit PE-410-1001``` ```PN 1000283, Illumina``` ```NextSeq 1000/2000 P2 Reagent v3 Kit (100 Cycles)``` ```PN 20046811, Illumina``` ```NextSeq 1000/2000 P2 Reagent v3 Kit (200 Cycles)``` ```PN 20046812, Illumina``` ```NextSeq 1000/2000 P2 Reagent v3 Kit (300 Cycles)``` ```PN 20046813, Illumina``` ```NextSeq 2000 P3 Reagent Kit (300 Cycles)``` ```PN 20040561, Illumina``` ```NextSeq 2000 P3 Reagents Kit (100 Cycles)``` ```PN 20040559, Illumina``` ```NextSeq 500/550 Hi Output Kit 150 Cycles``` ```v2.5``` ```PN 20024907, Illumina``` ```NextSeq 500/550 Hi Output Kit 75 Cycles v2.5``` ```PN 20024906, Illumina``` ```NextSeq 500/550 Mid Output Kit 150 Cycles v2.5``` ```PN 20024904, Illumina``` ```NovaSeq 6000 S1 Reagent Kit (200 Cycles)``` ```PN 20012864, Illumina``` ```NovaSeq 6000 S1 Reagent v1.5 Kit (100 Cycles)``` ```PN 20028319, Illumina``` ```NovaSeq 6000 S1 Reagent v1.5 Kit (200 Cycles)``` ```PN 20028318, Illumina``` ```NovaSeq 6000 S1 Reagent v1.5 Kit (300 Cycles)``` ```PN 20028317, Illumina``` ```NovaSeq 6000 S2 Reagent v1.5 Kit (100 Cycles)``` ```PN 20028316, Illumina``` ```NovaSeq 6000 S4 Reagent Kit v1.5 (300 cycles)``` ```PN 20028312, Illumina``` ```NovaSeq 6000 S4 Reagent v1.5 Kit (200 Cycles)``` ```PN 20028313, Illumina``` ```NovaSeq 6000 SP Reagent v1.5 Kit (100 Cycles)``` ```PN 20028401, Illumina``` ```NovaSeq X Series 1.5B Reagent Kit (100 Cycle)``` ```PN 20104703, Illumina``` ```NovaSeq X Series 1.5B Reagent Kit (200 Cycle)``` ```PN 20104704, Illumina``` ```NovaSeq X Series 1.5B Reagent Kit (300 Cycle)``` ```PN 20104705, Illumina``` ```NovaSeq X Series 10B Reagent Kit (100 Cycle)``` ```PN 20085596, Illumina``` ```NovaSeq X Series 10B Reagent Kit (200 Cycle)``` ```PN 20085595, Illumina``` ```NovaSeq X Series 10B Reagent Kit (300 Cycle)``` ```PN 20085594``` |
+| sequencing_read_format | | Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. | |
+| transposition_reagent_kit | | | |
+| transposition_method * | | Modality of capturing accessible chromatin molecules. The kit used, for example. | ```bulkATACseq``` ```sciATACseq``` ```Custom``` ```scATACseq``` |
+| sequencing_batch_id | | The ID for the sequencing run. This could, for example, be the chip ID and should allow users the ability to determine which samples were processed together in a sequencing run. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | |
+| capture_batch_id | | A lab-generated ID to identify which cells were captured at the same time. This would, for example, be an ID to denote which datasets were derived from a single 10X Genomics Chromium Controller run. In the case of the 10X Controller this could be the chip ID and would allow users the ability to determine which samples were processed together in a Chromium controller. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | |
+| preparation_instrument_vendor * | | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` |
+| preparation_instrument_model * | | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist``` |
+| preparation_instrument_kit * | | The reagent kit used with the preparation instrument. | ```10X Genomics; Chromium Next GEM Chip G Single Cell Kit``` ```16 rxns; PN 1000127``` ```48 rxns; PN 1000120``` ```10X Genomics; Chromium Next GEM Chip K Automated Single Cell Kit``` ```48 rxns; PN 1000289``` ```10X Genomics; Chromium Next GEM Chip K Single Cell Kit``` ```16 rxns; PN 1000287``` ```48 rxns; PN 1000286``` ```10X Genomics; Chromium Next GEM Chip Q Single Cell Kit``` ```16 rxns; PN 1000422``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```16 rxn; PN 1000283``` ```4 rxn; PN 1000285``` ```10X Genomics; Visium FFPE Reagent Kit v2-Small``` ```PN 1000436``` ```Custom``` |
+| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | |
+
+
+
+
+## Deprecated Attributes
+
+
+ indicates a field that was previously required
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| assay_category * | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```sequence``` |
+| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | |
+| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | |
+| library_construction_protocols_io_doi | | A link to the protocol document containing the library construction method (including version) that was used, e.g. "Smart-Seq2", "Drop-Seq", "10X v3". | |
+| library_id | | A library ID, unique within a TMC, which allows corresponding RNA and chromatin accessibility datasets to be linked. | |
+| operator | | Name of the person responsible for executing the assay. | |
+| operator_email | | Email address for the operator. | |
+| pi | | Name of the principal investigator responsible for the data. | |
+| pi_email | | Email address for the principal investigator. | |
+| library_pcr_cycles | | Number of PCR cycles performed in order to add adapters and amplify the library. Usually, this includes 5 pre-amplificationn cycles followed by 0-5 additional cycles determined by qPCR. | |
+| sc_isolation_enrichment * | | The method by which specific cell populations are sorted or enriched. | ```none``` ```FACS``` |
+| sc_isolation_protocols_io_doi | | Link to a protocols document answering the question: How were single cells separated into a single-cell suspension? | |
+| sc_isolation_quality_metric | | A quality metric by visual inspection prior to cell lysis or defined by known parameters such as wells with several cells or no cells. This can be captured at a high level. "OK" or "not OK", or with more specificity such as "debris", "clump", "low clump". | |
+| sc_isolation_tissue_dissociation | | The method by which tissues are dissociated into single cells in suspension. | |
+| sc_isolation_cell_number | | Total number of cell/nuclei yielded post dissociation and enrichment. | |
+| sequencing_phix_percent | | Percent PhiX loaded to the run | |
+| sequencing_read_percent_q30 | | Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + ...) | |
+| transposition_transposase_source * | | The source of the Tn5 transposase and transposon used for capturing accessible chromatin. | ```10X snATAC``` ```In-house``` ```Nextera``` ```10X multiome``` |
+| version * | | Version of the schema to use when validating this metadata. | ```1``` |
+| description | | Free-text description of this assay. | |
diff --git a/docs/assays/metadata/AutoFluorescence.md b/docs/assays/metadata/AutoFluorescence.md
index 6abca68..53780c7 100644
--- a/docs/assays/metadata/AutoFluorescence.md
+++ b/docs/assays/metadata/AutoFluorescence.md
@@ -1,105 +1,63 @@
----
-layout: page
----
-# Auto-fluorescence
-
-NOTE: Several versions of this metadata schema have been created over time. The (Latest) version contains most attributes, but there may be some deprecated attributes in the older versions for which data has been collected. HuBMAP is in the process of creating a reference which combines all of these versions into a single view. That reference will be available here once completed.
-
-Version 2 (Latest)
-
-## Version 2 (Latest)
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-----------------------------------------------------|-----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|------------|
-| source_storage_duration_value | Numeric | How long was the source material (parent) stored, prior to this sample being processed. | | True |
-| time_since_acquisition_instrument_calibration_value | Numeric | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | False |
-| contributors_path | Textfield | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | True |
-| data_path | Textfield | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "/TEST001-RK/" for this field. If there are multiple directory levels, use the format "/TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | True |
-| is_image_preprocessing_required | Allowable Value | Depending on if the acquisition instrument was a microscope, slide scanner, etc. will indicate whether or not any level of preprocessing was required to assemble the image (e.g., fusing image tiles) . | ```Yes``` ```No``` | False |
-| slide_id | Textfield | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | False |
-| tiled_image_columns | Numeric | This is how many columns used in stitching. This is sometimes referred to as the grid size x. | | False |
-| tiled_image_count | Numeric | This is the total number of raw (tiled) images captured, that are to be stitched together. | | False |
-| intended_tile_overlap_percentage | Numeric | The amount of overlap between tiled images. This is the set point, where as during image acquisition there will be slight variations due to stage registration. | | False |
-| dataset_type | Allowable Value | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium```| True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA ```| True |
-| acquisition_instrument_vendor | Allowable Value | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | True |
-| acquisition_instrument_model | Allowable Value | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | True |
-| source_storage_duration_unit | Allowable Value | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | True |
-| time_since_acquisition_instrument_calibration_unit | Allowable Value | The time unit of measurement | ```month``` ```day``` ```year``` | False |
-| tile_configuration | Allowable Value | This is how the tiles are configured for stitching. | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | False |
-| scan_direction | Allowable Value | This is the direction of imaging, which is required for stitching. | ```Left-and-down``` ```Left-and-up``` ```Not applicable``` ```Right-and-down``` ```Right-and-up``` | False |
-| metadata_schema_id | Textfield | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | True |
-| preparation_protocol_doi | Textfield | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | True |
-| antibodies_path | Textfield | This is the location of the antibodies.tsv file relative to the root of the top level of the upload directory structure. This path should begin with "." and would likely be something like "./extras/antibodies.tsv". | | True |
-| parent_sample_id | Textfield | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | True |
-
-
-
-Version 1
-
-## Version 1
-
-| Attribute | Type | Description | AllowableValues | Required |
-|-------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['1'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['imaging'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['AF'] | True |
-| analyte_class | Textfield | Analytes are the target molecules being measured with the assay. | | False |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ['Yes','No'] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| resolution_x_value | Numeric | The width of a pixel. | | True |
-| resolution_x_unit | Allowable Value | The unit of measurement of the width of a pixel. | ['nm', 'um'] | False |
-| resolution_y_value | Numeric | The height of a pixel | | True |
-| resolution_y_unit | Allowable Value | The unit of measurement of the height of a pixel. | ['nm', 'um'] | False |
-| resolution_z_value | Numeric | Optional if assay does not have multiple z-levels. Note that this is resolution within a given sample: z-pitch (resolution_z_value) is the increment distance between image slices, ie. the microscope stage is moved up or down in increments to capture images of several focal planes. | | True |
-| resolution_z_unit | Allowable Value | The unit of incremental distance between image slices. | ['mm', 'um', 'nm'] | False |
-| number_of_channels | Numeric | Number of channels capturing the emission spectrum from natural fluorophores in the sample. | | True |
-| overall_protocols_io_doi | Textfield | DOI for protocols.io referring to the overall protocol for the assay. | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
-
-Version 0
-
-## Version 0
-
-| Attribute | Type | Description | AllowableValues | Required |
-|-------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------|------------|
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['imaging'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['AF'] | True |
-| analyte_class | Textfield | Analytes are the target molecules being measured with the assay. | | False |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ['Yes','No'] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| resolution_x_value | Numeric | The width of a pixel. | | True |
-| resolution_x_unit | Allowable Value | The unit of measurement of the width of a pixel. | ['nm', 'um'] | False |
-| resolution_y_value | Numeric | The height of a pixel | | True |
-| resolution_y_unit | Allowable Value | The unit of measurement of the height of a pixel. | ['nm', 'um'] | False |
-| resolution_z_value | Numeric | Optional if assay does not have multiple z-levels. Note that this is resolution within a given sample: z-pitch (resolution_z_value) is the increment distance between image slices, ie. the microscope stage is moved up or down in increments to capture images of several focal planes. | | True |
-| resolution_z_unit | Allowable Value | The unit of incremental distance between image slices. | ['mm', 'um', 'nm'] | False |
-| number_of_channels | Numeric | Number of channels capturing the emission spectrum from natural fluorophores in the sample. | | True |
-| overall_protocols_io_doi | Textfield | DOI for protocols.io referring to the overall protocol for the assay. | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
+---
+layout: page-triary
+---
+
+# Auto-fluorescence Metadata Attributes
+
+Fields that are collected for Auto-fluorescence data, available at ```Dataset.metadata.```
+
+
+* indicates a required field
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | |
+| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | |
+| preparation_protocol_doi * | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | ```https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1``` |
+| dataset_type * | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` |
+| analyte_class * | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` |
+| is_targeted * | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ```Yes``` ```No``` |
+| acquisition_instrument_vendor * | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` |
+| acquisition_instrument_model * | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` |
+| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | |
+| source_storage_duration_unit * | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` |
+| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | |
+| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` |
+| contributors_path | | Relative path to file with ORCID IDs for contributors for this dataset. | |
+| data_path | | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | |
+| antibodies_path | | This is the location of the antibodies.tsv file relative to the root of the top level of the upload directory structure. This path should begin with "." and would likely be something like "./extras/antibodies.tsv". | |
+| is_image_preprocessing_required * | | Indicates whether image preprocessing is necessary based on the type of acquisition instrument used, such as a microscope or slide scanner. This may involve steps like fusing image tiles to assemble the complete image. Example: Yes | ```Yes``` ```No``` |
+| slide_id | | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | |
+| tile_configuration | | The configuration of tiles used for stitching in the assay process. If no tile configuration is applicable, enter "Not applicable". Example: Row-by-row | ```Column-by-column``` ```Not applicable``` ```Snake-by-columns``` ```Row-by-row``` ```Snake-by-rows``` |
+| scan_direction | | The direction of imaging, which is necessary for the stitching process. Example: Left-and-down | ```Left-and-down``` ```Right-and-down``` ```Not applicable``` ```Right-and-up``` ```Left-and-up``` |
+| tiled_image_columns | | The number of columns used in the stitching process of a tiled image, often referred to as the grid size in the x-dimension. Example: 5 | |
+| tiled_image_count | | The total number of raw tiled images captured, which are intended to be stitched together. Example: 75 | |
+| intended_tile_overlap_percentage | | The intended percentage of overlap between tiled images. This value serves as the set point, although slight variations may occur during image acquisition due to stage registration. Example: 5 | |
+| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | |
+
+
+
+
+## Deprecated Attributes
+
+ indicates a field that was previously required
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| resolution_x_unit | | The unit of measurement of the width of a pixel. | ```mm``` ```um``` ```nm``` |
+| resolution_x_value | | The width of a pixel. | |
+| resolution_y_unit | | The unit of measurement of the height of a pixel. | ```mm``` ```um``` ```nm``` |
+| resolution_y_value | | The height of a pixel | |
+| resolution_z_unit | | The unit of incremental distance between image slices. | ```mm``` ```um``` ```nm``` |
+| resolution_z_value | | Optional if assay does not have multiple z-levels. Note that this is resolution within a given sample: z-pitch (resolution_z_value) is the increment distance between image slices, ie. the microscope stage is moved up or down in increments to capture images of several focal planes. | |
+| assay_category * | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```sequence``` |
+| description | | Free-text description of this assay. | |
+| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | |
+| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | |
+| number_of_channels | | Number of channels capturing the emission spectrum from natural fluorophores in the sample. | |
+| operator | | Name of the person responsible for executing the assay. | |
+| operator_email | | Email address for the operator. | |
+| overall_protocols_io_doi | | DOI for protocols.io referring to the overall protocol for the assay. | |
+| principal_investigator | | The full name of the principal investigator responsible for the data. | |
+| pi_email | | Email address for the principal investigator. | |
+| version * | | Version of the schema to use when validating this metadata. | ```1``` |
diff --git a/docs/assays/metadata/CODEX.md b/docs/assays/metadata/CODEX.md
index f2efa40..ef54ebf 100644
--- a/docs/assays/metadata/CODEX.md
+++ b/docs/assays/metadata/CODEX.md
@@ -1,120 +1,62 @@
----
-layout: page
----
-# CODEX
-
-NOTE: Several versions of this metadata schema have been created over time. The (Latest) version contains most attributes, but there may be some deprecated attributes in the older versions for which data has been collected. HuBMAP is in the process of creating a reference which combines all of these versions into a single view. That reference will be available here once completed.
-
-Version 2 (Latest)
-
-## Version 2
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-----------------------------------------------------|-----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|------------|
-| source_storage_duration_value | Numeric | How long was the source material (parent) stored, prior to this sample being processed. | | True |
-| time_since_acquisition_instrument_calibration_value | Numeric | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | False |
-| contributors_path | Textfield | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | True |
-| data_path | Textfield | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "/TEST001-RK/" for this field. If there are multiple directory levels, use the format "/TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | True |
-| number_of_antibodies | Numeric | Number of antibodies | | True |
-| number_of_channels | Numeric | Number of fluorescent channels imaged during each cycle. | | True |
-| number_of_biomarker_imaging_rounds | Numeric | Number of imaging rounds to capture the tagged biomarkers. For CODEX a biomarker imaging round consists of 1. oligo application, 2. fluor application, 3. washes. For Cell DIVE a biomarker imaging round consists of 1. staining of a biomarker via secondary detection or direct conjugate and 2. dye inactivation. | | True |
-| number_of_total_imaging_rounds | Numeric | The total number of acquisitions performed on microscope to collect autofluorescence/background or stained signal (e.g., histology). | | True |
-| slide_id | Textfield | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | False |
-| total_run_time_value | Numeric | How long the tissue was on the acquisition instrument. | | False |
-| dataset_type | Allowable Value | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium```| True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA ```| True |
-| acquisition_instrument_vendor | Allowable Value | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | True |
-| acquisition_instrument_model | Allowable Value | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | True |
-| source_storage_duration_unit | Allowable Value | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | True |
-| time_since_acquisition_instrument_calibration_unit | Allowable Value | The time unit of measurement |```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | False |
-| total_run_time_unit | Allowable Value | The units for the total run time unit field. | ```Hour``` ```Minute``` | False |
-| metadata_schema_id | Textfield | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | True |
-| preparation_protocol_doi | Textfield | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | True |
-| antibodies_path | Textfield | This is the location of the antibodies.tsv file relative to the root of the top level of the upload directory structure. This path should begin with "." and would likely be something like "./extras/antibodies.tsv". | | True |
-| preparation_instrument_vendor | Allowable Value | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` | True |
-| preparation_instrument_model | Allowable Value | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist ```| True |
-| parent_sample_id | Textfield | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | True |
-
-
-
-Version 1
-
-## Version 1
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-------------------------------|-----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['1'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped foldergenerated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year,MM is the month with leading 0s, and DD is the day with leading 0s, hh is thehour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories:generation of images of microscopic entities, identification & quantitation ofmolecules by mass spectrometry, imaging mass spectrometry, and determination ofnucleotide sequence. | ['imaging'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['CODEX', 'CODEX2'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['protein'] | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted fordetection/measurement by the assay. | ['Yes', 'No'] | True |
-| acquisition_instrument_vendor | Allowable Value | An acquisition_instrument is the device that contains the signal detectionhardware and signal processing software. Assays generate signals such as lightof various intensities or color or signals representing molecular mass. | ['Keyence', 'Zeiss'] | True |
-| acquisition_instrument_model | Allowable Value | Manufacturers of an acquisition instrument may offer various versions(models) of that instrument with different features or sensitivities. Differencesin features or sensitivities may be relevant to processing or interpretation ofthe data. | ['BZ-X800', 'BZ-X710', 'Axio Observer Z1'] | True |
-| resolution_x_value | Numeric | The width of a pixel. (Akoya pixel is 377nm square) | | True |
-| resolution_x_unit | Allowable Value | The unit of measurement of width of a pixel.(nm) | ['mm', 'um', 'nm'] | False |
-| resolution_y_value | Numeric | The height of a pixel. (Akoya pixel is 377nm square) | | True |
-| resolution_y_unit | Allowable Value | The unit of measurement of height of a pixel. (nm) | ['mm', 'um', 'nm'] | False |
-| resolution_z_value | Numeric | Optional if assay does not have multiple z-levels. Note that thisis resolution within a given sample: z-pitch (resolution_z_value) is the incrementdistance between image slices (for Akoya, z-pitch=1.5um) ie. the microscope stageis moved up or down in increments of 1.5um to capture images of several focalplanes. The best one will be used & the rest discarded. The thickness of the sampleitself is sample metadata. | | False |
-| resolution_z_unit | Allowable Value | The unit of incremental distance between image slices. | ['mm', 'um', 'nm'] | False |
-| preparation_instrument_vendor | Allowable Value | The manufacturer of the instrument used to prepare the sample for theassay. | ['CODEX'] | True |
-| preparation_instrument_model | Allowable Value | The model number/name of the instrument used to prepare the samplefor the assay | ['version 1 robot', 'prototype robot - Stanford/Nolan Lab'] | True |
-| number_of_antibodies | Numeric | Number of antibodies | | True |
-| number_of_channels | Numeric | Number of fluorescent channels imaged during each cycle. | | True |
-| number_of_cycles | Numeric | Number of cycles of 1. oligo application, 2. fluor application, 3.washes | | True |
-| section_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing tissuesections for the assay. | | True |
-| reagent_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing reagentsfor the assay. | | True |
-| antibodies_path | Textfield | Relative path to file with antibody information for this dataset. | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstreamprocessing will depend on filename extension conventions. | | True |
-
-
-
-Version 0
-
-## Version 0
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-------------------------------|-----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------|------------|
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped foldergenerated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year,MM is the month with leading 0s, and DD is the day with leading 0s, hh is thehour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories:generation of images of microscopic entities, identification & quantitation ofmolecules by mass spectrometry, imaging mass spectrometry, and determination ofnucleotide sequence. | ['imaging'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['CODEX'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['protein'] | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted fordetection/measurement by the assay. | ['Yes', 'No'] | True |
-| acquisition_instrument_vendor | Allowable Value | An acquisition_instrument is the device that contains the signal detectionhardware and signal processing software. Assays generate signals such as lightof various intensities or color or signals representing molecular mass. | ['Keyence', 'Zeiss'] | True |
-| acquisition_instrument_model | Allowable Value | Manufacturers of an acquisition instrument may offer various versions(models) of that instrument with different features or sensitivities. Differencesin features or sensitivities may be relevant to processing or interpretation ofthe data. | ['BZ-X800', 'BZ-X710', 'Axio Observer Z1'] | True |
-| resolution_x_value | Numeric | The width of a pixel. (Akoya pixel is 377nm square) | | True |
-| resolution_x_unit | Allowable Value | The unit of measurement of width of a pixel.(nm) | ['mm', 'um', 'nm'] | False |
-| resolution_y_value | Numeric | The height of a pixel. (Akoya pixel is 377nm square) | | True |
-| resolution_y_unit | Allowable Value | The unit of measurement of height of a pixel. (nm) | ['mm', 'um', 'nm'] | False |
-| resolution_z_value | Numeric | Optional if assay does not have multiple z-levels. Note that thisis resolution within a given sample: z-pitch (resolution_z_value) is the incrementdistance between image slices (for Akoya, z-pitch=1.5um) ie. the microscope stageis moved up or down in increments of 1.5um to capture images of several focalplanes. The best one will be used & the rest discarded. The thickness of the sampleitself is sample metadata. | | False |
-| resolution_z_unit | Allowable Value | The unit of incremental distance between image slices. | ['mm', 'um', 'nm'] | False |
-| | Textfield | | | |
-| preparation_instrument_vendor | Allowable Value | The manufacturer of the instrument used to prepare the sample for theassay. | ['CODEX'] | True |
-| preparation_instrument_model | Allowable Value | The model number/name of the instrument used to prepare the samplefor the assay | ['version 1 robot', 'prototype robot - Stanford/Nolan Lab'] | True |
-| number_of_antibodies | Numeric | Number of antibodies | | True |
-| number_of_channels | Numeric | Number of fluorescent channels imaged during each cycle. | | True |
-| number_of_cycles | Numeric | Number of cycles of 1. oligo application, 2. fluor application, 3.washes | | True |
-| section_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing tissuesections for the assay. | | True |
-| reagent_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing reagentsfor the assay. | | True |
-| antibodies_path | Textfield | Relative path to file with antibody information for this dataset. | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstreamprocessing will depend on filename extension conventions. | | True |
-
-
+---
+layout: page-triary
+---
+
+# CODEX Metadata Attributes
+
+Fields that are collected for CODEX data, available at ```Dataset.metadata.```
+
+
+* indicates a required field
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | |
+| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | |
+| preparation_protocol_doi * | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | ```https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1``` |
+| dataset_type * | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` |
+| analyte_class * | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` |
+| is_targeted * | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ```Yes``` ```No``` |
+| acquisition_instrument_vendor * | | An acquisition_instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` |
+| acquisition_instrument_model * | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` |
+| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | |
+| source_storage_duration_unit * | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` |
+| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | |
+| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` |
+| contributors_path | | Relative path to file with ORCID IDs for contributors for this dataset. | |
+| data_path | | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | |
+| antibodies_path | | Relative path to file with antibody information for this dataset. | |
+| preparation_instrument_vendor * | | The manufacturer of the instrument used to prepare the sample for the assay. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` |
+| preparation_instrument_model * | | The model number/name of the instrument used to prepare the sample for the assay | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist``` |
+| total_run_time_value | | How long the tissue was on the acquisition instrument. | |
+| total_run_time_unit * | | The units for the total run time unit field. | ```Hour``` ```Minute``` |
+| number_of_antibodies | | Number of antibodies | |
+| number_of_channels | | Number of fluorescent channels imaged during each cycle. | |
+| number_of_biomarker_imaging_rounds | | Number of imaging rounds to capture the tagged biomarkers. For CODEX a biomarker imaging round consists of 1. oligo application, 2. fluor application, 3. washes. For Cell DIVE a biomarker imaging round consists of 1. staining of a biomarker via secondary detection or direct conjugate and 2. dye inactivation. | |
+| number_of_total_imaging_rounds | | The total number of acquisitions performed on microscope to collect autofluorescence/background or stained signal (e.g., histology). | |
+| slide_id | | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | |
+| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | |
+
+
+
+
+## Deprecated Attributes
+
+
+ indicates a field that was previously required
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| assay_category * | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```sequence``` |
+| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | |
+| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | |
+| operator | | Name of the person responsible for executing the assay. | |
+| operator_email | | Email address for the operator. | |
+| pi | | Name of the principal investigator responsible for the data. | |
+| pi_email | | Email address for the principal investigator. | |
+| resolution_x_unit | | The unit of measurement of width of a pixel.(nm) | ```mm``` ```um``` ```nm``` |
+| resolution_x_value | | The width of a pixel. (Akoya pixel is 377nm square) | |
+| resolution_y_unit | | The unit of measurement of height of a pixel. (nm) | ```mm``` ```um``` ```nm``` |
+| resolution_y_value | | The height of a pixel. (Akoya pixel is 377nm square) | |
+| resolution_z_unit | | The unit of incremental distance between image slices. | ```mm``` ```um``` ```nm``` |
+| resolution_z_value | | Optional if assay does not have multiple z-levels. Note that this is resolution within a given sample: z-pitch (resolution_z_value) is the increment distance between image slices (for Akoya, z-pitch=1.5um) ie. the microscope stage is moved up or down in increments of 1.5um to capture images of several focal planes. The best one will be used & the rest discarded. The thickness of the sample itself is sample metadata. | |
diff --git a/docs/assays/metadata/Cell-DIVE.md b/docs/assays/metadata/Cell-DIVE.md
new file mode 100644
index 0000000..6afdcf1
--- /dev/null
+++ b/docs/assays/metadata/Cell-DIVE.md
@@ -0,0 +1,62 @@
+---
+layout: page-triary
+---
+
+# Cell DIVE Metadata Attributes
+
+Fields that are collected for Cell DIVE data, available at ```Dataset.metadata.```
+
+
+* indicates a required field
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | |
+| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | |
+| preparation_protocol_doi * | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | ```https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1``` |
+| dataset_type * | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` |
+| analyte_class * | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` |
+| is_targeted * | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` |
+| acquisition_instrument_vendor * | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` |
+| acquisition_instrument_model * | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` |
+| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | |
+| source_storage_duration_unit * | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` |
+| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | |
+| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` |
+| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | |
+| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | |
+| antibodies_path | | This is the location of the antibodies.tsv file relative to the root of the top level of the upload directory structure. This path should begin with "." and would likely be something like "./extras/antibodies.tsv". | |
+| number_of_antibodies | | Number of antibodies | |
+| number_of_channels | | Number of fluorescent channels imaged during each cycle. | |
+| number_of_biomarker_imaging_rounds | | Number of imaging rounds to capture the tagged biomarkers. For CODEX a biomarker imaging round consists of 1. oligo application, 2. fluor application, 3. washes. For Cell DIVE a biomarker imaging round consists of 1. staining of a biomarker via secondary detection or direct conjugate and 2. dye inactivation. | |
+| number_of_total_imaging_rounds | | The total number of acquisitions performed on microscope to collect autofluorescence/background or stained signal (e.g., histology). | |
+| slide_id | | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | |
+| cell_boundary_marker_or_stain * | | If a marker or stain was used to identify all cell boundaries in the tissue, then the name of the marker or stain should be included here. The name of the antibody-targeted molecule marker or non-antibody targeted molecule stain included here must be identical to what is found in the imaging data. For example, with the PhenoCycler, this name must match the value found in the XPD output file. If multiple marker or stains are used to identify all cell boundaries, then a comma separated list should be used here. | ```NAKATPASE``` ```CD298``` ```Not applicable``` |
+| nuclear_marker_or_stain * | | For markers, an antibody-targetted molecule present in or around the cell nucleus, the protein or gene symbol that identifies the antibody target that is used as the nuclear marker. This symbol must match the antibody target that is either generated from the panel used or entered with custom panels. Preferably, if using a custom antibody marker, this symbol should be the HGNC symbol (https://www.genenames.org/). For non-protein targets this is the stain name (e.g., DAPI) and, when appropriate, associated staining kit and vendor. For the PhenoCycler, this symbol must match the value found in the XPD output file. | ```DAPI``` ```Not applicable``` |
+| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | |
+
+
+
+
+## Deprecated Attributes
+
+
+ indicates a field that was previously required
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| assay_category * | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```sequence``` |
+| description | | Free-text description of this assay. | |
+| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | |
+| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | |
+| operator | | Name of the person responsible for executing the assay. | |
+| operator_email | | Email address for the operator. | |
+| overall_protocols_io_doi | | DOI for protocols.io referring to the overall protocol for the assay. | |
+| pi | | Name of the principal investigator responsible for the data. | |
+| pi_email | | Email address for the principal investigator. | |
+| processing_protocols_io_doi | | DOI for analysis protocols.io for this assay. | |
+| resolution_x_unit | | The unit of measurement of width of a pixel.(nm) | ```mm``` ```um``` ```nm``` |
+| resolution_x_value | | The width of a pixel. (Akoya pixel is 377nm square) | |
+| resolution_y_unit | | The unit of measurement of height of a pixel. (nm) | ```mm``` ```um``` ```nm``` |
+| resolution_y_value | | The height of a pixel. (Akoya pixel is 377nm square) | |
+| version * | | Version of the schema to use when validating this metadata. | ```1``` |
diff --git a/docs/assays/metadata/DESI.md b/docs/assays/metadata/DESI.md
index 7ead4ea..ec5a9b3 100644
--- a/docs/assays/metadata/DESI.md
+++ b/docs/assays/metadata/DESI.md
@@ -1,43 +1,69 @@
----
-layout: page
----
-# DESI
-
-Version 2 (Latest)
-
-## Version 2 (Latest)
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-----------------------------------------------------|-----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|------------|
-| dataset_type | Allowable Value | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium```| True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA ```| True |
-| acquisition_instrument_vendor | Allowable Value | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | True |
-| acquisition_instrument_model | Allowable Value | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | True |
-| source_storage_duration_value | Numeric | How long was the source material (parent) stored, prior to this sample being processed. | | True |
-| source_storage_duration_unit | Allowable Value | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | True |
-| time_since_acquisition_instrument_calibration_value | Numeric | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | False |
-| time_since_acquisition_instrument_calibration_unit | Allowable Value | The time unit of measurement |```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | False |
-| preparation_protocol_doi | Textfield | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | True |
-| contributors_path | Textfield | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | True |
-| data_path | Textfield | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | True |
-| mass_analysis_polarity | Allowable Value | The polarity of the mass analysis (positive or negative ion modes). | ```Negative and positive ion mode``` ```Negative ion mode``` ```Positive ion mode``` | True |
-| mass-to-charge_range_low_value | Numeric | The low value of the scanned mass-to-charge range, for MS1. (unitless) | | True |
-| mass-to-charge_range_high_value | Numeric | The high value of the scanned mass-to-charge range, for MS1. (unitless) | | True |
-| mass_resolving_power | Numeric | The mass resolving power m/∆m, where ∆m is defined as the full width at half-maximum (FWHM) for a given peak with a specified mass-to-charge (m/z). (unitless) | | True |
-| mass-to-charge_resolving_power | Numeric | The peak (m/z) used to calculate the resolving power. | | True |
-| matrix_deposition_method | Allowable Value | Common methods of depositing matrix for assisting in desorption and ionization in imaging mass spectrometry include robotic spotting, electrospray deposition, and sublimation. | ```Electrospray deposition``` ```Not applicable``` ```Robotic spotting``` ```Robotic spraying``` ```Sublimation``` | False |
-| preparation_instrument_vendor | Allowable Value | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` | False |
-| preparation_instrument_model | Allowable Value | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist ```| False |
-| preparation_matrix | Allowable Value | The matrix is a compound of crystallized molecules that acts like a buffer between the sample and the ionizing probe. It also helps ionize the sample, carrying it along the flight tube so it can be detected. | ```2,5-DHA (2,5-dihydroxyacetophenone)``` ```2,5-DHB (2,5-Dihydroxybenzoic acid)``` ```9-AA (9-aminoacridine)``` ```CHCA (alpha-cyano-4-hydroxy-cinnamic acid)``` ```DAN (1,5-diaminonapthalene)``` ```DMACA (4-(dimethylamino)cinnamic acid)``` ```NEDC (N-(1-naphthyl) ethylenediamine dihydrochloride)``` ```SA (sinapic acid)``` | False |
-| metadata_schema_id | Textfield | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | True |
-| ion_mobility | Allowable Value | Specifies which technology was used for ion mobility spectrometry. Technologies for measuring ion mobility: Traveling Wave Ion Mobility Spectrometry (TWIMS), Trapped Ion Mobility Spectrometry (TIMS), High Field Asymmetric waveform ion Mobility Spectrometry (FAIMS), Drift Tube Ion Mobility Spectrometry (DTIMS), Structures for Lossless Ion Manipulations (SLIM), and cyclic Ion Mobility Spectrometry (cIMS). | ```cIMS``` ```DTIMS``` ```FAIMS``` ```SLIM``` ```TIMS``` ```TWIMS``` | False |
-| desorption_solvent | Allowable Value | Solvent composition for conducting nanospray desorption electrospray ionization (nanoDESI) or desorption electrospray ionization (DESI). | ```Acetonitrile:Dimethylformamide (ACN:DMF)``` ```Acetonitrile:Water (ACN:H2O)``` ```Ethanol:Dimethylformamide (EtOH:DMF)``` ```Ethanol:Water (EtOH:H2O)``` ```Methanol:Ethanol (MeOH:EtOH)``` ```Methanol:Water (MeOH:H2O)``` | True |
-| desorption_solvent_flow_rate_value | Numeric | The rate of flow of the solvent into a spray. | | True |
-| desorption_solvent_flow_rate_unit | Allowable Value | Units of the rate of solvent flow. | ```nL/min``` ```uL/min``` | True |
-| analysis_protocol_doi | Textfield | A DOI to a protocols.io protocol describing the software and database(s) used to process the raw data. Example: https://dx.doi.org/10.17504/protocols.io.bsu5ney6 | | True |
-| ms_ionization_technique | Allowable Value | The ionization approach (i.e., sample probing method) for performing imaging mass spectrometry. | ```DESI``` ```ESI``` ```HESI``` ```LA``` ```LDI``` ```MALDI``` ```MALDI-2``` ```nanoDESI``` ```SIMS-C60``` ```SIMS-H20 ```| True |
-| ms_scan_mode | Allowable Value | MS (mass spectrometry) scan mode refers to the number of steps in the separation of fragments. | ```MS1``` ```MS2``` ```MS3``` | True |
-| parent_sample_id | Textfield | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | True |
-
-
+---
+layout: page-triary
+---
+
+# DESI Metadata Attributes
+
+Fields that are collected for DESI data, available at ```Dataset.metadata.```
+
+
+* indicates a required field
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | |
+| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | |
+| preparation_protocol_doi * | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | ```https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1``` |
+| dataset_type * | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` |
+| analyte_class * | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` |
+| is_targeted * | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` |
+| acquisition_instrument_vendor * | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` |
+| acquisition_instrument_model * | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` |
+| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | |
+| source_storage_duration_unit * | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` |
+| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | |
+| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` |
+| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | |
+| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | |
+| ms_ionization_technique * | | The ionization approach (i.e., sample probing method) for performing imaging mass spectrometry. | ```DESI``` ```ESI``` ```HESI``` ```LA``` ```LDI``` ```MALDI``` ```MALDI-2``` ```nanoDESI``` ```SIMS-C60``` ```SIMS-H20``` |
+| ms_scan_mode * | | MS (mass spectrometry) scan mode refers to the number of steps in the separation of fragments. | ```MS1``` ```MS2``` ```MS3``` |
+| mass_analysis_polarity * | | The polarity of the mass analysis (positive or negative ion modes). | ```Negative and positive ion mode``` ```Negative ion mode``` ```Positive ion mode``` |
+| mass_to_charge_range_low_value | | The low value of the scanned mass-to-charge range, for MS1. (unitless) | |
+| mass_to_charge_range_high_value | | The high value of the scanned mass-to-charge range, for MS1. (unitless) | |
+| mass_resolving_power | | The mass resolving power m/∆m, where ∆m is defined as the full width at half-maximum (FWHM) for a given peak with a specified mass-to-charge (m/z). (unitless) | |
+| mass_to_charge_resolving_power | | The peak (m/z) used to calculate the resolving power. | |
+| ion_mobility | | Specifies which technology was used for ion mobility spectrometry. Technologies for measuring ion mobility: Traveling Wave Ion Mobility Spectrometry (TWIMS), Trapped Ion Mobility Spectrometry (TIMS), High Field Asymmetric waveform ion Mobility Spectrometry (FAIMS), Drift Tube Ion Mobility Spectrometry (DTIMS), Structures for Lossless Ion Manipulations (SLIM), and cyclic Ion Mobility Spectrometry (cIMS). | ```TIMS``` ```SLIM``` ```FAIMS``` ```DTIMS``` ```cIMS``` ```TWIMS``` |
+| matrix_deposition_method * | | Common methods of depositing matrix for assisting in desorption and ionization in imaging mass spectrometry include robotic spotting, electrospray deposition, and sublimation. | ```Electrospray deposition``` ```Not applicable``` ```Robotic spotting``` ```Robotic spraying``` ```Sublimation``` |
+| preparation_instrument_vendor * | | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` |
+| preparation_instrument_model * | | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist``` |
+| preparation_matrix * | | The matrix is a compound of crystallized molecules that acts like a buffer between the sample and the ionizing probe. It also helps ionize the sample, carrying it along the flight tube so it can be detected. | ```2,5-DHA (2,5-dihydroxyacetophenone)``` ```2,5-DHB (2,5-Dihydroxybenzoic acid)``` ```9-AA (9-aminoacridine)``` ```CHCA (alpha-cyano-4-hydroxy-cinnamic acid)``` ```DAN (1,5-diaminonapthalene)``` ```DMACA (4-(dimethylamino)cinnamic acid)``` ```NEDC (N-(1-naphthyl) ethylenediamine dihydrochloride)``` ```SA (sinapic acid)``` |
+| desorption_solvent * | | Solvent composition for conducting nanospray desorption electrospray ionization (nanoDESI) or desorption electrospray ionization (DESI). | ```Acetonitrile:Dimethylformamide (ACN:DMF)``` ```Acetonitrile:Water (ACN:H2O)``` ```Ethanol:Dimethylformamide (EtOH:DMF)``` ```Ethanol:Water (EtOH:H2O)``` ```Methanol:Ethanol (MeOH:EtOH)``` ```Methanol:Water (MeOH:H2O)``` |
+| desorption_solvent_flow_rate_value | | The rate of flow of the solvent into a spray. | |
+| desorption_solvent_flow_rate_unit * | | Units of the rate of solvent flow. | ```nL/min``` ```uL/min``` |
+| analysis_protocol_doi | | A DOI to a protocols.io protocol describing the software and database(s) used to process the raw data. Example: https://dx.doi.org/10.17504/protocols.io.bsu5ney6 | |
+| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | |
+
+
+
+
+## Deprecated Attributes
+
+
+ indicates a field that was previously required
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| assay_category * | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```sequence``` |
+| description | | Free-text description of this assay. | |
+| overall_protocols_io_doi | | DOI for protocols.io referring to the overall protocol for the assay. | |
+| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | |
+| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | |
+| operator | | Name of the person responsible for executing the assay. | |
+| operator_email | | Email address for the operator. | |
+| pi | | Name of the principal investigator responsible for the data. | |
+| pi_email | | Email address for the principal investigator. | |
+| resolution_x_unit | | The unit of measurement of width of a pixel.(nm) | ```mm``` ```um``` ```nm``` |
+| resolution_x_value | | The width of a pixel. (Akoya pixel is 377nm square) | |
+| resolution_y_unit | | The unit of measurement of height of a pixel. (nm) | ```mm``` ```um``` ```nm``` |
+| resolution_y_value | | The height of a pixel. (Akoya pixel is 377nm square) | |
+| version * | | Version of the schema to use when validating this metadata. | ```1``` |
diff --git a/docs/assays/metadata/Histology.md b/docs/assays/metadata/Histology.md
index 7170d0b..52b1caa 100644
--- a/docs/assays/metadata/Histology.md
+++ b/docs/assays/metadata/Histology.md
@@ -1,41 +1,65 @@
----
-layout: page
----
-# Histology
-
-Version 2 (Latest)
-
-## Version 2 (Latest)
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-----------------------------------------------------|-----------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|------------|
-| source_storage_duration_value | Numeric | How long was the source material (parent) stored, prior to this sample being processed. | | True |
-| time_since_acquisition_instrument_calibration_value | Numeric | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | False |
-| contributors_path | Textfield | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | True |
-| data_path | Textfield | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | True |
-| is_image_preprocessing_required | Allowable Value | Depending on if the acquisition instrument was a microscope, slide scanner, etc. will indicate whether or not any level of preprocessing was required to assemble the image (e.g., fusing image tiles) . | ```Yes``` ```No``` | False |
-| is_batch_staining_done | Allowable Value | Are the slides stained using a linear batch method or individually? | ```Yes``` ```No``` | True |
-| is_staining_automated | Allowable Value | Is the slide staining automated with an instrument? | ```Yes``` ```No``` | True |
-| slide_id | Textfield | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | False |
-| tiled_image_columns | Numeric | This is how many columns used in stitching. This is sometimes referred to as the grid size x. | | False |
-| tiled_image_count | Numeric | This is the total number of raw (tiled) images captured, that are to be stitched together. | | False |
-| intended_tile_overlap_percentage | Numeric | The amount of overlap between tiled images. This is the set point, where as during image acquisition there will be slight variations due to stage registration. | | False |
-| metadata_schema_id | Textfield | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | True |
-| dataset_type | Allowable Value | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium```| True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA ```| True |
-| acquisition_instrument_vendor | Allowable Value | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | True |
-| acquisition_instrument_model | Allowable Value | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | True |
-| source_storage_duration_unit | Allowable Value | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | True |
-| time_since_acquisition_instrument_calibration_unit | Allowable Value | The time unit of measurement |```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | False |
-| stain_name | Allowable Value | The name of the chemical stains (dyes) applied to histology samples to highlight important features of the tissue as well as to enhance the tissue contrast. | ```AB-PAS``` ```H&E``` ```H-DAB``` ```LFB``` ```PAS``` ```Trichrome ```| True |
-| stain_technique | Allowable Value | There are typically three types of stains: progressive, modified progressive, and regressive. Progressive staining occurs when the hematoxylin is added to the tissue without being followed by a differentiator to remove excess dye. With regressive and modified progressive staining, a differentiator is used. | ```Modified progressive staining``` ```Not applicable``` ```Progressive staining``` ```Regressive staining``` | False |
-| preparation_instrument_vendor | Allowable Value | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` | False |
-| preparation_instrument_model | Allowable Value | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist ```| False |
-| tile_configuration | Allowable Value | This is how the tiles are configured for stitching. | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | False |
-| scan_direction | Allowable Value | This is the direction of imaging, which is required for stitching. | ```Left-and-down``` ```Left-and-up``` ```Not applicable``` ```Right-and-down``` ```Right-and-up``` | False |
-| preparation_protocol_doi | Textfield | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | True |
-| parent_sample_id | Textfield | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | True |
-| non_global_files | Textfield | A semicolon separated list of non-shared files to be included in the dataset. The path assumes the files are located in the "TOP/non-global/" directory. For example, for the file is TOP/non-global/lab_processed/images/1-tissue-boundary.geojson the value of this field would be "./lab_processed/images/1-tissue-boundary.geojson". After ingest, these files will be copied to the appropriate locations within the respective dataset directory tree. This field is used for internal HuBMAP processing. Examples for GeoMx and PhenoCycler are provided in the File Locations documentation: https://docs.google.com/document/d/1n2McSs9geA9Eli4QWQaB3c9R3wo5d5U1Xd57DWQfN5Q/edit#heading=h.1u82i4axggee | | False |
-
-
+---
+layout: page-triary
+---
+
+# Histology Metadata Attributes
+
+Fields that are collected for Histology data, available at ```Dataset.metadata.```
+
+
+* indicates a required field
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| preparation_protocol_doi * | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | ```https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1``` |
+| dataset_type * | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` |
+| analyte_class * | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` |
+| is_targeted * | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` |
+| acquisition_instrument_vendor * | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` |
+| acquisition_instrument_model * | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` |
+| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | |
+| source_storage_duration_unit * | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` |
+| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | |
+| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` |
+| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | |
+| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | |
+| is_image_preprocessing_required * | | Indicates whether image preprocessing is necessary based on the type of acquisition instrument used, such as a microscope or slide scanner. This may involve steps like fusing image tiles to assemble the complete image. Example: Yes | ```Yes``` ```No``` |
+| stain_name * | | The name of the chemical stains (dyes) applied to histology samples to highlight important features of the tissue as well as to enhance the tissue contrast. | ```AB-PAS``` ```H&E``` ```H-DAB``` ```LFB``` ```PAS``` ```Trichrome``` |
+| stain_technique | | There are typically three types of stains: progressive, modified progressive, and regressive. Progressive staining occurs when the hematoxylin is added to the tissue without being followed by a differentiator to remove excess dye. With regressive and modified progressive staining, a differentiator is used. | ```Modified progressive staining``` ```Not applicable``` ```Progressive staining``` ```Regressive staining``` |
+| is_batch_staining_done * | | Are the slides stained using a linear batch method or individually? | ```Yes``` ```No``` |
+| is_staining_automated * | | Is the slide staining automated with an instrument? | ```Yes``` ```No``` |
+| preparation_instrument_vendor * | | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` |
+| preparation_instrument_model * | | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist``` |
+| slide_id | | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | |
+| tile_configuration | | The configuration of tiles used for stitching in the assay process. If no tile configuration is applicable, enter "Not applicable". Example: Row-by-row | ```Column-by-column``` ```Not applicable``` ```Snake-by-columns``` ```Row-by-row``` ```Snake-by-rows``` |
+| scan_direction | | The direction of imaging, which is necessary for the stitching process. Example: Left-and-down | ```Left-and-down``` ```Right-and-down``` ```Not applicable``` ```Right-and-up``` ```Left-and-up``` |
+| tiled_image_columns | | The number of columns used in the stitching process of a tiled image, often referred to as the grid size in the x-dimension. Example: 5 | |
+| tiled_image_count | | The total number of raw tiled images captured, which are intended to be stitched together. Example: 75 | |
+| intended_tile_overlap_percentage | | The intended percentage of overlap between tiled images. This value serves as the set point, although slight variations may occur during image acquisition due to stage registration. Example: 5 | |
+| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | |
+
+
+
+
+## Deprecated Attributes
+
+
+ indicates a field that was previously required
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| assay_category * | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```sequence``` |
+| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | |
+| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | |
+| operator | | Name of the person responsible for executing the assay. | |
+| operator_email | | Email address for the operator. | |
+| overall_protocols_io_doi | | DOI for protocols.io referring to the overall protocol for the assay. | |
+| pi | | Name of the principal investigator responsible for the data. | |
+| pi_email | | Email address for the principal investigator. | |
+| resolution_x_unit | | The unit of measurement of width of a pixel.(nm) | ```mm``` ```um``` ```nm``` |
+| resolution_x_value | | The width of a pixel. (Akoya pixel is 377nm square) | |
+| resolution_y_unit | | The unit of measurement of height of a pixel. (nm) | ```mm``` ```um``` ```nm``` |
+| resolution_y_value | | The height of a pixel. (Akoya pixel is 377nm square) | |
+| resolution_z_unit | | The unit of incremental distance between image slices. | ```mm``` ```um``` ```nm``` |
+| resolution_z_value | | Optional if assay does not have multiple z-levels. Note that thisis resolution within a given sample: z-pitch (resolution_z_value) is the incrementdistance between image slices (for Akoya, z-pitch=1.5um) ie. the microscope stageis moved up or down in increments of 1.5um to capture images of several focalplanes. The best one will be used & the rest discarded. The thickness of the sampleitself is sample metadata. | |
+| version * | | Version of the schema to use when validating this metadata. | ```1``` |
diff --git a/docs/assays/metadata/IMC.md b/docs/assays/metadata/IMC.md
index d438320..00397c9 100644
--- a/docs/assays/metadata/IMC.md
+++ b/docs/assays/metadata/IMC.md
@@ -1,234 +1,76 @@
----
-layout: page
----
-# IMC-2D
-
-NOTE: Several versions of this metadata schema have been created over time. The (Latest) version contains most attributes, but there may be some deprecated attributes in the older versions for which data has been collected. HuBMAP is in the process of creating a reference which combines all of these versions into a single view. That reference will be available here once completed.
-
-2D IMC Version 2 (Latest)
-
-## 2D IMC Version 2 (Latest)
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-----------------------------------------------------|-----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|------------|
-| dataset_type | Allowable Value | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium```| True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA ```| True |
-| acquisition_instrument_vendor | Allowable Value | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | True |
-| acquisition_instrument_model | Allowable Value | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | True |
-| source_storage_duration_value | Numeric | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | | True |
-| source_storage_duration_unit | Allowable Value | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | True |
-| time_since_acquisition_instrument_calibration_value | Numeric | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | False |
-| time_since_acquisition_instrument_calibration_unit | Allowable Value | The time unit of measurement |```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | False |
-| preparation_protocol_doi | Textfield | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | True |
-| contributors_path | Textfield | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | True |
-| data_path | Textfield | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | True |
-| parent_sample_id | Textfield | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | True |
-| total_run_time_value | Numeric | How long the tissue was on the acquisition instrument. | | True |
-| total_run_time_unit | Allowable Value | The units for the total run time unit field. | ```Hour``` ```Minute``` | True |
-| number_of_antibodies | Numeric | Number of antibodies | | True |
-| number_of_channels | Numeric | The number of distinct color channels in the image. | | True |
-| slide_id | Textfield | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | True |
-| data_precision_bytes | Numeric | Numerical data precision in bytes. | | True |
-| ablation_frequency_value | Numeric | Frequency value of laser ablation | | True |
-| ablation_frequency_unit | Allowable Value | Frequency unit of laser ablation | ```Hz``` | True |
-| antibodies_path | Textfield | This is the location of the antibodies.tsv file relative to the root of the top level of the upload directory structure. This path should begin with "." and would likely be something like "./extras/antibodies.tsv". | | True |
-| metadata_schema_id | Textfield | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | True |
-
-
-
-
-2D IMC Version 1
-
-## 2D IMC Version 1
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-----------------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['1'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['mass_spectrometry_imaging'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['Imaging Mass Cytometry'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['protein'] | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ['Yes','No'] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| preparation_instrument_vendor | Textfield | The manufacturer of the instrument used to prepare the sample for the assay. | | True |
-| preparation_instrument_model | Textfield | The model number/name of the instrument used to prepare the sample for the assay | | True |
-| section_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing tissue sections for the assay. | | True |
-| reagent_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing reagents for the assay. | | True |
-| number_of_channels | Numeric | Number of mass channels measured | | True |
-| ablation_distance_between_shots_x_value | Numeric | x resolution. Distance between laser ablation shots in the X-dimension. | | True |
-| ablation_distance_between_shots_x_units | Allowable Value | Units of x resolution distance between laser ablation shots. | ['um', 'nm'] | True |
-| ablation_distance_between_shots_y_value | Numeric | y resolution. Distance between laser ablation shots in the Y-dimension. | | True |
-| ablation_distance_between_shots_y_units | Allowable Value | Units of y resolution distance between laser ablation shots. | ['um', 'nm'] | True |
-| ablation_frequency_value | Numeric | Frequency value of laser ablation (in Hz) | | True |
-| ablation_frequency_unit | Allowable Value | Frequency unit of laser ablation | ['Hz'] | False |
-| roi_description | Textfield | A description of the region of interest (ROI) captured in the image. | | True |
-| roi_id | Numeric | Multiple images (1-n) are acquired from regions of interest (ROI1, ROI2, ROI3, etc) on a slide. The roi_id is a number from 1-n representing the ROI captured on a slide. | | True |
-| acquisition_id | Textfield | The acquisition_id refers to the directory containing the ROI images for a slide. Together, the acquisition_id and the roi_id indicate the slide-ROI represented in the image. | | True |
-| dual_count_start | Numeric | Threshold for dual counting. | | True |
-| max_x_width_value | Numeric | Image width value of the ROI acquisition | | True |
-| max_x_width_unit | Allowable Value | Units of image width of the ROI acquisition | ['um'] | False |
-| max_y_height_value | Numeric | Image height value of the ROI acquisition | | True |
-| max_y_height_unit | Allowable Value | Units of image height of the ROI acquisition | ['um'] | False |
-| segment_data_format | Allowable Value | This refers to the data type, which is a "float" for the IMC counts. | ['float', 'integer', 'string'] | True |
-| signal_type | Allowable Value | Type of signal measured per channel (usually dual counts) | ['dual count', 'pulse count', 'intensity value'] | True |
-| data_precision_bytes | Numeric | Numerical data precision in bytes | | True |
-| antibodies_path | Textfield | Relative path to file with antibody information for this dataset. | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
-
-2D IMC Version 0
-
-## 2D IMC Version 0
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-----------------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------|------------|
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['mass_spectrometry_imaging'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['Imaging Mass Cytometry'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['protein'] | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ['Yes','No'] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| preparation_instrument_vendor | Textfield | The manufacturer of the instrument used to prepare the sample for the assay. | | True |
-| preparation_instrument_model | Textfield | The model number/name of the instrument used to prepare the sample for the assay | | True |
-| section_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing tissue sections for the assay. | | True |
-| reagent_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing reagents for the assay. | | True |
-| number_of_channels | Numeric | Number of mass channels measured | | True |
-| ablation_distance_between_shots_x_value | Numeric | x resolution. Distance between laser ablation shots in the X-dimension. | | True |
-| ablation_distance_between_shots_x_units | Allowable Value | Units of x resolution distance between laser ablation shots. | ['um', 'nm'] | True |
-| ablation_distance_between_shots_y_value | Numeric | y resolution. Distance between laser ablation shots in the Y-dimension. | | True |
-| ablation_distance_between_shots_y_units | Allowable Value | Units of y resolution distance between laser ablation shots. | ['um', 'nm'] | True |
-| ablation_frequency_value | Numeric | Frequency value of laser ablation (in Hz) | | True |
-| ablation_frequency_unit | Allowable Value | Frequency unit of laser ablation | ['Hz'] | False |
-| roi_description | Textfield | A description of the region of interest (ROI) captured in the image. | | True |
-| roi_id | Numeric | Multiple images (1-n) are acquired from regions of interest (ROI1, ROI2, ROI3, etc) on a slide. The roi_id is a number from 1-n representing the ROI captured on a slide. | | True |
-| acquisition_id | Textfield | The acquisition_id refers to the directory containing the ROI images for a slide. Together, the acquisition_id and the roi_id indicate the slide-ROI represented in the image. | | True |
-| dual_count_start | Numeric | Threshold for dual counting. | | True |
-| end_datetime | Datetime | Time stamp indicating end of ablation for ROI | | True |
-| max_x_width_value | Numeric | Image width value of the ROI acquisition | | True |
-| max_x_width_unit | Allowable Value | Units of image width of the ROI acquisition | ['um'] | False |
-| max_y_height_value | Numeric | Image height value of the ROI acquisition | | True |
-| max_y_height_unit | Allowable Value | Units of image height of the ROI acquisition | ['um'] | False |
-| segment_data_format | Allowable Value | This refers to the data type, which is a "float" for the IMC counts. | ['float', 'integer', 'string'] | True |
-| signal_type | Allowable Value | Type of signal measured per channel (usually dual counts) | ['dual count', 'pulse count', 'intensity value'] | True |
-| start_datetime | Datetime | Time stamp indicating start of ablation for ROI | | True |
-| data_precision_bytes | Numeric | Numerical data precision in bytes | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
-
-3D IMC Version 1 (no longer accepting data)
-
-## 3D IMC Version 1 (no longer accepting data)
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-----------------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['1'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['mass_spectrometry_imaging'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['3D Imaging Mass Cytometry'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['protein'] | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ['Yes','No'] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| preparation_instrument_vendor | Textfield | The manufacturer of the instrument used to prepare the sample for the assay. | | True |
-| preparation_instrument_model | Textfield | The model number/name of the instrument used to prepare the sample for the assay | | True |
-| section_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing tissue sections for the assay. | | True |
-| reagent_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing reagents for the assay. | | True |
-| number_of_channels | Numeric | Number of mass channels measured | | True |
-| number_of_sections | Numeric | Number of sections | | True |
-| ablation_distance_between_shots_x_value | Numeric | x resolution. Distance between laser ablation shots in the X-dimension. | | True |
-| ablation_distance_between_shots_x_units | Allowable Value | Units of x resolution distance between laser ablation shots. | ['um', 'nm'] | True |
-| ablation_distance_between_shots_y_value | Numeric | y resolution. Distance between laser ablation shots in the Y-dimension. | | True |
-| ablation_distance_between_shots_y_units | Allowable Value | Units of y resolution distance between laser ablation shots. | ['um', 'nm'] | True |
-| ablation_frequency_value | Numeric | Frequency value of laser ablation (in Hz) | | True |
-| ablation_frequency_unit | Allowable Value | Frequency unit of laser ablation | ['Hz'] | False |
-| roi_description | Textfield | A description of the region of interest (ROI) captured in the image. | | True |
-| roi_id | Numeric | Multiple images (1-n) are acquired from regions of interest (ROI1, ROI2, ROI3, etc) on a slide. The roi_id is a number from 1-n representing the ROI captured on a slide. | | True |
-| acquisition_id | Textfield | The acquisition_id refers to the directory containing the ROI images for a slide. Together, the acquisition_id and the roi_id indicate the slide-ROI represented in the image. | | True |
-| max_x_width_value | Numeric | Image width value of the ROI acquisition | | True |
-| max_x_width_unit | Allowable Value | Units of image width of the ROI acquisition | ['um'] | False |
-| max_y_height_value | Numeric | Image height value of the ROI acquisition | | True |
-| max_y_height_unit | Allowable Value | Units of image height of the ROI acquisition | ['um'] | False |
-| segment_data_format | Allowable Value | This refers to the data type, which is a "float" for the IMC counts. | ['float', 'integer', 'string'] | True |
-| signal_type | Allowable Value | Type of signal measured per channel (usually dual counts) | ['dual count', 'pulse count', 'intensity value'] | True |
-| antibodies_path | Textfield | Relative path to file with antibody information for this dataset. | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
-
-3D IMC Version 0
-
-## 3D IMC Version 0
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-----------------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------|------------|
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['mass_spectrometry_imaging'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['3D Imaging Mass Cytometry'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['protein'] | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ['Yes','No'] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| preparation_instrument_vendor | Textfield | The manufacturer of the instrument used to prepare the sample for the assay. | | True |
-| preparation_instrument_model | Textfield | The model number/name of the instrument used to prepare the sample for the assay | | True |
-| section_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing tissue sections for the assay. | | True |
-| reagent_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing reagents for the assay. | | True |
-| number_of_channels | Numeric | Number of mass channels measured | | True |
-| number_of_sections | Numeric | Number of sections | | True |
-| ablation_distance_between_shots_x_value | Numeric | x resolution. Distance between laser ablation shots in the X-dimension. | | True |
-| ablation_distance_between_shots_x_units | Allowable Value | Units of x resolution distance between laser ablation shots. | ['um', 'nm'] | True |
-| ablation_distance_between_shots_y_value | Numeric | y resolution. Distance between laser ablation shots in the Y-dimension. | | True |
-| ablation_distance_between_shots_y_units | Allowable Value | Units of y resolution distance between laser ablation shots. | ['um', 'nm'] | True |
-| ablation_frequency_value | Numeric | Frequency value of laser ablation (in Hz) | | True |
-| ablation_frequency_unit | Allowable Value | Frequency unit of laser ablation | ['Hz'] | False |
-| roi_description | Textfield | A description of the region of interest (ROI) captured in the image. | | True |
-| roi_id | Numeric | Multiple images (1-n) are acquired from regions of interest (ROI1, ROI2, ROI3, etc) on a slide. The roi_id is a number from 1-n representing the ROI captured on a slide. | | True |
-| acquisition_id | Textfield | The acquisition_id refers to the directory containing the ROI images for a slide. Together, the acquisition_id and the roi_id indicate the slide-ROI represented in the image. | | True |
-| max_x_width_value | Numeric | Image width value of the ROI acquisition | | True |
-| max_x_width_unit | Allowable Value | Units of image width of the ROI acquisition | ['um'] | False |
-| max_y_height_value | Numeric | Image height value of the ROI acquisition | | True |
-| max_y_height_unit | Allowable Value | Units of image height of the ROI acquisition | ['um'] | False |
-| segment_data_format | Allowable Value | This refers to the data type, which is a "float" for the IMC counts. | ['float', 'integer', 'string'] | True |
-| signal_type | Allowable Value | Type of signal measured per channel (usually dual counts) | ['dual count', 'pulse count', 'intensity value'] | True |
-| antibodies_path | Textfield | Relative path to file with antibody information for this dataset. | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
+---
+layout: page-triary
+---
+
+# IMC-2D Metadata Attributes
+
+Fields that are collected for IMC-2D data, available at ```Dataset.metadata.```
+
+
+* indicates a required field
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | |
+| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | |
+| preparation_protocol_doi * | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | ```https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1``` |
+| dataset_type * | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` |
+| analyte_class * | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` |
+| is_targeted * | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` |
+| acquisition_instrument_vendor * | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` |
+| acquisition_instrument_model * | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` |
+| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | |
+| source_storage_duration_unit * | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` |
+| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | |
+| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` |
+| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | |
+| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | |
+| total_run_time_value | | How long the tissue was on the acquisition instrument. | |
+| total_run_time_unit * | | The units for the total run time unit field. | ```Hour``` ```Minute``` |
+| number_of_antibodies | | Number of antibodies | |
+| number_of_channels | | Number of fluorescent channels imaged during each cycle. | |
+| slide_id | | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | |
+| data_precision_bytes | | Numerical data precision in bytes. | |
+| ablation_frequency_value | | Frequency value of laser ablation | |
+| ablation_frequency_unit * | | Frequency unit of laser ablation | ```Hz``` |
+| antibodies_path | | This is the location of the antibodies.tsv file relative to the root of the top level of the upload directory structure. This path should begin with "." and would likely be something like "./extras/antibodies.tsv". | |
+| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | |
+
+
+
+
+## Deprecated Attributes
+
+
+ indicates a field that was previously required
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| ablation_distance_between_shots_x_units * | | Units of x resolution distance between laser ablation shots. | ```um``` ```nm``` |
+| ablation_distance_between_shots_x_value | | x resolution. Distance between laser ablation shots in the X-dimension. | |
+| ablation_distance_between_shots_y_units * | | Units of y resolution distance between laser ablation shots. | ```um``` ```nm``` |
+| ablation_distance_between_shots_y_value | | y resolution. Distance between laser ablation shots in the Y-dimension. | |
+| acquisition_id | | The acquisition_id refers to the directory containing the ROI images for a slide. Together, the acquisition_id and the roi_id indicate the slide-ROI represented in the image. | |
+| assay_category * | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```sequence``` |
+| description | | Free-text description of this assay. | |
+| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | |
+| dual_count_start | | Threshold for dual counting. | |
+| end_datetime | | Time stamp indicating end of ablation for ROI | |
+| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | |
+| max_x_width_unit | | Units of image width of the ROI acquisition | ```um``` |
+| max_x_width_value | | Image width value of the ROI acquisition | |
+| max_y_height_unit | | Units of image height of the ROI acquisition | ```um``` |
+| max_y_height_value | | Image height value of the ROI acquisition | |
+| operator | | Name of the person responsible for executing the assay. | |
+| operator_email | | Email address for the operator. | |
+| pi | | Name of the principal investigator responsible for the data. | |
+| pi_email | | Email address for the principal investigator. | |
+| preparation_instrument_model * | | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist``` |
+| preparation_instrument_vendor * | | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` |
+| reagent_prep_protocols_io_doi | | DOI for protocols.io referring to the protocol for preparing reagents for the assay. | |
+| roi_description | | A description of the region of interest (ROI) captured in the image. | |
+| roi_id | | Multiple images (1-n) are acquired from regions of interest (ROI1, ROI2, ROI3, etc) on a slide. The roi_id is a number from 1-n representing the ROI captured on a slide. | |
+| segment_data_format * | | This refers to the data type, which is a "float" for the IMC counts. | ```float``` ```integer``` ```string``` |
+| signal_type * | | Type of signal measured per channel (usually dual counts) | ```dual count``` ```pulse count``` ```intensity value``` |
+| start_datetime | | Time stamp indicating start of ablation for ROI | |
+| version * | | Version of the schema to use when validating this metadata. | ```1``` |
diff --git a/docs/assays/metadata/LC-MS.md b/docs/assays/metadata/LC-MS.md
index 75cdce0..c7ddb67 100644
--- a/docs/assays/metadata/LC-MS.md
+++ b/docs/assays/metadata/LC-MS.md
@@ -1,296 +1,89 @@
----
-layout: page
----
-# LC-MS
-
-NOTE: Several versions of this metadata schema have been created over time. The (Latest) version contains most attributes, but there may be some deprecated attributes in the older versions for which data has been collected. HuBMAP is in the process of creating a reference which combines all of these versions into a single view. That reference will be available here once completed.
-
-Version 4 (Latest)
-
-## Version 4 (Latest)
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-----------------------------------------------------|-----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|------------|
-| dataset_type | Allowable Value | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium```| True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA ```| True |
-| acquisition_instrument_vendor | Allowable Value | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | True |
-| acquisition_instrument_model | Allowable Value | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | True |
-| source_storage_duration_value | Numeric | How long was the source material (parent) stored, prior to this sample being processed. | | True |
-| source_storage_duration_unit | Allowable Value | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | True |
-| time_since_acquisition_instrument_calibration_value | Numeric | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | False |
-| time_since_acquisition_instrument_calibration_unit | Allowable Value | The time unit of measurement |```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | False |
-| preparation_protocol_doi | Textfield | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | True |
-| contributors_path | Textfield | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | True |
-| data_path | Textfield | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | True |
-| mass_analysis_polarity | Allowable Value | The polarity of the mass analysis (positive or negative ion modes). | ```Negative and positive ion mode``` ```Negative ion mode``` ```Positive ion mode``` | True |
-| mass-to-charge_range_low_value | Numeric | The low value of the scanned mass-to-charge range, for MS1. (unitless) | | False |
-| mass-to-charge_range_high_value | Numeric | The high value of the scanned mass-to-charge range, for MS1. (unitless) | | False |
-| mass_resolving_power | Numeric | The mass resolving power m/∆m, where ∆m is defined as the full width at half-maximum (FWHM) for a given peak with a specified mass-to-charge (m/z). (unitless) | | False |
-| mass-to-charge_resolving_power | Numeric | The peak (m/z) used to calculate the resolving power. | | False |
-| ion_mobility | Allowable Value | Specifies which technology was used for ion mobility spectrometry. Technologies for measuring ion mobility: Traveling Wave Ion Mobility Spectrometry (TWIMS), Trapped Ion Mobility Spectrometry (TIMS), High Field Asymmetric waveform ion Mobility Spectrometry (FAIMS), Drift Tube Ion Mobility Spectrometry (DTIMS), Structures for Lossless Ion Manipulations (SLIM), and cyclic Ion Mobility Spectrometry (cIMS). | ```cIMS``` ```DTIMS``` ```FAIMS``` ```SLIM``` ```TIMS``` ```TWIMS``` | False |
-| ms_ionization_technique | Allowable Value | The ionization approach (i.e., sample probing method) for performing imaging mass spectrometry. | ```DESI``` ```ESI``` ```HESI``` ```LA``` ```LDI``` ```MALDI``` ```MALDI-2``` ```nanoDESI``` ```SIMS-C60``` ```SIMS-H20 ```| True |
-| ms_scan_mode | Allowable Value | MS (mass spectrometry) scan mode refers to the number of steps in the separation of fragments. | ```MS1``` ```MS2``` ```MS3``` | True |
-| label_name | Textfield | If the samples were labeled (e.g. TMT), provide the name/ID of the label on this sample. Leave blank if not applicable. | | False |
-| lc_instrument_vendor | Allowable Value | The manufacturer of the instrument used for liquid chromatography. | ```Agilent Technologies``` ```Bruker``` ```Evosep``` ```In-House``` ```Sciex``` ```Thermo Fisher Scientific``` ```Waters``` | False |
-| lc_instrument_model | Textfield | The model number/name of the instrument used for liquid chromatography. | | False |
-| lc_column_model | Textfield | The model number/name of the liquid chromatography column. If it is a custom self-packed, pulled tip capillary is used enter “Pulled tip capilary”. | | False |
-| lc_resin | Textfield | Details of the resin used for liquid chromatography, including vendor, particle size, pore size. | | False |
-| lc_column_length_value | Numeric | Liquid chromatography column length. | | False |
-| lc_column_length_unit | Allowable Value | Units for liquid chromatography column length (typically cm). | ```um``` ```mm``` ```cm``` | False |
-| lc_temperature_value | Numeric | Liquid chromatography temperature. | | False |
-| lc_inner_diameter_value | Numeric | Liquid chromatography column inner diameter. | | False |
-| lc_flow_rate_value | Numeric | Value of flow rate. | | False |
-| lc_gradient_value | Numeric | Liquid chromatography gradient. | | False |
-| lc_gradient_unit | Allowable Value | Unit for liquid chromatography gradient | ```Minute``` | False |
-| lc_mobile_phase_a | Textfield | Composition of mobile phase A. | | False |
-| lc_mobile_phase_b | Textfield | | | False |
-| spatial_sampling_technique | Allowable Value | | ```LCM``` ```LESA``` ```microLESA``` ```microPOTS``` ```nanoPOTS``` ```nanoSPLITS``` | False |
-| spatial_sampling_target | Textfield | Specifies the cell-type or functional tissue unit (FTU) that is targeted in the spatial profiling experiment. Leave blank if data are generated in imaging mode without a specific target structure. | | False |
-| analysis_protocol_doi | Textfield | A DOI to a protocols.io protocol describing the software and database(s) used to process the raw data. Example: https://dx.doi.org/10.17504/protocols.io.bsu5ney6 | | True |
-| metadata_schema_id | Textfield | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | True |
-| data_collection_mode | Allowable Value | Mode of data collection in tandem MS assays. Either DDA (Data-dependent acquisition), DIA (Data-independent acquisition), SRM (multiple reaction monitoring), or PRM (parallel reaction monitoring). | ```DDA``` ```PRM``` ```DIA``` ```SRM``` | False |
-| lc_column_vendor | Allowable Value | The manufacturer of the liquid chromatography column unless self-packed, pulled tip capillary is used. | ```Bruker``` ```Evosep``` ```In-House``` ```IonOpticks``` ```Thermo Fisher Scientific``` ```Waters``` | False |
-| lc_temperature_unit | Allowable Value | | ```Celsius``` | False |
-| lc_inner_diameter_unit | Allowable Value | | ```um``` ```mm``` ```cm``` | False |
-| lc_flow_rate_unit | Allowable Value | Units of flow rate. | ```mL/min``` ```nL/min``` | False |
-| spatial_sampling_type | Allowable Value | Specifies whether or not the analysis was performed in a spatially targeted manner. Spatial profiling experiments target specific tissue foci but do not necessarily generate images. Spatial imaging expriments collect data from a regular array (pixels) that can be visualized as heat maps of ion intensity at each location (molecular images). Leave blank if data are derived from bulk analysis. | ```Imaging``` ```Profiling``` | False |
-| parent_sample_id | Textfield | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | True |
-
-
-
-Version 3
-
-## Version 3
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-------------------------------|-----------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['3'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped foldergenerated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year,MM is the month with leading 0s, and DD is the day with leading 0s, hh is thehour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories:generation of images of microscopic entities, identification & quantitation ofmolecules by mass spectrometry, imaging mass spectrometry, and determination ofnucleotide sequence. | ['mass_spectrometry'] | True |
-| assay_type | Allowable Value | Bottom-up refers to analyzing proteins in a sample by digesting themto peptides. Top-down refers to analyzing whole proteins without digestion. LC-MSand MS are for lipids/metabolites. LC-MS Bottom-Up and MS Bottom-Up are for peptides.LC-MS Top-Down and MS Top-Down are for proteins. | ['LC-MS', 'MS', 'LC-MS Bottom-Up', 'MS Bottom-Up', 'LC-MS Top-Down', 'MS Top-Down'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['protein', 'metabolites', 'lipids', 'peptides', 'phosphopeptides', 'glycans'] | False |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted fordetection/measurement by the assay. | ['Yes','No'] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detectionhardware and signal processing software. Assays generate signals such as lightof various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions(models) of that instrument with different features or sensitivities. Differencesin features or sensitivities may be relevant to processing or interpretation ofthe data. | | True |
-| dms | Allowable Value | Was differential mobility spectrometry used in this assay? | ['Yes','No'] | True |
-| ms_source | Allowable Value | The ion source type used for surface sampling. | ['ESI'] | True |
-| polarity | Allowable Value | The polarity of the mass analysis (positive or negative ion modes) | ['negative ion mode', 'positive ion mode', 'negative and positive ion mode'] | True |
-| mz_range_low_value | Numeric | The low value of the scanned mass range for MS1. (unitless) | | True |
-| mz_range_high_value | Numeric | The high value of the scanned mass range for MS1. (unitless) | | True |
-| mass_resolving_power | Numeric | The MS1 resolving power defined as m/âm where âm is the FWHM for a given peak with a specified m/z (m). (unitless) | | False |
-| mz_resolving_power | Numeric | The peak (m/z) used to calculate the resolving power. | | False |
-| ion_mobility | Allowable Value | Specifies whether or not ion mobility spectrometry was performed andwhich technology was used. Technologies for measuring ion mobility: TravelingWave Ion Mobility Spectrometry (TWIMS), Trapped Ion Mobility Spectrometry (TIMS),High Field Asymmetric waveform ion Mobility Spectrometry (FAIMS), Drift Tube IonMobility Spectrometry (DTIMS, Structures for Lossless Ion Manipulations (SLIM). | ['TIMS', 'TWIMS', 'FAIMS', 'DTIMS', 'SLIMS'] | False |
-| data_collection_mode | Allowable Value | Mode of data collection in tandem MS assays. Either DDA (Data-dependentacquisition), DIA (Data-independent acquisition), MRM (multiple reaction monitoring),or PRM (parallel reaction monitoring). | ['DDA', 'DIA', 'MRM', 'PRM'] | True |
-| ms_scan_mode | Textfield | Indicates whether experiment is MS, MS/MS, or other (possibly MS3 forTMT) | | True |
-| labeling | Textfield | Indicates whether samples were labeled prior to MS analysis (e.g.,TMT) | | True |
-| label_name | Textfield | If the samples were labeled (e.g. TMT), provide the name/ID of thelabel on this sample. | | False |
-| section_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing tissuesections for the assay. | | True |
-| lc_instrument_vendor | Textfield | The manufacturer of the instrument used for LC | | False |
-| lc_instrument_model | Textfield | The model number/name of the instrument used for LC | | False |
-| lc_column_vendor | Textfield | OPTIONAL: The manufacturer of the LC Column unless self-packed, pulledtip capilary is used | | False |
-| lc_column_model | Textfield | The model number/name of the LC Column - IF custom self-packed, pulledtip calillary is used enter "Pulled tip capilary" | | False |
-| lc_resin | Textfield | Details of the resin used for lc, including vendor, particle size,pore size | | False |
-| lc_length_value | Numeric | LC column length | | False |
-| lc_length_unit | Allowable Value | units for LC column length (typically cm) | ['um', 'mm', 'cm'] | False |
-| lc_temp_value | Numeric | LC temperature | | False |
-| lc_temp_unit | Allowable Value | units for LC temperature | ['C'] | False |
-| lc_id_value | Numeric | LC column inner diameter (microns) | | False |
-| lc_id_unit | Allowable Value | units of LC column inner diameter (typically microns) | ['um', 'mm', 'cm'] | False |
-| lc_flow_rate_value | Numeric | Value of flow rate. | | False |
-| lc_flow_rate_unit | Allowable Value | Units of flow rate. | ['nL/min', 'mL/min'] | False |
-| lc_gradient | Textfield | LC gradient | | False |
-| lc_mobile_phase_a | Textfield | Composition of mobile phase A | | False |
-| lc_mobile_phase_b | Textfield | Composition of mobile phase B | | False |
-| spatial_type | Allowable Value | Specifies whether or not the analysis was performed in a spatialy targetedmanner and the technique used for spatial sampling. For example, Laser-capturemicrodissection (LCM), Liquid Extraction Surface Analysis (LESA), NanodropletProcessing in One pot for Trace Samples (nanoPOTS). | ['LCM', 'LESA', 'nanoPOTS', 'microLESA'] | False |
-| spatial_sampling_type | Allowable Value | Specifies whether or not the analysis was performed in a spatiallytargeted manner. Spatial profiling experiments target specific tissue foci butdo not necessarily generate images. Spatial imaging expriments collect data froma regular array (pixels) that can be visualized as heat maps of ion intensityat each location (molecular images). Leave blank if data are derived from bulkanalysis. | ['profiling', 'imaging'] | False |
-| spatial_target | Textfield | Specifies the cell-type or functional tissue unit (FTU) that is targetedin the spatial profiling experiment. Leave blank if data are generated in imagingmode without a specific target structure. | | False |
-| resolution_x_value | Numeric | The width of a pixel. | | False |
-| resolution_x_unit | Allowable Value | The unit of measurement of the width of a pixel. | ['nm', 'um'] | False |
-| resolution_y_value | Numeric | The height of a pixel | | False |
-| resolution_y_unit | Allowable Value | The unit of measurement of the height of a pixel. | ['nm', 'um'] | False |
-| processing_search | Textfield | Software for analyzing and searching LC-MS/MS omics data | | True |
-| processing_protocols_io_doi | Textfield | DOI for analysis protocols.io for this assay. | | False |
-| overall_protocols_io_doi | Textfield | DOI for protocols.io for the overall process for this assay. | | False |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstreamprocessing will depend on filename extension conventions. | | True |
-
-
-
-Version 2
-
-## Version 2
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-------------------------------|-----------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['2'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped foldergenerated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year,MM is the month with leading 0s, and DD is the day with leading 0s, hh is thehour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories:generation of images of microscopic entities, identification & quantitation ofmolecules by mass spectrometry, imaging mass spectrometry, and determination ofnucleotide sequence. | ['mass_spectrometry'] | True |
-| assay_type | Allowable Value | Bottom-up refers to analyzing proteins in a sample by digesting themto peptides. Top-down refers to analyzing whole proteins without digestion. LC-MSand MS are for lipids/metabolites. LC-MS Bottom-Up and MS Bottom-Up are for peptides.LC-MS Top-Down and MS Top-Down are for proteins. | ['LC-MS', 'MS', 'LC-MS Bottom-Up', 'MS Bottom-Up', 'LC-MS Top-Down', 'MS Top-Down'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['protein', 'metabolites', 'lipids', 'peptides', 'phosphopeptides', 'glycans'] | False |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted fordetection/measurement by the assay. | ['Yes','No'] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detectionhardware and signal processing software. Assays generate signals such as lightof various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions(models) of that instrument with different features or sensitivities. Differencesin features or sensitivities may be relevant to processing or interpretation ofthe data. | | True |
-| ms_source | Allowable Value | The ion source type used for surface sampling. | ['ESI'] | True |
-| polarity | Allowable Value | The polarity of the mass analysis (positive or negative ion modes) | ['negative ion mode', 'positive ion mode', 'negative and positive ion mode'] | True |
-| mz_range_low_value | Numeric | The low value of the scanned mass range for MS1. (unitless) | | True |
-| mz_range_high_value | Numeric | The high value of the scanned mass range for MS1. (unitless) | | True |
-| mass_resolving_power | Numeric | The MS1 resolving power defined as m/âm where âm is the FWHM for a given peak with a specified m/z (m). (unitless) | | False |
-| mz_resolving_power | Numeric | The peak (m/z) used to calculate the resolving power. | | False |
-| ion_mobility | Allowable Value | Specifies whether or not ion mobility spectrometry was performed andwhich technology was used. Technologies for measuring ion mobility: TravelingWave Ion Mobility Spectrometry (TWIMS), Trapped Ion Mobility Spectrometry (TIMS),High Field Asymmetric waveform ion Mobility Spectrometry (FAIMS), Drift Tube IonMobility Spectrometry (DTIMS, Structures for Lossless Ion Manipulations (SLIM). | ['TIMS', 'TWIMS', 'FAIMS', 'DTIMS', 'SLIMS'] | False |
-| data_collection_mode | Allowable Value | Mode of data collection in tandem MS assays. Either DDA (Data-dependentacquisition), DIA (Data-independent acquisition), MRM (multiple reaction monitoring),or PRM (parallel reaction monitoring). | ['DDA', 'DIA', 'MRM', 'PRM'] | True |
-| ms_scan_mode | Textfield | Indicates whether experiment is MS, MS/MS, or other (possibly MS3 forTMT) | | True |
-| labeling | Textfield | Indicates whether samples were labeled prior to MS analysis (e.g.,TMT) | | True |
-| section_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing tissuesections for the assay. | | True |
-| lc_instrument_vendor | Textfield | The manufacturer of the instrument used for LC | | False |
-| lc_instrument_model | Textfield | The model number/name of the instrument used for LC | | False |
-| lc_column_vendor | Textfield | OPTIONAL: The manufacturer of the LC Column unless self-packed, pulledtip capilary is used | | False |
-| lc_column_model | Textfield | The model number/name of the LC Column - IF custom self-packed, pulledtip calillary is used enter "Pulled tip capilary" | | False |
-| lc_resin | Textfield | Details of the resin used for lc, including vendor, particle size,pore size | | False |
-| lc_length_value | Numeric | LC column length | | False |
-| lc_length_unit | Allowable Value | units for LC column length (typically cm) | ['um', 'mm', 'cm'] | False |
-| lc_temp_value | Numeric | LC temperature | | False |
-| lc_temp_unit | Allowable Value | units for LC temperature | ['C'] | False |
-| lc_id_value | Numeric | LC column inner diameter (microns) | | False |
-| lc_id_unit | Allowable Value | units of LC column inner diameter (typically microns) | ['um', 'mm', 'cm'] | False |
-| lc_flow_rate_value | Numeric | Value of flow rate. | | False |
-| lc_flow_rate_unit | Allowable Value | Units of flow rate. | ['nL/min', 'mL/min'] | False |
-| lc_gradient | Textfield | LC gradient | | False |
-| lc_mobile_phase_a | Textfield | Composition of mobile phase A | | False |
-| lc_mobile_phase_b | Textfield | Composition of mobile phase B | | False |
-| spatial_type | Allowable Value | Specifies whether or not the analysis was performed in a spatialy targetedmanner and the technique used for spatial sampling. For example, Laser-capturemicrodissection (LCM), Liquid Extraction Surface Analysis (LESA), NanodropletProcessing in One pot for Trace Samples (nanoPOTS). | ['LCM', 'LESA', 'nanoPOTS', 'microLESA'] | False |
-| spatial_sampling_type | Allowable Value | Specifies whether or not the analysis was performed in a spatiallytargeted manner. Spatial profiling experiments target specific tissue foci butdo not necessarily generate images. Spatial imaging expriments collect data froma regular array (pixels) that can be visualized as heat maps of ion intensityat each location (molecular images). Leave blank if data are derived from bulkanalysis. | ['profiling', 'imaging'] | False |
-| spatial_target | Textfield | Specifies the cell-type or functional tissue unit (FTU) that is targetedin the spatial profiling experiment. Leave blank if data are generated in imagingmode without a specific target structure. | | False |
-| resolution_x_value | Numeric | The width of a pixel. | | False |
-| resolution_x_unit | Allowable Value | The unit of measurement of the width of a pixel. | ['nm', 'um'] | False |
-| resolution_y_value | Numeric | The height of a pixel | | False |
-| resolution_y_unit | Allowable Value | The unit of measurement of the height of a pixel. | ['nm', 'um'] | False |
-| processing_search | Textfield | Software for analyzing and searching LC-MS/MS omics data | | True |
-| processing_protocols_io_doi | Textfield | DOI for analysis protocols.io for this assay. | | False |
-| overall_protocols_io_doi | Textfield | DOI for protocols.io for the overall process for this assay. | | False |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstreamprocessing will depend on filename extension conventions. | | True |
-
-
-
-Version 1
-
-## Version 1
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-------------------------------|-----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['1'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped foldergenerated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year,MM is the month with leading 0s, and DD is the day with leading 0s, hh is thehour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories:generation of images of microscopic entities, identification & quantitation ofmolecules by mass spectrometry, imaging mass spectrometry, and determination ofnucleotide sequence. | ['mass_spectrometry'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['LC-MS (metabolomics)', 'LC-MS/MS (label-free proteomics)', 'MS (shotgun lipidomics)'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['protein', 'metabolites', 'lipids'] | False |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted fordetection/measurement by the assay. | ['Yes','No'] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detectionhardware and signal processing software. Assays generate signals such as lightof various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions(models) of that instrument with different features or sensitivities. Differencesin features or sensitivities may be relevant to processing or interpretation ofthe data. | | True |
-| ms_source | Textfield | The ion source type used for surface sampling (MALDI, MALDI-2, DESI,or SIMS) or LC-MS/MS data acquisition (nESI) | | True |
-| polarity | Allowable Value | The polarity of the mass analysis (positive or negative ion modes) | ['negative ion mode', 'positive ion mode', 'negative and positive ion mode'] | True |
-| mz_range_low_value | Numeric | The low value of the scanned mass range for MS1. (unitless) | | True |
-| mz_range_high_value | Numeric | The high value of the scanned mass range for MS1. (unitless) | | True |
-| data_collection_mode | Allowable Value | Mode of data collection in tandem MS assays. Either DDA (Data-dependentacquisition), DIA (Data-independent acquisition), MRM (multiple reaction monitoring),or PRM (parallel reaction monitoring). | ['DDA', 'DIA', 'MRM', 'PRM'] | True |
-| ms_scan_mode | Textfield | Indicates whether experiment is MS, MS/MS, or other (possibly MS3 forTMT) | | True |
-| labeling | Textfield | Indicates whether samples were labeled prior to MS analysis (e.g.,TMT) | | True |
-| section_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing tissuesections for the assay. | | True |
-| lc_instrument_vendor | Textfield | The manufacturer of the instrument used for LC | | False |
-| lc_instrument_model | Textfield | The model number/name of the instrument used for LC | | False |
-| lc_column_vendor | Textfield | OPTIONAL: The manufacturer of the LC Column unless self-packed, pulledtip capilary is used | | False |
-| lc_column_model | Textfield | The model number/name of the LC Column - IF custom self-packed, pulledtip calillary is used enter "Pulled tip capilary" | | False |
-| lc_resin | Textfield | Details of the resin used for lc, including vendor, particle size,pore size | | False |
-| lc_length_value | Numeric | LC column length | | False |
-| lc_length_unit | Allowable Value | units for LC column length (typically cm) | ['um', 'mm', 'cm'] | False |
-| lc_temp_value | Numeric | LC temperature | | False |
-| lc_temp_unit | Allowable Value | units for LC temperature | ['C'] | False |
-| lc_id_value | Numeric | LC column inner diameter (microns) | | False |
-| lc_id_unit | Allowable Value | units of LC column inner diameter (typically microns) | ['um', 'mm', 'cm'] | False |
-| lc_flow_rate_value | Numeric | Value of flow rate. | | False |
-| lc_flow_rate_unit | Allowable Value | Units of flow rate. | ['nL/min', 'mL/min'] | False |
-| lc_gradient | Textfield | LC gradient | | False |
-| lc_mobile_phase_a | Textfield | Composition of mobile phase A | | False |
-| lc_mobile_phase_b | Textfield | Composition of mobile phase B | | False |
-| processing_search | Textfield | Software for analyzing and searching LC-MS/MS omics data | | True |
-| processing_protocols_io_doi | Textfield | DOI for analysis protocols.io for this assay. | | False |
-| overall_protocols_io_doi | Textfield | DOI for protocols.io for the overall process for this assay. | | False |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstreamprocessing will depend on filename extension conventions. | | True |
-
-
-
-Version 0
-
-## Version 0
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-------------------------------|-----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------|------------|
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped foldergenerated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year,MM is the month with leading 0s, and DD is the day with leading 0s, hh is thehour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories:generation of images of microscopic entities, identification & quantitation ofmolecules by mass spectrometry, imaging mass spectrometry, and determination ofnucleotide sequence. | ['mass_spectrometry'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['LC-MS (metabolomics)', 'LC-MS/MS (label-free proteomics)', 'MS (shotgun lipidomics)'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['protein', 'metabolites', 'lipids'] | False |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted fordetection/measurement by the assay. | ['Yes','No'] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detectionhardware and signal processing software. Assays generate signals such as lightof various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions(models) of that instrument with different features or sensitivities. Differencesin features or sensitivities may be relevant to processing or interpretation ofthe data. | | True |
-| ms_source | Textfield | The ion source type used for surface sampling (MALDI, MALDI-2, DESI,or SIMS) or LC-MS/MS data acquisition (nESI) | | True |
-| polarity | Allowable Value | The polarity of the mass analysis (positive or negative ion modes) | ['negative ion mode', 'positive ion mode', 'negative and positive ion mode'] | True |
-| mz_range_low_value | Numeric | The low value of the scanned mass range for MS1. (unitless) | | True |
-| mz_range_high_value | Numeric | The high value of the scanned mass range for MS1. (unitless) | | True |
-| data_collection_mode | Allowable Value | Mode of data collection in tandem MS assays. Either DDA (Data-dependentacquisition), DIA (Data-independent acquisition), MRM (multiple reaction monitoring),or PRM (parallel reaction monitoring). | ['DDA', 'DIA', 'MRM', 'PRM'] | True |
-| ms_scan_mode | Textfield | Indicates whether experiment is MS, MS/MS, or other (possibly MS3 forTMT) | | True |
-| labeling | Textfield | Indicates whether samples were labeled prior to MS analysis (e.g.,TMT) | | True |
-| section_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing tissuesections for the assay. | | True |
-| lc_instrument_vendor | Textfield | The manufacturer of the instrument used for LC | | False |
-| lc_instrument_model | Textfield | The model number/name of the instrument used for LC | | False |
-| lc_column_vendor | Textfield | OPTIONAL: The manufacturer of the LC Column unless self-packed, pulledtip capilary is used | | False |
-| lc_column_model | Textfield | The model number/name of the LC Column - IF custom self-packed, pulledtip calillary is used enter "Pulled tip capilary" | | False |
-| lc_resin | Textfield | Details of the resin used for lc, including vendor, particle size,pore size | | False |
-| lc_length_value | Numeric | LC column length | | False |
-| lc_length_unit | Allowable Value | units for LC column length (typically cm) | ['um', 'mm', 'cm'] | False |
-| lc_temp_value | Numeric | LC temperature | | False |
-| lc_temp_unit | Allowable Value | units for LC temperature | ['C'] | False |
-| lc_id_value | Numeric | LC column inner diameter (microns) | | False |
-| lc_id_unit | Allowable Value | units of LC column inner diameter (typically microns) | ['um', 'mm', 'cm'] | False |
-| lc_flow_rate_value | Numeric | Value of flow rate. | | False |
-| lc_flow_rate_unit | Allowable Value | Units of flow rate. | ['nL/min', 'mL/min'] | False |
-| lc_gradient | Textfield | LC gradient | | False |
-| lc_mobile_phase_a | Textfield | Composition of mobile phase A | | False |
-| lc_mobile_phase_b | Textfield | Composition of mobile phase B | | False |
-| processing_search | Textfield | Software for analyzing and searching LC-MS/MS omics data | | True |
-| processing_protocols_io_doi | Textfield | DOI for analysis protocols.io for this assay. | | False |
-| overall_protocols_io_doi | Textfield | DOI for protocols.io for the overall process for this assay. | | False |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstreamprocessing will depend on filename extension conventions. | | True |
-
-
\ No newline at end of file
+---
+layout: page-triary
+---
+
+# LC-MS Metadata Attributes
+
+Fields that are collected for LC-MS data, available at ```Dataset.metadata.```
+
+
+* indicates a required field
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | |
+| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | |
+| preparation_protocol_doi * | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | ```https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1``` |
+| dataset_type * | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` |
+| analyte_class * | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` |
+| is_targeted * | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ```Yes``` ```No``` |
+| acquisition_instrument_vendor * | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` |
+| acquisition_instrument_model * | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` |
+| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | |
+| source_storage_duration_unit * | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` |
+| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | |
+| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` |
+| contributors_path | | Relative path to file with ORCID IDs for contributors for this dataset. | |
+| data_path | | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | |
+| ms_ionization_technique * | | The ionization approach (i.e., sample probing method) for performing imaging mass spectrometry. | ```DESI``` ```ESI``` ```HESI``` ```LA``` ```LDI``` ```MALDI``` ```MALDI-2``` ```nanoDESI``` ```SIMS-C60``` ```SIMS-H20``` |
+| ms_scan_mode * | | Indicates whether experiment is MS, MS/MS, or other (possibly MS3 for TMT) | ```MS1``` ```MS2``` ```MS3``` |
+| mass_analysis_polarity * | | The polarity of the mass analysis (positive or negative ion modes). | ```Negative and positive ion mode``` ```Negative ion mode``` ```Positive ion mode``` |
+| mass_to_charge_range_low_value | | The low value of the scanned mass-to-charge range, for MS1. (unitless) | |
+| mass_to_charge_range_high_value | | The high value of the scanned mass-to-charge range, for MS1. (unitless) | |
+| mass_resolving_power | | The MS1 resolving power defined as m/∆m where ∆m is the FWHM for a given peak with a specified m/z (m). (unitless) | |
+| mass_to_charge_resolving_power | | The peak (m/z) used to calculate the resolving power. | |
+| ion_mobility | | Specifies whether or not ion mobility spectrometry was performed and which technology was used. Technologies for measuring ion mobility: Traveling Wave Ion Mobility Spectrometry (TWIMS), Trapped Ion Mobility Spectrometry (TIMS), High Field Asymmetric waveform ion Mobility Spectrometry (FAIMS), Drift Tube Ion Mobility Spectrometry (DTIMS, Structures for Lossless Ion Manipulations (SLIM). | ```TIMS``` ```SLIM``` ```FAIMS``` ```DTIMS``` ```cIMS``` ```TWIMS``` |
+| data_collection_mode * | | Mode of data collection in tandem MS assays. Either DDA (Data-dependent acquisition), DIA (Data-independent acquisition), MRM (multiple reaction monitoring), or PRM (parallel reaction monitoring). | ```DDA``` ```PRM``` ```DIA``` ```SRM``` |
+| label_name | | If the samples were labeled (e.g. TMT), provide the name/ID of the label on this sample. | |
+| lc_instrument_vendor | | The manufacturer of the instrument used for LC | ```Thermo Fisher Scientific``` ```Sciex``` ```In-House``` ```Agilent Technologies``` ```Waters``` ```Bruker``` ```Evosep``` |
+| lc_instrument_model | | The model number/name of the instrument used for LC | |
+| lc_column_vendor | | OPTIONAL: The manufacturer of the LC Column unless self-packed, pulled tip capilary is used | ```Thermo Fisher Scientific``` ```In-House``` ```Waters``` ```Bruker``` ```Evosep``` ```IonOpticks``` |
+| lc_column_model | | The model number/name of the LC Column - IF custom self-packed, pulled tip calillary is used enter "Pulled tip capilary" | |
+| lc_resin | | Details of the resin used for lc, including vendor, particle size, pore size | |
+| lc_column_length_value | | Liquid chromatography column length. | |
+| lc_column_length_unit | | Units for liquid chromatography column length (typically cm). | ```um``` ```mm``` ```cm``` |
+| lc_temperature_value | | Liquid chromatography temperature. | |
+| lc_temperature_unit | | | ```celsius``` |
+| lc_inner_diameter_value | | Liquid chromatography column inner diameter. | |
+| lc_inner_diameter_unit | | | ```um``` ```mm``` ```cm``` |
+| lc_flow_rate_value | | Value of flow rate. | |
+| lc_flow_rate_unit | | Units of flow rate. | ```nL/min``` ```mL/min``` |
+| lc_gradient_value | | Liquid chromatography gradient. | |
+| lc_gradient_unit | | Unit for liquid chromatography gradient | ```minute``` |
+| lc_mobile_phase_a | | Composition of mobile phase A | |
+| lc_mobile_phase_b | | Composition of mobile phase B | |
+| spatial_sampling_technique | | | ```nanoSPLITS``` ```nanoPOTS``` ```LESA``` ```microPOTS``` ```LCM``` ```microLESA``` |
+| spatial_sampling_target | | Specifies the cell-type or functional tissue unit (FTU) that is targeted in the spatial profiling experiment. Leave blank if data are generated in imaging mode without a specific target structure. | |
+| spatial_sampling_type | | Specifies whether or not the analysis was performed in a spatially targeted manner. Spatial profiling experiments target specific tissue foci but do not necessarily generate images. Spatial imaging expriments collect data from a regular array (pixels) that can be visualized as heat maps of ion intensity at each location (molecular images). Leave blank if data are derived from bulk analysis. | ```Imaging``` ```Profiling``` |
+| analysis_protocol_doi | | A DOI to a protocols.io protocol describing the software and database(s) used to process the raw data. Example: https://dx.doi.org/10.17504/protocols.io.bsu5ney6 | |
+| acquisition_protocol_doi | | | |
+| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | |
+
+
+
+
+## Deprecated Attributes
+
+
+ indicates a field that was previously required
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| assay_category * | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```sequence``` |
+| description | | Free-text description of this assay. | |
+| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | |
+| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | |
+| operator | | Name of the person responsible for executing the assay. | |
+| operator_email | | Email address for the operator. | |
+| protocols_io_doi | | DOI for protocols.io referring to the protocol for this assay. | |
+| overall_protocols_io_doi | | DOI for protocols.io for the overall process for this assay. | |
+| pi | | Name of the principal investigator responsible for the data. | |
+| pi_email | | Email address for the principal investigator. | |
+| processing_search | | Software for analyzing and searching LC-MS/MS omics data | |
+| labeling | | Indicates whether samples were labeled prior to MS analysis (e.g., TMT) | |
+| dms * | | Was differential mobility spectrometry used in this assay? | ```Yes``` ```No``` |
+| resolution_x_unit | | The unit of measurement of the width of a pixel. | ```mm``` ```um``` ```nm``` |
+| resolution_x_value | | The width of a pixel. | |
+| resolution_y_unit | | The unit of measurement of the height of a pixel. | ```mm``` ```um``` ```nm``` |
+| resolution_y_value | | The height of a pixel | |
+| version * | | Version of the schema to use when validating this metadata. | ```1``` |
diff --git a/docs/assays/metadata/LightSheet.md b/docs/assays/metadata/LightSheet.md
index 4eb8950..110dca5 100644
--- a/docs/assays/metadata/LightSheet.md
+++ b/docs/assays/metadata/LightSheet.md
@@ -1,146 +1,67 @@
----
-layout: page
----
-# Light-Sheet
-
-NOTE: Several versions of this metadata schema have been created over time. The (Latest) version contains most attributes, but there may be some deprecated attributes in the older versions for which data has been collected. HuBMAP is in the process of creating a reference which combines all of these versions into a single view. That reference will be available here once completed.
-
- Version 3 (latest)
-
-## Version 3 (latest)
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-----------------------------------------------------|-----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|------------|
-| source_storage_duration_value | Numeric | How long was the source material (parent) stored, prior to this sample being processed. | | True |
-| time_since_acquisition_instrument_calibration_value | Numeric | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | False |
-| contributors_path | Textfield | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | True |
-| data_path | Textfield | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "/TEST001-RK/" for this field. If there are multiple directory levels, use the format "/TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | True |
-| is_image_preprocessing_required | Allowable Value | Depending on if the acquisition instrument was a microscope, slide scanner, etc. will indicate whether or not any level of preprocessing was required to assemble the image (e.g., fusing image tiles) . | ```Yes``` ```No``` | True |
-| slide_id | Textfield | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | False |
-| tiled_image_columns | Numeric | This is how many columns used in stitching. This is sometimes referred to as the grid size x. | | False |
-| tiled_image_count | Numeric | This is the total number of raw (tiled) images captured, that are to be stitched together. | | False |
-| intended_tile_overlap_percentage | Numeric | The amount of overlap between tiled images. This is the set point, where as during image acquisition there will be slight variations due to stage registration. | | False |
-| dataset_type | Allowable Value | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium```| True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA ```| True |
-| acquisition_instrument_vendor | Allowable Value | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | True |
-| acquisition_instrument_model | Allowable Value | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | True |
-| source_storage_duration_unit | Allowable Value | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | True |
-| time_since_acquisition_instrument_calibration_unit | Allowable Value | The time unit of measurement |```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | False |
-| tile_configuration | Allowable Value | This is how the tiles are configured for stitching. | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | False |
-| scan_direction | Allowable Value | This is the direction of imaging, which is required for stitching. | ```Left-and-down``` ```Left-and-up``` ```Not applicable``` ```Right-and-down``` ```Right-and-up``` | False |
-| metadata_schema_id | Textfield | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | True |
-| preparation_protocol_doi | Textfield | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | True |
-| antibodies_path | Textfield | This is the location of the antibodies.tsv file relative to the root of the top level of the upload directory structure. This path should begin with "." and would likely be something like "./extras/antibodies.tsv". | | True |
-| parent_sample_id | Textfield | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | True |
-
-
-
-Version 2
-
-## Version 2
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-------------------------------|-----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['2'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped foldergenerated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year,MM is the month with leading 0s, and DD is the day with leading 0s, hh is thehour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories:generation of images of microscopic entities, identification & quantitation ofmolecules by mass spectrometry, imaging mass spectrometry, and determination ofnucleotide sequence. | ['imaging'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['Light Sheet'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['protein'] | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted fordetection/measurement by the assay. | ['Yes','No'] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detectionhardware and signal processing software. Assays generate signals such as lightof various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions(models) of that instrument with different features or sensitivities. Differencesin features or sensitivities may be relevant to processing or interpretation ofthe data. | | True |
-| resolution_x_value | Numeric | The width of a pixel. | | True |
-| resolution_x_unit | Allowable Value | The unit of measurement of the width of a pixel. | ['nm', 'um'] | False |
-| resolution_y_value | Numeric | The height of a pixel | | True |
-| resolution_y_unit | Allowable Value | The unit of measurement of the height of a pixel. | ['nm', 'um'] | False |
-| range_z_value | Numeric | The total range of the z axis. | | True |
-| range_z_unit | Allowable Value | The unit of range_z_value. | ['nm', 'um'] | False |
-| step_z_value | Numeric | The number of optical sections in z axis range. | | True |
-| increment_z_value | Numeric | The distance between sequential optical sections. | | True |
-| increment_z_unit | Allowable Value | The units of increment z value. | ['nm', 'um'] | False |
-| number_of_antibodies | Numeric | Number of antibodies | | True |
-| number_of_channels | Numeric | Number of fluorescent channels imaged during each cycle. | | True |
-| antibodies_path | Textfield | Relative path to file with antibody information for this dataset. | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstreamprocessing will depend on filename extension conventions. | | True |
-
-
-
-Version 1
-
-## Version 1
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-------------------------------|-----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['1'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped foldergenerated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year,MM is the month with leading 0s, and DD is the day with leading 0s, hh is thehour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories:generation of images of microscopic entities, identification & quantitation ofmolecules by mass spectrometry, imaging mass spectrometry, and determination ofnucleotide sequence. | ['imaging'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['Light Sheet'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['protein'] | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted fordetection/measurement by the assay. | ['Yes','No'] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detectionhardware and signal processing software. Assays generate signals such as lightof various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions(models) of that instrument with different features or sensitivities. Differencesin features or sensitivities may be relevant to processing or interpretation ofthe data. | | True |
-| resolution_x_value | Numeric | The width of a pixel. | | True |
-| resolution_x_unit | Allowable Value | The unit of measurement of the width of a pixel. | ['nm', 'um'] | False |
-| resolution_y_value | Numeric | The height of a pixel | | True |
-| resolution_y_unit | Allowable Value | The unit of measurement of the height of a pixel. | ['nm', 'um'] | False |
-| resolution_z_value | Numeric | The distance at which two objects along the detection z-axis can bedistinguished (resolved as 2 objects). | | True |
-| resolution_z_unit | Allowable Value | The unit of distance at which two objects along the detection z-axiscan be distinguished (resolved as 2 objects). | ['mm', 'um', 'nm'] | False |
-| number_of_antibodies | Numeric | Number of antibodies | | True |
-| number_of_channels | Numeric | Number of fluorescent channels imaged during each cycle. | | True |
-| antibodies_path | Textfield | Relative path to file with antibody information for this dataset. | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstreamprocessing will depend on filename extension conventions. | | True |
-
-
-
-Version 0
-
-## Version 0
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-------------------------------|-----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------|------------|
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped foldergenerated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year,MM is the month with leading 0s, and DD is the day with leading 0s, hh is thehour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories:generation of images of microscopic entities, identification & quantitation ofmolecules by mass spectrometry, imaging mass spectrometry, and determination ofnucleotide sequence. | ['imaging'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['Light Sheet'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['protein'] | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted fordetection/measurement by the assay. | ['Yes','No'] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detectionhardware and signal processing software. Assays generate signals such as lightof various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions(models) of that instrument with different features or sensitivities. Differencesin features or sensitivities may be relevant to processing or interpretation ofthe data. | | True |
-| resolution_x_value | Numeric | The width of a pixel. | | True |
-| resolution_x_unit | Allowable Value | The unit of measurement of the width of a pixel. | ['nm', 'um'] | False |
-| resolution_y_value | Numeric | The height of a pixel | | True |
-| resolution_y_unit | Allowable Value | The unit of measurement of the height of a pixel. | ['nm', 'um'] | False |
-| resolution_z_value | Numeric | The distance at which two objects along the detection z-axis can bedistinguished (resolved as 2 objects). | | True |
-| resolution_z_unit | Allowable Value | The unit of distance at which two objects along the detection z-axiscan be distinguished (resolved as 2 objects). | ['mm', 'um', 'nm'] | False |
-| number_of_antibodies | Numeric | Number of antibodies | | True |
-| number_of_channels | Numeric | Number of fluorescent channels imaged during each cycle. | | True |
-| antibodies_path | Textfield | Relative path to file with antibody information for this dataset. | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstreamprocessing will depend on filename extension conventions. | | True |
-
-
+---
+layout: page-triary
+---
+
+# Light-Sheet Metadata Attributes
+
+Fields that are collected for Light-Sheet data, available at ```Dataset.metadata.```
+
+
+* indicates a required field
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | |
+| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | |
+| preparation_protocol_doi * | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | ```https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1``` |
+| dataset_type * | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` |
+| analyte_class * | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` |
+| is_targeted * | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` |
+| acquisition_instrument_vendor * | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` |
+| acquisition_instrument_model * | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` |
+| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | |
+| source_storage_duration_unit * | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` |
+| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | |
+| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` |
+| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | |
+| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | |
+| antibodies_path | | This is the location of the antibodies.tsv file relative to the root of the top level of the upload directory structure. This path should begin with "." and would likely be something like "./extras/antibodies.tsv". | |
+| is_image_preprocessing_required * | | Indicates whether image preprocessing is necessary based on the type of acquisition instrument used, such as a microscope or slide scanner. This may involve steps like fusing image tiles to assemble the complete image. Example: Yes | ```Yes``` ```No``` |
+| slide_id | | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | |
+| tile_configuration | | The configuration of tiles used for stitching in the assay process. If no tile configuration is applicable, enter "Not applicable". Example: Row-by-row | ```Column-by-column``` ```Not applicable``` ```Snake-by-columns``` ```Row-by-row``` ```Snake-by-rows``` |
+| scan_direction | | The direction of imaging, which is necessary for the stitching process. Example: Left-and-down | ```Left-and-down``` ```Right-and-down``` ```Not applicable``` ```Right-and-up``` ```Left-and-up``` |
+| tiled_image_columns | | The number of columns used in the stitching process of a tiled image, often referred to as the grid size in the x-dimension. Example: 5 | |
+| tiled_image_count | | The total number of raw tiled images captured, which are intended to be stitched together. Example: 75 | |
+| intended_tile_overlap_percentage | | The intended percentage of overlap between tiled images. This value serves as the set point, although slight variations may occur during image acquisition due to stage registration. Example: 5 | |
+| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | |
+
+
+
+
+## Deprecated Attributes
+
+
+ indicates a field that was previously required
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| assay_category * | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```sequence``` |
+| description | | Free-text description of this assay. | |
+| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | |
+| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | |
+| number_of_antibodies | | Number of antibodies | |
+| number_of_channels | | Number of fluorescent channels imaged during each cycle. | |
+| operator | | Name of the person responsible for executing the assay. | |
+| operator_email | | Email address for the operator. | |
+| principal_investigator | | The full name of the principal investigator responsible for the data. | |
+| pi_email | | Email address for the principal investigator. | |
+| resolution_x_unit | | The unit of measurement of width of a pixel.(nm) | ```mm``` ```um``` ```nm``` |
+| resolution_x_value | | The width of a pixel. (Akoya pixel is 377nm square) | |
+| resolution_y_unit | | The unit of measurement of height of a pixel. (nm) | ```mm``` ```um``` ```nm``` |
+| resolution_y_value | | The height of a pixel. (Akoya pixel is 377nm square) | |
+| increment_z_unit | | The units of increment z value. | ```nm``` ```um``` |
+| increment_z_value | | The distance between sequential optical sections. | |
+| step_z_value | | The number of optical sections in z axis range. | |
+| range_z_value | | The total range of the z axis. | |
+| range_z_unit | | The unit of range_z_value. | ```nm``` ```um``` |
+| version * | | Version of the schema to use when validating this metadata. | ```1``` |
diff --git a/docs/assays/metadata/MALDI.md b/docs/assays/metadata/MALDI.md
index ccf79e7..7eb1f27 100644
--- a/docs/assays/metadata/MALDI.md
+++ b/docs/assays/metadata/MALDI.md
@@ -1,171 +1,67 @@
----
-layout: page
----
-# MALDI
-
-NOTE: Several versions of this metadata schema have been created over time. The (Latest) version contains most attributes, but there may be some deprecated attributes in the older versions for which data has been collected. HuBMAP is in the process of creating a reference which combines all of these versions into a single view. That reference will be available here once completed.
-
- Maldi Version 2 (latest)
-
-## Maldi Version 2 (latest)
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-----------------------------------------------------|-----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|------------|
-| dataset_type | Allowable Value | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium```| True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA ```| True |
-| acquisition_instrument_vendor | Allowable Value | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | True |
-| acquisition_instrument_model | Allowable Value | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | True |
-| source_storage_duration_value | Numeric | How long was the source material (parent) stored, prior to this sample being processed. | | True |
-| source_storage_duration_unit | Allowable Value | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | True |
-| time_since_acquisition_instrument_calibration_value | Numeric | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | False |
-| time_since_acquisition_instrument_calibration_unit | Allowable Value | The time unit of measurement |```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | False |
-| preparation_protocol_doi | Textfield | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | True |
-| contributors_path | Textfield | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | True |
-| data_path | Textfield | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | True |
-| mass_analysis_polarity | Allowable Value | The polarity of the mass analysis (positive or negative ion modes). | ```Negative and positive ion mode``` ```Negative ion mode``` ```Positive ion mode``` | True |
-| mass_resolving_power | Numeric | The mass resolving power m/∆m, where ∆m is defined as the full width at half-maximum (FWHM) for a given peak with a specified mass-to-charge (m/z). (unitless) | | True |
-| mass-to-charge_resolving_power | Numeric | The peak (m/z) used to calculate the resolving power. | | True |
-| ion_mobility | Allowable Value | Specifies which technology was used for ion mobility spectrometry. Technologies for measuring ion mobility: Traveling Wave Ion Mobility Spectrometry (TWIMS), Trapped Ion Mobility Spectrometry (TIMS), High Field Asymmetric waveform ion Mobility Spectrometry (FAIMS), Drift Tube Ion Mobility Spectrometry (DTIMS), Structures for Lossless Ion Manipulations (SLIM), and cyclic Ion Mobility Spectrometry (cIMS). | ```cIMS``` ```DTIMS``` ```FAIMS``` ```SLIM``` ```TIMS``` ```TWIMS``` | False |
-| matrix_deposition_method | Allowable Value | Common methods of depositing matrix for assisting in desorption and ionization in imaging mass spectrometry include robotic spotting, electrospray deposition, and sublimation. | ```Electrospray deposition``` ```Not applicable``` ```Robotic spotting``` ```Robotic spraying``` ```Sublimation``` | True |
-| preparation_instrument_vendor | Allowable Value | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` | False |
-| preparation_instrument_model | Allowable Value | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist ```| False |
-| preparation_matrix | Allowable Value | The matrix is a compound of crystallized molecules that acts like a buffer between the sample and the ionizing probe. It also helps ionize the sample, carrying it along the flight tube so it can be detected. | ```2,5-DHA (2,5-dihydroxyacetophenone)``` ```2,5-DHB (2,5-Dihydroxybenzoic acid)``` ```9-AA (9-aminoacridine)``` ```CHCA (alpha-cyano-4-hydroxy-cinnamic acid)``` ```DAN (1,5-diaminonapthalene)``` ```DMACA (4-(dimethylamino)cinnamic acid)``` ```NEDC (N-(1-naphthyl) ethylenediamine dihydrochloride)``` ```SA (sinapic acid)``` | True |
-| metadata_schema_id | Textfield | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | True |
-| mass-to-charge_range_low_value | Numeric | The low value of the scanned mass-to-charge range, for MS1. (unitless) | | False |
-| mass-to-charge_range_high_value | Numeric | The high value of the scanned mass-to-charge range, for MS1. (unitless) | | False |
-| analysis_protocol_doi | Textfield | A DOI to a protocols.io protocol describing the software and database(s) used to process the raw data. Example: https://dx.doi.org/10.17504/protocols.io.bsu5ney6 | | True |
-| ms_ionization_technique | Allowable Value | The ionization approach (i.e., sample probing method) for performing imaging mass spectrometry. | ```DESI``` ```ESI``` ```HESI``` ```LA``` ```LDI``` ```MALDI``` ```MALDI-2``` ```nanoDESI``` ```SIMS-C60``` ```SIMS-H20 ```| True |
-| ms_scan_mode | Allowable Value | MS (mass spectrometry) scan mode refers to the number of steps in the separation of fragments. | ```MS1``` ```MS2``` ```MS3``` | True |
-| parent_sample_id | Textfield | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | True |
-
-
-
-IMS Version 2
-
-## IMS Version 2
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-------------------------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['2'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['mass_spectrometry_imaging'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['MALDI-IMS', 'SIMS-IMS', 'NanoDESI', 'DESI'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['protein', 'metabolites', 'lipids', 'peptides', 'phosphopeptides', 'glycans'] | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ['Yes','No'] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| ms_source | Allowable Value | The ion source type used for surface sampling (MALDI, MALDI-2, DESI, nanoDESI or SIMS). | ['MALDI', 'MALDI-2', 'LDI', 'LA', 'SIMS-C60', 'SIMS-H2O', 'DESI', 'nanoDESI'] | True |
-| polarity | Allowable Value | The polarity of the mass analysis (positive or negative ion modes) | ['negative ion mode', 'positive ion mode', 'negative and positive ion mode'] | True |
-| mz_range_low_value | Numeric | The low value of the scanned mass range for MS1. (unitless) | | True |
-| mz_range_high_value | Numeric | The high value of the scanned mass range for MS1. (unitless) | | True |
-| mass_resolving_power | Numeric | The MS1 resolving power defined as m/âm where âm is the FWHM for a given peak with a specified m/z (m). (unitless) | | True |
-| mz_resolving_power | Numeric | The peak (m/z) used to calculate the resolving power. | | True |
-| ion_mobility | Allowable Value | Specifies whether or not ion mobility spectrometry was performed and which technology was used. Technologies for measuring ion mobility: Traveling Wave Ion Mobility Spectrometry (TWIMS), Trapped Ion Mobility Spectrometry (TIMS), High Field Asymmetric waveform ion Mobility Spectrometry (FAIMS), Drift Tube Ion Mobility Spectrometry (DTIMS, Structures for Lossless Ion Manipulations (SLIM). | ['TIMS', 'TWIMS', 'FAIMS', 'DTIMS', 'SLIMS'] | False |
-| ms_scan_mode | Allowable Value | Scan mode refers to the number of steps in the separation of fragments. | ['MS', 'MS/MS', 'MS3'] | True |
-| resolution_x_value | Numeric | The width of a pixel. | | True |
-| resolution_x_unit | Allowable Value | The unit of measurement of the width of a pixel. | ['nm', 'um'] | False |
-| resolution_y_value | Numeric | The height of a pixel | | True |
-| resolution_y_unit | Allowable Value | The unit of measurement of the height of a pixel. | ['nm', 'um'] | False |
-| preparation_type | Textfield | Common methods of depositing matrix for MALDI imaging include robotic spotting, electrospray deposition, and spray-coating with an airbrush. | | False |
-| preparation_instrument_vendor | Textfield | The manufacturer of the instrument used to prepare the sample for the assay. | | False |
-| preparation_instrument_model | Textfield | The model number/name of the instrument used to prepare the sample for the assay | | False |
-| preparation_maldi_matrix | Textfield | The matrix is a compound of crystallized molecules that acts like a buffer between the sample and the laser. It also helps ionize the sample, carrying it along the flight tube so it can be detected. | | False |
-| desi_solvent | Textfield | Solvent composition for conducting nanospray desorption electrospray ionization (nanoDESI) or desorption electrospray ionization (DESI). | | False |
-| desi_solvent_flow_rate | Numeric | The rate of flow of the solvent into a spray. | | False |
-| desi_solvent_flow_rate_unit | Allowable Value | Units of the rate of solvent flow. | ['uL/minute'] | False |
-| section_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing tissue sections for the assay. | | True |
-| processing_protocols_io_doi | Textfield | DOI for analysis protocols.io for this assay. | | False |
-| overall_protocols_io_doi | Textfield | DOI for protocols.io for the overall process. | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
-
-IMS Version 1
-
-## IMS Version 1
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['1'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['mass_spectrometry_imaging'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['MALDI-IMS'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['protein', 'metabolites', 'lipids'] | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ['Yes','No'] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| ms_source | Allowable Value | The ion source type used for surface sampling (MALDI, MALDI-2, DESI, or SIMS) or LC-MS/MS data acquisition (nESI) | ['MALDI', 'MALDI-2', 'DESI', 'SIMS', 'nESI'] | True |
-| polarity | Allowable Value | The polarity of the mass analysis (positive or negative ion modes) | ['negative ion mode', 'positive ion mode', 'negative and positive ion mode'] | True |
-| mz_range_low_value | Numeric | The low value of the scanned mass range for MS1. (unitless) | | True |
-| mz_range_high_value | Numeric | The high value of the scanned mass range for MS1. (unitless) | | True |
-| resolution_x_value | Numeric | The width of a pixel. | | True |
-| resolution_x_unit | Allowable Value | The unit of measurement of the width of a pixel. | ['nm', 'um'] | False |
-| resolution_y_value | Numeric | The height of a pixel | | True |
-| resolution_y_unit | Allowable Value | The unit of measurement of the height of a pixel. | ['nm', 'um'] | False |
-| preparation_type | Textfield | Common methods of depositing matrix for MALDI imaging include robotic spotting, electrospray deposition, and spray-coating with an airbrush. | | True |
-| preparation_instrument_vendor | Textfield | The manufacturer of the instrument used to prepare the sample for the assay. | | True |
-| preparation_instrument_model | Textfield | The model number/name of the instrument used to prepare the sample for the assay | | True |
-| preparation_maldi_matrix | Textfield | The matrix is a compound of crystallized molecules that acts like a buffer between the sample and the laser. It also helps ionize the sample, carrying it along the flight tube so it can be detected. | | True |
-| section_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing tissue sections for the assay. | | True |
-| overall_protocols_io_doi | Textfield | DOI for protocols.io for the overall process. | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
-
-IMS Version 0
-
-## IMS Version 0
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------|------------|
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['mass_spectrometry_imaging'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['MALDI-IMS'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['protein', 'metabolites', 'lipids'] | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ['Yes','No'] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| ms_source | Allowable Value | The ion source type used for surface sampling (MALDI, MALDI-2, DESI, or SIMS) or LC-MS/MS data acquisition (nESI) | ['MALDI', 'MALDI-2', 'DESI', 'SIMS', 'nESI'] | True |
-| polarity | Allowable Value | The polarity of the mass analysis (positive or negative ion modes) | ['negative ion mode', 'positive ion mode', 'negative and positive ion mode'] | True |
-| mz_range_low_value | Numeric | The low value of the scanned mass range for MS1. (unitless) | | True |
-| mz_range_high_value | Numeric | The high value of the scanned mass range for MS1. (unitless) | | True |
-| resolution_x_value | Numeric | The width of a pixel. | | True |
-| resolution_x_unit | Allowable Value | The unit of measurement of the width of a pixel. | ['nm', 'um'] | False |
-| resolution_y_value | Numeric | The height of a pixel | | True |
-| resolution_y_unit | Allowable Value | The unit of measurement of the height of a pixel. | ['nm', 'um'] | False |
-| preparation_type | Textfield | Common methods of depositing matrix for MALDI imaging include robotic spotting, electrospray deposition, and spray-coating with an airbrush. | | True |
-| preparation_instrument_vendor | Textfield | The manufacturer of the instrument used to prepare the sample for the assay. | | True |
-| preparation_instrument_model | Textfield | The model number/name of the instrument used to prepare the sample for the assay | | True |
-| preparation_maldi_matrix | Textfield | The matrix is a compound of crystallized molecules that acts like a buffer between the sample and the laser. It also helps ionize the sample, carrying it along the flight tube so it can be detected. | | True |
-| section_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing tissue sections for the assay. | | True |
-| overall_protocols_io_doi | Textfield | DOI for protocols.io for the overall process. | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
+---
+layout: page-triary
+---
+
+# MALDI Metadata Attributes
+
+Fields that are collected for MALDI data, available at ```Dataset.metadata.```
+
+
+* indicates a required field
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | |
+| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | |
+| preparation_protocol_doi * | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | ```https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1``` |
+| dataset_type * | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` |
+| analyte_class * | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` |
+| is_targeted * | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` |
+| acquisition_instrument_vendor * | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` |
+| acquisition_instrument_model * | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` |
+| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | |
+| source_storage_duration_unit * | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` |
+| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | |
+| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` |
+| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | |
+| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | |
+| ms_ionization_technique * | | The ionization approach (i.e., sample probing method) for performing imaging mass spectrometry. | ```DESI``` ```ESI``` ```HESI``` ```LA``` ```LDI``` ```MALDI``` ```MALDI-2``` ```nanoDESI``` ```SIMS-C60``` ```SIMS-H20``` |
+| ms_scan_mode * | | MS (mass spectrometry) scan mode refers to the number of steps in the separation of fragments. | ```MS1``` ```MS2``` ```MS3``` |
+| mass_analysis_polarity * | | The polarity of the mass analysis (positive or negative ion modes). | ```Negative and positive ion mode``` ```Negative ion mode``` ```Positive ion mode``` |
+| mass_to_charge_range_low_value | | The low value of the scanned mass-to-charge range, for MS1. (unitless) | |
+| mass_to_charge_range_high_value | | The high value of the scanned mass-to-charge range, for MS1. (unitless) | |
+| mass_resolving_power | | The mass resolving power m/∆m, where ∆m is defined as the full width at half-maximum (FWHM) for a given peak with a specified mass-to-charge (m/z). (unitless) | |
+| mass_to_charge_resolving_power | | The peak (m/z) used to calculate the resolving power. | |
+| ion_mobility | | Specifies which technology was used for ion mobility spectrometry. Technologies for measuring ion mobility: Traveling Wave Ion Mobility Spectrometry (TWIMS), Trapped Ion Mobility Spectrometry (TIMS), High Field Asymmetric waveform ion Mobility Spectrometry (FAIMS), Drift Tube Ion Mobility Spectrometry (DTIMS), Structures for Lossless Ion Manipulations (SLIM), and cyclic Ion Mobility Spectrometry (cIMS). | ```TIMS``` ```SLIM``` ```FAIMS``` ```DTIMS``` ```cIMS``` ```TWIMS``` |
+| matrix_deposition_method * | | Common methods of depositing matrix for assisting in desorption and ionization in imaging mass spectrometry include robotic spotting, electrospray deposition, and sublimation. | ```Electrospray deposition``` ```Not applicable``` ```Robotic spotting``` ```Robotic spraying``` ```Sublimation``` |
+| preparation_instrument_vendor * | | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` |
+| preparation_instrument_model * | | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist``` |
+| preparation_matrix * | | The matrix is a compound of crystallized molecules that acts like a buffer between the sample and the ionizing probe. It also helps ionize the sample, carrying it along the flight tube so it can be detected. | ```2,5-DHA (2,5-dihydroxyacetophenone)``` ```2,5-DHB (2,5-Dihydroxybenzoic acid)``` ```9-AA (9-aminoacridine)``` ```CHCA (alpha-cyano-4-hydroxy-cinnamic acid)``` ```DAN (1,5-diaminonapthalene)``` ```DMACA (4-(dimethylamino)cinnamic acid)``` ```NEDC (N-(1-naphthyl) ethylenediamine dihydrochloride)``` ```SA (sinapic acid)``` |
+| analysis_protocol_doi | | A DOI to a protocols.io protocol describing the software and database(s) used to process the raw data. Example: https://dx.doi.org/10.17504/protocols.io.bsu5ney6 | |
+| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | |
+
+
+
+
+## Deprecated Attributes
+
+
+ indicates a field that was previously required
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| assay_category * | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```sequence``` |
+| description | | Free-text description of this assay. | |
+| section_prep_protocols_io_doi | | DOI for protocols.io referring to the protocol for preparing tissue sections for the assay. | |
+| overall_protocols_io_doi | | DOI for protocols.io referring to the overall protocol for the assay. | |
+| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | |
+| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | |
+| operator | | Name of the person responsible for executing the assay. | |
+| operator_email | | Email address for the operator. | |
+| pi | | Name of the principal investigator responsible for the data. | |
+| pi_email | | Email address for the principal investigator. | |
+| resolution_x_value | | The width of a pixel. (Akoya pixel is 377nm square) | |
+| resolution_x_unit | | The unit of measurement of width of a pixel.(nm) | ```mm``` ```um``` ```nm``` |
+| resolution_y_value | | The height of a pixel. (Akoya pixel is 377nm square) | |
+| resolution_y_unit | | The unit of measurement of height of a pixel. (nm) | ```mm``` ```um``` ```nm``` |
+| version * | | Version of the schema to use when validating this metadata. | ```1``` |
diff --git a/docs/assays/metadata/MIBI.md b/docs/assays/metadata/MIBI.md
index 6c06a9b..e1d5819 100644
--- a/docs/assays/metadata/MIBI.md
+++ b/docs/assays/metadata/MIBI.md
@@ -1,104 +1,86 @@
----
-layout: page
----
-# MIBI
-
-NOTE: Several versions of this metadata schema have been created over time. The (Latest) version contains most attributes, but there may be some deprecated attributes in the older versions for which data has been collected. HuBMAP is in the process of creating a reference which combines all of these versions into a single view. That reference will be available here once completed.
-
- Version 2 (latest)
-
-## Version 2 (latest)
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-----------------------------------------------------|-----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|------------|
-| dataset_type | Allowable Value | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium```| True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA ```| True |
-| acquisition_instrument_vendor | Allowable Value | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | True |
-| acquisition_instrument_model | Allowable Value | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | True |
-| source_storage_duration_value | Numeric | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | | True |
-| source_storage_duration_unit | Allowable Value | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | True |
-| time_since_acquisition_instrument_calibration_value | Numeric | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | False |
-| time_since_acquisition_instrument_calibration_unit | Allowable Value | The time unit of measurement |```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | False |
-| preparation_protocol_doi | Textfield | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | True |
-| contributors_path | Textfield | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | True |
-| data_path | Textfield | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | True |
-| parent_sample_id | Textfield | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | True |
-| number_of_antibodies | Numeric | Number of antibodies | | True |
-| number_of_channels | Numeric | The number of distinct color channels in the image. | | True |
-| slide_id | Textfield | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | True |
-| roi_description | Textfield | A description of the anatomical structure being captured in the region of interest (ROI). | | True |
-| roi_id | Numeric | Multiple images are acquired from regions of interest (ROI1, ROI2, ROI3, etc) on a slide. The ROI ID is a number from 1 to N representing the ROI captured on a slide. | | True |
-| acquisition_id | Textfield | The acquisition_id refers to the directory containing the ROI images for a slide. Together, the "Acquisition ID" and the "ROI ID" indicate the slide-ROI represented in the image. | | True |
-| area_normalized_ion_dose_value | Numeric | Number of primary ions delivered to the sample per unit area. | | True |
-| area_normalized_ion_dose_unit | Allowable Value | Area normalized ion dose unit. | ```nA*hr/mm2``` | True |
-| data_precision_bytes | Numeric | Numerical data precision in bytes. | | True |
-| pixel_dwell_time_value | Numeric | Resident time of primary ion beam on each pixel to ionize it. | | True |
-| pixel_dwell_time_unit | Allowable Value | Pixel dwell time unit. | ```ms``` | True |
-| antibodies_path | Textfield | This is the location of the antibodies.tsv file relative to the root of the top level of the upload directory structure. This path should begin with "." and would likely be something like "./extras/antibodies.tsv". | | True |
-| metadata_schema_id | Textfield | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | True |
-
-
-
-
- Version 1
-
-## Version 1
-
-| Attribute | Type | Description | Allowable Values | Required |
-|--------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['1'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['mass_spectrometry_imaging'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['MIBI'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['protein'] | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ['Yes','No'] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| number_of_antibodies | Numeric | Number of antibodies | | True |
-| number_of_channels | Numeric | Number of fluorescent channels imaged during each cycle. | | True |
-| resolution_x_value | Numeric | The width of a pixel. (Akoya pixel is 377nm square) | | True |
-| resolution_x_unit | Allowable Value | The unit of measurement of width of a pixel.(nm) | ['mm', 'um', 'nm'] | False |
-| resolution_y_value | Numeric | The height of a pixel. (Akoya pixel is 377nm square) | | True |
-| resolution_y_unit | Allowable Value | The unit of measurement of height of a pixel. (nm) | ['mm', 'um', 'nm'] | False |
-| max_x_width_value | Numeric | Image width value of the ROI acquisition | | True |
-| max_x_width_unit | Allowable Value | Units of image width of the ROI acquisition | ['um'] | False |
-| max_y_height_value | Numeric | Image height value of the ROI acquisition | | True |
-| max_y_height_unit | Allowable Value | Units of image height of the ROI acquisition | ['um'] | False |
-| roi_description | Textfield | A description of the region of interest (ROI) captured in the image. | | True |
-| roi_id | Numeric | Multiple images (1-n) are acquired from regions of interest (ROI1, ROI2, ROI3, etc) on a slide. The roi_id is a number from 1-n representing the ROI captured on a slide. | | True |
-| acquisition_id | Textfield | The acquisition_id refers to the directory containing the ROI images for a slide. Together, the acquisition_id and the roi_id indicate the slide-ROI represented in the image. | | True |
-| area_normalized_ion_dose_unit | Allowable Value | Area normalized ion dose unit | ['nA*hr/mm2'] | False |
-| area_normalized_ion_dose_value | Numeric | Number of primary ions delivered to the sample per unit area | | True |
-| data_precision_bytes | Numeric | Numerical data precision in bytes | | True |
-| dual_count_start | Numeric | Threshold for dual counting. | | True |
-| end_datetime | Datetime | Time stamp indicating end of ablation for ROI | | True |
-| pixel_dwell_time_value | Numeric | Resident time of primary ion beam on each pixel. | | True |
-| pixel_dwell_time_unit | Allowable Value | Pixel dwell time unit. | ['ms'] | False |
-| pixel_size_x_value | Numeric | Width value of the pixel or voxel measurement (distinct from the image resolution_x_value). | | True |
-| pixel_size_x_unit | Allowable Value | Width unit of the pixel or voxel measurement. | ['nm'] | False |
-| pixel_size_y_value | Numeric | Length value of the pixel or voxel measurement (distinct from the image resolution_y_value). | | True |
-| pixel_size_y_unit | Allowable Value | Length unit of the pixel or voxel measurement. | ['nm'] | False |
-| preparation_instrument_vendor | Allowable Value | The manufacturer of the instrument used to prepare the sample for the assay. | ['Custom', 'Ionpath'] | True |
-| preparation_instrument_model | Allowable Value | The model number/name of the instrument used to prepare the sample for the assay | ['Custom', 'MIBIscope 1', 'MIBIscope 2'] | True |
-| primary_ion | Allowable Value | Primary ion. | ['Xe'] | True |
-| primary_ion_current_value | Numeric | Primary ion current value. | | True |
-| primary_ion_current_unit | Allowable Value | Primary ion current unit, typically nA or pA | ['nA', 'pA'] | False |
-| reagent_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing reagents for the assay. | | True |
-| section_prep_protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for preparing tissue sections for the assay. | | True |
-| segment_data_format | Allowable Value | This refers to the data type, which is a "float" for the IMC counts. | ['float', 'integer', 'string'] | True |
-| signal_type | Allowable Value | Type of signal measured per channel (usually dual counts) | ['dual count', 'pulse count', 'intensity value'] | True |
-| start_datetime | Datetime | Time stamp indicating start of ablation for ROI | | True |
-| antibodies_path | Textfield | Relative path to file with antibody information for this dataset. | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
+---
+layout: page-triary
+---
+
+# MIBI Metadata Attributes
+
+Fields that are collected for MIBI data, available at ```Dataset.metadata.```
+
+
+* indicates a required field
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | |
+| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | |
+| preparation_protocol_doi * | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | ```https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1``` |
+| dataset_type * | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` |
+| analyte_class * | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` |
+| is_targeted * | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ```Yes``` ```No``` |
+| acquisition_instrument_vendor * | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` |
+| acquisition_instrument_model * | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` |
+| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | |
+| source_storage_duration_unit * | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` |
+| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | |
+| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` |
+| contributors_path | | Relative path to file with ORCID IDs for contributors for this dataset. | |
+| data_path | | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | |
+| number_of_antibodies | | Number of antibodies | |
+| number_of_channels | | Number of fluorescent channels imaged during each cycle. | |
+| slide_id | | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | |
+| roi_description | | A description of the region of interest (ROI) captured in the image. | |
+| roi_id | | Multiple images (1-n) are acquired from regions of interest (ROI1, ROI2, ROI3, etc) on a slide. The roi_id is a number from 1-n representing the ROI captured on a slide. | |
+| acquisition_id | | The acquisition_id refers to the directory containing the ROI images for a slide. Together, the acquisition_id and the roi_id indicate the slide-ROI represented in the image. | |
+| area_normalized_ion_dose_value | | Number of primary ions delivered to the sample per unit area | |
+| area_normalized_ion_dose_unit * | | Area normalized ion dose unit | ```nA*hr/mm2``` |
+| data_precision_bytes | | Numerical data precision in bytes | |
+| pixel_dwell_time_value | | Resident time of primary ion beam on each pixel. | |
+| pixel_dwell_time_unit * | | Pixel dwell time unit. | ```ms``` |
+| antibodies_path | | Relative path to file with antibody information for this dataset. | |
+| primary_ion * | | Primary ion. | ```Xe``` |
+| primary_ion_current_unit | | Primary ion current unit, typically nA or pA | ```nA``` ```pA``` |
+| primary_ion_current_value | | Primary ion current value. | |
+| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | |
+
+
+
+
+## Deprecated Attributes
+
+
+ indicates a field that was previously required
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| assay_category * | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```sequence``` |
+| description | | Free-text description of this assay. | |
+| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | |
+| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | |
+| protocol_io_doi | | | |
+| reagent_prep_protocols_io_doi | | DOI for protocols.io referring to the protocol for preparing reagents for the assay. | |
+| preparation_instrument_model * | | The model number/name of the instrument used to prepare the sample for the assay | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist``` |
+| preparation_instrument_vendor * | | The manufacturer of the instrument used to prepare the sample for the assay. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` |
+| operator | | Name of the person responsible for executing the assay. | |
+| operator_email | | Email address for the operator. | |
+| pi | | Name of the principal investigator responsible for the data. | |
+| pi_email | | Email address for the principal investigator. | |
+| segment_data_format * | | This refers to the data type, which is a "float" for the IMC counts. | ```float``` ```integer``` ```string``` |
+| signal_type * | | Type of signal measured per channel (usually dual counts) | ```dual count``` ```pulse count``` ```intensity value``` |
+| dual_count_start | | Threshold for dual counting. | |
+| start_datetime | | Time stamp indicating start of ablation for ROI | |
+| end_datetime | | Time stamp indicating end of ablation for ROI | |
+| resolution_x_unit | | The unit of measurement of width of a pixel.(nm) | ```mm``` ```um``` ```nm``` |
+| resolution_x_value | | The width of a pixel. (Akoya pixel is 377nm square) | |
+| resolution_y_unit | | The unit of measurement of height of a pixel. (nm) | ```mm``` ```um``` ```nm``` |
+| resolution_y_value | | The height of a pixel. (Akoya pixel is 377nm square) | |
+| max_x_width_unit | | Units of image width of the ROI acquisition | ```um``` |
+| max_x_width_value | | Image width value of the ROI acquisition | |
+| max_y_height_unit | | Units of image height of the ROI acquisition | ```um``` |
+| max_y_height_value | | Image height value of the ROI acquisition | |
+| pixel_size_x_unit | | Width unit of the pixel or voxel measurement. | ```nm``` |
+| pixel_size_x_value | | Width value of the pixel or voxel measurement (distinct from the image resolution_x_value). | |
+| pixel_size_y_unit | | Length unit of the pixel or voxel measurement. | ```nm``` |
+| pixel_size_y_value | | Length value of the pixel or voxel measurement (distinct from the image resolution_y_value). | |
+| resolution_x_value | | The width of a pixel. (Akoya pixel is 377nm square) | |
+| resolution_y_value | | The height of a pixel. (Akoya pixel is 377nm square) | |
+| version * | | Version of the schema to use when validating this metadata. | ```1``` |
diff --git a/docs/assays/metadata/MUSIC.md b/docs/assays/metadata/MUSIC.md
index 58c7128..7d40e1f 100644
--- a/docs/assays/metadata/MUSIC.md
+++ b/docs/assays/metadata/MUSIC.md
@@ -1,58 +1,62 @@
----
-layout: page
----
-# MUSIC
-
- Current Metadata Attributes
-
-## Current Metadata Attributes
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-----------------------------------------------------|-----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|------------|
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA ```| True |
-| acquisition_instrument_vendor | Allowable Value | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | True |
-| acquisition_instrument_model | Allowable Value | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | True |
-| source_storage_duration_value | Numeric | How long was the source material (parent) stored, prior to this sample being processed. | | True |
-| source_storage_duration_unit | Allowable Value | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | True |
-| time_since_acquisition_instrument_calibration_value | Numeric | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | False |
-| time_since_acquisition_instrument_calibration_unit | Allowable Value | The time unit of measurement |```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | False |
-| contributors_path | Textfield | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | True |
-| data_path | Textfield | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | True |
-| barcode_read | Allowable Value | Which read file contains the cell or capture spot barcode. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if the source material is barcoded. This field is used to determine which analysis pipeline to run. | ```Read 2 (R2)``` ```Read 1 (R1)``` ```Not applicable``` | True |
-| barcode_size | Allowable Value | Length of the cell or capture spot barcode in base pairs. Cell and capture spot barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences, the offsets. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if the source material is barcoded. This field is used to determine which analysis pipeline to run. | ```14``` ```16``` ```40``` ```8,8,8``` ```8,6``` ```14-17,14,14``` ```Not applicable``` | True |
-| umi_read | Allowable Value | Which read file contains the UMI barcode. This should be included when constructing sequencing libraries with a non-commercial kit. | ```Read 2 (R2)``` ```Read 1 (R1)``` ```Not applicable``` | True |
-| umi_size | Allowable Value | Length of the umi barcode in base pairs. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if UMI are present. This field is used to determine which analysis pipeline to run. | ```8``` ```9``` ```10``` ```12``` ```Not applicable``` | True |
-| assay_input_entity | Allowable Value | This is the entity from which the analyte is being captured. For example, for bulk sequencing this would be "tissue", while it would be "single cell" for single cell sequencing. This field is used to determine which analysis pipeline to run. | ```area of interest``` ```single cell``` ```single nucleus``` ```spot``` ```tissue (bulk)``` | True |
-| number_of_input_cells_or_nuclei | Numeric | How many cells or nuclei were input to the assay? This is typically not available for preparations working with bulk tissue. | | False |
-| library_adapter_sequence | Textfield | 5’ and/or 3’ read adapter sequences used as part of the library preparation protocol to render the library compatible with the sequencing protocol and instrumentation. This should be provided as comma-separated list of key:value pairs (adapter name:sequence). | | True |
-| library_average_fragment_size | Numeric | Average size of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. Numeric value in base pairs (bp). | | True |
-| library_input_amount_value | Numeric | The amount of cDNA, after amplification, that was used for library construction. | | False |
-| library_input_amount_unit | Allowable Value | unit of library input amount value | ```ng``` ```ul``` | False |
-| library_output_amount_value | Numeric | Total amount (eg. nanograms) of library after the clean-up step of final pcr amplification step. Answer the question: What is the Qubit measured concentration (ng/ul) times the elution volume (ul) after the final clean-up step? | | False |
-| library_output_amount_unit | Allowable Value | Units of library final yield. | ```ng``` ```ul``` | False |
-| library_concentration_value | Numeric | The concentration value of the pooled library samples submitted for sequencing. | | True |
-| library_concentration_unit | Allowable Value | Unit of library concentration value. | ```ng/ul``` ```nM``` | True |
-| library_layout | Allowable Value | Whether the library was generated for single-end or paired end sequencing | ```paired-end``` ```single-end``` | True |
-| library_preparation_kit | Allowable Value | Reagent kit used for library preparation | ```10X Genomics; Automated Library Construction Kit``` ```24 rxns; PN 1000428``` ```10X Genomics; Chromium Next GEM Automated Single Cell 5' Kit v2``` ```24 rxns; PN 1000290``` ```10X Genomics; Chromium Next GEM Automated Single Cell 5' Kit v2``` ```4 rxns; PN 1000298``` ```10X Genomics; Chromium Next GEM Single Cell 3' GEM``` ```Library & Gel Bead Kit v3.1``` ```16 rxns; PN 1000121``` ```10X Genomics; Chromium Next GEM Single Cell 3' HT Kit v3.1``` ```48 rxns; PN 1000348``` ```10X Genomics; Chromium Next GEM Single Cell 3' HT Kit v3.1``` ```8 rxns; PN 1000370``` ```10X Genomics; Chromium Next GEM Single Cell 3' Kit v3.1``` ```16 rxns; PN 1000268``` ```10X Genomics; Chromium Next GEM Single Cell 3' Kit v3.1``` ```4 rxns; PN 1000269``` ```10X Genomics; Chromium Next GEM Single Cell 5' Kit v2``` ```16 rxns; PN 1000263``` ```10X Genomics; Chromium Next GEM Single Cell 5' Kit v2``` ```4 rxns; PN 1000265``` ```10X Genomics; Chromium Next GEM Single Cell Fixed RNA Hybridization & Library Kit``` ```4 rxns; PN 1000415``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```16 rxn; PN 1000283``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```4 rxn; PN 1000285``` ```10X Genomics; Chromium Single Cell 3' GEM``` ```Library & Gel Bead Kit v3``` ```4 rxns PN 1000092``` ```10X Genomics; Chromium Single Cell 3' Library & Gel Bead Kit``` ```4 rxns; PN 120267``` ```10X Genomics; Visium CytAssist Spatial Gene Expression for FFPE``` ```Human Transcriptome``` ```11 mm``` ```2 reactions; PN 1000522``` ```10X Genomics; Visium CytAssist Spatial Gene Expression for FFPE``` ```Human Transcriptome``` ```6.5mm``` ```4 reactions; PN 1000520``` ```10X Genomics; Visium Spatial for FFPE Gene Expression Kit``` ```Human Transcriptome``` ```1 slides``` ```4 reactions; PN 1000338``` ```10X Genomics; Visium Spatial for FFPE Gene Expression Kit``` ```Mouse Transcriptome``` ```4 rxns; PN 1000339``` ```10X Genomics; Visium Spatial Gene Expression Slide and Reagent Kit``` ```1 slides``` ```4 reactions; PN 1000187``` ```10X Genomics; Visium Spatial Gene Expression Slide and Reagent Kit``` ```4 slides``` ```16 reactions; PN 1000184``` ```Custom``` ```Illumina; TruSeq Stranded mRNA Library Prep (48 samples); PN 20020594``` ```Illumina; TruSeq Stranded mRNA Library Prep (96 samples); PN 20020595``` ```New England BioLabs; NEBNext Ultra II RNA Library Prep Kit for Illumina; PN E7770``` ```Parse Biosciences; Evercode WT Mini v2 Kit``` ```12 rxns; PN ECW02010``` ```Parse Biosciences; Evercode WT v2 Kit``` ```48 rxns; PN ECW02030)``` | True |
-| sample_indexing_kit | Allowable Value | Indexes are needed for multiplexing sequencing libraries for simultaneous sequencing (pooling) and proper attachment to the Illumina flowcell. Each indexing kit would have a number of compatible sequences ("sample indexing sets") that are used to label some number of samples (the number of sets depend on the kit). | ```10X Genomics; Chromium i7 Sample Index Plate (96 rxn); PN 220103``` ```10X Genomics; Dual Index Kit TS``` ```Set A; PN 1000251``` ```10X Genomics; Dual Index Kit TT``` ```Set A (96 rxn); PN 1000215``` ```10X Genomics; Single Index Kit N``` ```Set A (96 rxn); PN 1000212``` ```Custom``` ```Illumina; IDT for Illumina - TruSeq RNA UD Indexes v2 (96 Indexes``` ```96 Samples); PN 20040871``` ```Illumina; TruSeq RNA CD Index Plate (96 Indexes``` ```96 Samples); PN 20019792``` ```Illumina; TruSeq RNA Single Indexes Set A (12 Indexes``` ```48 Samples); PN 20020492``` ```Illumina; TruSeq RNA Single Indexes Set B (12 Indexes``` ```48 Samples); PN 20020493``` ```Integrated DNA Technologies: Custom DNA Oligos``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-AB``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-CD``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-EF``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-GH``` ```Not applicable``` ```Parse Biosciences; Fragmentation Reagents; PN WX100``` ```Parse Biosciences; UDI Plate - WT; PN UDI1001 ```| True |
-| sample_indexing_set | Textfield | The specific sequencing barcode index set used, selected from the sample indexing kit. Example: For 10X this might be "SI-GA-A1", for Nextera "N505 - CTCCTTAC" | | False |
-| is_technical_replicate | Allowable Value | Is the sequencing reaction run in replicate, "Yes" or "No". If "Yes", FASTQ files in dataset need to be merged. | ```Yes``` ```No``` | True |
-| expected_entity_capture_count | Numeric | Number of cells, nuclei or capture spots expected to be captured by the assay. For Visium this is the total number of spots covered by tissue, within the capture area. | | False |
-| sequencing_reagent_kit | Allowable Value | Reagent kit used for sequencing |```Custom``` ```Illumina; HiSeq 3000/4000 PE Cluster Kit PE-410-1001; PN 1000283``` ```Illumina; NextSeq 1000/2000 P2 Reagent v3 Kit (100 Cycles); PN 20046811``` ```Illumina; NextSeq 1000/2000 P2 Reagent v3 Kit (200 Cycles); PN 20046812``` ```Illumina; NextSeq 1000/2000 P2 Reagent v3 Kit (300 Cycles); PN 20046813``` ```Illumina; NextSeq 2000 P3 Reagent Kit (300 Cycles); PN 20040561``` ```Illumina; NextSeq 2000 P3 Reagents Kit (100 Cycles); PN 20040559``` ```Illumina; NextSeq 500/550 Hi Output Kit 150 Cycles; v2.5; PN 20024907``` ```Illumina; NextSeq 500/550 Hi Output Kit 75 Cycles v2.5; PN 20024906``` ```Illumina; NextSeq 500/550 Mid Output Kit 150 Cycles v2.5; PN 20024904``` ```Illumina; NovaSeq 6000 S1 Reagent Kit (200 Cycles); PN 20012864``` ```Illumina; NovaSeq 6000 S1 Reagent v1.5 Kit (100 Cycles); PN 20028319``` ```Illumina; NovaSeq 6000 S1 Reagent v1.5 Kit (200 Cycles); PN 20028318``` ```Illumina; NovaSeq 6000 S1 Reagent v1.5 Kit (300 Cycles); PN 20028317``` ```Illumina; NovaSeq 6000 S2 Reagent v1.5 Kit (100 Cycles); PN 20028316``` ```Illumina; NovaSeq 6000 S4 Reagent Kit v1.5 (300 cycles); PN 20028312``` ```Illumina; NovaSeq 6000 S4 Reagent v1.5 Kit (200 Cycles); PN 20028313``` ```Illumina; NovaSeq 6000 SP Reagent v1.5 Kit (100 Cycles); PN 20028401``` ```Illumina; NovaSeq X Series 1.5B Reagent Kit (100 Cycle); PN 20104703``` ```Illumina; NovaSeq X Series 1.5B Reagent Kit (200 Cycle); PN 20104704``` ```Illumina; NovaSeq X Series 1.5B Reagent Kit (300 Cycle); PN 20104705``` ```Illumina; NovaSeq X Series 10B Reagent Kit (100 Cycle); PN 20085596``` ```Illumina; NovaSeq X Series 10B Reagent Kit (200 Cycle); PN 20085595``` ```Illumina; NovaSeq X Series 10B Reagent Kit (300 Cycle); PN 20085594``` | True |
-| sequencing_read_format | Textfield | Number of sequencing cycles in each round of sequencing (i.e., Read1, i7 index, i5 index, and Read2). This is reported as a comma-delimited list. Example: For 10X snATAC-seq (R1,Index,R2,R3) this might be: 50,8,16,50. For SNARE-seq2 this might be: 75,94,8,75 | | True |
-| sequencing_batch_id | Textfield | The ID for the sequencing run. This could, for example, be the chip ID and should allow users the ability to determine which samples were processed together in a sequencing run. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | False |
-| capture_batch_id | Textfield | A lab-generated ID to identify which cells were captured at the same time. This would, for example, be an ID to denote which datasets were derived from a single 10X Genomics Chromium Controller run. In the case of the 10X Controller this could be the chip ID and would allow users the ability to determine which samples were processed together in a Chromium controller. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | False |
-| preparation_instrument_vendor | Allowable Value | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` | False |
-| preparation_instrument_model | Allowable Value | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist ```| False |
-| metadata_schema_id | Textfield | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | True |
-| amount_of_input_analyte_value | Numeric | The amount of RNA or DNA input to the assay, typically measured by a Qubit, BioAnalyzer, or TapeStation. In most single cell/nuclei assays, this value isn't available. | | False |
-| preparation_instrument_kit | Allowable Value | The reagent kit used with the preparation instrument. | ```10X Genomics; Chromium Next GEM Chip G Single Cell Kit``` ```16 rxns; PN 1000127``` ```10X Genomics; Chromium Next GEM Chip G Single Cell Kit``` ```48 rxns; PN 1000120``` ```10X Genomics; Chromium Next GEM Chip K Automated Single Cell Kit``` ```48 rxns; PN 1000289``` ```10X Genomics; Chromium Next GEM Chip K Single Cell Kit``` ```16 rxns; PN 1000287``` ```10X Genomics; Chromium Next GEM Chip K Single Cell Kit``` ```48 rxns; PN 1000286``` ```10X Genomics; Chromium Next GEM Chip Q Single Cell Kit``` ```16 rxns; PN 1000422``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```16 rxn; PN 1000283``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```4 rxn; PN 1000285``` ```10X Genomics; Visium FFPE Reagent Kit v2-Small``` ```PN 1000436``` ```Custom``` | False |
-| preparation_protocol_doi | Textfield | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | True |
-| parent_sample_id | Textfield | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | True |
-| umi_offset | Allowable Value | Position in the read at which the UMI barcode starts. This should be included when constructing sequencing libraries with a non-commercial kit. | ```0``` ```16``` ```Not applicable``` | True |
-| dataset_type | Allowable Value | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium```| True |
-| barcode_offset | Allowable Value | Positions in the read at which the cell or capture spot barcodes start. Cell and capture spot barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences (the offsets). First barcode at position 0, then 38, then 76. This should be included when constructing sequencing libraries with a non-commercial kit. | ```0``` ```8``` ```20``` ```1,27``` ```0,38,76``` ```10,48,78``` ```10,48,86``` ```0,20-23,41-44``` ```Not applicable``` | True |
-| amount_of_input_analyte_unit | Allowable Value | Units of amount of entity input to assay value | ```ug``` ```ng``` | False |
-
-
\ No newline at end of file
+---
+layout: page-triary
+---
+
+# MUSIC (CEDAR) Metadata Attributes
+
+Fields that are collected for MUSIC (CEDAR) data, available at ```Dataset.metadata.```
+
+
+* indicates a required field
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | |
+| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | |
+| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | ```https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1``` |
+| dataset_type | | The specific type of dataset being produced. | |
+| analyte_class | | Analytes are the target molecules being measured with the assay. | |
+| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | |
+| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | |
+| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | |
+| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | |
+| source_storage_duration_unit | | The time duration unit of measurement | |
+| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | |
+| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | |
+| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | |
+| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | |
+| barcode_offset | | Positions in the read at which the cell or capture spot barcodes start. Cell and capture spot barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences (the offsets). First barcode at position 0, then 38, then 76. This should be included when constructing sequencing libraries with a non-commercial kit. | |
+| barcode_read | | Which read file contains the cell or capture spot barcode. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if the source material is barcoded. This field is used to determine which analysis pipeline to run. | |
+| barcode_size | | Length of the cell or capture spot barcode in base pairs. Cell and capture spot barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences, the offsets. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if the source material is barcoded. This field is used to determine which analysis pipeline to run. | |
+| umi_offset | | Position in the read at which the UMI barcode starts. This should be included when constructing sequencing libraries with a non-commercial kit. | |
+| umi_read | | Which read file contains the UMI barcode. This should be included when constructing sequencing libraries with a non-commercial kit. | |
+| umi_size | | Length of the umi barcode in base pairs. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if UMI are present. This field is used to determine which analysis pipeline to run. | |
+| assay_input_entity | | This is the entity from which the analyte is being captured. For example, for bulk sequencing this would be "tissue", while it would be "single cell" for single cell sequencing. This field is used to determine which analysis pipeline to run. | |
+| number_of_input_cells_or_nuclei | | How many cells or nuclei were input to the assay? This is typically not available for preparations working with bulk tissue. | |
+| amount_of_input_analyte_value | | The amount of RNA or DNA input to the assay, typically measured by a Qubit, BioAnalyzer, or TapeStation. In most single cell/nuclei assays, this value isn't available. | |
+| amount_of_input_analyte_unit | | Units of amount of entity input to assay value | |
+| library_adapter_sequence | | 5’ and/or 3’ read adapter sequences used as part of the library preparation protocol to render the library compatible with the sequencing protocol and instrumentation. This should be provided as comma-separated list of key:value pairs (adapter name:sequence). | |
+| library_average_fragment_size | | Average size of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. Numeric value in base pairs (bp). | |
+| library_input_amount_value | | The amount of cDNA, after amplification, that was used for library construction. | |
+| library_input_amount_unit | | unit of library input amount value | |
+| library_output_amount_value | | Total amount (eg. nanograms) of library after the clean-up step of final pcr amplification step. Answer the question: What is the Qubit measured concentration (ng/ul) times the elution volume (ul) after the final clean-up step? | |
+| library_output_amount_unit | | Units of library final yield. | |
+| library_concentration_value | | The concentration value of the pooled library samples submitted for sequencing. | |
+| library_concentration_unit | | Unit of library concentration value. | |
+| library_layout | | Whether the library was generated for single-end or paired end sequencing | |
+| library_preparation_kit | | Reagent kit used for library preparation | |
+| sample_indexing_kit | | Indexes are needed for multiplexing sequencing libraries for simultaneous sequencing (pooling) and proper attachment to the Illumina flowcell. Each indexing kit would have a number of compatible sequences ("sample indexing sets") that are used to label some number of samples (the number of sets depend on the kit). | |
+| sample_indexing_set | | The specific sequencing barcode index set used, selected from the sample indexing kit. Example: For 10X this might be "SI-GA-A1", for Nextera "N505 - CTCCTTAC" | |
+| is_technical_replicate | | Is the sequencing reaction run in replicate, "Yes" or "No". If "Yes", FASTQ files in dataset need to be merged. | |
+| expected_entity_capture_count | | Number of cells, nuclei or capture spots expected to be captured by the assay. For Visium this is the total number of spots covered by tissue, within the capture area. | |
+| sequencing_reagent_kit | | Reagent kit used for sequencing | |
+| sequencing_read_format | | Number of sequencing cycles in each round of sequencing (i.e., Read1, i7 index, i5 index, and Read2). This is reported as a comma-delimited list. Example: For 10X snATAC-seq (R1,Index,R2,R3) this might be: 50,8,16,50. For SNARE-seq2 this might be: 75,94,8,75 | |
+| sequencing_batch_id | | The ID for the sequencing run. This could, for example, be the chip ID and should allow users the ability to determine which samples were processed together in a sequencing run. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | |
+| capture_batch_id | | A lab-generated ID to identify which cells were captured at the same time. This would, for example, be an ID to denote which datasets were derived from a single 10X Genomics Chromium Controller run. In the case of the 10X Controller this could be the chip ID and would allow users the ability to determine which samples were processed together in a Chromium controller. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | |
+| preparation_instrument_vendor | | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | |
+| preparation_instrument_model | | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | |
+| preparation_instrument_kit | | The reagent kit used with the preparation instrument. | |
+| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | |
+
+
+
\ No newline at end of file
diff --git a/docs/assays/metadata/RNAseq.md b/docs/assays/metadata/RNAseq.md
index 8a07abc..351b7c0 100644
--- a/docs/assays/metadata/RNAseq.md
+++ b/docs/assays/metadata/RNAseq.md
@@ -1,425 +1,95 @@
----
-layout: page
----
-# RNAseq
-
-NOTE: Several versions of this metadata schema have been created over time. The (Latest) version contains most attributes, but there may be some deprecated attributes in the older versions for which data has been collected. HuBMAP is in the process of creating a reference which combines all of these versions into a single view. That reference will be available here once completed.
-
- RNAseq Version 5 (current)
-
-## RNAseq Version 5 (current)
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-----------------------------------------------------|-----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|------------|
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA ```| True |
-| acquisition_instrument_vendor | Allowable Value | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | True |
-| acquisition_instrument_model | Allowable Value | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | True |
-| source_storage_duration_value | Numeric | How long was the source material (parent) stored, prior to this sample being processed. | | True |
-| source_storage_duration_unit | Allowable Value | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | True |
-| time_since_acquisition_instrument_calibration_value | Numeric | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | False |
-| time_since_acquisition_instrument_calibration_unit | Allowable Value | The time unit of measurement |```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | False |
-| contributors_path | Textfield | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | True |
-| data_path | Textfield | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | True |
-| barcode_read | Allowable Value | Which read file contains the cell or capture spot barcode. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if the source material is barcoded. This field is used to determine which analysis pipeline to run. | ```Read 2 (R2)``` ```Read 1 (R1)``` ```Not applicable``` | True |
-| barcode_size | Allowable Value | Length of the cell or capture spot barcode in base pairs. Cell and capture spot barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences, the offsets. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if the source material is barcoded. This field is used to determine which analysis pipeline to run. | ```14``` ```16``` ```40``` ```8,8,8``` ```8,6``` ```Not applicable``` | True |
-| umi_read | Allowable Value | Which read file contains the UMI barcode. This should be included when constructing sequencing libraries with a non-commercial kit. | ```Read 2 (R2)``` ```Read 1 (R1)``` ```Not applicable``` | True |
-| umi_size | Allowable Value | Length of the umi barcode in base pairs. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if UMI are present. This field is used to determine which analysis pipeline to run. | ```8``` ```9``` ```10``` ```12``` ```Not applicable``` | True |
-| assay_input_entity | Allowable Value | This is the entity from which the analyte is being captured. For example, for bulk sequencing this would be "tissue", while it would be "single cell" for single cell sequencing. This field is used to determine which analysis pipeline to run. | ```area of interest``` ```single cell``` ```single nucleus``` ```spot``` ```tissue (bulk)``` | True |
-| number_of_input_cells_or_nuclei | Numeric | How many cells or nuclei were input to the assay? This is typically not available for preparations working with bulk tissue. | | False |
-| library_adapter_sequence | Textfield | 5’ and/or 3’ read adapter sequences used as part of the library preparation protocol to render the library compatible with the sequencing protocol and instrumentation. This should be provided as comma-separated list of key:value pairs (adapter name:sequence). | | True |
-| library_average_fragment_size | Numeric | Average size of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. Numeric value in base pairs (bp). | | True |
-| library_input_amount_value | Numeric | The amount of cDNA, after amplification, that was used for library construction. | | True |
-| library_input_amount_unit | Allowable Value | unit of library input amount value | ```ng``` ```ul``` | True |
-| library_output_amount_value | Numeric | Total amount (eg. nanograms) of library after the clean-up step of final pcr amplification step. Answer the question: What is the Qubit measured concentration (ng/ul) times the elution volume (ul) after the final clean-up step? | | False |
-| library_output_amount_unit | Allowable Value | Units of library final yield. | ```ng``` ```ul``` | False |
-| library_concentration_value | Numeric | The concentration value of the pooled library samples submitted for sequencing. | | True |
-| library_concentration_unit | Allowable Value | Unit of library concentration value. | ```ng/ul``` ```nM``` | True |
-| library_layout | Allowable Value | Whether the library was generated for single-end or paired end sequencing | ```paired-end``` ```single-end``` | True |
-| number_of_pcr_cycles_for_indexing | Numeric | Number of PCR cycles performed in order to add adapters and amplify the library. This does not include the cDNA amplification which is captured in the "number of iterations of cDNA amplification" field. | | True |
-| library_preparation_kit | Allowable Value | Reagent kit used for library preparation | ```10X Genomics; Automated Library Construction Kit``` ```24 rxns; PN 1000428``` ```10X Genomics; Chromium Next GEM Automated Single Cell 5' Kit v2``` ```24 rxns; PN 1000290``` ```10X Genomics; Chromium Next GEM Automated Single Cell 5' Kit v2``` ```4 rxns; PN 1000298``` ```10X Genomics; Chromium Next GEM Single Cell 3' GEM``` ```Library & Gel Bead Kit v3.1``` ```16 rxns; PN 1000121``` ```10X Genomics; Chromium Next GEM Single Cell 3' HT Kit v3.1``` ```48 rxns; PN 1000348``` ```10X Genomics; Chromium Next GEM Single Cell 3' HT Kit v3.1``` ```8 rxns; PN 1000370``` ```10X Genomics; Chromium Next GEM Single Cell 3' Kit v3.1``` ```16 rxns; PN 1000268``` ```10X Genomics; Chromium Next GEM Single Cell 3' Kit v3.1``` ```4 rxns; PN 1000269``` ```10X Genomics; Chromium Next GEM Single Cell 5' Kit v2``` ```16 rxns; PN 1000263``` ```10X Genomics; Chromium Next GEM Single Cell 5' Kit v2``` ```4 rxns; PN 1000265``` ```10X Genomics; Chromium Next GEM Single Cell Fixed RNA Hybridization & Library Kit``` ```4 rxns; PN 1000415``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```16 rxn; PN 1000283``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```4 rxn; PN 1000285``` ```10X Genomics; Chromium Single Cell 3' GEM``` ```Library & Gel Bead Kit v3``` ```4 rxns PN 1000092``` ```10X Genomics; Chromium Single Cell 3' Library & Gel Bead Kit``` ```4 rxns; PN 120267``` ```10X Genomics; Visium CytAssist Spatial Gene Expression for FFPE``` ```Human Transcriptome``` ```11 mm``` ```2 reactions; PN 1000522``` ```10X Genomics; Visium CytAssist Spatial Gene Expression for FFPE``` ```Human Transcriptome``` ```6.5mm``` ```4 reactions; PN 1000520``` ```10X Genomics; Visium Spatial for FFPE Gene Expression Kit``` ```Human Transcriptome``` ```1 slides``` ```4 reactions; PN 1000338``` ```10X Genomics; Visium Spatial for FFPE Gene Expression Kit``` ```Mouse Transcriptome``` ```4 rxns; PN 1000339``` ```10X Genomics; Visium Spatial Gene Expression Slide and Reagent Kit``` ```1 slides``` ```4 reactions; PN 1000187``` ```10X Genomics; Visium Spatial Gene Expression Slide and Reagent Kit``` ```4 slides``` ```16 reactions; PN 1000184``` ```Custom``` ```Illumina; TruSeq Stranded mRNA Library Prep (48 samples); PN 20020594``` ```Illumina; TruSeq Stranded mRNA Library Prep (96 samples); PN 20020595``` ```New England BioLabs; NEBNext Ultra II RNA Library Prep Kit for Illumina; PN E7770``` ```Parse Biosciences; Evercode WT Mini v2 Kit``` ```12 rxns; PN ECW02010``` ```Parse Biosciences; Evercode WT v2 Kit``` ```48 rxns; PN ECW02030)``` | True |
-| sample_indexing_kit | Allowable Value | Indexes are needed for multiplexing sequencing libraries for simultaneous sequencing (pooling) and proper attachment to the Illumina flowcell. Each indexing kit would have a number of compatible sequences ("sample indexing sets") that are used to label some number of samples (the number of sets depend on the kit). | ```10X Genomics; Chromium i7 Sample Index Plate (96 rxn); PN 220103``` ```10X Genomics; Dual Index Kit TS``` ```Set A; PN 1000251``` ```10X Genomics; Dual Index Kit TT``` ```Set A (96 rxn); PN 1000215``` ```10X Genomics; Single Index Kit N``` ```Set A (96 rxn); PN 1000212``` ```Custom``` ```Illumina; IDT for Illumina - TruSeq RNA UD Indexes v2 (96 Indexes``` ```96 Samples); PN 20040871``` ```Illumina; TruSeq RNA CD Index Plate (96 Indexes``` ```96 Samples); PN 20019792``` ```Illumina; TruSeq RNA Single Indexes Set A (12 Indexes``` ```48 Samples); PN 20020492``` ```Illumina; TruSeq RNA Single Indexes Set B (12 Indexes``` ```48 Samples); PN 20020493``` ```Integrated DNA Technologies: Custom DNA Oligos``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-AB``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-CD``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-EF``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-GH``` ```Not applicable``` ```Parse Biosciences; Fragmentation Reagents; PN WX100``` ```Parse Biosciences; UDI Plate - WT; PN UDI1001 ```| True |
-| sample_indexing_set | Textfield | The specific sequencing barcode index set used, selected from the sample indexing kit. Example: For 10X this might be "SI-GA-A1", for Nextera "N505 - CTCCTTAC" | | True |
-| is_technical_replicate | Allowable Value | Is the sequencing reaction run in replicate, "Yes" or "No". If "Yes", FASTQ files in dataset need to be merged. | ```Yes``` ```No``` | True |
-| expected_entity_capture_count | Numeric | Number of cells, nuclei or capture spots expected to be captured by the assay. For Visium this is the total number of spots covered by tissue, within the capture area. | | False |
-| sequencing_reagent_kit | Allowable Value | Reagent kit used for sequencing |```Custom``` ```Illumina; HiSeq 3000/4000 PE Cluster Kit PE-410-1001; PN 1000283``` ```Illumina; NextSeq 1000/2000 P2 Reagent v3 Kit (100 Cycles); PN 20046811``` ```Illumina; NextSeq 1000/2000 P2 Reagent v3 Kit (200 Cycles); PN 20046812``` ```Illumina; NextSeq 1000/2000 P2 Reagent v3 Kit (300 Cycles); PN 20046813``` ```Illumina; NextSeq 2000 P3 Reagent Kit (300 Cycles); PN 20040561``` ```Illumina; NextSeq 2000 P3 Reagents Kit (100 Cycles); PN 20040559``` ```Illumina; NextSeq 500/550 Hi Output Kit 150 Cycles; v2.5; PN 20024907``` ```Illumina; NextSeq 500/550 Hi Output Kit 75 Cycles v2.5; PN 20024906``` ```Illumina; NextSeq 500/550 Mid Output Kit 150 Cycles v2.5; PN 20024904``` ```Illumina; NovaSeq 6000 S1 Reagent Kit (200 Cycles); PN 20012864``` ```Illumina; NovaSeq 6000 S1 Reagent v1.5 Kit (100 Cycles); PN 20028319``` ```Illumina; NovaSeq 6000 S1 Reagent v1.5 Kit (200 Cycles); PN 20028318``` ```Illumina; NovaSeq 6000 S1 Reagent v1.5 Kit (300 Cycles); PN 20028317``` ```Illumina; NovaSeq 6000 S2 Reagent v1.5 Kit (100 Cycles); PN 20028316``` ```Illumina; NovaSeq 6000 S4 Reagent Kit v1.5 (300 cycles); PN 20028312``` ```Illumina; NovaSeq 6000 S4 Reagent v1.5 Kit (200 Cycles); PN 20028313``` ```Illumina; NovaSeq 6000 SP Reagent v1.5 Kit (100 Cycles); PN 20028401``` ```Illumina; NovaSeq X Series 1.5B Reagent Kit (100 Cycle); PN 20104703``` ```Illumina; NovaSeq X Series 1.5B Reagent Kit (200 Cycle); PN 20104704``` ```Illumina; NovaSeq X Series 1.5B Reagent Kit (300 Cycle); PN 20104705``` ```Illumina; NovaSeq X Series 10B Reagent Kit (100 Cycle); PN 20085596``` ```Illumina; NovaSeq X Series 10B Reagent Kit (200 Cycle); PN 20085595``` ```Illumina; NovaSeq X Series 10B Reagent Kit (300 Cycle); PN 20085594``` | True |
-| sequencing_read_format | Textfield | Number of sequencing cycles in each round of sequencing (i.e., Read1, i7 index, i5 index, and Read2). This is reported as a comma-delimited list. Example: For 10X snATAC-seq (R1,Index,R2,R3) this might be: 50,8,16,50. For SNARE-seq2 this might be: 75,94,8,75 | | True |
-| sequencing_batch_id | Textfield | The ID for the sequencing run. This could, for example, be the chip ID and should allow users the ability to determine which samples were processed together in a sequencing run. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | False |
-| capture_batch_id | Textfield | A lab-generated ID to identify which cells were captured at the same time. This would, for example, be an ID to denote which datasets were derived from a single 10X Genomics Chromium Controller run. In the case of the 10X Controller this could be the chip ID and would allow users the ability to determine which samples were processed together in a Chromium controller. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | False |
-| preparation_instrument_vendor | Allowable Value | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` | False |
-| preparation_instrument_model | Allowable Value | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist ```| False |
-| metadata_schema_id | Textfield | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | True |
-| amount_of_input_analyte_value | Numeric | The amount of RNA or DNA input to the assay, typically measured by a Qubit, BioAnalyzer, or TapeStation. In most single cell/nuclei assays, this value isn't available. | | False |
-| number_of_iterations_of_cdna_amplification | Numeric | This is the amplification of the cDNA prior to library construction. This is typically a PCR amplification, while for linear amplification methods like aRNA this would be the number of rounds of aRNA. | | True |
-| preparation_instrument_kit | Allowable Value | The reagent kit used with the preparation instrument. | ```10X Genomics; Chromium Next GEM Chip G Single Cell Kit``` ```16 rxns; PN 1000127``` ```10X Genomics; Chromium Next GEM Chip G Single Cell Kit``` ```48 rxns; PN 1000120``` ```10X Genomics; Chromium Next GEM Chip K Automated Single Cell Kit``` ```48 rxns; PN 1000289``` ```10X Genomics; Chromium Next GEM Chip K Single Cell Kit``` ```16 rxns; PN 1000287``` ```10X Genomics; Chromium Next GEM Chip K Single Cell Kit``` ```48 rxns; PN 1000286``` ```10X Genomics; Chromium Next GEM Chip Q Single Cell Kit``` ```16 rxns; PN 1000422``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```16 rxn; PN 1000283``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```4 rxn; PN 1000285``` ```10X Genomics; Visium FFPE Reagent Kit v2-Small``` ```PN 1000436``` ```Custom``` | False |
-| preparation_protocol_doi | Textfield | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | True |
-| parent_sample_id | Textfield | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | True |
-| umi_offset | Allowable Value | Position in the read at which the UMI barcode starts. This should be included when constructing sequencing libraries with a non-commercial kit. | ```0``` ```16``` ```36``` ```Not applicable``` | True |
-| dataset_type | Allowable Value | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium```| True |
-| barcode_offset | Allowable Value | Positions in the read at which the cell or capture spot barcodes start. Cell and capture spot barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences (the offsets). First barcode at position 0, then 38, then 76. This should be included when constructing sequencing libraries with a non-commercial kit. | ```0``` ```8``` ```20``` ```1,27``` ```0,38,76``` ```10,48,78``` ```10,48,86``` ```Not applicable``` | True |
-| amount_of_input_analyte_unit | Allowable Value | Units of amount of entity input to assay value | ```ug``` ```ng``` | False |
-
-
-
-RNAseq Version 2
-
-## RNAseq Version 2
-
-| Attribute | Type | Description | Allowable Values | required |
-|-----------------------------------------------------|-----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|------------|
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA ```| True |
-| acquisition_instrument_vendor | Allowable Value | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | True |
-| acquisition_instrument_model | Allowable Value | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | True |
-| source_storage_duration_value | Numeric | How long was the source material (parent) stored, prior to this sample being processed. | | True |
-| source_storage_duration_unit | Allowable Value | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | True |
-| time_since_acquisition_instrument_calibration_value | Numeric | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | False |
-| time_since_acquisition_instrument_calibration_unit | Allowable Value | The time unit of measurement |```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | False |
-| contributors_path | Textfield | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | True |
-| data_path | Textfield | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | True |
-| barcode_read | Allowable Value | Which read file contains the cell or capture spot barcode. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if the source material is barcoded. This field is used to determine which analysis pipeline to run. | ```Read 2 (R2)``` ```Read 1 (R1)``` ```Not applicable``` | True |
-| barcode_size | Allowable Value | Length of the cell or capture spot barcode in base pairs. Cell and capture spot barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences, the offsets. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if the source material is barcoded. This field is used to determine which analysis pipeline to run. | ```14``` ```16``` ```40``` ```8,8,8``` ```8,6``` ```Not applicable``` | True |
-| umi_read | Allowable Value | Which read file contains the UMI barcode. This should be included when constructing sequencing libraries with a non-commercial kit. | ```Read 2 (R2)``` ```Read 1 (R1)``` ```Not applicable``` | True |
-| umi_size | Allowable Value | Length of the umi barcode in base pairs. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if UMI are present. This field is used to determine which analysis pipeline to run. | ```8``` ```9``` ```10``` ```12``` ```Not applicable``` | True |
-| assay_input_entity | Allowable Value | This is the entity from which the analyte is being captured. For example, for bulk sequencing this would be "tissue", while it would be "single cell" for single cell sequencing. This field is used to determine which analysis pipeline to run. | ```area of interest``` ```single cell``` ```single nucleus``` ```spot``` ```tissue (bulk)``` | True |
-| number_of_input_cells_or_nuclei | Numeric | How many cells or nuclei were input to the assay? This is typically not available for preparations working with bulk tissue. | | False |
-| library_adapter_sequence | Textfield | 5’ and/or 3’ read adapter sequences used as part of the library preparation protocol to render the library compatible with the sequencing protocol and instrumentation. This should be provided as comma-separated list of key:value pairs (adapter name:sequence). | | True |
-| library_average_fragment_size | Numeric | Average size of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. Numeric value in base pairs (bp). | | True |
-| library_input_amount_value | Numeric | The amount of cDNA, after amplification, that was used for library construction. | | True |
-| library_input_amount_unit | Allowable Value | unit of library input amount value | ```ng``` ```ul``` | True |
-| library_output_amount_value | Numeric | Total amount (eg. nanograms) of library after the clean-up step of final pcr amplification step. Answer the question: What is the Qubit measured concentration (ng/ul) times the elution volume (ul) after the final clean-up step? | | False |
-| library_output_amount_unit | Allowable Value | Units of library final yield. | ```ng``` ```ul``` | False |
-| library_concentration_value | Numeric | The concentration value of the pooled library samples submitted for sequencing. | | True |
-| library_concentration_unit | Allowable Value | Unit of library concentration value. | ```ng/ul``` ```nM``` | True |
-| library_layout | Allowable Value | Whether the library was generated for single-end or paired end sequencing | ```paired-end``` ```single-end``` | True |
-| number_of_pcr_cycles_for_indexing | Numeric | Number of PCR cycles performed in order to add adapters and amplify the library. This does not include the cDNA amplification which is captured in the "number of iterations of cDNA amplification" field. | | True |
-| library_preparation_kit | Allowable Value | Reagent kit used for library preparation | ```10X Genomics; Automated Library Construction Kit``` ```24 rxns; PN 1000428``` ```10X Genomics; Chromium Next GEM Automated Single Cell 5' Kit v2``` ```24 rxns; PN 1000290``` ```10X Genomics; Chromium Next GEM Automated Single Cell 5' Kit v2``` ```4 rxns; PN 1000298``` ```10X Genomics; Chromium Next GEM Single Cell 3' GEM``` ```Library & Gel Bead Kit v3.1``` ```16 rxns; PN 1000121``` ```10X Genomics; Chromium Next GEM Single Cell 3' HT Kit v3.1``` ```48 rxns; PN 1000348``` ```10X Genomics; Chromium Next GEM Single Cell 3' HT Kit v3.1``` ```8 rxns; PN 1000370``` ```10X Genomics; Chromium Next GEM Single Cell 3' Kit v3.1``` ```16 rxns; PN 1000268``` ```10X Genomics; Chromium Next GEM Single Cell 3' Kit v3.1``` ```4 rxns; PN 1000269``` ```10X Genomics; Chromium Next GEM Single Cell 5' Kit v2``` ```16 rxns; PN 1000263``` ```10X Genomics; Chromium Next GEM Single Cell 5' Kit v2``` ```4 rxns; PN 1000265``` ```10X Genomics; Chromium Next GEM Single Cell Fixed RNA Hybridization & Library Kit``` ```4 rxns; PN 1000415``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```16 rxn; PN 1000283``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```4 rxn; PN 1000285``` ```10X Genomics; Chromium Single Cell 3' GEM``` ```Library & Gel Bead Kit v3``` ```4 rxns PN 1000092``` ```10X Genomics; Chromium Single Cell 3' Library & Gel Bead Kit``` ```4 rxns; PN 120267``` ```10X Genomics; Visium CytAssist Spatial Gene Expression for FFPE``` ```Human Transcriptome``` ```11 mm``` ```2 reactions; PN 1000522``` ```10X Genomics; Visium CytAssist Spatial Gene Expression for FFPE``` ```Human Transcriptome``` ```6.5mm``` ```4 reactions; PN 1000520``` ```10X Genomics; Visium Spatial for FFPE Gene Expression Kit``` ```Human Transcriptome``` ```1 slides``` ```4 reactions; PN 1000338``` ```10X Genomics; Visium Spatial for FFPE Gene Expression Kit``` ```Mouse Transcriptome``` ```4 rxns; PN 1000339``` ```10X Genomics; Visium Spatial Gene Expression Slide and Reagent Kit``` ```1 slides``` ```4 reactions; PN 1000187``` ```10X Genomics; Visium Spatial Gene Expression Slide and Reagent Kit``` ```4 slides``` ```16 reactions; PN 1000184``` ```Custom``` ```Illumina; TruSeq Stranded mRNA Library Prep (48 samples); PN 20020594``` ```Illumina; TruSeq Stranded mRNA Library Prep (96 samples); PN 20020595``` ```New England BioLabs; NEBNext Ultra II RNA Library Prep Kit for Illumina; PN E7770``` ```Parse Biosciences; Evercode WT Mini v2 Kit``` ```12 rxns; PN ECW02010``` ```Parse Biosciences; Evercode WT v2 Kit``` ```48 rxns; PN ECW02030)``` | True |
-| sample_indexing_kit | Allowable Value | Indexes are needed for multiplexing sequencing libraries for simultaneous sequencing (pooling) and proper attachment to the Illumina flowcell. Each indexing kit would have a number of compatible sequences ("sample indexing sets") that are used to label some number of samples (the number of sets depend on the kit). | ```10X Genomics; Chromium i7 Sample Index Plate (96 rxn); PN 220103``` ```10X Genomics; Dual Index Kit TS``` ```Set A; PN 1000251``` ```10X Genomics; Dual Index Kit TT``` ```Set A (96 rxn); PN 1000215``` ```10X Genomics; Single Index Kit N``` ```Set A (96 rxn); PN 1000212``` ```Custom``` ```Illumina; IDT for Illumina - TruSeq RNA UD Indexes v2 (96 Indexes``` ```96 Samples); PN 20040871``` ```Illumina; TruSeq RNA CD Index Plate (96 Indexes``` ```96 Samples); PN 20019792``` ```Illumina; TruSeq RNA Single Indexes Set A (12 Indexes``` ```48 Samples); PN 20020492``` ```Illumina; TruSeq RNA Single Indexes Set B (12 Indexes``` ```48 Samples); PN 20020493``` ```Integrated DNA Technologies: Custom DNA Oligos``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-AB``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-CD``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-EF``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-GH``` ```Not applicable``` ```Parse Biosciences; Fragmentation Reagents; PN WX100``` ```Parse Biosciences; UDI Plate - WT; PN UDI1001 ```| True |
-| sample_indexing_set | Textfield | The specific sequencing barcode index set used, selected from the sample indexing kit. Example: For 10X this might be "SI-GA-A1", for Nextera "N505 - CTCCTTAC" | | True |
-| is_technical_replicate | Allowable Value | Is the sequencing reaction run in replicate, "Yes" or "No". If "Yes", FASTQ files in dataset need to be merged. | ```Yes``` ```No``` | True |
-| expected_entity_capture_count | Numeric | Number of cells, nuclei or capture spots expected to be captured by the assay. For Visium this is the total number of spots covered by tissue, within the capture area. | | False |
-| sequencing_reagent_kit | Allowable Value | Reagent kit used for sequencing |```Custom``` ```Illumina; HiSeq 3000/4000 PE Cluster Kit PE-410-1001; PN 1000283``` ```Illumina; NextSeq 1000/2000 P2 Reagent v3 Kit (100 Cycles); PN 20046811``` ```Illumina; NextSeq 1000/2000 P2 Reagent v3 Kit (200 Cycles); PN 20046812``` ```Illumina; NextSeq 1000/2000 P2 Reagent v3 Kit (300 Cycles); PN 20046813``` ```Illumina; NextSeq 2000 P3 Reagent Kit (300 Cycles); PN 20040561``` ```Illumina; NextSeq 2000 P3 Reagents Kit (100 Cycles); PN 20040559``` ```Illumina; NextSeq 500/550 Hi Output Kit 150 Cycles; v2.5; PN 20024907``` ```Illumina; NextSeq 500/550 Hi Output Kit 75 Cycles v2.5; PN 20024906``` ```Illumina; NextSeq 500/550 Mid Output Kit 150 Cycles v2.5; PN 20024904``` ```Illumina; NovaSeq 6000 S1 Reagent Kit (200 Cycles); PN 20012864``` ```Illumina; NovaSeq 6000 S1 Reagent v1.5 Kit (100 Cycles); PN 20028319``` ```Illumina; NovaSeq 6000 S1 Reagent v1.5 Kit (200 Cycles); PN 20028318``` ```Illumina; NovaSeq 6000 S1 Reagent v1.5 Kit (300 Cycles); PN 20028317``` ```Illumina; NovaSeq 6000 S2 Reagent v1.5 Kit (100 Cycles); PN 20028316``` ```Illumina; NovaSeq 6000 S4 Reagent Kit v1.5 (300 cycles); PN 20028312``` ```Illumina; NovaSeq 6000 S4 Reagent v1.5 Kit (200 Cycles); PN 20028313``` ```Illumina; NovaSeq 6000 SP Reagent v1.5 Kit (100 Cycles); PN 20028401``` ```Illumina; NovaSeq X Series 1.5B Reagent Kit (100 Cycle); PN 20104703``` ```Illumina; NovaSeq X Series 1.5B Reagent Kit (200 Cycle); PN 20104704``` ```Illumina; NovaSeq X Series 1.5B Reagent Kit (300 Cycle); PN 20104705``` ```Illumina; NovaSeq X Series 10B Reagent Kit (100 Cycle); PN 20085596``` ```Illumina; NovaSeq X Series 10B Reagent Kit (200 Cycle); PN 20085595``` ```Illumina; NovaSeq X Series 10B Reagent Kit (300 Cycle); PN 20085594``` | True |
-| sequencing_read_format | Textfield | Number of sequencing cycles in each round of sequencing (i.e., Read1, i7 index, i5 index, and Read2). This is reported as a comma-delimited list. Example: For 10X snATAC-seq (R1,Index,R2,R3) this might be: 50,8,16,50. For SNARE-seq2 this might be: 75,94,8,75 | | True |
-| sequencing_batch_id | Textfield | The ID for the sequencing run. This could, for example, be the chip ID and should allow users the ability to determine which samples were processed together in a sequencing run. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | False |
-| capture_batch_id | Textfield | A lab-generated ID to identify which cells were captured at the same time. This would, for example, be an ID to denote which datasets were derived from a single 10X Genomics Chromium Controller run. In the case of the 10X Controller this could be the chip ID and would allow users the ability to determine which samples were processed together in a Chromium controller. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | False |
-| preparation_instrument_vendor | Allowable Value | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` | False |
-| preparation_instrument_model | Allowable Value | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist ```| False |
-| metadata_schema_id | Textfield | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | True |
-| amount_of_input_analyte_value | Numeric | The amount of RNA or DNA input to the assay, typically measured by a Qubit, BioAnalyzer, or TapeStation. In most single cell/nuclei assays, this value isn't available. | | False |
-| amount_of_input_analyte_unit | Textfield | Units of amount of entity input to assay value | | False |
-| number_of_iterations_of_cdna_amplification | Numeric | This is the amplification of the cDNA prior to library construction. This is typically a PCR amplification, while for linear amplification methods like aRNA this would be the number of rounds of aRNA. | | True |
-| preparation_instrument_kit | Allowable Value | The reagent kit used with the preparation instrument. | ```10X Genomics; Chromium Next GEM Chip G Single Cell Kit``` ```16 rxns; PN 1000127``` ```10X Genomics; Chromium Next GEM Chip G Single Cell Kit``` ```48 rxns; PN 1000120``` ```10X Genomics; Chromium Next GEM Chip K Automated Single Cell Kit``` ```48 rxns; PN 1000289``` ```10X Genomics; Chromium Next GEM Chip K Single Cell Kit``` ```16 rxns; PN 1000287``` ```10X Genomics; Chromium Next GEM Chip K Single Cell Kit``` ```48 rxns; PN 1000286``` ```10X Genomics; Chromium Next GEM Chip Q Single Cell Kit``` ```16 rxns; PN 1000422``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```16 rxn; PN 1000283``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```4 rxn; PN 1000285``` ```10X Genomics; Visium FFPE Reagent Kit v2-Small``` ```PN 1000436``` ```Custom``` | False |
-| preparation_protocol_doi | Textfield | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | True |
-| parent_sample_id | Textfield | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | True |
-| barcode_offset | Allowable Value | Positions in the read at which the cell or capture spot barcodes start. Cell and capture spot barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences (the offsets). First barcode at position 0, then 38, then 76. This should be included when constructing sequencing libraries with a non-commercial kit. | ```0``` ```8``` ```20``` ```1,27``` ```0,38,76``` ```10,48,78``` ```10,48,86``` ```Not applicable``` | True |
-| umi_offset | Allowable Value | Position in the read at which the UMI barcode starts. This should be included when constructing sequencing libraries with a non-commercial kit. | ```0``` ```16``` ```36``` ```Not applicable``` | True |
-| dataset_type | Allowable Value | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq (bulk)``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```PhenoCycler``` ```RNAseq (bulk)``` ```scATACseq``` ```scRNAseq``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```snATACseq``` ```snRNAseq``` ```Thick section Multiphoton MxIF``` ```Visium``` ```Xenium``` | True |
-
-
-
-bulk-RNA Version 1
-
-## bulk-RNA Version 1
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-------------------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['1'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['sequence'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['bulkATACseq'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['DNA'] | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ['Yes','No']] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| bulk_transposition_input_number_nuclei | Textfield | A number (no comma separators) | | True |
-| bulk_atac_cell_isolation_protocols_io_doi | Textfield | Textfield to a protocols document answering the question: How was tissue stored and processed for cell/nuclei isolation | | True |
-| is_technical_replicate | Allowable Value | Is this a sequencing replicate? | ['Yes','No']] | True |
-| library_adapter_sequence | Textfield | Adapter sequence to be used for adapter trimming | | True |
-| library_average_fragment_size | Numeric | Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. | | True |
-| library_concentration_value | Numeric | The concentration value of the pooled library samples submitted for sequencing. | | True |
-| library_concentration_unit | Allowable Value | Unit of library_concentration_value | ['nM'] | False |
-| library_construction_protocols_io_doi | Textfield | A link to the protocol document containing the library construction method (including version) that was used, e.g. "Smart-Seq2", "Drop-Seq", "10X v3". | | True |
-| library_creation_date | Datetime | date and time of library creation. YYYY-MM-DD, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s. | | False |
-| library_final_yield_value | Numeric | Total number of ng of library after final pcr amplification step. This is the concentration (ng/ul) * volume (ul) | | True |
-| library_final_yield_unit | Allowable Value | Units of final library yield | ['ng'] | False |
-| library_id | Textfield | A library ID, unique within a TMC, which allows corresponding RNA and chromatin accessibility datasets to be linked. | | True |
-| library_layout | Allowable Value | State whether the library was generated for single-end or paired end sequencing. | ['single-end', 'paired-end'] | True |
-| library_pcr_cycles | Numeric | Number of PCR cycles performed in order to add adapters and amplify the library. Usually, this includes 5 pre-amplificationn cycles followed by 0-5 additional cycles determined by qPCR. | | True |
-| library_preparation_kit | Textfield | Reagent kit used for library preparation | | True |
-| sample_quality_metric | Textfield | This is a quality metric by visual inspection. This should answerthe question: Are the nuclei intact and are the nuclei free of significant amountsof debris? This can be captured at a high level, âOKâ or ânotOKâ. | | True |
-| sequencing_phix_percent | Numeric | Percent PhiX loaded to the run | | True |
-| sequencing_read_format | Textfield | Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. | | True |
-| sequencing_read_percent_q30 | Numeric | Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + ...) | | True |
-| sequencing_reagent_kit | Textfield | Reagent kit used for sequencing | | True |
-| transposition_kit_number | Textfield | If Tn5 came from a kit, provide the catalog number. | | False |
-| transposition_method | Textfield | Modality of capturing accessible chromatin molecules. The kit used, for example. | | True |
-| transposition_transposase_source | Textfield | The source of the Tn5 transposase and transposon used for capturing accessible chromatin. | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
-
-bulk-RNA Version 0
-
-## bulk-RNA Version 0
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-----------------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------|------------|
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['sequence'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['bulk-RNA'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['RNA'] | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ['Yes','No']] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| bulk_rna_isolation_protocols_io_doi | Textfield | Textfield to a protocols document answering the question: How was tissue stored and processed for RNA isolation RNA_isolation_protocols_io_doi | | True |
-| bulk_rna_yield_value | Numeric | RNA (ng) per Weight of Tissue (mg). Answer the question: How much RNA in ng was isolated? How much tissue in mg was initially used for isolating RNA? Calculate the yield by dividing total RNA isolated by amount of tissue used to isolate RNA from (ng/mg). | | True |
-| bulk_rna_yield_units_per_tissue_unit | Allowable Value | RNA amount per Tissue input amount. Valid values should be weight/weight (ng/mg). | ['ng/mg'] | True |
-| bulk_rna_isolation_quality_metric_value | Numeric | RIN value | | True |
-| rnaseq_assay_input_value | Numeric | RNA input amount value to the assay | | True |
-| rnaseq_assay_input_unit | Allowable Value | Units of RNA input amount to the assay | ['ug'] | False |
-| rnaseq_assay_method | Textfield | The kit used for the RNA sequencing assay | | True |
-| library_construction_protocols_io_doi | Textfield | A link to the protocol document containing the library construction method (including version) that was used, e.g. "Smart-Seq2", "Drop-Seq", "10X v3". | | True |
-| library_layout | Allowable Value | State whether the library was generated for single-end or paired end sequencing. | ['single-end', 'paired-end'] | True |
-| library_adapter_sequence | Textfield | Adapter sequence to be used for adapter trimming. | | True |
-| library_pcr_cycles_for_sample_index | Numeric | Number of PCR cycles performed for library indexing | | True |
-| library_final_yield_value | Numeric | Total number of ng of library after final pcr amplification step. This is the concentration (ng/ul) * volume (ul) | | True |
-| library_final_yield_unit | Allowable Value | Units of final library yield | ['ng'] | False |
-| library_average_fragment_size | Numeric | Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. | | True |
-| sequencing_reagent_kit | Textfield | Reagent kit used for sequencing | | True |
-| sequencing_read_format | Textfield | Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. | | True |
-| sequencing_read_percent_q30 | Numeric | Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + ...) | | True |
-| sequencing_phix_percent | Numeric | Percent PhiX loaded to the run | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
-
-scRNAseq Version 3
-
-## scRNAseq Version 3
-
-| Attribute | Type | Description | Allowable Values | Required |
-|---------------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['3'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['sequence'] | True |
-| assay_type | Allowable Value | The UMI sequence length in the 10xGenomics-v2 kit is 10 base pairs and the length in the 10xGenomics-v3 kit is 12 base pairs. | ['scRNAseq-10xGenomics-v2', 'scRNAseq-10xGenomics-v3', 'snRNAseq-10xGenomics-v2', 'snRNAseq-10xGenomics-v3', 'scRNAseq', 'sciRNAseq', 'snRNAseq', 'SNARE2-RNAseq'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['RNA'] | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ['Yes','No']] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| sc_isolation_protocols_io_doi | Textfield | Textfield to a protocols document answering the question: How were single cells separated into a single-cell suspension? | | True |
-| sc_isolation_entity | Allowable Value | The type of single cell entity derived from isolation protocol | ['whole cell', 'nucleus', 'cell-cell multimer', 'spatially encoded cell barcoding'] | True |
-| sc_isolation_tissue_dissociation | Textfield | The method by which tissues are dissociated into single cells in suspension. | | True |
-| sc_isolation_enrichment | Allowable Value | The method by which specific cell populations are sorted or enriched. | ['none', 'FACS'] | True |
-| sc_isolation_quality_metric | Textfield | A quality metric by visual inspection prior to cell lysis or defined by known parameters such as wells with several cells or no cells. This can be captured at a high level. | | True |
-| sc_isolation_cell_number | Numeric | Total number of cell/nuclei yielded post dissociation and enrichment | | True |
-| rnaseq_assay_input | Numeric | Number of cell/nuclei input to the assay | | True |
-| rnaseq_assay_method | Textfield | The kit used for the RNA sequencing assay | | True |
-| library_construction_protocols_io_doi | Textfield | A link to the protocol document containing the library construction method (including version) that was used, e.g. "Smart-Seq2", "Drop-Seq", "10X v3". | | True |
-| library_layout | Allowable Value | State whether the library was generated for single-end or paired end sequencing. | ['single-end', 'paired-end'] | True |
-| library_adapter_sequence | Textfield | Adapter sequence to be used for adapter trimming | | True |
-| library_id | Textfield | A library ID, unique within a TMC, which allows corresponding RNA and chromatin accessibility datasets to be linked. | | True |
-| is_technical_replicate | Allowable Value | Is the sequencing reaction run in replicate, TRUE or FALSE | ['Yes','No']] | True |
-| cell_barcode_read | Textfield | Which read file(s) contains the cell barcode. Multiple cell_barcode_read files must be provided as a comma-delimited list (e.g. file1,file2,file3). | | False |
-| umi_read | Textfield | Which read file(s) contains the UMI (unique molecular identifier) barcode. | | True |
-| umi_offset | Numeric | Position in the read at which the umi barcode starts. | | True |
-| umi_size | Numeric | Length of the umi barcode in base pairs. | | True |
-| cell_barcode_offset | Textfield | Position(s) in the read at which the cell barcode starts. | | False |
-| cell_barcode_size | Textfield | Length of the cell barcode in base pairs | | False |
-| expected_cell_count | Numeric | How many cells are expected? This may be used in downstream pipelines to guide selection of cell barcodes or segmentation parameters. | | False |
-| library_pcr_cycles | Numeric | Number of PCR cycles to amplify cDNA | | True |
-| library_pcr_cycles_for_sample_index | Numeric | Number of PCR cycles performed for library indexing | | True |
-| library_final_yield_value | Numeric | Total number of ng of library after final pcr amplification step. This is the concentration (ng/ul) * volume (ul) | | True |
-| library_final_yield_unit | Allowable Value | Units of final library yield | ['ng'] | False |
-| library_average_fragment_size | Numeric | Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. | | True |
-| sequencing_reagent_kit | Textfield | Reagent kit used for sequencing | | True |
-| sequencing_read_format | Textfield | Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. | | True |
-| sequencing_read_percent_q30 | Numeric | Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + ...) | | True |
-| sequencing_phix_percent | Numeric | Percent PhiX loaded to the run | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
-
-scRNAseq Version 2
-
-## scRNAseq Version 2
-
-| Attribute | Type | Description | Allowable Values | Required |
-|---------------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['2'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['sequence'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['scRNAseq-10xGenomics-v2', 'scRNAseq-10xGenomics-v3', 'snRNAseq-10xGenomics-v2', 'snRNAseq-10xGenomics-v3', 'scRNAseq', 'sciRNAseq', 'snRNAseq', 'SNARE2-RNAseq'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['RNA'] | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ['Yes','No']] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| sc_isolation_protocols_io_doi | Textfield | Textfield to a protocols document answering the question: How were single cells separated into a single-cell suspension? | | True |
-| sc_isolation_entity | Allowable Value | The type of single cell entity derived from isolation protocol | ['whole cell', 'nucleus', 'cell-cell multimer', 'spatially encoded cell barcoding'] | True |
-| sc_isolation_tissue_dissociation | Textfield | The method by which tissues are dissociated into single cells in suspension. | | True |
-| sc_isolation_enrichment | Allowable Value | The method by which specific cell populations are sorted or enriched. | ['none', 'FACS'] | True |
-| sc_isolation_quality_metric | Textfield | A quality metric by visual inspection prior to cell lysis or defined by known parameters such as wells with several cells or no cells. This can be captured at a high level. | | True |
-| sc_isolation_cell_number | Numeric | Total number of cell/nuclei yielded post dissociation and enrichment | | True |
-| rnaseq_assay_input | Numeric | Number of cell/nuclei input to the assay | | True |
-| rnaseq_assay_method | Textfield | The kit used for the RNA sequencing assay | | True |
-| library_construction_protocols_io_doi | Textfield | A link to the protocol document containing the library construction method (including version) that was used, e.g. "Smart-Seq2", "Drop-Seq", "10X v3". | | True |
-| library_layout | Allowable Value | State whether the library was generated for single-end or paired end sequencing. | ['single-end', 'paired-end'] | True |
-| library_adapter_sequence | Textfield | Adapter sequence to be used for adapter trimming | | True |
-| library_id | Textfield | A library ID, unique within a TMC, which allows corresponding RNA and chromatin accessibility datasets to be linked. | | True |
-| is_technical_replicate | Allowable Value | Is the sequencing reaction run in replicate, TRUE or FALSE | ['Yes','No']] | True |
-| cell_barcode_read | Textfield | Which read file(s) contains the cell barcode. Multiple cell_barcode_read files must be provided as a comma-delimited list (e.g. file1,file2,file3). | | False |
-| cell_barcode_offset | Textfield | Position(s) in the read at which the cell barcode starts. | | False |
-| cell_barcode_size | Textfield | Length of the cell barcode in base pairs | | False |
-| expected_cell_count | Numeric | How many cells are expected? This may be used in downstream pipelines to guide selection of cell barcodes or segmentation parameters. | | False |
-| library_pcr_cycles | Numeric | Number of PCR cycles to amplify cDNA | | True |
-| library_pcr_cycles_for_sample_index | Numeric | Number of PCR cycles performed for library indexing | | True |
-| library_final_yield_value | Numeric | Total number of ng of library after final pcr amplification step. This is the concentration (ng/ul) * volume (ul) | | True |
-| library_final_yield_unit | Allowable Value | Units of final library yield | ['ng'] | False |
-| library_average_fragment_size | Numeric | Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. | | True |
-| sequencing_reagent_kit | Textfield | Reagent kit used for sequencing | | True |
-| sequencing_read_format | Textfield | Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. | | True |
-| sequencing_read_percent_q30 | Numeric | Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + ...) | | True |
-| sequencing_phix_percent | Numeric | Percent PhiX loaded to the run | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
-
-scRNAseq Version 1
-
-## scRNAseq Version 1
-
-| Attribute | Type | Description | Allowable Values | Required |
-|---------------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['1'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['sequence'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['scRNAseq-10xGenomics', 'snRNAseq-10xGenomics-v2', 'snRNAseq-10xGenomics-v3', 'scRNAseq', 'sciRNAseq', 'snRNAseq', 'SNARE2-RNAseq'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['RNA'] | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ['Yes','No']] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| sc_isolation_protocols_io_doi | Textfield | Textfield to a protocols document answering the question: How were single cells separated into a single-cell suspension? | | True |
-| sc_isolation_entity | Allowable Value | The type of single cell entity derived from isolation protocol | ['whole cell', 'nucleus', 'cell-cell multimer', 'spatially encoded cell barcoding'] | True |
-| sc_isolation_tissue_dissociation | Textfield | The method by which tissues are dissociated into single cells in suspension. | | True |
-| sc_isolation_enrichment | Allowable Value | The method by which specific cell populations are sorted or enriched. | ['none', 'FACS'] | True |
-| sc_isolation_quality_metric | Textfield | A quality metric by visual inspection prior to cell lysis or defined by known parameters such as wells with several cells or no cells. This can be captured at a high level. | | True |
-| sc_isolation_cell_number | Numeric | Total number of cell/nuclei yielded post dissociation and enrichment | | True |
-| rnaseq_assay_input | Numeric | Number of cell/nuclei input to the assay | | True |
-| rnaseq_assay_method | Textfield | The kit used for the RNA sequencing assay | | True |
-| library_construction_protocols_io_doi | Textfield | A link to the protocol document containing the library construction method (including version) that was used, e.g. "Smart-Seq2", "Drop-Seq", "10X v3". | | True |
-| library_layout | Allowable Value | State whether the library was generated for single-end or paired end sequencing. | ['single-end', 'paired-end'] | True |
-| library_adapter_sequence | Textfield | Adapter sequence to be used for adapter trimming | | True |
-| library_id | Textfield | A library ID, unique within a TMC, which allows corresponding RNA and chromatin accessibility datasets to be linked. | | True |
-| is_technical_replicate | Allowable Value | Is the sequencing reaction run in repliucate, TRUE or FALSE | ['Yes','No']] | True |
-| cell_barcode_read | Textfield | Which read file contains the cell barcode | | True |
-| cell_barcode_offset | Textfield | Position(s) in the read at which the cell barcode starts. | | True |
-| cell_barcode_size | Textfield | Length of the cell barcode in base pairs | | True |
-| library_pcr_cycles | Numeric | Number of PCR cycles to amplify cDNA | | True |
-| library_pcr_cycles_for_sample_index | Numeric | Number of PCR cycles performed for library indexing | | True |
-| library_final_yield_value | Numeric | Total number of ng of library after final pcr amplification step. This is the concentration (ng/ul) * volume (ul) | | True |
-| library_final_yield_unit | Allowable Value | Units of final library yield | ['ng'] | False |
-| library_average_fragment_size | Numeric | Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. | | True |
-| sequencing_reagent_kit | Textfield | Reagent kit used for sequencing | | True |
-| sequencing_read_format | Textfield | Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. | | True |
-| sequencing_read_percent_q30 | Numeric | Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + ...) | | True |
-| sequencing_phix_percent | Numeric | Percent PhiX loaded to the run | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
-
-scRNAseq Version 0
-
-## scRNAseq Version 0
-
-| Attribute | Type | Description | Allowable Values | Required |
-|---------------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|
-| version | Allowable Value | Version of the schema to use when validating this metadata. | ['2'] | True |
-| description | Textfield | Free-text description of this assay. | | True |
-| donor_id | Textfield | HuBMAP Display ID of the donor of the assayed tissue. | | True |
-| tissue_id | Textfield | HuBMAP Display ID of the assayed tissue. | | True |
-| execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | True |
-| protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | | True |
-| operator | Textfield | Name of the person responsible for executing the assay. | | True |
-| operator_email | Textfield | Email address for the operator. | | True |
-| pi | Textfield | Name of the principal investigator responsible for the data. | | True |
-| pi_email | Textfield | Email address for the principal investigator. | | True |
-| assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ['sequence'] | True |
-| assay_type | Allowable Value | The specific type of assay being executed. | ['scRNAseq-10xGenomics-v2', 'scRNAseq-10xGenomics-v3', 'snRNAseq-10xGenomics-v2', 'snRNAseq-10xGenomics-v3', 'scRNAseq', 'sciRNAseq', 'snRNAseq', 'SNARE2-RNAseq'] | True |
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ['RNA'] | True |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ['Yes','No']] | True |
-| acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | True |
-| acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | True |
-| sc_isolation_protocols_io_doi | Textfield | Textfield to a protocols document answering the question: How were single cells separated into a single-cell suspension? | | True |
-| sc_isolation_entity | Allowable Value | The type of single cell entity derived from isolation protocol | ['whole cell', 'nucleus', 'cell-cell multimer', 'spatially encoded cell barcoding'] | True |
-| sc_isolation_tissue_dissociation | Textfield | The method by which tissues are dissociated into single cells in suspension. | | True |
-| sc_isolation_enrichment | Allowable Value | The method by which specific cell populations are sorted or enriched. | ['none', 'FACS'] | True |
-| sc_isolation_quality_metric | Textfield | A quality metric by visual inspection prior to cell lysis or defined by known parameters such as wells with several cells or no cells. This can be captured at a high level. | | True |
-| sc_isolation_cell_number | Numeric | Total number of cell/nuclei yielded post dissociation and enrichment | | True |
-| rnaseq_assay_input | Numeric | Number of cell/nuclei input to the assay | | True |
-| rnaseq_assay_method | Textfield | The kit used for the RNA sequencing assay | | True |
-| library_construction_protocols_io_doi | Textfield | A link to the protocol document containing the library construction method (including version) that was used, e.g. "Smart-Seq2", "Drop-Seq", "10X v3". | | True |
-| library_layout | Allowable Value | State whether the library was generated for single-end or paired end sequencing. | ['single-end', 'paired-end'] | True |
-| library_adapter_sequence | Textfield | Adapter sequence to be used for adapter trimming | | True |
-| library_id | Textfield | A library ID, unique within a TMC, which allows corresponding RNA and chromatin accessibility datasets to be linked. | | True |
-| is_technical_replicate | Allowable Value | Is the sequencing reaction run in replicate, TRUE or FALSE | ['Yes','No']] | True |
-| cell_barcode_read | Textfield | Which read file(s) contains the cell barcode. Multiple cell_barcode_read files must be provided as a comma-delimited list (e.g. file1,file2,file3). | | False |
-| cell_barcode_offset | Textfield | Position(s) in the read at which the cell barcode starts. | | False |
-| cell_barcode_size | Textfield | Length of the cell barcode in base pairs | | False |
-| expected_cell_count | Numeric | How many cells are expected? This may be used in downstream pipelines to guide selection of cell barcodes or segmentation parameters. | | False |
-| library_pcr_cycles | Numeric | Number of PCR cycles to amplify cDNA | | True |
-| library_pcr_cycles_for_sample_index | Numeric | Number of PCR cycles performed for library indexing | | True |
-| library_final_yield_value | Numeric | Total number of ng of library after final pcr amplification step. This is the concentration (ng/ul) * volume (ul) | | True |
-| library_final_yield_unit | Allowable Value | Units of final library yield | ['ng'] | False |
-| library_average_fragment_size | Numeric | Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. | | True |
-| sequencing_reagent_kit | Textfield | Reagent kit used for sequencing | | True |
-| sequencing_read_format | Textfield | Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. | | True |
-| sequencing_read_percent_q30 | Numeric | Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + ...) | | True |
-| sequencing_phix_percent | Numeric | Percent PhiX loaded to the run | | True |
-| contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | | True |
-| data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | True |
-
-
\ No newline at end of file
+---
+layout: page-triary
+---
+
+# RNAseq Metadata Attributes
+
+Fields that are collected for RNAseq data, available at ```Dataset.metadata.```
+
+
+* indicates a required field
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | |
+| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | |
+| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | ```https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1``` |
+| dataset_type | | The specific type of dataset being produced. | |
+| analyte_class | | Analytes are the target molecules being measured with the assay. | |
+| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | |
+| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | |
+| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | |
+| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | |
+| source_storage_duration_unit | | The time duration unit of measurement | |
+| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | |
+| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | |
+| contributors_path | | Relative path to file with ORCID IDs for contributors for this dataset. | |
+| data_path | | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | |
+| barcode_offset | | Positions in the read at which the cell or capture spot barcodes start. Cell and capture spot barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences (the offsets). First barcode at position 0, then 38, then 76. This should be included when constructing sequencing libraries with a non-commercial kit. | |
+| barcode_read | | Which read file contains the cell or capture spot barcode. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if the source material is barcoded. This field is used to determine which analysis pipeline to run. | |
+| barcode_size | | Length of the cell or capture spot barcode in base pairs. Cell and capture spot barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences, the offsets. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if the source material is barcoded. This field is used to determine which analysis pipeline to run. | |
+| umi_offset | | Position in the read at which the umi barcode starts. | |
+| umi_read | | Which read file(s) contains the UMI (unique molecular identifier) barcode. | |
+| umi_size | | Length of the umi barcode in base pairs. | |
+| assay_input_entity | | This is the entity from which the analyte is being captured. For example, for bulk sequencing this would be "tissue", while it would be "single cell" for single cell sequencing. This field is used to determine which analysis pipeline to run. | |
+| number_of_input_cells_or_nuclei | | How many cells or nuclei were input to the assay? This is typically not available for preparations working with bulk tissue. | |
+| amount_of_input_analyte_value | | The amount of RNA or DNA input to the assay, typically measured by a Qubit, BioAnalyzer, or TapeStation. In most single cell/nuclei assays, this value isn't available. | |
+| amount_of_input_analyte_unit | | Units of amount of entity input to assay value | |
+| library_adapter_sequence | | Adapter sequence to be used for adapter trimming | |
+| library_average_fragment_size | | Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. | |
+| library_input_amount_value | | The amount of cDNA, after amplification, that was used for library construction. | |
+| library_input_amount_unit | | unit of library input amount value | |
+| library_output_amount_value | | Total amount (eg. nanograms) of library after the clean-up step of final pcr amplification step. Answer the question: What is the Qubit measured concentration (ng/ul) times the elution volume (ul) after the final clean-up step? | |
+| library_output_amount_unit | | Units of library final yield. | |
+| library_concentration_value | | The concentration value of the pooled library samples submitted for sequencing. | |
+| library_concentration_unit | | Unit of library concentration value. | |
+| library_layout | | State whether the library was generated for single-end or paired end sequencing. | |
+| number_of_iterations_of_cdna_amplification | | This is the amplification of the cDNA prior to library construction. This is typically a PCR amplification, while for linear amplification methods like aRNA this would be the number of rounds of aRNA. | |
+| number_of_pcr_cycles_for_indexing | | Number of PCR cycles performed in order to add adapters and amplify the library. This does not include the cDNA amplification which is captured in the "number of iterations of cDNA amplification" field. | |
+| library_preparation_kit | | Reagent kit used for library preparation | |
+| sample_indexing_kit | | Indexes are needed for multiplexing sequencing libraries for simultaneous sequencing (pooling) and proper attachment to the Illumina flowcell. Each indexing kit would have a number of compatible sequences ("sample indexing sets") that are used to label some number of samples (the number of sets depend on the kit). | |
+| sample_indexing_set | | The specific sequencing barcode index set used, selected from the sample indexing kit. Example: For 10X this might be "SI-GA-A1", for Nextera "N505 - CTCCTTAC" | |
+| is_technical_replicate | | Is the sequencing reaction run in replicate, TRUE or FALSE | |
+| expected_entity_capture_count | | Number of cells, nuclei or capture spots expected to be captured by the assay. For Visium this is the total number of spots covered by tissue, within the capture area. | |
+| sequencing_reagent_kit | | Reagent kit used for sequencing | |
+| sequencing_read_format | | Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. | |
+| sequencing_batch_id | | The ID for the sequencing run. This could, for example, be the chip ID and should allow users the ability to determine which samples were processed together in a sequencing run. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | |
+| capture_batch_id | | A lab-generated ID to identify which cells were captured at the same time. This would, for example, be an ID to denote which datasets were derived from a single 10X Genomics Chromium Controller run. In the case of the 10X Controller this could be the chip ID and would allow users the ability to determine which samples were processed together in a Chromium controller. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | |
+| preparation_instrument_vendor | | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | |
+| preparation_instrument_model | | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | |
+| preparation_instrument_kit | | The reagent kit used with the preparation instrument. | |
+| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | |
+
+
+
+
+## Deprecated Attributes
+
+
+ indicates a field that was previously required
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| assay_category | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | |
+| bulk_rna_isolation_protocols_io_doi | | Link to a protocols document answering the question: How was tissue stored and processed for RNA isolation RNA_isolation_protocols_io_doi | |
+| bulk_rna_isolation_quality_metric_value | | RIN value | |
+| bulk_rna_yield_units_per_tissue_unit | | RNA amount per Tissue input amount. Valid values should be weight/weight (ng/mg). | |
+| bulk_rna_yield_value | | RNA (ng) per Weight of Tissue (mg). Answer the question: How much RNA in ng was isolated? How much tissue in mg was initially used for isolating RNA? Calculate the yield by dividing total RNA isolated by amount of tissue used to isolate RNA from (ng/mg). | |
+| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | |
+| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | |
+| library_construction_protocols_io_doi | | A link to the protocol document containing the library construction method (including version) that was used, e.g. "Smart-Seq2", "Drop-Seq", "10X v3". | |
+| library_id | | A library ID, unique within a TMC, which allows corresponding RNA and chromatin accessibility datasets to be linked. | |
+| operator | | Name of the person responsible for executing the assay. | |
+| operator_email | | Email address for the operator. | |
+| pi | | Name of the principal investigator responsible for the data. | |
+| pi_email | | Email address for the principal investigator. | |
+| rnaseq_assay_method | | The kit used for the RNA sequencing assay | |
+| sc_isolation_enrichment | | The method by which specific cell populations are sorted or enriched. | |
+| sc_isolation_protocols_io_doi | | Link to a protocols document answering the question: How were single cells separated into a single-cell suspension? | |
+| sc_isolation_quality_metric | | A quality metric by visual inspection prior to cell lysis or defined by known parameters such as wells with several cells or no cells. This can be captured at a high level. | |
+| sc_isolation_tissue_dissociation | | The method by which tissues are dissociated into single cells in suspension. | |
+| sc_isolation_cell_number | | Total number of cell/nuclei yielded post dissociation and enrichment | |
+| sequencing_phix_percent | | Percent PhiX loaded to the run | |
+| sequencing_read_percent_q30 | | Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + ...) | |
+| version | | Version of the schema to use when validating this metadata. | |
+| description | | Free-text description of this assay. | |
diff --git a/docs/assays/metadata/RNAseqWithProbes.md b/docs/assays/metadata/RNAseqWithProbes.md
index fb8e974..68497e0 100644
--- a/docs/assays/metadata/RNAseqWithProbes.md
+++ b/docs/assays/metadata/RNAseqWithProbes.md
@@ -1,63 +1,95 @@
----
-layout: page
----
-# RNAseq-(with-probes)
- Version 2 (current)
-
-## Version 2 (current)
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-----------------------------------------------------|-----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|------------|
-| analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA ```| True |
-| acquisition_instrument_vendor | Allowable Value | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | True |
-| acquisition_instrument_model | Allowable Value | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | True |
-| source_storage_duration_value | Numeric | How long was the source material (parent) stored, prior to this sample being processed. | | True |
-| source_storage_duration_unit | Allowable Value | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | True |
-| time_since_acquisition_instrument_calibration_value | Numeric | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | False |
-| time_since_acquisition_instrument_calibration_unit | Allowable Value | The time unit of measurement |```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | False |
-| contributors_path | Textfield | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | True |
-| data_path | Textfield | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | True |
-| barcode_read | Allowable Value | Which read file contains the cell or capture spot barcode. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if the source material is barcoded. This field is used to determine which analysis pipeline to run. | ```Read 2 (R2)``` ```Read 1 (R1)``` ```Not applicable``` | True |
-| barcode_size | Allowable Value | Length of the cell or capture spot barcode in base pairs. Cell and capture spot barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences, the offsets. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if the source material is barcoded. This field is used to determine which analysis pipeline to run. | ```14``` ```16``` ```40``` ```8,8,8``` ```8,6``` ```8,8``` ```Not applicable``` | True |
-| umi_read | Allowable Value | Which read file contains the UMI barcode. This should be included when constructing sequencing libraries with a non-commercial kit. | ```Read 2 (R2)``` ```Read 1 (R1)``` ```Not applicable``` | True |
-| umi_size | Allowable Value | Length of the umi barcode in base pairs. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if UMI are present. This field is used to determine which analysis pipeline to run. | ```8``` ```9``` ```10``` ```12``` ```14``` ```Not applicable``` | True |
-| assay_input_entity | Allowable Value | This is the entity from which the analyte is being captured. For example, for bulk sequencing this would be "tissue", while it would be "single cell" for single cell sequencing. This field is used to determine which analysis pipeline to run. | ```area of interest``` ```single cell``` ```single nucleus``` ```spot``` ```tissue (bulk)``` | True |
-| number_of_input_cells_or_nuclei | Numeric | How many cells or nuclei were input to the assay? This is typically not available for preparations working with bulk tissue. | | False |
-| library_adapter_sequence | Textfield | 5’ and/or 3’ read adapter sequences used as part of the library preparation protocol to render the library compatible with the sequencing protocol and instrumentation. This should be provided as comma-separated list of key:value pairs (adapter name:sequence). | | True |
-| library_average_fragment_size | Numeric | Average size of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. Numeric value in base pairs (bp). | | True |
-| library_input_amount_value | Numeric | The amount of cDNA, after amplification, that was used for library construction. | | False |
-| library_input_amount_unit | Allowable Value | unit of library input amount value | ```ng``` ```ul``` | False |
-| library_output_amount_value | Numeric | Total amount (eg. nanograms) of library after the clean-up step of final pcr amplification step. Answer the question: What is the Qubit measured concentration (ng/ul) times the elution volume (ul) after the final clean-up step? | | False |
-| library_output_amount_unit | Allowable Value | Units of library final yield. | ```ng``` ```ul``` | False |
-| library_concentration_value | Numeric | The concentration value of the pooled library samples submitted for sequencing. | | True |
-| library_concentration_unit | Allowable Value | Unit of library concentration value. | ```ng/ul``` ```nM``` | True |
-| library_layout | Allowable Value | Whether the library was generated for single-end or paired end sequencing | ```paired-end``` ```single-end``` | True |
-| number_of_pcr_cycles_for_indexing | Numeric | Number of PCR cycles performed in order to add adapters and amplify the library. This does not include the cDNA amplification which is captured in the "number of iterations of cDNA amplification" field. | | True |
-| library_preparation_kit | Allowable Value | Reagent kit used for library preparation | ```10X Genomics; Automated Library Construction Kit``` ```24 rxns; PN 1000428``` ```10X Genomics; Chromium Next GEM Automated Single Cell 5' Kit v2``` ```24 rxns; PN 1000290``` ```10X Genomics; Chromium Next GEM Automated Single Cell 5' Kit v2``` ```4 rxns; PN 1000298``` ```10X Genomics; Chromium Next GEM Single Cell 3' GEM``` ```Library & Gel Bead Kit v3.1``` ```16 rxns; PN 1000121``` ```10X Genomics; Chromium Next GEM Single Cell 3' HT Kit v3.1``` ```48 rxns; PN 1000348``` ```10X Genomics; Chromium Next GEM Single Cell 3' HT Kit v3.1``` ```8 rxns; PN 1000370``` ```10X Genomics; Chromium Next GEM Single Cell 3' Kit v3.1``` ```16 rxns; PN 1000268``` ```10X Genomics; Chromium Next GEM Single Cell 3' Kit v3.1``` ```4 rxns; PN 1000269``` ```10X Genomics; Chromium Next GEM Single Cell 5' Kit v2``` ```16 rxns; PN 1000263``` ```10X Genomics; Chromium Next GEM Single Cell 5' Kit v2``` ```4 rxns; PN 1000265``` ```10X Genomics; Chromium Next GEM Single Cell Fixed RNA Hybridization & Library Kit``` ```4 rxns; PN 1000415``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```16 rxn; PN 1000283``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```4 rxn; PN 1000285``` ```10X Genomics; Chromium Single Cell 3' GEM``` ```Library & Gel Bead Kit v3``` ```4 rxns PN 1000092``` ```10X Genomics; Chromium Single Cell 3' Library & Gel Bead Kit``` ```4 rxns; PN 120267``` ```10X Genomics; Visium CytAssist Spatial Gene Expression for FFPE``` ```Human Transcriptome``` ```11 mm``` ```2 reactions; PN 1000522``` ```10X Genomics; Visium CytAssist Spatial Gene Expression for FFPE``` ```Human Transcriptome``` ```6.5mm``` ```4 reactions; PN 1000520``` ```10X Genomics; Visium Spatial for FFPE Gene Expression Kit``` ```Human Transcriptome``` ```1 slides``` ```4 reactions; PN 1000338``` ```10X Genomics; Visium Spatial for FFPE Gene Expression Kit``` ```Mouse Transcriptome``` ```4 rxns; PN 1000339``` ```10X Genomics; Visium Spatial Gene Expression Slide and Reagent Kit``` ```1 slides``` ```4 reactions; PN 1000187``` ```10X Genomics; Visium Spatial Gene Expression Slide and Reagent Kit``` ```4 slides``` ```16 reactions; PN 1000184``` ```Custom``` ```Illumina; TruSeq Stranded mRNA Library Prep (48 samples); PN 20020594``` ```Illumina; TruSeq Stranded mRNA Library Prep (96 samples); PN 20020595``` ```New England BioLabs; NEBNext Ultra II RNA Library Prep Kit for Illumina; PN E7770``` ```Parse Biosciences; Evercode WT Mini v2 Kit``` ```12 rxns; PN ECW02010``` ```Parse Biosciences; Evercode WT v2 Kit``` ```48 rxns; PN ECW02030)``` | False |
-| sample_indexing_kit | Allowable Value | Indexes are needed for multiplexing sequencing libraries for simultaneous sequencing (pooling) and proper attachment to the Illumina flowcell. Each indexing kit would have a number of compatible sequences ("sample indexing sets") that are used to label some number of samples (the number of sets depend on the kit). | ```10X Genomics; Chromium i7 Sample Index Plate (96 rxn); PN 220103``` ```10X Genomics; Dual Index Kit TS``` ```Set A; PN 1000251``` ```10X Genomics; Dual Index Kit TT``` ```Set A (96 rxn); PN 1000215``` ```10X Genomics; Single Index Kit N``` ```Set A (96 rxn); PN 1000212``` ```Custom``` ```Illumina; IDT for Illumina - TruSeq RNA UD Indexes v2 (96 Indexes``` ```96 Samples); PN 20040871``` ```Illumina; TruSeq RNA CD Index Plate (96 Indexes``` ```96 Samples); PN 20019792``` ```Illumina; TruSeq RNA Single Indexes Set A (12 Indexes``` ```48 Samples); PN 20020492``` ```Illumina; TruSeq RNA Single Indexes Set B (12 Indexes``` ```48 Samples); PN 20020493``` ```Integrated DNA Technologies: Custom DNA Oligos``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-AB``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-CD``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-EF``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-GH``` ```Not applicable``` ```Parse Biosciences; Fragmentation Reagents; PN WX100``` ```Parse Biosciences; UDI Plate - WT; PN UDI1001 ```| False |
-| sample_indexing_set | Textfield | The specific sequencing barcode index set used, selected from the sample indexing kit. Example: For 10X this might be "SI-GA-A1", for Nextera "N505 - CTCCTTAC" | | False |
-| is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | True |
-| expected_entity_capture_count | Numeric | Number of cells, nuclei or capture spots expected to be captured by the assay. For Visium this is the total number of spots covered by tissue, within the capture area. | | False |
-| sequencing_reagent_kit | Allowable Value | Reagent kit used for sequencing |```Custom``` ```Illumina; HiSeq 3000/4000 PE Cluster Kit PE-410-1001; PN 1000283``` ```Illumina; NextSeq 1000/2000 P2 Reagent v3 Kit (100 Cycles); PN 20046811``` ```Illumina; NextSeq 1000/2000 P2 Reagent v3 Kit (200 Cycles); PN 20046812``` ```Illumina; NextSeq 1000/2000 P2 Reagent v3 Kit (300 Cycles); PN 20046813``` ```Illumina; NextSeq 2000 P3 Reagent Kit (300 Cycles); PN 20040561``` ```Illumina; NextSeq 2000 P3 Reagents Kit (100 Cycles); PN 20040559``` ```Illumina; NextSeq 500/550 Hi Output Kit 150 Cycles; v2.5; PN 20024907``` ```Illumina; NextSeq 500/550 Hi Output Kit 75 Cycles v2.5; PN 20024906``` ```Illumina; NextSeq 500/550 Mid Output Kit 150 Cycles v2.5; PN 20024904``` ```Illumina; NovaSeq 6000 S1 Reagent Kit (200 Cycles); PN 20012864``` ```Illumina; NovaSeq 6000 S1 Reagent v1.5 Kit (100 Cycles); PN 20028319``` ```Illumina; NovaSeq 6000 S1 Reagent v1.5 Kit (200 Cycles); PN 20028318``` ```Illumina; NovaSeq 6000 S1 Reagent v1.5 Kit (300 Cycles); PN 20028317``` ```Illumina; NovaSeq 6000 S2 Reagent v1.5 Kit (100 Cycles); PN 20028316``` ```Illumina; NovaSeq 6000 S4 Reagent Kit v1.5 (300 cycles); PN 20028312``` ```Illumina; NovaSeq 6000 S4 Reagent v1.5 Kit (200 Cycles); PN 20028313``` ```Illumina; NovaSeq 6000 SP Reagent v1.5 Kit (100 Cycles); PN 20028401``` ```Illumina; NovaSeq X Series 1.5B Reagent Kit (100 Cycle); PN 20104703``` ```Illumina; NovaSeq X Series 1.5B Reagent Kit (200 Cycle); PN 20104704``` ```Illumina; NovaSeq X Series 1.5B Reagent Kit (300 Cycle); PN 20104705``` ```Illumina; NovaSeq X Series 10B Reagent Kit (100 Cycle); PN 20085596``` ```Illumina; NovaSeq X Series 10B Reagent Kit (200 Cycle); PN 20085595``` ```Illumina; NovaSeq X Series 10B Reagent Kit (300 Cycle); PN 20085594``` | True |
-| sequencing_read_format | Textfield | Number of sequencing cycles in each round of sequencing (i.e., Read1, i7 index, i5 index, and Read2). This is reported as a comma-delimited list. Example: For 10X snATAC-seq (R1,Index,R2,R3) this might be: 50,8,16,50. For SNARE-seq2 this might be: 75,94,8,75 | | True |
-| sequencing_batch_id | Textfield | The ID for the sequencing run. This could, for example, be the chip ID and should allow users the ability to determine which samples were processed together in a sequencing run. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | False |
-| capture_batch_id | Textfield | A lab-generated ID to identify which cells were captured at the same time. This would, for example, be an ID to denote which datasets were derived from a single 10X Genomics Chromium Controller run. In the case of the 10X Controller this could be the chip ID and would allow users the ability to determine which samples were processed together in a Chromium controller. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | False |
-| preparation_instrument_vendor | Allowable Value | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` | False |
-| preparation_instrument_model | Allowable Value | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist ```| False |
-| metadata_schema_id | Textfield | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | True |
-| amount_of_input_analyte_value | Numeric | The amount of RNA or DNA input to the assay, typically measured by a Qubit, BioAnalyzer, or TapeStation. In most single cell/nuclei assays, this value isn't available. | | False |
-| number_of_iterations_of_cdna_amplification | Numeric | This is the amplification of the cDNA prior to library construction. This is typically a PCR amplification, while for linear amplification methods like aRNA this would be the number of rounds of aRNA. | | True |
-| preparation_instrument_kit | Allowable Value | The reagent kit used with the preparation instrument. | ```10X Genomics; Chromium Next GEM Chip G Single Cell Kit``` ```16 rxns; PN 1000127``` ```10X Genomics; Chromium Next GEM Chip G Single Cell Kit``` ```48 rxns; PN 1000120``` ```10X Genomics; Chromium Next GEM Chip K Automated Single Cell Kit``` ```48 rxns; PN 1000289``` ```10X Genomics; Chromium Next GEM Chip K Single Cell Kit``` ```16 rxns; PN 1000287``` ```10X Genomics; Chromium Next GEM Chip K Single Cell Kit``` ```48 rxns; PN 1000286``` ```10X Genomics; Chromium Next GEM Chip Q Single Cell Kit``` ```16 rxns; PN 1000422``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```16 rxn; PN 1000283``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```4 rxn; PN 1000285``` ```10X Genomics; Visium FFPE Reagent Kit v2-Small``` ```PN 1000436``` ```Custom``` | False |
-| preparation_protocol_doi | Textfield | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | True |
-| umi_offset | Allowable Value | Position in the read at which the UMI barcode starts. This should be included when constructing sequencing libraries with a non-commercial kit. | ```0``` ```16``` ```34``` ```36``` ```Not applicable``` | True |
-| parent_sample_id | Textfield | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | True |
-| barcode_offset | Allowable Value | Positions in the read at which the cell or capture spot barcodes start. Cell and capture spot barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences (the offsets). First barcode at position 0, then 38, then 76. This should be included when constructing sequencing libraries with a non-commercial kit. | ```0``` ```8``` ```20``` ```1,27``` ```0,38,76``` ```10,48,86``` ```Not applicable``` | True |
-| is_custom_probes_used | Allowable Value | State ("Yes" or "No") whether custom RNA or antibody probes were used. If custom probes were used, they must be listed in the "custom_probe_set.csv" file. | ```Yes``` ```No``` | True |
-| probe_hybridization_time_value | Numeric | How long was the oligo-conjugated RNA or oligo-conjugated antibody probes hybridized with the sample? | | True |
-| probe_hybridization_time_unit | Allowable Value | The units for probe hybridization time value. | ```Hour``` ```Minute``` | True |
-| oligo_probe_panel | Allowable Value | This is the probe panel used to target genes and/or proteins. In cases where there is a core panel and add-on modules, the core panel should be selected here. If additional panels are used, then they must be included in the "additional_panels_used.csv" file that's uploaded with the dataset. | ```10x Genomics; Chromium Fixed RNA Kit``` ```Human Transcriptome``` ```4 rxns x 1 BC; PN 1000474``` ```10X Genomics; Chromium Next GEM Single Cell Fixed RNA Human Transcriptome Probe Kit``` ```16 rxns; PN 1000420``` ```10X Genomics; Chromium Next GEM Single Cell Fixed RNA Human Transcriptome Probe Kit``` ```64 rxns; PN 1000456``` ```10x Genomics; Visium Human Transcriptome Probe Kit v2 - Small; PN 1000466``` ```10x Genomics; Visium Human Transcriptome Probe Kit-Large; PN 1000364``` ```10x Genomics; Visium Human Transcriptome Probe Kit-Small; PN 1000363``` ```10x Genomics; Visium Mouse Transcriptome Probe Kit - Small; PN 1000365``` ```Custom``` ```NanoString Technologies; GeoMx Human Whole Transcriptome Atlas``` ```4 slides; PN GMX-RNA-NGS-HuWTA-4``` ```NanoString Technologies; GeoMx Mouse Whole Transcriptome Atlas``` ```4 slides; PN GMX-RNA-NGS-MsWTA-4``` | True |
-| is_custom_probes_used | Allowable Value | State ("Yes" or "No") whether custom RNA or antibody probes were used. If custom probes were used, they must be listed in the "custom_probe_set.csv" file. | ```Yes``` ```No``` | True |
-| dataset_type | Allowable Value | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium```| True |
-| amount_of_input_analyte_unit | Allowable Value | Units of amount of entity input to assay value | ```ug``` ```ng``` | False |
-
-
+---
+layout: page-triary
+---
+
+# RNAseq-(with-probes) Metadata Attributes
+
+Fields that are collected for RNAseq-(with-probes) data, available at ```Dataset.metadata.```
+
+
+* indicates a required field
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | |
+| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | |
+| preparation_protocol_doi * | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | ```https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1``` |
+| dataset_type * | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` |
+| analyte_class * | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` |
+| is_targeted * | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ```Yes``` ```No``` |
+| acquisition_instrument_vendor * | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` |
+| acquisition_instrument_model * | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` |
+| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | |
+| source_storage_duration_unit * | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` |
+| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | |
+| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` |
+| contributors_path | | Relative path to file with ORCID IDs for contributors for this dataset. | |
+| data_path | | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | |
+| barcode_offset * | | Positions in the read at which the cell or capture spot barcodes start. Cell and capture spot barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences (the offsets). First barcode at position 0, then 38, then 76. This should be included when constructing sequencing libraries with a non-commercial kit. | ```0``` ```8``` ```20``` ```1,27``` ```0,38,76``` ```10,48,86``` ```Not applicable``` |
+| barcode_read * | | Which read file contains the cell or capture spot barcode. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if the source material is barcoded. This field is used to determine which analysis pipeline to run. | ```Read 2 (R2)``` ```Read 1 (R1)``` ```Not applicable``` |
+| barcode_size * | | Length of the cell or capture spot barcode in base pairs. Cell and capture spot barcodes are, for example, 3 x 8 bp sequences that are spaced by constant sequences, the offsets. This should be included when constructing sequencing libraries with a non-commercial kit. This field is required if the source material is barcoded. This field is used to determine which analysis pipeline to run. | ```14``` ```16``` ```40``` ```8,8,8``` ```8,6``` ```Not applicable``` |
+| umi_offset * | | Position in the read at which the umi barcode starts. | ```0``` ```16``` ```36``` ```Not applicable``` |
+| umi_read * | | Which read file(s) contains the UMI (unique molecular identifier) barcode. | ```Read 2 (R2)``` ```Read 1 (R1)``` ```Not applicable``` |
+| umi_size * | | Length of the umi barcode in base pairs. | ```8``` ```9``` ```10``` ```12``` ```Not applicable``` |
+| assay_input_entity * | | This is the entity from which the analyte is being captured. For example, for bulk sequencing this would be "tissue", while it would be "single cell" for single cell sequencing. This field is used to determine which analysis pipeline to run. | ```area of interest``` ```single cell``` ```single nucleus``` ```spot``` ```tissue (bulk)``` |
+| number_of_input_cells_or_nuclei | | How many cells or nuclei were input to the assay? This is typically not available for preparations working with bulk tissue. | |
+| amount_of_input_analyte_value | | The amount of RNA or DNA input to the assay, typically measured by a Qubit, BioAnalyzer, or TapeStation. In most single cell/nuclei assays, this value isn't available. | |
+| amount_of_input_analyte_unit | | Units of amount of entity input to assay value | ```ug``` ```ng``` |
+| library_adapter_sequence | | Adapter sequence to be used for adapter trimming | |
+| library_average_fragment_size | | Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. | |
+| library_input_amount_value | | The amount of cDNA, after amplification, that was used for library construction. | |
+| library_input_amount_unit * | | unit of library input amount value | ```ng``` ```ul``` |
+| library_output_amount_value | | Total amount (eg. nanograms) of library after the clean-up step of final pcr amplification step. Answer the question: What is the Qubit measured concentration (ng/ul) times the elution volume (ul) after the final clean-up step? | |
+| library_output_amount_unit | | Units of library final yield. | ```ng``` ```ul``` |
+| library_concentration_value | | The concentration value of the pooled library samples submitted for sequencing. | |
+| library_concentration_unit * | | Unit of library concentration value. | ```ng/ul``` ```nM``` |
+| library_layout * | | State whether the library was generated for single-end or paired end sequencing. | ```paired-end``` ```single-end``` |
+| number_of_iterations_of_cdna_amplification | | This is the amplification of the cDNA prior to library construction. This is typically a PCR amplification, while for linear amplification methods like aRNA this would be the number of rounds of aRNA. | |
+| number_of_pcr_cycles_for_indexing | | Number of PCR cycles performed in order to add adapters and amplify the library. This does not include the cDNA amplification which is captured in the "number of iterations of cDNA amplification" field. | |
+| library_preparation_kit * | | Reagent kit used for library preparation | ```10X Genomics; Automated Library Construction Kit``` ```24 rxns; PN 1000428``` ```10X Genomics; Chromium Next GEM Automated Single Cell 5' Kit v2``` ```24 rxns; PN 1000290``` ```4 rxns; PN 1000298``` ```10X Genomics; Chromium Next GEM Single Cell 3' GEM``` ```Library & Gel Bead Kit v3.1``` ```16 rxns; PN 1000121``` ```10X Genomics; Chromium Next GEM Single Cell 3' HT Kit v3.1``` ```48 rxns; PN 1000348``` ```8 rxns; PN 1000370``` ```10X Genomics; Chromium Next GEM Single Cell 3' Kit v3.1``` ```16 rxns; PN 1000268``` ```4 rxns; PN 1000269``` ```10X Genomics; Chromium Next GEM Single Cell 5' Kit v2``` ```16 rxns; PN 1000263``` ```4 rxns; PN 1000265``` ```10X Genomics; Chromium Next GEM Single Cell Fixed RNA Hybridization & Library Kit``` ```4 rxns; PN 1000415``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```16 rxn; PN 1000283``` ```4 rxn; PN 1000285``` ```10X Genomics; Chromium Single Cell 3' GEM``` ```Library & Gel Bead Kit v3``` ```4 rxns PN 1000092``` ```10X Genomics; Chromium Single Cell 3' Library & Gel Bead Kit``` ```4 rxns; PN 120267``` ```10X Genomics; Visium CytAssist Spatial Gene Expression for FFPE``` ```Human Transcriptome``` ```11 mm``` ```2 reactions; PN 1000522``` ```6.5mm``` ```4 reactions; PN 1000520``` ```10X Genomics; Visium Spatial for FFPE Gene Expression Kit``` ```1 slides``` ```4 reactions; PN 1000338``` ```Mouse Transcriptome``` ```4 rxns; PN 1000339``` ```10X Genomics; Visium Spatial Gene Expression Slide and Reagent Kit``` ```4 reactions; PN 1000187``` ```4 slides``` ```16 reactions; PN 1000184``` ```Custom``` ```Illumina; TruSeq Stranded mRNA Library Prep (48 samples); PN 20020594``` ```Illumina; TruSeq Stranded mRNA Library Prep (96 samples); PN 20020595``` ```New England BioLabs; NEBNext Ultra II RNA Library Prep Kit for Illumina; PN E7770``` ```Parse Biosciences; Evercode WT Mini v2 Kit``` ```12 rxns; PN ECW02010``` ```Parse Biosciences; Evercode WT v2 Kit``` ```48 rxns; PN ECW02030)``` |
+| sample_indexing_kit * | | Indexes are needed for multiplexing sequencing libraries for simultaneous sequencing (pooling) and proper attachment to the Illumina flowcell. Each indexing kit would have a number of compatible sequences ("sample indexing sets") that are used to label some number of samples (the number of sets depend on the kit). | ```10X Genomics; Chromium i7 Sample Index Plate (96 rxn); PN 220103``` ```10X Genomics; Dual Index Kit TS``` ```Set A; PN 1000251``` ```10X Genomics; Dual Index Kit TT``` ```Set A (96 rxn); PN 1000215``` ```10X Genomics; Single Index Kit N``` ```Set A (96 rxn); PN 1000212``` ```Custom``` ```Illumina; IDT for Illumina - TruSeq RNA UD Indexes v2 (96 Indexes``` ```96 Samples); PN 20040871``` ```Illumina; TruSeq RNA CD Index Plate (96 Indexes``` ```96 Samples); PN 20019792``` ```Illumina; TruSeq RNA Single Indexes Set A (12 Indexes``` ```48 Samples); PN 20020492``` ```Illumina; TruSeq RNA Single Indexes Set B (12 Indexes``` ```48 Samples); PN 20020493``` ```Integrated DNA Technologies: Custom DNA Oligos``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-AB``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-CD``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-EF``` ```NanoString Technologies; GeoMx Seq Code Pack; PN GMX-NGS-SEQ-GH``` ```Not applicable``` ```Parse Biosciences; Fragmentation Reagents; PN WX100``` ```Parse Biosciences; UDI Plate - WT; PN UDI1001``` |
+| sample_indexing_set | | The specific sequencing barcode index set used, selected from the sample indexing kit. Example: For 10X this might be "SI-GA-A1", for Nextera "N505 - CTCCTTAC" | |
+| is_technical_replicate * | | Is the sequencing reaction run in replicate, TRUE or FALSE | ```Yes``` ```No``` |
+| expected_entity_capture_count | | Number of cells, nuclei or capture spots expected to be captured by the assay. For Visium this is the total number of spots covered by tissue, within the capture area. | |
+| sequencing_reagent_kit * | | Reagent kit used for sequencing | ```Custom``` ```Illumina``` ```HiSeq 3000/4000 PE Cluster Kit PE-410-1001``` ```PN 1000283, Illumina``` ```NextSeq 1000/2000 P2 Reagent v3 Kit (100 Cycles)``` ```PN 20046811, Illumina``` ```NextSeq 1000/2000 P2 Reagent v3 Kit (200 Cycles)``` ```PN 20046812, Illumina``` ```NextSeq 1000/2000 P2 Reagent v3 Kit (300 Cycles)``` ```PN 20046813, Illumina``` ```NextSeq 2000 P3 Reagent Kit (300 Cycles)``` ```PN 20040561, Illumina``` ```NextSeq 2000 P3 Reagents Kit (100 Cycles)``` ```PN 20040559, Illumina``` ```NextSeq 500/550 Hi Output Kit 150 Cycles``` ```v2.5``` ```PN 20024907, Illumina``` ```NextSeq 500/550 Hi Output Kit 75 Cycles v2.5``` ```PN 20024906, Illumina``` ```NextSeq 500/550 Mid Output Kit 150 Cycles v2.5``` ```PN 20024904, Illumina``` ```NovaSeq 6000 S1 Reagent Kit (200 Cycles)``` ```PN 20012864, Illumina``` ```NovaSeq 6000 S1 Reagent v1.5 Kit (100 Cycles)``` ```PN 20028319, Illumina``` ```NovaSeq 6000 S1 Reagent v1.5 Kit (200 Cycles)``` ```PN 20028318, Illumina``` ```NovaSeq 6000 S1 Reagent v1.5 Kit (300 Cycles)``` ```PN 20028317, Illumina``` ```NovaSeq 6000 S2 Reagent v1.5 Kit (100 Cycles)``` ```PN 20028316, Illumina``` ```NovaSeq 6000 S4 Reagent Kit v1.5 (300 cycles)``` ```PN 20028312, Illumina``` ```NovaSeq 6000 S4 Reagent v1.5 Kit (200 Cycles)``` ```PN 20028313, Illumina``` ```NovaSeq 6000 SP Reagent v1.5 Kit (100 Cycles)``` ```PN 20028401, Illumina``` ```NovaSeq X Series 1.5B Reagent Kit (100 Cycle)``` ```PN 20104703, Illumina``` ```NovaSeq X Series 1.5B Reagent Kit (200 Cycle)``` ```PN 20104704, Illumina``` ```NovaSeq X Series 1.5B Reagent Kit (300 Cycle)``` ```PN 20104705, Illumina``` ```NovaSeq X Series 10B Reagent Kit (100 Cycle)``` ```PN 20085596, Illumina``` ```NovaSeq X Series 10B Reagent Kit (200 Cycle)``` ```PN 20085595, Illumina``` ```NovaSeq X Series 10B Reagent Kit (300 Cycle)``` ```PN 20085594``` |
+| sequencing_read_format | | Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. | |
+| sequencing_batch_id | | The ID for the sequencing run. This could, for example, be the chip ID and should allow users the ability to determine which samples were processed together in a sequencing run. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | |
+| capture_batch_id | | A lab-generated ID to identify which cells were captured at the same time. This would, for example, be an ID to denote which datasets were derived from a single 10X Genomics Chromium Controller run. In the case of the 10X Controller this could be the chip ID and would allow users the ability to determine which samples were processed together in a Chromium controller. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | |
+| preparation_instrument_vendor * | | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` |
+| preparation_instrument_model * | | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist``` |
+| preparation_instrument_kit * | | The reagent kit used with the preparation instrument. | ```10X Genomics; Chromium Next GEM Chip G Single Cell Kit``` ```16 rxns; PN 1000127``` ```48 rxns; PN 1000120``` ```10X Genomics; Chromium Next GEM Chip K Automated Single Cell Kit``` ```48 rxns; PN 1000289``` ```10X Genomics; Chromium Next GEM Chip K Single Cell Kit``` ```16 rxns; PN 1000287``` ```48 rxns; PN 1000286``` ```10X Genomics; Chromium Next GEM Chip Q Single Cell Kit``` ```16 rxns; PN 1000422``` ```10X Genomics; Chromium NextGem Single Cell Multiome ATAC + Gene Expression Reagent Bundle``` ```16 rxn; PN 1000283``` ```4 rxn; PN 1000285``` ```10X Genomics; Visium FFPE Reagent Kit v2-Small``` ```PN 1000436``` ```Custom``` |
+| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | |
+
+
+
+
+## Deprecated Attributes
+
+
+ indicates a field that was previously required
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| assay_category * | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```sequence``` |
+| bulk_rna_isolation_protocols_io_doi | | Link to a protocols document answering the question: How was tissue stored and processed for RNA isolation RNA_isolation_protocols_io_doi | |
+| bulk_rna_isolation_quality_metric_value | | RIN value | |
+| bulk_rna_yield_units_per_tissue_unit * | | RNA amount per Tissue input amount. Valid values should be weight/weight (ng/mg). | ```ng/mg``` |
+| bulk_rna_yield_value | | RNA (ng) per Weight of Tissue (mg). Answer the question: How much RNA in ng was isolated? How much tissue in mg was initially used for isolating RNA? Calculate the yield by dividing total RNA isolated by amount of tissue used to isolate RNA from (ng/mg). | |
+| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | |
+| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | |
+| library_construction_protocols_io_doi | | A link to the protocol document containing the library construction method (including version) that was used, e.g. "Smart-Seq2", "Drop-Seq", "10X v3". | |
+| library_id | | A library ID, unique within a TMC, which allows corresponding RNA and chromatin accessibility datasets to be linked. | |
+| operator | | Name of the person responsible for executing the assay. | |
+| operator_email | | Email address for the operator. | |
+| pi | | Name of the principal investigator responsible for the data. | |
+| pi_email | | Email address for the principal investigator. | |
+| rnaseq_assay_method | | The kit used for the RNA sequencing assay | |
+| sc_isolation_enrichment * | | The method by which specific cell populations are sorted or enriched. | ```none``` ```FACS``` |
+| sc_isolation_protocols_io_doi | | Link to a protocols document answering the question: How were single cells separated into a single-cell suspension? | |
+| sc_isolation_quality_metric | | A quality metric by visual inspection prior to cell lysis or defined by known parameters such as wells with several cells or no cells. This can be captured at a high level. | |
+| sc_isolation_tissue_dissociation | | The method by which tissues are dissociated into single cells in suspension. | |
+| sc_isolation_cell_number | | Total number of cell/nuclei yielded post dissociation and enrichment | |
+| sequencing_phix_percent | | Percent PhiX loaded to the run | |
+| sequencing_read_percent_q30 | | Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + ...) | |
+| version * | | Version of the schema to use when validating this metadata. | ```1``` |
+| description | | Free-text description of this assay. | |
diff --git a/docs/assays/metadata/VisiumNoProbes.md b/docs/assays/metadata/VisiumNoProbes.md
index 2d741c4..9d24d6b 100644
--- a/docs/assays/metadata/VisiumNoProbes.md
+++ b/docs/assays/metadata/VisiumNoProbes.md
@@ -1,58 +1,36 @@
----
-layout: page
----
-# Visium-(no-probes)
-
-NOTE: Several versions of this metadata schema have been created over time. The (Latest) version contains most attributes, but there may be some deprecated attributes in the older versions for which data has been collected. HuBMAP is in the process of creating a reference which combines all of these versions into a single view. That reference will be available here once completed.
-
- Version3 (current)
-
-## Version 3
-
-| Attribute | Type | Description | Allowable Values | Required |
-|-------------------------------|-----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|------------|
-| preparation_protocol_doi | Textfield | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | True |
-| dataset_type | Allowable Value | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium```| True |
-| contributors_path | Textfield | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | True |
-| data_path | Textfield | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | True |
-| mapped_area_value | Numeric | For Visium, this is the area of spots that was covered by tissue within the captured area, not the total possible captured area which is fixed. For GeoMx this would be the area of the AOI being captured. For HiFi this is the summed area of the ROIs in a single flowcell lane. For CosMx, Xenium and Resolve, this is the area of the FOV (aka ROI) region being captured. | | True |
-| mapped_area_unit | Allowable Value | The unit of measurement for the mapping area. For Visium and GeoMx this is typically um^2. | ```um^2``` ```mm^2``` | True |
-| spot_size_value | Numeric | For assays where spots are used to define discrete capture areas, this is the area of a spot. | | True |
-| spot_size_unit | Allowable Value | The unit for spot size value. | ```um^2``` ```mm^2``` | True |
-| number_of_spots | Numeric | Number of capture spots within the mapped area. For Visium this would be the number of spots covered by tissue, while it's the number of spots within ROIs for HiFi. | | True |
-| spot_spacing_value | Numeric | Approximate center-to-center distance between capture spots. Synonyms: Inter-Spot distance, Spot resolution, Pit size | | True |
-| spot_spacing_unit | Allowable Value | Units corresponding to inter-spot distance | ```um``` | True |
-| capture_area_id | Allowable Value | Which capture area on the slide was used. For Visium this would be ```A1, B1, C1, D1```. For HiFi this would be the lane on the flowcell. | ```A1``` ```B1``` ```C1``` ```D1``` ```Lane 1``` ```Lane 2``` ```Lane 3``` ```Lane 4``` ```Lane 5``` ```Lane 6``` ```Lane 7``` ```Lane 8``` | True |
-| permeabilization_time_value | Numeric | Permeabilization time used for this tissue section. | | False |
-| permeabilization_time_unit | Allowable Value | The unit for the permeabilization time. | ```minute``` | False |
-| metadata_schema_id | Textfield | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | True |
-| parent_sample_id | Textfield | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | True |
-| preparation_instrument_vendor | Allowable Value | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` | False |
-| preparation_instrument_model | Allowable Value | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist ```| False |
-
-
-
- Version 2
-
-## Version 2
-
-| Attribute | Type | Description | Allowable Value | Required |
-|-----------------------------|----------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|------------|
-| preparation_protocol_doi | Textfield | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | True |
-| dataset_type | Textfield | The specific type of dataset being produced. | | True |
-| contributors_path | Textfield | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | True |
-| data_path | Textfield | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | True |
-| mapped_area_value | Numeric | For Visium, this is the area of spots that was covered by tissue within the captured area, not the total possible captured area which is fixed. For GeoMx this would be the area of the AOI being captured. For HiFi this is the summed area of the ROIs in a single flowcell lane. For CosMx, Xenium and Resolve, this is the area of the FOV (aka ROI) region being captured. | | True |
-| mapped_area_unit | Textfield | The unit of measurement for the mapping area. For Visium and GeoMx this is typically um^2. | | True |
-| spot_size_value | Numeric | For assays where spots are used to define discrete capture areas, this is the area of a spot. | | True |
-| spot_size_unit | Textfield | The unit for spot size value. | | True |
-| number_of_spots | Numeric | Number of capture spots within the mapped area. For Visium this would be the number of spots covered by tissue, while it's the number of spots within ROIs for HiFi. | | True |
-| spot_spacing_value | Numeric | Approximate center-to-center distance between capture spots. Synonyms: Inter-Spot distance, Spot resolution, Pit size | | True |
-| spot_spacing_unit | Textfield | Units corresponding to inter-spot distance | | True |
-| capture_area_id | Allowable Value | Which capture area on the slide was used. For Visium this would be [A1, B1, C1, D1]. For HiFi this would be the lane on the flowcell. | [A1, B1, C1, D1, Lane 1, Lane 2, Lane 3, Lane 4, Lane 5, Lane 6, Lane 7, Lane 8] | True |
-| permeabilization_time_value | Numeric | Permeabilization time used for this tissue section. | | False |
-| permeabilization_time_unit | Textfield | The unit for the permeabilization time. | | False |
-| metadata_schema_id | Textfield | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | True |
-| parent_sample_id | Textfield | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | True |
-
-
\ No newline at end of file
+---
+layout: page-triary
+---
+
+# Visium (no probes) Metadata Attributes
+
+Fields that are collected for Visium (no probes) data, available at ```Dataset.metadata.```
+
+
+* indicates a required field
+
+| Attribute | Type | Description | Allowable Values |
+|------|------|-------------|-------------------|
+| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | |
+| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | |
+| preparation_protocol_doi * | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | ```https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1``` |
+| dataset_type * | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` |
+| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | |
+| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | |
+| mapped_area_value | | For Visium, this is the area of spots that was covered by tissue within the captured area, not the total possible captured area which is fixed. For GeoMx this would be the area of the AOI being captured. For HiFi this is the summed area of the ROIs in a single flowcell lane. For CosMx, Xenium and Resolve, this is the area of the FOV (aka ROI) region being captured. | |
+| mapped_area_unit * | | The unit of measurement for the mapping area. For Visium and GeoMx this is typically um^2. | ```um^2``` ```mm^2``` |
+| spot_size_value | | FModified progressive staining, Not applicable, Progressive staining, Regressive stainingor assays where spots are used to define discrete capture areas, this is the area of a spot. | |
+| spot_size_unit * | | The unit for spot size value. | ```um^2``` ```mm^2``` |
+| number_of_spots | | Number of capture spots within the mapped area. For Visium this would be the number of spots covered by tissue, while it's the number of spots within ROIs for HiFi. | |
+| spot_spacing_value | | Approximate center-to-center distance between capture spots. Synonyms: Inter-Spot distance, Spot resolution, Pit size | |
+| spot_spacing_unit * | | Units corresponding to inter-spot distance | ```um``` |
+| capture_area_id * | | Which capture area on the slide was used. For Visium this would be ```A1, B1, C1, D1```. For HiFi this would be the lane on the flowcell. | ```A1, B1, C1, D1``` ```A1``` ```B1``` ```C1``` ```D1``` ```Lane 1``` ```Lane 2``` ```Lane 3``` ```Lane 4``` ```Lane 5``` ```Lane 6``` ```Lane 7``` ```Lane 8``` |
+| permeabilization_time_value | | Permeabilization time used for this tissue section. | |
+| permeabilization_time_unit * | | The unit for the permeabilization time. | ```minute``` |
+| preparation_instrument_vendor * | | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` |
+| preparation_instrument_model * | | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist``` |
+| non_global_files | | A semicolon separated list of non-shared files to be included in the dataset. The path assumes the files are located in the "TOP/non-global/" directory. For example, for the file is TOP/non-global/lab_processed/images/1-tissue-boundary.geojson the value of this field would be "./lab_processed/images/1-tissue-boundary.geojson". After ingest, these files will be copied to the appropriate locations within the respective dataset directory tree. This field is used for internal HuBMAP processing. Examples for GeoMx and PhenoCycler are provided in the File Locations documentation: https://docs.google.com/document/d/1n2McSs9geA9Eli4QWQaB3c9R3wo5d5U1Xd57DWQfN5Q/edit#heading=h.1u82i4axggee | |
+| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | |
+
+
+
\ No newline at end of file
diff --git a/docs/assays/metadata/index.md b/docs/assays/metadata/index.md
index db8c27f..2e4c5d2 100644
--- a/docs/assays/metadata/index.md
+++ b/docs/assays/metadata/index.md
@@ -1,6 +1,7 @@
---
layout: page
---
+
## HuBMAP Metadata by Dataset Type
A list of available dataset types (data types from multiple supported assays), with a link [](EnhancedSRS "Attribute description") to the valid metadata attributes for each dataset type. The linked assay metadata pages list all attributes, as they have occurred, across any versions of the metadata specification for the given dataset type with the most current, valid set of attributes listed first on the page. The directory schema for each dataset type is also linked in the description column.
@@ -9,8 +10,10 @@ A list of available dataset types (data types from multiple supported assays), w
| Dataset Type | Description |
|--------------|-------------|
| [4i](https://hubmapconsortium.github.io/ingest-validation-tools/4i/current/) [](4i "Attribute description")| 4i, or Iterative Indirect Immunofluorescence Imaging, is a high-resolution technique for imaging proteins within cells and tissues. |
+| 10X Multiome [](10XMultiome "Attribute description")| 10x Multiome (specifically Chromium Single Cell Multiome ATAC + Gene Expression) is a high-throughput, droplet-based, single-cell technique that simultaneously measures gene expression (RNA) and chromatin accessibility (ATAC) from the same individual nucleus.|
| [Autofluorescence (AF)](https://docs.hubmapconsortium.org/assays/af) [](AutoFluorescence "Attribute description")| Exploits endogenous fluorescence in biological tissue to capture an image. The image can be used to integrate other images from multiple modalities and align tissues within a 3D experiment. Link to [AF directory schema](https://hubmapconsortium.github.io/ingest-validation-tools/af/current/). |
| [ATACseq](https://docs.hubmapconsortium.org/assays/atacseq) [](ATACseq "Attribute description")| Assay for Transposase-Accessible Chromatin using sequencing (ATACseq) identifies accessible DNA regions, probing open chromatin with hyperactive mutant Tn5 Transposase that inserts sequencing adapters into open regions of the genome. Link to [ATACsec directory schema](https://hubmapconsortium.github.io/ingest-validation-tools/atacseq/current/). |
+| [Cell DIVE](https://docs.hubmapconsortium.org/assays/codex) [](CellDIVE "Attribute description")| ell DIVE is a multiplexed imaging solution that enables researchers to visualize and analyze over 60 different biomarkers on a single tissue sample (FFPE). It uses an iterative workflow of staining, imaging, and signal inactivation to create high-resolution spatial maps of protein expression, crucial for studying cancer and complex tissue pathology. |
| [CODEX](https://docs.hubmapconsortium.org/assays/codex) [](CODEX "Attribute description")| Co-detection by indexing (CODEX) is a strategy for generating highly multiplexed images of fluorescently-labeled antigens. Link to [CODEX directory schema](https://hubmapconsortium.github.io/ingest-validation-tools/codex/current/). |
| COMET [](COMET "Attribute description") | COMET is a technique used to measure DNA damage in individual cells. The name comes from the shape that damaged DNA fragments form when they migrate out of a cell's nucleus under an electric field, resembling a comet with a head and a tail. This assay is widely used in genetics research to study DNA damage from factors like radiation, chemicals, and environmental exposure. |
| [CosMx Proteomics](https://hubmapconsortium.github.io/ingest-validation-tools/cosmx-proteomics/current/) [](CosMx Proteomics "Attribute description")| CosMx Proteomics is a technology that enables the high-resolution, spatial analysis of proteins within their native tissue environment. It is part of the CosMx Spatial Molecular Imager (SMI) platform, which provides single-cell and subcellular resolution to map protein expression, cell states, and cell-cell interactions in FFPE and fresh frozen tissue samples. |
diff --git a/docs/css/alt.css b/docs/css/alt.css
new file mode 100644
index 0000000..4717494
--- /dev/null
+++ b/docs/css/alt.css
@@ -0,0 +1,82 @@
+
+.altStyle div.c-container,
+.altStyle div.c-header__main{
+ max-width: 87%!important;
+}
+/* inline code badges inside table cells */
+.altStyle .requiredNote{
+ font-size: 0.75rem;
+ line-height: 1.36;
+ letter-spacing: 0.03333em;
+ color: #00000071;
+}
+
+.altStyle table code {
+ padding: 3px 4px;
+ margin: 2px 0;
+ display: inline-block;
+ box-shadow: 0 0 3px #A5A9BE;
+ border: 1px solid #444A65;
+ border-radius: 10px;
+ font-size: 0.65rem;
+ background: linear-gradient(180deg, #c4c8dc 0%, #b9bfda 100%) !important;
+ color: #444A65;
+ text-shadow: 0 0 2px #818597A7;
+}
+.altStyle table code:hover {
+ border-color: #444A65;
+ background: linear-gradient(180deg, #c4c8dc 0%, #C4CAE5 100%) !important;
+ cursor: default;
+}
+
+.altStyle table span.requiredMark {
+ color: #ff0000;
+}
+
+.altStyle table.deprecated span.requiredMark {
+ color: #00000061;
+}
+
+/* Description column styling (3rd column) */
+.altStyle table td:nth-child(3){
+ font-weight: 400;
+ font-size: 0.75rem;
+ line-height: 1.36;
+ letter-spacing: 0.03333em;
+ max-height: 150px;
+ overflow-y: auto;
+ border: 0;
+}
+
+.fa-circle-nodes{
+ transform: rotate(80deg);
+}
+/* codebox helpers */
+.codebox {
+ border: 1px solid black;
+ background-color: #EEEEFF;
+ width: 100%;
+ overflow: auto;
+ padding: 10px;
+ background-clip: content-box, padding-box;
+}
+.codebox br { display: block; margin: 0; line-height: 0; }
+.innercodebox { padding-left: 5px; padding-right: 5px; }
+
+/* Assay Metadata page Code & Table Styles */
+/* Allowable Values measured collapse/expand */
+.av-wrapper { overflow: hidden; transition: max-height 420ms cubic-bezier(0.2, 0, 0, 1), opacity 220ms ease; }
+.av-wrapper.collapsed { opacity: 1; }
+.av-content { word-break: break-word; }
+.av-content pre { white-space: pre; overflow: auto; }
+.av-toggle { display: inline-block; margin-top: 0.35rem; background: transparent; border: none; color: #0b62d1; cursor: pointer; font-size: 0.65rem; text-decoration: underline; padding: 0; font-weight: 600; font-family: 'Inter Variable', Helvetica, Arial, sans-serif; }
+.av-toggle:focus { outline: 2px solid #cfe; }
+
+/* Row Specific */
+.altStyle table th { text-align: left; }
+.altStyle table td:nth-child(1) { font-family: monospace; }
+.altStyle table td:nth-child(2) { text-align: center; color: #00000061; }
+.altStyle table td:nth-child(2) i:hover { color: #00000091; }
+
+/* Ensure table cells allow the wrapper to control overflow for Description and Allowable Values */
+.altStyle table td:nth-child(3), .altStyle table td:nth-child(4) { max-height: none; overflow: visible; }
\ No newline at end of file
diff --git a/docs/css/hm-code.css b/docs/css/hm-code.css
index 4acbfcd..002485a 100644
--- a/docs/css/hm-code.css
+++ b/docs/css/hm-code.css
@@ -26,29 +26,94 @@ pre code {
box-shadow: none
}
+
+/* codebox helpers */
.codebox {
- /* Below are styles for the codebox (not the code itself) */
border: 1px solid black;
background-color: #EEEEFF;
width: 100%;
overflow: auto;
- /* padding-left:12px;
- padding-right:12px;
- padding-top:10px;
- padding-bottom:10px; */
-
padding: 10px;
- background-image: linear-gradient(to bottom, rgb(255, 255, 255) 0%, rgb(255, 255, 255) 100%), linear-gradient(to bottom, rgb(186, 189, 193) 0%, rgb(186, 189, 193) 100%);
background-clip: content-box, padding-box;
}
+.codebox br { display: block; margin: 0; line-height: 0; }
+.innercodebox { padding-left: 5px; padding-right: 5px; }
-.codebox br {
- display: block;
- margin: 0px 0;
- line-height: 0px;
+
+
+.altStyle div.c-container,
+.altStyle div.c-header__main{
+ max-width: 87%!important;
+}
+
+.altStyle div.c-container div.c-navbar-brand{
+ top:auto;
+}
+
+/* inline code badges inside table cells */
+.altStyle .requiredNote{
+ font-size: 0.75rem;
+ line-height: 1.36;
+ letter-spacing: 0.03333em;
+ color: #00000071;
+}
+
+.altStyle table code {
+ padding: 3px 4px;
+ margin: 2px 0;
+ display: inline-block;
+ box-shadow: 0 0 3px #A5A9BE;
+ border: 1px solid #444A65;
+ border-radius: 10px;
+ font-size: 0.65rem;
+ background: linear-gradient(180deg, #c4c8dc 0%, #b9bfda 100%) !important;
+ color: #444A65;
+ text-shadow: 0 0 2px #818597A7;
}
+.altStyle table code:hover {
+ border-color: #444A65;
+ background: linear-gradient(180deg, #c4c8dc 0%, #C4CAE5 100%) !important;
+ cursor: default;
+}
+
+.altStyle table span.requiredMark {
+ color: #ff0000;
+}
+
+.altStyle table.deprecated span.requiredMark {
+ color: #00000061;
+}
+
+/* Description column styling (3rd column) */
+.altStyle table td:nth-child(3){
+ font-weight: 400;
+ font-size: 0.75rem;
+ line-height: 1.36;
+ letter-spacing: 0.03333em;
+ max-height: 150px;
+ overflow-y: auto;
+ border: 0;
+}
+
+.fa-circle-nodes{
+ transform: rotate(80deg);
+}
+/* Assay Metadata page Code & Table Styles */
+/* Allowable Values measured collapse/expand */
+.av-wrapper { overflow: hidden; transition: max-height 420ms cubic-bezier(0.2, 0, 0, 1), opacity 220ms ease; }
+.av-wrapper.collapsed { opacity: 1; }
+.av-content { word-break: break-word; }
+.av-content pre { white-space: pre; overflow: auto; }
+.av-toggle { display: inline-block; margin-top: 0.35rem; background: transparent; border: none; color: #0b62d1; cursor: pointer; font-size: 0.65rem; text-decoration: underline; padding: 0; font-weight: 600; font-family: 'Inter Variable', Helvetica, Arial, sans-serif; }
+.av-toggle:focus { outline: 2px solid #cfe; }
+
+.altStyle table { border: 2px solid #444a65; }
+.altStyle table .deprecated { border: 2px solid #818bb6; }
+/* Row Specific */
+.altStyle table th { text-align: left; background-color: #444a65!important; color: #ffffff; font-weight: 600; }
+.altStyle table td:nth-child(1) { font-family: monospace; }
+.altStyle table td:nth-child(2) { text-align: center; color: #00000061; }
+.altStyle table td:nth-child(2) i:hover { color: #00000091; }
-.innercodebox {
- padding-left: 5px;
- padding-right: 5px;
-}
\ No newline at end of file
+/* Ensure table cells allow the wrapper to control overflow for Description and Allowable Values */
+.altStyle table td:nth-child(3), .altStyle table td:nth-child(4) { max-height: none; overflow: visible; }
\ No newline at end of file
diff --git a/docs/css/releaseHighlight.css b/docs/css/releaseHighlight.css
new file mode 100644
index 0000000..aa9ae3d
--- /dev/null
+++ b/docs/css/releaseHighlight.css
@@ -0,0 +1,18 @@
+/* Styles for new-release highlighting moved out of JS into a dedicated CSS file */
+
+/* left accent on rows and padding for table rows used as context */
+table.currentMeta tr,
+.altStyle table tr {
+ border-left: 4px solid #444a65;
+}
+
+table.currentMeta td,
+.altStyle table td {
+ padding-left: .75rem;
+}
+
+.new-release-highlight {
+ background-color: #fff8e1;
+ border-left: 4px solid #ffc107;
+ transition: background-color .25s ease, border-color .25s ease;
+}
diff --git a/docs/js/alt-collapse.js b/docs/js/alt-collapse.js
new file mode 100644
index 0000000..336c0ef
--- /dev/null
+++ b/docs/js/alt-collapse.js
@@ -0,0 +1,128 @@
+/* Shared collapse/expand for Allowable Values and Description
+ - Wraps content in a measured container and adds a toggle
+ - Reusable for any table column index and max line count
+*/
+(function(){
+ 'use strict';
+
+ function setupCollapsible(colIndex, maxLines, showText, hideText) {
+ var tables = document.querySelectorAll('.altStyle table');
+ tables.forEach(function(table){
+ var rows = table.querySelectorAll('tr');
+ rows.forEach(function(row){
+ var tds = row.querySelectorAll('td');
+ if (!tds || tds.length <= colIndex) return;
+ var cell = tds[colIndex];
+ if (!cell) return;
+ if (cell.querySelector('.av-wrapper')) return; // already processed
+ var raw = cell.innerHTML.trim();
+ if (!raw) return;
+
+ // Create wrapper and content container
+ var wrapper = document.createElement('div');
+ wrapper.className = 'av-wrapper';
+ var content = document.createElement('div');
+ content.className = 'av-content';
+ content.innerHTML = raw;
+ wrapper.appendChild(content);
+ // Replace cell contents
+ cell.innerHTML = '';
+ cell.appendChild(wrapper);
+
+ // Measure line-height
+ var computed = window.getComputedStyle(content);
+ var lineHeight = parseFloat(computed.lineHeight);
+ if (!lineHeight || isNaN(lineHeight)) {
+ var span = document.createElement('span');
+ span.style.visibility = 'hidden';
+ span.style.whiteSpace = 'nowrap';
+ span.textContent = 'A';
+ content.appendChild(span);
+ lineHeight = span.getBoundingClientRect().height || 16;
+ content.removeChild(span);
+ }
+
+ var maxH = Math.round(lineHeight * maxLines);
+
+ // allow content to fully render to measure full height
+ wrapper.style.maxHeight = 'none';
+ var fullH = wrapper.scrollHeight;
+ if (fullH <= maxH + 4) {
+ // short content — no toggle needed
+ wrapper.style.maxHeight = 'none';
+ return;
+ }
+
+ // Collapse by default
+ wrapper.style.maxHeight = maxH + 'px';
+ wrapper.classList.add('collapsed');
+
+ // Create toggle button
+ var btn = document.createElement('button');
+ btn.className = 'av-toggle';
+ btn.setAttribute('aria-expanded', 'false');
+ btn.textContent = showText || 'Show more';
+ btn.addEventListener('click', function(){
+ var expanded = btn.getAttribute('aria-expanded') === 'true';
+ if (!expanded) {
+ // expand: animate to measured height, then clear to allow natural height
+ var full = wrapper.scrollHeight;
+ wrapper.style.maxHeight = full + 'px';
+ btn.setAttribute('aria-expanded','true');
+ btn.textContent = hideText || 'Show less';
+ // after the transition completes, remove the max-height limit
+ var onEnd = function(e) {
+ if (e.propertyName === 'max-height') {
+ wrapper.style.maxHeight = 'none';
+ wrapper.removeEventListener('transitionend', onEnd);
+ }
+ };
+ wrapper.addEventListener('transitionend', onEnd);
+ } else {
+ // collapse: ensure we start from a numeric height so the transition eases
+ var currentFull = wrapper.scrollHeight;
+ // If maxHeight is 'none' or unset, set it to the current full pixel height first
+ var computed = window.getComputedStyle(wrapper).maxHeight;
+ if (computed === 'none' || !computed) {
+ wrapper.style.maxHeight = currentFull + 'px';
+ // force a reflow so the browser recognizes the start height
+ wrapper.getBoundingClientRect();
+ }
+ // then animate down to collapsed height
+ wrapper.style.maxHeight = maxH + 'px';
+ btn.setAttribute('aria-expanded','false');
+ btn.textContent = showText || 'Show more';
+ }
+ });
+
+ cell.appendChild(btn);
+ });
+ });
+ }
+
+ function init() {
+ // Description: 3rd column (index 2) — 4 lines
+ setupCollapsible(2, 4, 'More...', 'Hide');
+ // Allowable Values: 4th column (index 3) — 3 lines (preserve previous behavior)
+ setupCollapsible(3, 3, 'More...', 'Less' );
+
+ // Add classes to the first and second tables on the page (if present).
+ try {
+ var allTables = document.querySelectorAll('table');
+ if (allTables && allTables.length >= 1) {
+ allTables[0].classList.add('currentMeta');
+ }
+ if (allTables && allTables.length >= 2) {
+ allTables[1].classList.add('deprecated');
+ }
+ } catch (e) {
+ // harmless if DOM isn't as expected
+ }
+ }
+
+ if (document.readyState === 'loading') {
+ document.addEventListener('DOMContentLoaded', init);
+ } else {
+ init();
+ }
+})();
diff --git a/docs/js/new-releases.js b/docs/js/new-releases.js
new file mode 100644
index 0000000..fd3971e
--- /dev/null
+++ b/docs/js/new-releases.js
@@ -0,0 +1,91 @@
+// Highlight newly released assays and provide Ctrl+ArrowUp,ArrowUp quick-toggle
+(function(){
+ 'use strict';
+ const newReleases = ["10XMultiome", "ATACseq", "AutoFluorescence", "CellDIVE", "CODEX", "DESI", "Histology", "IMC", "LC-MS", "LightSheet", "MALDI", "MIBI", "MUSIC", "RNAseq", "RNAseqWithProbes", "VisiumNoProbes"];
+
+ // Styles for new-release highlighting are now provided by /css/releaseHighlight.css
+
+ // Detect quick sequence: Ctrl + ArrowUp, ArrowUp
+ let seqCount = 0;
+ let lastTime = 0;
+ const SEQ_TIMEOUT = 800; // ms
+
+ function isAssayPage() {
+ return true
+ // return !!document.querySelector('table.currentMeta') || !!document.querySelector('.altStyle table');
+ }
+
+ function highlightMatches() {
+ console.debug('%c◉ highlighting ', 'color:#2158FF', );
+ document.querySelectorAll('.new-release-highlight').forEach(function(el){ el.classList.remove('new-release-highlight'); });
+ var tables = document.querySelectorAll('table.currentMeta, .altStyle table');
+ if (!tables || tables.length === 0) {
+ tables = document.querySelectorAll('.altStyle table');
+ }
+ if (!tables || tables.length === 0) {
+ tables = document.querySelectorAll('table');
+ }
+ console.debug('%c◉ new-releases: tables found', 'color:#2158FF', tables.length);
+ console.debug('%c◉ tablesfound ', 'color:#00ff7b', );
+ tables.forEach(function(table){
+ table.querySelectorAll('tr').forEach(function(row){
+ var links = row.querySelectorAll('a');
+ for (var i=0;i SEQ_TIMEOUT) seqCount = 0;
+ seqCount++;
+ lastTime = now;
+ if (seqCount >= 2){
+ if (isAssayPage()){
+ highlightMatches();
+ var first = document.querySelector('.new-release-highlight');
+ if (first) first.scrollIntoView({behavior: 'smooth', block: 'center'});
+ }
+ seqCount = 0;
+ }
+ } else {
+ seqCount = 0;
+ }
+ });
+
+ // Clear highlights on click or Escape
+ document.addEventListener('click', function(){
+ document.querySelectorAll('.new-release-highlight').forEach(function(el){ el.classList.remove('new-release-highlight'); });
+ });
+ document.addEventListener('keydown', function(e){ if (e.key === 'Escape') { document.querySelectorAll('.new-release-highlight').forEach(function(el){ el.classList.remove('new-release-highlight'); }); } });
+
+})();
diff --git a/scripts/newMeta2/generate_markdown_clean.py b/scripts/newMeta2/generate_markdown_clean.py
new file mode 100644
index 0000000..b38852e
--- /dev/null
+++ b/scripts/newMeta2/generate_markdown_clean.py
@@ -0,0 +1,823 @@
+#!/usr/bin/env python3
+"""Clean generator for assay JSON files (renamed from generate_json_clean).
+
+This file is the same generator but renamed to `generate_markdown_clean.py`.
+It produces per-assay JSON files and accompanying markdown in the
+configured output directories.
+"""
+import os
+import re
+import csv
+import json
+import glob
+import yaml
+import time
+from pathlib import Path
+import argparse
+import shutil
+
+# Config (paths from project)
+RSOURCE_DIR = '/home/birdie/Documents/hubmap/documentation/scripts/newMeta2/source/reharmonize-legacy-metadata/metadata'
+JSONLD_SOURCE = '/home/birdie/Documents/hubmap/documentation/scripts/newMeta2/source/dataset-metadata-spreadsheet'
+IVT_SOURCE = os.path.join(os.path.dirname(__file__), 'source', 'IVT_table_meta')
+OUTPUT_DIR = '/home/birdie/Documents/hubmap/documentation/scripts/newMeta2/JSON/'
+MARKDOWN_DIR = '/home/birdie/Documents/hubmap/documentation/scripts/newMeta2/markdown'
+
+os.makedirs(OUTPUT_DIR, exist_ok=True)
+os.makedirs(MARKDOWN_DIR, exist_ok=True)
+
+
+def _normalize_attr_key(s):
+ if not s:
+ return ''
+ return re.sub(r'\s+', ' ', str(s)).strip().lower()
+
+
+def normalize_type(t):
+ """Normalize type strings into human-friendly canonical values."""
+ if not t:
+ return ''
+ s = str(t).strip().lower()
+ mapping = {
+ 'controlled-term-field': 'Allowable value',
+ 'radio-field': 'Radio',
+ 'numeric-field': 'Numeric',
+ 'text-field': 'Textfield',
+ 'link-field': 'Textfield'
+ }
+ if s in mapping:
+ return mapping[s]
+ # Treat 'assigned value' (and variants) as allowable values
+ if s.replace('-', ' ') in ('assigned value', 'assignedvalue'):
+ return 'Allowable value'
+ if s.endswith('-field'):
+ base = s[:-6]
+ return base.capitalize()
+ return str(t).capitalize()
+
+
+def _determine_type(item):
+ """Decide the final `type` for an item.
+
+ If allowable values are exclusively Yes/No (case-insensitive), prefer `Radio`.
+ Otherwise normalize the existing type string.
+ """
+ t = item.get('type') or ''
+ allowable = item.get('allowable values') or []
+ if isinstance(allowable, (list, tuple)) and allowable:
+ low = set([str(x).strip().lower() for x in allowable if x is not None])
+ if low and (low.issubset({'yes', 'no'}) or low.issubset({'true', 'false'})):
+ return 'Radio'
+ return normalize_type(t)
+
+
+def _format_allowable_values(vals):
+ """Format allowable values for markdown output.
+
+ Wrap each value in triple backticks and join with spaces (no commas).
+ """
+ if not isinstance(vals, (list, tuple)):
+ return ''
+ wrapped = [f"```{v}```" for v in vals]
+ return ' '.join(wrapped)
+
+
+def generate_markdown_from_json(assay, section, out_dir=MARKDOWN_DIR):
+ """Generate a markdown file for an assay section containing `main` and `deprecated`.
+
+ Produces `/.md`.
+ """
+ # Use AssayName (from section) for filename and title when available
+ assay_name = None
+ if isinstance(section, dict):
+ assay_name = section.get('AssayName')
+ # Replace spaces with dashes for the filename (keep title unchanged)
+ safe_name = re.sub(r"\s+", '-', (assay_name or assay))
+ fname = f"{safe_name}.md"
+ path = os.path.join(out_dir, fname)
+
+ # Build table body strings for template replacement
+ def esc(s):
+ return str(s).replace('|', '\\|')
+
+ def render_rows(rows, deprecated=False):
+ out = []
+ for it in rows:
+ term = it.get('attribute') or ''
+ try:
+ if it.get('required'):
+ color = '#00000061' if deprecated else 'red'
+ term = f"{term} *"
+ except Exception:
+ pass
+ typ = it.get('type') or ''
+ if typ is None or (isinstance(typ, str) and typ.strip() == ''):
+ typ = _determine_type(it)
+ desc = it.get('description') or ''
+ av = it.get('allowable values') or []
+ avs = _format_allowable_values(av) if av else ''
+ out.append(f"| {esc(term)} | {esc(typ)} | {esc(desc)} | {avs} |\n")
+ return ''.join(out)
+
+ main_rows = render_rows(section.get('main', []) if isinstance(section, dict) else [])
+ classic_rows = render_rows(section.get('deprecated', []) if isinstance(section, dict) else [], deprecated=True)
+
+ # Load template and substitute
+ tpl_path = os.path.join(os.path.dirname(__file__), 'Template.md')
+ if os.path.exists(tpl_path):
+ try:
+ tpl = open(tpl_path, 'r', encoding='utf-8').read()
+ title = assay_name or assay
+ content = tpl.replace('{AssayNameHere}', title).replace('{Main Table}', main_rows).replace('{Classic Table}', classic_rows)
+ with open(path, 'w', encoding='utf-8') as mf:
+ mf.write(content)
+ return path
+ except Exception:
+ pass
+
+ # Fallback to previous simple writer if template missing
+ lines = []
+ lines.append(f"# {assay_name or assay}\n\n")
+ lines.append("| Attribute | Type | Description | Allowable Values |\n")
+ lines.append("|---|---|---|---|\n")
+ lines.append(main_rows)
+ if classic_rows:
+ lines.append('\n')
+ lines.append('## Deprecated / Classic Attributes\n\n')
+ lines.append("| Attribute | Type | Description | Allowable Values |\n")
+ lines.append("|---|---|---|---|\n")
+ lines.append(classic_rows)
+ with open(path, 'w', encoding='utf-8') as mf:
+ mf.writelines(lines)
+ return path
+
+
+def get_version_folders(assay_path):
+ if not os.path.isdir(assay_path):
+ return []
+ folders = [f for f in os.listdir(assay_path) if os.path.isdir(os.path.join(assay_path, f))]
+ latest = [f for f in folders if f == 'latest']
+ versions = [f for f in folders if re.match(r'^v?\d+(?:\.\d+)*$', f)]
+ versions_sorted = sorted(versions, key=lambda v: list(map(int, re.findall(r'\d+', v))), reverse=True)
+ return latest + versions_sorted
+
+
+def process_fmdata(csv_path):
+ """Return (main_list, deprecated_list) with canonical fields populated.
+ Leftmost column -> main; if leftmost empty use rightmost non-empty -> deprecated.
+ """
+ main = []
+ deprecated = []
+ with open(csv_path, newline='', encoding='utf-8') as cf:
+ reader = csv.DictReader(cf)
+ cols = reader.fieldnames
+ if not cols:
+ return main, deprecated
+ left_col = cols[0]
+ for row in reader:
+ left_val = (row.get(left_col) or '').strip()
+ if left_val:
+ obj = {
+ 'attribute': left_val,
+ 'type': '',
+ 'description': '',
+ 'allowable values': [],
+ 'required': False,
+ }
+ main.append(obj)
+ continue
+ # rightmost non-empty
+ found = None
+ for c in reversed(cols):
+ if c == left_col:
+ continue
+ v = (row.get(c) or '').strip()
+ if v:
+ found = v
+ break
+ if found:
+ obj = {
+ 'attribute': found,
+ 'type': '',
+ 'description': '',
+ 'allowable values': [],
+ 'required': False,
+ }
+ deprecated.append(obj)
+ return main, deprecated
+
+
+def load_glossary_for_assay(assay):
+ mapping = {}
+ gdir = os.path.join(RSOURCE_DIR, assay, 'glossary')
+ if not os.path.isdir(gdir):
+ return mapping
+ for fn in os.listdir(gdir):
+ if not fn.lower().endswith('.csv'):
+ continue
+ path = os.path.join(gdir, fn)
+ try:
+ with open(path, newline='', encoding='utf-8') as cf:
+ r = csv.reader(cf)
+ rows = list(r)
+ if not rows:
+ continue
+ header = rows[0]
+ if len(header) > 1 and any(h.strip() for h in header):
+ cf.seek(0)
+ for d in csv.DictReader(open(path, newline='', encoding='utf-8')):
+ keys = list(d.keys())
+ if not keys:
+ continue
+ name_key = None
+ for candidate in ('attribute', 'term', 'name', 'label'):
+ for k in keys:
+ if k.lower() == candidate:
+ name_key = k
+ break
+ if name_key:
+ break
+ if not name_key:
+ name_key = keys[0]
+ desc_key = None
+ for candidate in ('description', 'def', 'definition', 'term_definition'):
+ for k in keys:
+ if k.lower() == candidate:
+ desc_key = k
+ break
+ if desc_key:
+ break
+ term = (d.get(name_key) or '').strip()
+ desc = (d.get(desc_key) or '').strip() if desc_key else ''
+ if term and desc:
+ mapping[_normalize_attr_key(term)] = desc
+ else:
+ for row in rows:
+ if not row:
+ continue
+ term = row[0].strip() if len(row) > 0 else ''
+ desc = row[1].strip() if len(row) > 1 else ''
+ if term and desc:
+ mapping[_normalize_attr_key(term)] = desc
+ except Exception:
+ continue
+ return mapping
+
+
+def process_assay_yaml(yaml_path):
+ try:
+ with open(yaml_path, 'r', encoding='utf-8') as yf:
+ data = yaml.safe_load(yf) or {}
+ except Exception:
+ return {}
+ res = {}
+ children = data.get('children', []) if isinstance(data, dict) else []
+ def _value_label(v):
+ if not isinstance(v, dict):
+ return None
+ return v.get('termLabel') or v.get('label') or v.get('prefLabel') or v.get('name')
+ for child in children:
+ if not isinstance(child, dict):
+ continue
+ attr = child.get('key')
+ if not attr:
+ continue
+ typ = child.get('type') or ''
+ desc = child.get('description') or ''
+ config = child.get('configuration') or {}
+ required = bool(config.get('required', False))
+ allowable = []
+ if typ == 'controlled-term-field':
+ vals = child.get('values') or []
+ labels = set()
+ for v in vals:
+ lbl = _value_label(v)
+ if lbl:
+ labels.add(lbl)
+ allowable = sorted(labels)
+ res[_normalize_attr_key(attr)] = {
+ 'attribute': attr,
+ 'type': typ,
+ 'description': desc,
+ 'allowable values': allowable,
+ 'required': required,
+ }
+ return res
+
+
+def load_yaml_map_for_assay(assay):
+ assay_path = os.path.join(JSONLD_SOURCE, assay)
+ if not os.path.isdir(assay_path):
+ return {}
+ folders = get_version_folders(assay_path)
+ merged = {}
+ for v in reversed(folders):
+ vp = os.path.join(assay_path, v)
+ if not os.path.isdir(vp):
+ continue
+ for f in os.listdir(vp):
+ if f.endswith('.yml') or f.endswith('.yaml'):
+ path = os.path.join(vp, f)
+ m = process_assay_yaml(path)
+ for k, v in m.items():
+ existing = merged.get(k, {})
+ merged[k] = {**existing, **{kk: vv for kk, vv in v.items() if vv not in (None, '', [], {})}}
+ return merged
+
+
+def load_jsonld_map_for_assay(assay):
+ assay_path = os.path.join(JSONLD_SOURCE, assay)
+ if not os.path.isdir(assay_path):
+ return {}
+ folders = get_version_folders(assay_path)
+ merged = {}
+ for v in reversed(folders):
+ vp = os.path.join(assay_path, v)
+ if not os.path.isdir(vp):
+ continue
+ for f in os.listdir(vp):
+ if not f.endswith('.jsonld'):
+ continue
+ path = os.path.join(vp, f)
+ try:
+ with open(path, 'r', encoding='utf-8') as jf:
+ jd = json.load(jf)
+ except Exception:
+ continue
+ ui_pd = jd.get('_ui', {}).get('propertyDescriptions', {}) if isinstance(jd.get('_ui', {}), dict) else {}
+ props = jd.get('properties', {}) if isinstance(jd.get('properties', {}), dict) else {}
+ for pname, pval in props.items():
+ if not pname:
+ continue
+ key = _normalize_attr_key(pname)
+ entry = merged.get(key, {})
+ desc = ui_pd.get(pname)
+ if desc and not entry.get('description'):
+ entry['description'] = desc
+ t = None
+ if isinstance(pval.get('_ui', {}), dict):
+ t = pval.get('_ui', {}).get('inputType')
+ if t and not entry.get('type'):
+ entry['type'] = t
+ vc = pval.get('_valueConstraints', {}) or {}
+ branches = vc.get('branches') or []
+ vals = []
+ for b in branches:
+ if isinstance(b, dict) and 'label' in b:
+ vals.append(b.get('label'))
+ if vals and not entry.get('allowable values'):
+ entry['allowable values'] = vals
+ if not entry.get('attribute'):
+ entry['attribute'] = pname
+ merged[key] = entry
+ return merged
+
+
+def build_ivt_maps():
+ assay_map = {}
+ includes_map = {}
+ if not os.path.isdir(IVT_SOURCE):
+ return assay_map, includes_map
+ for root, dirs, files in os.walk(IVT_SOURCE):
+ for fn in files:
+ if not fn.endswith('.yaml') and not fn.endswith('.yml'):
+ continue
+ path = os.path.join(root, fn)
+ if 'includes' in Path(path).parts:
+ try:
+ with open(path, 'r', encoding='utf-8') as yf:
+ docs = list(yaml.safe_load_all(yf))
+ except Exception:
+ continue
+ for doc in docs:
+ if isinstance(doc, dict):
+ name = doc.get('name') or doc.get('attribute') or doc.get('key')
+ desc = doc.get('description') or doc.get('def') or ''
+ typ = doc.get('type') or ''
+ if name:
+ includes_map[_normalize_attr_key(name)] = {'description': desc, 'type': typ}
+ continue
+ m = re.match(r'^([a-zA-Z0-9_\-]+)-', fn)
+ if not m:
+ continue
+ assay_name = m.group(1)
+ try:
+ with open(path, 'r', encoding='utf-8') as yf:
+ docs = list(yaml.safe_load_all(yf))
+ except Exception:
+ continue
+ merged = assay_map.setdefault(assay_name, {})
+ for doc in docs:
+ if isinstance(doc, list):
+ for d in doc:
+ if isinstance(d, dict):
+ n = d.get('name') or d.get('attribute') or d.get('key')
+ if not n:
+ continue
+ k = _normalize_attr_key(n)
+ merged.setdefault(k, {})
+ if d.get('description') and not merged[k].get('description'):
+ merged[k]['description'] = d.get('description')
+ if d.get('type') and not merged[k].get('type'):
+ merged[k]['type'] = d.get('type')
+ elif isinstance(doc, dict):
+ n = doc.get('name') or doc.get('attribute') or doc.get('key')
+ if not n:
+ continue
+ k = _normalize_attr_key(n)
+ merged.setdefault(k, {})
+ if doc.get('description') and not merged[k].get('description'):
+ merged[k]['description'] = doc.get('description')
+ if doc.get('type') and not merged[k].get('type'):
+ merged[k]['type'] = doc.get('type')
+ assay_map[assay_name] = merged
+ return assay_map, includes_map
+
+
+def load_docs_index(docs_base):
+ """Build a lightweight docs index mapping normalized attribute -> (type, description).
+ Scans markdown and HTML files under `docs_base` for table rows containing attribute names.
+ """
+ mapping = {}
+ if not os.path.isdir(docs_base):
+ return mapping
+ for root, dirs, files in os.walk(docs_base):
+ for fn in files:
+ if not (fn.endswith('.md') or fn.endswith('.markdown') or fn.endswith('.html')):
+ continue
+ path = os.path.join(root, fn)
+ try:
+ text = open(path, 'r', encoding='utf-8').read()
+ except Exception:
+ continue
+ # Simple markdown table rows: | attr | Type | Description | Allowable Values | Required |
+ for line in text.splitlines():
+ if not line.strip().startswith('|'):
+ continue
+ parts = [p.strip() for p in line.split('|')[1:-1]]
+ if len(parts) < 2:
+ continue
+ attr = parts[0]
+ typ = parts[1]
+ if not attr:
+ continue
+ key = _normalize_attr_key(attr)
+ # only record non-blank types
+ if typ and typ.lower() not in ('', '-', 'none'):
+ mapping.setdefault(key, {})
+ mapping[key].setdefault('type', typ)
+ # description if available
+ if len(parts) > 2 and parts[2].strip():
+ mapping[key].setdefault('description', parts[2].strip())
+ # attempt to parse allowable values presented like "[a, b, c]"
+ # search both type, description and allowable-values column for bracketed lists
+ candidate_text = ' '.join(p for p in (typ, parts[2] if len(parts) > 2 else '', parts[3] if len(parts) > 3 else '') if p)
+ # 1) Python/list-like brackets: [a, b, c] or ['a','b']
+ mvals = re.search(r"\[([^\]]+)\]", candidate_text)
+ vals = []
+ if mvals:
+ raw = mvals.group(1)
+ for v in raw.split(','):
+ vv = v.strip().strip('\"\'')
+ if vv:
+ vals.append(vv)
+ # 2) backticked tokens like ```nm``` or `nm`
+ bt = re.findall(r"`{1,3}([^`]+)`{1,3}", candidate_text)
+ for b in bt:
+ bb = b.strip()
+ if bb and bb not in vals:
+ vals.append(bb)
+ if vals:
+ mapping[key].setdefault('allowable values', vals)
+ # parse Required column if present (e.g. 'True', 'Yes') in column 5 (index 4)
+ if len(parts) > 4 and parts[4].strip():
+ raw_req = parts[4].strip().lower()
+ if raw_req in ('true', 'yes', 'required', 'y'):
+ mapping[key]['required'] = True
+ # HTML tables: look for
attr
followed by
type
+ try:
+ import re as _re
+ for m in _re.finditer(r'
]*>([^<]+)
\s*
]*>([^<]+)
\s*(?:
]*>([^<]+)
)?\s*(?:
]*>([^<]+)
)?', text, _re.IGNORECASE):
+ attr = m.group(1).strip()
+ typ = m.group(2).strip()
+ # optional third and fourth cells (description/allowable values and required)
+ third = m.group(3).strip() if m.lastindex and m.lastindex >= 3 and m.group(3) else ''
+ fourth = m.group(4).strip() if m.lastindex and m.lastindex >= 4 and m.group(4) else ''
+ if not attr:
+ continue
+ key = _normalize_attr_key(attr)
+ if typ and typ.lower() not in ('', '-', 'none'):
+ mapping.setdefault(key, {})
+ mapping[key].setdefault('type', typ)
+ # try to parse allowable values from the type cell
+ vals = []
+ mvals = re.search(r"\[([^\]]+)\]", ' '.join((typ, third)))
+ if mvals:
+ raw = mvals.group(1)
+ for v in raw.split(','):
+ vv = v.strip().strip('"\'')
+ if vv:
+ vals.append(vv)
+ # backticked tokens (search both type and third column)
+ bt = re.findall(r"`{1,3}([^`]+)`{1,3}", ' '.join((typ, third)))
+ for b in bt:
+ bb = b.strip()
+ if bb and bb not in vals:
+ vals.append(bb)
+ if vals:
+ mapping[key].setdefault('allowable values', vals)
+ # parse required from fourth column when present (only set True)
+ if fourth:
+ r = fourth.strip().lower()
+ if r in ('true', 'yes', 'required', 'y'):
+ mapping[key]['required'] = True
+ except Exception:
+ pass
+ return mapping
+
+
+def load_assay_display_name_from_docs(assay, docs_base):
+ """Find transformation-summary.html under docs_base for the assay and extract the prefix
+ from the tag up to ' - Metadata Transformation Summary'. Returns None if not found.
+ """
+ if not os.path.isdir(docs_base):
+ return None
+ # Prefer file under docs_base//transformation-summary.html
+ candidate = os.path.join(docs_base, assay, 'transformation-summary.html')
+ if os.path.exists(candidate):
+ try:
+ txt = open(candidate, 'r', encoding='utf-8').read()
+ m = re.search(r'\s*(.*?)\s*-\s*Metadata Transformation Summary\s*', txt, re.IGNORECASE)
+ if m:
+ return m.group(1).strip()
+ except Exception:
+ pass
+ # fallback: search any transformation-summary.html under docs_base
+ for root, dirs, files in os.walk(docs_base):
+ for fn in files:
+ if fn != 'transformation-summary.html':
+ continue
+ path = os.path.join(root, fn)
+ try:
+ txt = open(path, 'r', encoding='utf-8').read()
+ except Exception:
+ continue
+ m = re.search(r'\s*(.*?)\s*-\s*Metadata Transformation Summary\s*', txt, re.IGNORECASE)
+ if m:
+ # prefer files whose path contains assay name
+ if assay in path:
+ return m.group(1).strip()
+ # otherwise first match
+ return m.group(1).strip()
+
+ # Additional fallback: look for markdown/html files with H1/title and try to match by filename or title
+ for root, dirs, files in os.walk(docs_base):
+ for fn in files:
+ if not (fn.endswith('.md') or fn.endswith('.markdown') or fn.endswith('.html')):
+ continue
+ path = os.path.join(root, fn)
+ try:
+ text = open(path, 'r', encoding='utf-8').read()
+ except Exception:
+ continue
+ # markdown H1
+ m = re.search(r'^\s*#\s*(.+)$', text, re.MULTILINE)
+ title = None
+ if m:
+ title = m.group(1).strip()
+ else:
+ m2 = re.search(r'\s*(.*?)\s*', text, re.IGNORECASE)
+ if m2:
+ title = m2.group(1).strip()
+ if not title:
+ continue
+ bn = os.path.splitext(fn)[0].lower()
+ # normalize strings for matching
+ n_assay = str(assay).lower()
+ n_title = title.lower()
+ # match by filename containing assay or vice-versa, or title containing assay
+ if n_assay in bn or bn in n_assay or n_assay in n_title or n_title in n_assay:
+ return title
+
+ return None
+
+
+def generate_all(use_ivt=False, use_jsonld=False):
+ start = time.time()
+ assays = [d for d in os.listdir(RSOURCE_DIR) if os.path.isdir(os.path.join(RSOURCE_DIR, d))]
+ if use_ivt:
+ ivt_assay_map, ivt_includes_map = build_ivt_maps()
+ else:
+ ivt_assay_map, ivt_includes_map = {}, {}
+ docs_base = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(__file__))), 'docs', 'assays', 'metadata')
+ docs_index = load_docs_index(docs_base)
+ for assay in assays:
+ fm_pattern = os.path.join(RSOURCE_DIR, assay, '*-field-mappings.csv')
+ files = glob.glob(fm_pattern)
+ if not files:
+ continue
+ main_all = []
+ deprecated_all = []
+ for f in files:
+ m, d = process_fmdata(f)
+ main_all.extend(m)
+ deprecated_all.extend(d)
+
+ glossary = load_glossary_for_assay(assay)
+ yaml_map = load_yaml_map_for_assay(assay) if use_jsonld else {}
+ jsonld_map = load_jsonld_map_for_assay(assay) if use_jsonld else {}
+ ivt_assay_entries = ivt_assay_map.get(assay) or {}
+
+ def fill_item(item):
+ key = _normalize_attr_key(item.get('attribute'))
+ if not item.get('description') and glossary.get(key):
+ item['description'] = glossary.get(key)
+ y = yaml_map.get(key) or {}
+ if y:
+ # Prefer filling values from YAML/JSON-LD (y contains processed YAML entries)
+ if y.get('type'):
+ item['type'] = item.get('type') or y.get('type')
+ if y.get('description'):
+ item['description'] = item.get('description') or y.get('description')
+ if y.get('allowable values'):
+ if not item.get('allowable values'):
+ item['allowable values'] = y.get('allowable values')
+ # If YAML explicitly provides a required flag, respect True only
+ if y.get('required'):
+ item['required'] = True
+ return
+ j = jsonld_map.get(key) or {}
+ if j:
+ if j.get('type'):
+ item['type'] = item.get('type') or j.get('type')
+ if j.get('description'):
+ item['description'] = item.get('description') or j.get('description')
+ if j.get('allowable values'):
+ if not item.get('allowable values'):
+ item['allowable values'] = j.get('allowable values')
+ # Respect explicit required flag from JSON-LD when present (only set True)
+ if j.get('required'):
+ item['required'] = True
+ return
+ # Docs (added search path)
+ d = docs_index.get(key) or {}
+ if d:
+ if d.get('type'):
+ item['type'] = item.get('type') or d.get('type')
+ # If docs provide allowable values (e.g. [a, b, c]) and item lacks them, use them
+ if d.get('allowable values'):
+ if not item.get('allowable values'):
+ item['allowable values'] = d.get('allowable values')
+ if d.get('description'):
+ item['description'] = item.get('description') or d.get('description')
+ # If docs explicitly indicate requiredness, respect True only
+ if d.get('required'):
+ item['required'] = True
+ return
+ ivt = ivt_assay_entries.get(key) or {}
+ if ivt:
+ if ivt.get('description'):
+ item['description'] = item.get('description') or ivt.get('description')
+ if ivt.get('type'):
+ item['type'] = item.get('type') or ivt.get('type')
+ if ivt.get('required'):
+ item['required'] = True
+ return
+ inc = ivt_includes_map.get(key) or {}
+ if inc:
+ if inc.get('description'):
+ item['description'] = item.get('description') or inc.get('description')
+ if inc.get('type'):
+ item['type'] = item.get('type') or inc.get('type')
+ if inc.get('required'):
+ item['required'] = True
+
+
+ for itm in main_all + deprecated_all:
+ fill_item(itm)
+
+ # Normalize/finalize types for all items (detect radios from allowable values,
+ # and canonicalize type strings).
+ for itm in main_all + deprecated_all:
+ try:
+ itm['type'] = _determine_type(itm)
+ except Exception:
+ # fallback: preserve whatever was present
+ itm['type'] = itm.get('type') or ''
+
+ # Add AssayName from the transformation-summary.html when available;
+ # otherwise fall back to the assay folder name.
+ section = {'main': main_all, 'deprecated': deprecated_all}
+ assay_display = load_assay_display_name_from_docs(assay, docs_base)
+ # If not found in the docs site path, also check the source transformation-summary
+ if not assay_display:
+ src_candidate = os.path.join(RSOURCE_DIR, assay, 'transformation-summary.html')
+ if os.path.exists(src_candidate):
+ try:
+ txt = open(src_candidate, 'r', encoding='utf-8').read()
+ m = re.search(r'\s*(.*?)\s*-\s*Metadata Transformation Summary\s*', txt, re.IGNORECASE)
+ if m:
+ assay_display = m.group(1).strip()
+ except Exception:
+ assay_display = None
+ if assay_display:
+ section['AssayName'] = assay_display
+ else:
+ section['AssayName'] = assay
+ out = {assay: section}
+ out_path = os.path.join(OUTPUT_DIR, f"{assay}.json")
+ with open(out_path, 'w', encoding='utf-8') as outf:
+ json.dump(out, outf, indent=2)
+ print(f'Wrote {out_path}')
+ # Also generate markdown for this assay
+ try:
+ md_path = generate_markdown_from_json(assay, section)
+ print(f'Wrote {md_path}')
+ except Exception:
+ pass
+
+ print(f'Done. Elapsed: {time.time() - start:.2f}s')
+
+
+if __name__ == '__main__':
+ p = argparse.ArgumentParser()
+ p.add_argument('--sources', nargs='*', choices=['ivt', 'jsonld'], help='Enable optional sources. Example: --sources ivt jsonld')
+ p.add_argument('--apply', action='store_true', help='If set, run markdown postprocessing and copy files to docs/assays/metadata/testing')
+ args = p.parse_args()
+ sources = args.sources or []
+ use_ivt = 'ivt' in sources
+ use_jsonld = 'jsonld' in sources
+ print(f'Enabled sources: ivt={use_ivt}, jsonld={use_jsonld}')
+ generate_all(use_ivt=use_ivt, use_jsonld=use_jsonld)
+
+ def _map_type_to_icon(typ):
+ if not typ:
+ return ''
+ s = str(typ).strip().lower()
+ # normalize common synonyms
+ if s in ('link', 'email'):
+ s = 'textfield'
+ if s.startswith('controlled') or 'allowable' in s:
+ s = 'allowable value'
+ icons = {
+ 'textfield': ' Textfield',
+ 'text': ' Textfield',
+ 'allowable value': ' Allowable Value',
+ 'radio': ' Radio',
+ 'numeric': ' Numeric',
+ 'checkbox': ' Checkbox',
+ 'date': ' Date',
+ }
+ return icons.get(s, typ)
+
+ def _postprocess_markdown(md_dir, deploy_target):
+ # Process markdown files in-place then optionally copy to deploy_target
+ if not os.path.isdir(md_dir):
+ print(f'Markdown dir not found: {md_dir}')
+ return
+ files = [f for f in os.listdir(md_dir) if f.endswith('.md')]
+ for fn in files:
+ path = os.path.join(md_dir, fn)
+ try:
+ text = open(path, 'r', encoding='utf-8').read()
+ except Exception:
+ print(f'Could not read {path}')
+ continue
+ lines = []
+ for line in text.splitlines():
+ if not line.strip().startswith('|'):
+ lines.append(line)
+ continue
+ parts = line.split('|')
+ # parts: ['', ' Attribute ', ' Type ', ' Description ', ' Allowable Values ', ''] typically
+ if len(parts) >= 4:
+ typ_cell = parts[2].strip()
+ mapped = _map_type_to_icon(typ_cell)
+ parts[2] = f' {mapped} '
+ line = '|'.join(parts)
+ lines.append(line)
+ new_text = '\n'.join(lines)
+ try:
+ with open(path, 'w', encoding='utf-8') as wf:
+ wf.write(new_text)
+ except Exception:
+ print(f'Could not write processed markdown to {path}')
+ # Copy to deploy target
+ if deploy_target:
+ os.makedirs(deploy_target, exist_ok=True)
+ for fn in files:
+ if fn == 'index.md':
+ continue
+ src = os.path.join(md_dir, fn)
+ dst = os.path.join(deploy_target, fn)
+ try:
+ shutil.copy2(src, dst)
+ print(f'Copied {src} -> {dst}')
+ except Exception:
+ print(f'Failed to copy {src} -> {dst}')
+
+ # If requested, run postprocessing and deploy
+ if getattr(args, 'apply', False):
+ docs_target = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(__file__))), 'docs', 'assays', 'metadata', 'testing')
+ print(f'Postprocessing markdown in {MARKDOWN_DIR} and copying to {docs_target}')
+ _postprocess_markdown(MARKDOWN_DIR, docs_target)