diff --git a/data/datasets/desert_farm_leverage_points.csv b/data/datasets/desert_farm_leverage_points.csv index 16f307c..f200bf8 100644 --- a/data/datasets/desert_farm_leverage_points.csv +++ b/data/datasets/desert_farm_leverage_points.csv @@ -1,13 +1,13 @@ Prefix,ShortName,Time_min,Time_max,Space_min,Space_max,Color,Reference -model,Molecular Dynamics Models,1.00E-15,1.00E-09,1.00E-30,1.00E-27,#013333, -process,CO2 fixation,7.69E-03,2.38E-01,5.79E-26,1.19E-24,#006666,Bar-On et al. (2019) PNAS 116:4738 — kcat dependent on HCO3- pool via carbonic anhydrase (Igamberdiev 2015) -process,Nutrient transport,1.00E-02,1.00E+00,1.00E-22,1.00E-20,#006666,Moore et al. (2013) Nature Geosci 6:701 -process,Biochemical synthesis,1.00E+00,1.00E+01,1.00E-21,1.00E-19,#006666,Moore et al. (2013) Nature Geosci 6:701 -process,Growth,1.00E+05,1.00E+06,1.00E-20,1.00E-10,#006666,Moore et al. (2013) Nature Geosci 6:701 -model,Community Metabolic Models,99,1.00E+07,1.00E-18,1.00E-03,#013333, +model,Molecular Dynamics Models,1.00E-15,1.00E-09,1.00E-30,1.00E-22,#013333,Karplus & McCammon (2002) Nat Struct Biol 9:646 +process,CO2 fixation,6.67E-02,2.38E-01,5.79E-26,1.19E-24,#006666,Bar-On et al. (2019) PNAS 116:4738 — kcat = 3-15/s natural RuBisCO; space = 0.1×–1.7× RuBisCO L8S8 enzyme volume (~7e-25 m³) +process,Nutrient transport,1.00E-02,1.00E+00,1.00E-22,1.00E-20,#006666,Milo & Phillips (2015) Cell Biology by the Numbers +process,Biochemical synthesis,1.00E+00,1.00E+01,1.00E-21,1.00E-19,#006666,Milo & Phillips (2015) Cell Biology by the Numbers +process,Photoautotroph cell growth,7.56E+03,1.00E+05,1.00E-18,1.00E-14,#006666,Milo & Phillips (2015) Cell Biology by the Numbers; Yu et al. (2015) Sci Rep 5:8132 (UTEX 2973 fastest) +model,Community Metabolic Models,1.00E+02,1.00E+07,1.00E-18,1.00E-03,#013333,Zakem et al. (2020) ISME J 14:288 leverage point,Community Ecology,1.00E+06,1.00E+08,1.00E-09,1.00E+00,#FF9900,Moore et al. (2013) Nature Geosci 6:701 leverage point,Flocculation,1.00E+03,86400,1.00E-15,1.00E-09,#FF9900,Salim et al. (2011) J Appl Phycol; Vandamme et al. (2013) Biotechnol Adv 31:1680 leverage point,Ponds,86400,3.15E+07,2000,2.00E+04,#FF9900,Design basis: 10x100x3m deep ponds (3000 m3); depth for photoinhibition protection and floc settling -model,Biogeochemical Circulation Models,1.00E+05,1.00E+10,1.00E+09,1.00E+16,#013333, +model,Biogeochemical Circulation Models,1.00E+05,1.00E+10,1.00E+09,1.00E+16,#013333,Levine et al. (2025) Annu Rev Earth Planet Sci 53:595 leverage point,Sequestration,3.15E+08,1.00E+11,3.12E+09,2.91E+10,#FF9900,40 GtCO2/yr: Space_min=C as diamond (3.5 t/m3) Space_max=biomass+salt+sand (rho~1.5 t/m3) per year -process,Extraction,3.16E+14,6.31E+15,2.03E+12,2.05E+14,#006666,Carboniferous coal (~300 Mya) + Mesozoic petroleum (~252-66 Mya); Tissot & Welte (1984) +process,Fossil-fuel formation,3.16E+13,3.16E+15,2.03E+12,2.05E+14,#006666,Kerogen→petroleum maturation (~1-10 My) + coalification (~10-100 My); Tissot & Welte (1984) diff --git a/data/references/README.md b/data/references/README.md new file mode 100644 index 0000000..c77c110 --- /dev/null +++ b/data/references/README.md @@ -0,0 +1,31 @@ +# References + +Standalone reference data — not coupled to any specific dataset in this repo. + +## Files + +- `bionumbers_subset.csv` — curated phototroph-relevant entries from the [BioNumbers database](https://bionumbers.hms.harvard.edu/) (Milo & Phillips, *Cell Biology by the Numbers*) with stable `bion_id` identifiers and direct URLs. + +## `bionumbers_subset.csv` schema + +| Column | Description | +|---|---| +| `bion_id` | BioNumbers entry ID (stable identifier) | +| `Properties` | Description of the measured quantity | +| `Organism` | Organism the measurement is from | +| `Value` | Reported value | +| `Range` | Reported range (often empty) | +| `Units` | Units of `Value` | +| `URL` | Direct link to the BioNumbers entry | + +## Scope + +Curated to phototroph-relevant entries (cyanobacteria, green algae, diatoms) across three categories: cell generation/doubling times, biochemical synthesis rates (transcription/translation elongation), and nutrient transport (small-molecule diffusion, transporter kinetics). Where phototroph-specific entries were unavailable, generic or model-organism data (e.g., E. coli, Xenopus) was kept as a proxy and noted in `Organism`. + +The full BioNumbers database has ~14,300 entries (~27 MB HTML export); this is a focused subset kept as a resource for future work. + +## Attribution + +BioNumbers is a project of the Milo Lab (Weizmann Institute) and Harvard Medical School. Cite as: Milo R, Jorgensen P, Moran U, Weber G, Springer M (2010). *BioNumbers — the database of key numbers in molecular and cell biology*. Nucleic Acids Res. 38:D750–D753. DOI: [10.1093/nar/gkp889](https://doi.org/10.1093/nar/gkp889). + +For the broader synthesis: Milo R, Phillips R (2015). *Cell Biology by the Numbers*. Garland Science. diff --git a/data/references/bionumbers_subset.csv b/data/references/bionumbers_subset.csv new file mode 100644 index 0000000..2db5ed5 --- /dev/null +++ b/data/references/bionumbers_subset.csv @@ -0,0 +1,26 @@ +bion_id,Properties,Organism,Value,Range,Units,URL +100289,Generation time,Cyanobacteria Synechocystis PCC 6803,6.0,,Hours,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=100289 +102211,Mean generation time (on acetate),Green algae chlorella,20.0,,hours,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=102211 +102212,mean generation time (photoautotrophic growth),Green algae chlorella,11.0,,hours,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=102212 +104195,Generation time,Diatom Phaeodactylum tricornutum,15.0,,hours,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=104195 +104196,Generation time,Green algae Dunaliella tertiolecta,15.0,,hours,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=104196 +105638,Generation time at 30°C,Green algae Dunaliella salina,20.0,,Hours,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=105638 +111252,Doubling time,Cyanobacteria Synechocystis PCC 6803,12.0,,hours,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=111252 +111335,Fastest generation time,Cyanobacteria Synechocystis PCC 6803,5.13,,hours,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=111335 +100233,Translation rate of beta-galactosidase in E. coli,Bacteria Escherichia coli,14.5,'13-16,aa/s,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=100233 +105067,Translation rate for short peptides,Bacteria Escherichia coli,22.0,'Table link - http://bionumbers.hms.harvard.edu/files/Maximal%20translation%20rate.pdf,1/sec,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=105067 +104300,Translation rate on culture of nitrogen bases and glucose,Budding yeast Saccharomyces cerevisiae,9.3,,aa/sec,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=104300 +104598,Translation rate,Human Homo sapiens,5.0,,aa/sec,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=104598 +100060,RNA polymerase transcription rate of stable RNA,Bacteria Escherichia coli,85.0,'Table link - http://bionumbers.hms.harvard.edu/files/Parameters%20pertaining%20to%20the%20macromolecular%20synthesis%20rates%20in%20exponentially%20growing%20E.%20coli%20Br%20as%20a%20function%20of%20growth%20rate%20at%2037%20degrees%20celsius.pdf,nt/sec,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=100060 +101904,Transcription elongation rate of rRNA operons,Bacteria Escherichia coli,42.0,'±2,nucleotides/sec,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=101904 +100664,Single molecule RNA polymerase II transcription rates,Bacteria Escherichia coli,12.8,'±4.9,nucleotides/sec,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=100664 +100662,Maximum RNA polymerase II transcription elongation rate,Mammalian tissue culture cell,71.6,,nucleotides/sec,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=100662 +104089,Diffusion coefficient of glucose in water,Generic,600.0,'Table link - http://bionumbers.hms.harvard.edu/files/Diffusion%20coefficients%20of%20various%20substances%20in%20Water.pdf,µm^2/sec,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=104089 +104437,Diffusion coefficient of ammonium in water,Generic,1860.0,'Table link - http://bionumbers.hms.harvard.edu/files/Biofilm%20model%20parameters.pdf,µm^2/sec,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=104437 +104439,Diffusion coefficient of nitrate in water,Generic,1700.0,'Table link - http://bionumbers.hms.harvard.edu/files/Biofilm%20model%20parameters.pdf,µm^2/sec,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=104439 +109504,Diffusion coefficient glucose in water at 25°C,Generic,673.0,,µm^2/sec,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=109504 +110618,Diffusion coefficient of CO2 in red blood cell,Human Homo sapiens,650.0,,µm^2/sec,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=110618 +117305,Diffusion coefficient of CO2 in air,air,1.4e-05,,m2/s,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=117305 +102428,Intracellular diffusion coefficient for glucose transporter,African clawed frog Xenopus laevis,100.0,,µm^2/sec,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=102428 +109673,Highest glucose uptake rate that still allows the maximal NADH formation rate,Bacteria Escherichia coli,2.4,,mmol/(gCDW·…h),https://bionumbers.hms.harvard.edu/bionumber.aspx?id=109673 +109686,Glucose uptake rate of strain C-3000 in minimal M9 media,Bacteria Escherichia coli,12.0,'±0.5,mmol/gDW/h,https://bionumbers.hms.harvard.edu/bionumber.aspx?id=109686