Skip to content

PEtab v2 interop (import + export) — the 'two-adapter' proof #407

@wshlavacek

Description

@wshlavacek

Status

Updated 2026-06-19. This umbrella started life as an importer-only tracker. Two things have since reshaped it: (1) the work pivoted to exporter-first (PyBNF/BNGL → PEtab v2; ADR-0025), because exporting reads already-correct constructs that petablint can grade, whereas importing means generating BNGL with no oracle; and (2) the resulting package is a single two-way interop, not two subsystems — so the importer is the reverse reader of each asset mapper, not a parallel build. Checklist at the bottom is the source of truth.

The package is one import/export interop (the key architectural fact). pybnf/petab/ pairs each PEtab table with a neutral row dataclass (PetabParameterRow, PetabObservableRow, PetabMeasurementRow, …) and reversible asset mappers on both sides of it; export.py calls itself "disposable glue." So the hard, semantic part — prior families, noise distributions, scale conversions — is written once and run both directions. Consequences:

  • Parameters and observables are already mapped both ways and tested: petab_parameter_rowfree_parameter_from_row (ADR-0019), petab_observable_rownoise_model_from_row (ADR-0023).
  • Measurements and conditions are export-only today, so their reverse readers are the main remaining importer code — but they slot onto the same neutral seam, not a new design.
  • Scope is BNGL-native PEtab problems: the model passes through _bngl.parse_model unchanged (no BNGL generation — the original "importer is hard" fear is dissolved). An SBML model is a separate optional adapter: export_job raises on SBML, and the inverse cannot synthesize a PyBNF model from SBML.
  • Export is lossy by design (raises on PyBNF features PEtab v2 can't express), so import↔export is identity only on the PEtab-representable subset.

Importer (read) status:

Exporter-first track (ADR-0025; landed since this issue was filed):

Motivation

The M2 modularization gave PyBNF first-class, registry-backed Prior (ADR-0010, pybnf/priors/) and NoiseModel (ADR-0011, pybnf/noise/) abstractions, deliberately PEtab-defaulted but not PEtab-bound (ADR-0004). The payoff that justifies that shape is PEtab v2 interop: a thin adapter where a problem.yaml + its TSV tables + model and a native .conf produce the same internal objects, in either direction.

That makes it the "two-adapter" proof the refactor plan calls out — native .conf and a PEtab problem feeding one set of FreeParameter/Prior/NoiseModel/exp-data objects. If both adapters land on the same objects, the abstractions are right; if PEtab forces a special case, we learn where they're wrong. In practice this became a single reversible package (see Status): the asset mappers are the proof, run forward (export) and backward (import).

This is an umbrella/tracking issue. It scopes the whole PEtab v2 interop (import + export); each chunk splits into its own issue when work begins.

Spec correction — what PEtab v2 actually specifies

Verified against the live v2 data-format spec. Three premises in the original draft are now wrong:

  • parameterScale was removed entirely. v2 parameters are all in linear space; a scale change is expected to be done in the model file. PyBNF derives a parameter's Scale (Linear/Log10) from its prior family instead — so the original "natural-log scale gap" is moot.
  • Prior columns renamed to priorDistribution / priorParameters (from objectivePriorType / objectivePriorParameters). There is a single prior, used for the objective only (initializationPrior* was also removed).
  • Bounds truncate the prior, and the catalog is richer than the draft assumed: uniform, normal, laplace, log-normal, log-laplace, log-uniform, cauchy, gamma, exponential, chisquare, rayleigh. log-normal / log-laplace use the natural log.

What a PEtab v2 problem is

  • problem.yaml — references the model file(s) + the TSV tables
  • model — BNGL (first-class via ADR-0026; the supported interop target) or SBML (a separate adapter; PyBNF already imports SBML/Antimony: SbmlModel, BngsimAntimony)
  • parameters.tsvparameterId, lowerBound, upperBound, nominalValue, estimate (true|false), priorDistribution, priorParameters (no parameterScale)
  • observables.tsvobservableId, observableFormula, noiseFormula, observableTransformation, noiseDistribution
  • measurements.tsvobservableId, simulationConditionId, measurement, time, …
  • conditions.tsv — per-condition parameter/species overrides

Mapping to PyBNF's existing abstractions (corrected for v2)

PEtab v2 concept PyBNF target Status
priorDistribution uniform / normal / laplace (linear) Uniform / Normal / Laplace family, Linear scale ✅ exact, both ways
log-uniform loguniform_var (Uniform × Log10); params are linear bounds ✅ exact (base-independent)
log-normal / log-laplace (natural log) lognormal_var / loglaplace_var; convert μ/ln10, σ/ln10 ✅ θ-distribution identical, no Jacobian (ADR-0003 — the scale lives in the sampling parameterization)
omitted prior + bounds uniform over [lowerBound, upperBound] ✅ matches v2's default-to-uniform rule
estimate = false (fixed) model constant, not a FreeParameter ⏭ later chunk (conditions / model overrides)
lowerBound / upperBound truncate the prior reflecting bounds ✅ two-sided truncation maps (#411, ADR-0020); one-sided still raises (#417)
cauchy, gamma, exponential, chisquare, rayleigh ⚠️ 5 families PyBNF lacks (catalog parity; the 1-param ones need grammar/arity work)
noiseDistribution normal / laplace Gaussian / Laplace noise ✅ done both ways (#410, ADR-0021/0023)
observableTransformation lin / log / log10 NoiseModel additive-noise-scale axis (ADR-0011) ✅ done (ADR-0022/0023)
location = median (PEtab hardcodes) Location Interpretation axis ✅ exists
observableFormula / noiseFormula (sympy over model entities) bare model-entity name ✅ common case descoped by ADR-0025 (functions stay in the model); arbitrary expressions still need the sympy layer
model (BNGL) BnglModel / _bngl.parse_model ✅ passes through unchanged (ADR-0026)
model (SBML) SbmlModel / Antimony ⏭ separate adapter (export raises on SBML; not obtainable by inversion)

Step 1 — parameters table → Prior/FreeParameter (DONE)

pybnf/petab/parameters.py reads parameters.tsv and maps each estimated row to a FreeParameter carrying a Prior, driven by the prior registry (synthesizes the *_var keyword, validates against PRIOR_KEYWORD_MAP, builds through the FreeParameter constructor → bit-identical to the native .conf path — not a parallel mapping table). Dependency-free (stdlib csv; runs in the bngsim-less CI tier) behind a neutral PetabParameterRow seam shared with the export direction. PEtab/PyBNF boundaries are explicit NotImplementedErrors (the 5 unsupported families; unbounded-family truncation; estimate=false). Commit f151914, ADR-0019; 30 tests.

Chunks (rough order, each its own issue when reached)

Importer = the reverse readers on the shared asset seam (not a separate subsystem):

Exporter-first track (ADR-0025; not in the original scope, landed since):

Notes / constraints

  • New runtime deps (petab, python-libsbml, sympy) must be hand-mirrored into .github/actions/setup-pybnf or the tests/integration CI tiers go red (the recurring single-sync-point gotcha). Decision (ADR-0019): petab is adopted as an optional extra (pybnf[petab]) at the formula/SBML chunk, not in core. (As of the exporter-first track, petab is wired in as a test-only oracle in pybnf[tests]; core remains dependency-free — ADR-0025/0026.)
  • Keep the interop simulator-free where possible so it runs in the bngsim-less CI tier.
  • Out-of-scope framing comes from dev/refactor-plan.md. Relevant ADRs: 0003 (no Jacobian), 0004 (PEtab-defaulted not -bound), 0010 (Prior), 0011 (NoiseModel), 0019 (parameters), 0020 (truncation), 0021 (per-observable noise), 0023 (observables noise), 0025 (exporter-first), 0026 (BNGL PEtab model), 0027 (conditions/experiments), 0028 (config redesign).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions