Skip to content

Design a first-class surface for analytical / user-defined objective functions (generalize direct_pass) #425

@wshlavacek

Description

@wshlavacek

Context

Spun out of the #424 objective-surface discussion (mean-vs-median centering + the broader "modernize the objective surface" thread). That discussion settled a clean taxonomy for PyBNF's objective functions along two orthogonal axes:

  • Shape axis (→ config key): per-point noise model (noise_model) | column-joint profile comparison (profile_objective) | named (objective).
  • Nature axis (→ doc label): statistical objective (anchored to a likelihood — an NLL with parameter-independent constants dropped) | heuristic objective (no probability model).

A key finding: nearly every existing objfunc is a likelihood in disguise — sos = Gaussian with σ≡1, chi_sq = Gaussian × the data _SD column, sod = Laplace with b≡1, norm_sos/ave_norm_sos = Gaussian with a data-derived σ, kl = multinomial cross-entropy. The one genuinely non-statistical objective is direct_pass. It is the sole current member of the heuristic objective bucket, and it deserves a deliberate design rather than growth by accretion.

What direct_pass is today

DirectPassObjective (pybnf/objective.py) reads a single score cell from the simulated data and returns it as the objective, ignoring the experimental data (a dummy .exp file is still required by the config parser). It is the backbone of the analytical test tier: analytical_model.py computes a score with no external simulator, and direct_pass feeds that number straight to the optimizer/sampler (tests/integration_harness.py, test_optimizer_integration, test_sampler_integration).

Problems:

  • The name is opaque and leaks an implementation detail — a user has no idea what direct_pass means, and it should never be user-facing.
  • It is wired narrowly (a magic score column; a required dummy .exp), purely for internal/test use.

Why this deserves its own targeted discussion

Done well, this is more than a rename. A first-class surface for analytical / user-defined objective functions makes PyBNF useful far beyond biological-model calibration — optimizing or sampling arbitrary functions (analytical test functions, engineered costs, externally-computed scores, user-supplied callables) with PyBNF's full optimizer / sampler / parallel machinery. That broadens PyBNF's usefulness substantially, so it warrants a focused design.

Questions to settle here (not prescribing answers)

  • User-facing name & surface. objective = score? A dedicated key? How does a user supply a function — a magic column (today), a Python callable / entry point, a math expression, a file?
  • Drop the dummy .exp requirement for objectives that ignore experimental data.
  • Relationship to profile_objective's geometric members (e.g. wasserstein, which is non-likelihood) — is "heuristic objective" one bucket or several?
  • Analytical test functions as first-class fitting targets (Rosenbrock, banana, Gaussian, …) — promote from test-only fixtures to a documented, user-accessible feature?
  • Keep direct_pass as a developer/internal alias during the transition (avoid yak-shaving the rename + golden churn) while exposing a clean surface to users.

Related: #424 (objective-surface modernization / centering convention), #419 (per-family mean/median capability).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions