Skip to content

Botorch preset#757

Open
AdrianSosic wants to merge 20 commits intodev/gpfrom
feature/botorch_preset
Open

Botorch preset#757
AdrianSosic wants to merge 20 commits intodev/gpfrom
feature/botorch_preset

Conversation

@AdrianSosic
Copy link
Copy Markdown
Collaborator

@AdrianSosic AdrianSosic commented Mar 2, 2026

DevPR, parent is #745

Adds the BOTORCH preset for GPs.

Important information

  • I think it's critical to actually assert that the preset exactly recovers the BoTorch behavior, in the form of a test, for mainly two reasons:
    1. The construction involves quite a few things to be configured, i.e. handling both singletask/multitask (the latter even requiring a new custom gpytorch module), setting all sorts of priors correctly, etc. Blindly believing that everything is correct and then just claiming this is the BoTorch behavior seems like a bad idea. The test ensures this explicitly.
    2. It also as an automatic alert mechanism for all situations when something is changed on the BoTorch side, informing us about breaking changes that yield different behavior but are not fully documented (which happened already several times).
  • I've also invested quite some effort to test the new multitask mean logic, i.e. that it not only recovers the BoTorch logic but that it also fills one of the missing gaps that will ultimately make our transfer learning model truely scale-invariant w.r.t. the different input tasks. In particular, I made sure that the only missing piece is the noise model stratification, which should be added in a follow-up PR and is the analogous to the stratification over means shipped by this PR.

@AdrianSosic AdrianSosic self-assigned this Mar 2, 2026
@AdrianSosic AdrianSosic added the new feature New functionality label Mar 2, 2026
@AdrianSosic AdrianSosic mentioned this pull request Mar 2, 2026
20 tasks
@AdrianSosic AdrianSosic changed the base branch from main to dev/gp March 2, 2026 15:15
@AdrianSosic AdrianSosic added the dev label Mar 2, 2026
@AdrianSosic AdrianSosic force-pushed the feature/botorch_preset branch from a149bb3 to c3885f6 Compare March 3, 2026 08:34
@AdrianSosic AdrianSosic force-pushed the feature/botorch_preset branch from c3885f6 to 4f4fd55 Compare April 17, 2026 08:18
@AdrianSosic AdrianSosic marked this pull request as ready for review April 17, 2026 10:46
Copilot AI review requested due to automatic review settings April 17, 2026 10:46
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new BOTORCH Gaussian Process preset intended to exactly mirror BoTorch GP behavior (single- and multi-task), alongside supporting kernel-factory refactors and validation improvements.

Changes:

  • Introduce GaussianProcessPreset.BOTORCH and implement BoTorch-aligned kernel / mean / likelihood factories (including custom GPyTorch components for multitask mean + likelihood).
  • Add parameter-kind classification (ParameterKind) and automatic parameter-kind validation for kernel factories via a new _KernelFactory base.
  • Extend tests (BoTorch parity + kernel-factory validation) and update the Streamlit surrogate-model demo to expose GP presets and multitask/transfer-learning toggles.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
baybe/surrogates/gaussian_process/presets/botorch.py Adds BoTorch preset factories for kernel/mean/likelihood.
baybe/surrogates/gaussian_process/components/_gpytorch.py Introduces custom GPyTorch mean + multitask likelihood helper to match BoTorch behavior.
baybe/surrogates/gaussian_process/components/kernel.py Adds _KernelFactory w/ parameter-kind validation; adjusts ICMKernelFactory.
baybe/surrogates/gaussian_process/presets/{edbo,edbo_smoothed}.py Migrates kernel factories to _KernelFactory and updates dimension logic.
baybe/surrogates/gaussian_process/presets/{core,__init__.py} Adds BOTORCH enum entry and re-exports preset factories.
baybe/parameters/{enum,base.py} Adds ParameterKind and exposes Parameter.kind.
baybe/parameters/selectors.py Removes _ParameterSelectorMixin (superseded by _KernelFactory).
tests/test_gp.py Adds parity test asserting BOTORCH preset reproduces BoTorch posterior stats.
tests/test_kernel_factories.py Adds tests for kernel-factory parameter-kind validation.
streamlit/surrogate_models.py Adds GP preset selector + multitask / transfer-learning UI and logic.
docs/conf.py Updates nitpick ignore list for refactor.
CHANGELOG.md Documents new presets and kernel-factory parameter-kind validation.
Comments suppressed due to low confidence (1)

baybe/surrogates/gaussian_process/presets/edbo.py:85

  • effective_dims is computed as train_x.shape[-1], which includes the task feature column in transfer-learning searchspaces. Since this factory explicitly excludes TaskParameter via parameter_selector, the priors should likely be based on the non-task dimensionality (i.e. exclude the task feature) to avoid skewing the dimension-dependent prior heuristics. Consider deriving effective_dims from the selected non-task dimensions instead (e.g., subtract task dims or count comp-rep columns for selected parameters).
        effective_dims = train_x.shape[-1]

        switching_condition = _contains_encoding(
            searchspace.discrete, _EDBO_ENCODINGS
        ) and (effective_dims >= 50)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +63 to 64
effective_dims = train_x.shape[-1]

Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

effective_dims = train_x.shape[-1] will count the task feature column in multitask searchspaces, even though this factory’s default parameter_selector excludes TaskParameter. This makes the dimension-dependent prior interpolation depend on a dimension that the kernel is not actually modeling. Consider computing effective_dims from the selected non-task dimensions (or excluding the task feature explicitly) to keep priors consistent in transfer-learning settings.

Copilot uses AI. Check for mistakes.
Comment on lines +178 to +186
base_idcs = (
set(dims)
if (dims := base_kernel.active_dims.tolist()) is not None
else None
)
task_idcs = (
set(dims)
if (dims := task_kernel.active_dims.tolist()) is not None
else None
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

base_kernel.active_dims / task_kernel.active_dims can be None in GPyTorch kernels. Calling .tolist() unconditionally will raise an AttributeError (NoneType has no attribute tolist). Consider guarding with an is not None check (or using getattr(..., None)) before converting to a set, and treat None explicitly in the validation logic.

Suggested change
base_idcs = (
set(dims)
if (dims := base_kernel.active_dims.tolist()) is not None
else None
)
task_idcs = (
set(dims)
if (dims := task_kernel.active_dims.tolist()) is not None
else None
base_active_dims = base_kernel.active_dims
task_active_dims = task_kernel.active_dims
base_idcs = (
all_idcs
if base_active_dims is None
else set(base_active_dims.tolist())
)
task_idcs = (
all_idcs
if task_active_dims is None
else set(task_active_dims.tolist())

Copilot uses AI. Check for mistakes.
Comment thread tests/test_gp.py
measurements_mt = create_fake_input(
searchspace_mt.parameters, objective.targets, n_rows=100
)
baybe_kernel = ScaleKernel(AdditiveKernel([MaternKernel(), RBFKernel()]))
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

baybe_kernel is assigned twice; the first assignment is immediately overwritten and never used. This makes the test setup harder to read and may hide which kernel is actually used in test_gpytorch_components. Remove the redundant assignment or use distinct names for single-task vs. multitask kernels.

Suggested change
baybe_kernel = ScaleKernel(AdditiveKernel([MaternKernel(), RBFKernel()]))

Copilot uses AI. Check for mistakes.
Comment on lines +110 to +112
0 if st_enable_multitask else 1,
20,
0 if st_enable_multitask else 5,
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When Multi-task is enabled, the on-task training-point slider allows 0. In the st_use_separate_gps branch (transfer learning unchecked), the on-task recommender is still run on the on-task subset, which will be empty and will likely raise during GP fitting/recommendation. Consider enforcing a minimum of 1 on-task point when st_use_separate_gps is True, or add an explicit guard that shows a Streamlit error and stops early when the on-task dataset is empty.

Suggested change
0 if st_enable_multitask else 1,
20,
0 if st_enable_multitask else 5,
1,
20,
5,

Copilot uses AI. Check for mistakes.
Comment thread CHANGELOG.md
for creating a `ScaleKernel` with a fixed output scale

### Changed
- Gaussian processes no longer invoke leave-one-out training for multitask scenarios but
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s trailing whitespace at the end of this line (after “but”). This tends to cause noisy diffs and may violate whitespace linting in some setups; consider removing it.

Suggested change
- Gaussian processes no longer invoke leave-one-out training for multitask scenarios but
- Gaussian processes no longer invoke leave-one-out training for multitask scenarios but

Copilot uses AI. Check for mistakes.
Comment thread tests/test_gp.py
Comment on lines +187 to +196
@pytest.mark.parametrize("multitask", [False, True], ids=["single-task", "multi-task"])
def test_botorch_preset(multitask: bool, monkeypatch):
"""The BoTorch preset exactly mimics BoTorch's behavior."""
if multitask:
monkeypatch.setenv("BAYBE_DISABLE_CUSTOM_KERNEL_WARNING", "true")
sp = searchspace_mt
data = measurements_mt
else:
sp = searchspace
data = measurements
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The multitask branch sets BAYBE_DISABLE_CUSTOM_KERNEL_WARNING to bypass the deprecation error. Since BOTORCH is a first-party preset, it would be better if it didn’t require an environment-variable escape hatch in tests. Consider adjusting the “custom kernel” detection / deprecation gate so that built-in presets (e.g. BotorchKernelFactory) don’t trigger it in the first place, and then drop this env var from the test.

Suggested change
@pytest.mark.parametrize("multitask", [False, True], ids=["single-task", "multi-task"])
def test_botorch_preset(multitask: bool, monkeypatch):
"""The BoTorch preset exactly mimics BoTorch's behavior."""
if multitask:
monkeypatch.setenv("BAYBE_DISABLE_CUSTOM_KERNEL_WARNING", "true")
sp = searchspace_mt
data = measurements_mt
else:
sp = searchspace
data = measurements
def test_botorch_preset():
"""The BoTorch preset exactly mimics BoTorch's behavior."""
sp = searchspace
data = measurements

Copilot uses AI. Check for mistakes.
Comment on lines +189 to +190
if base_idcs is not None and (base_idcs > allowed_base_idcs):
raise ValueError(
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subset check for base-kernel active_dims is incorrect: base_idcs > allowed_base_idcs checks for a strict superset, not “not a subset”. This will miss invalid cases (e.g. {0, task_idx}) and potentially flag none. Use a proper subset validation (e.g. not base_idcs <= allowed_base_idcs) and consider a clearer error if active_dims is None (meaning “all dims”).

Suggested change
if base_idcs is not None and (base_idcs > allowed_base_idcs):
raise ValueError(
if base_idcs is None:
raise ValueError(
"The base kernel's 'active_dims' must be restricted to the non-task "
f"indices {allowed_base_idcs}; got None, which means all dimensions."
)
if not base_idcs <= allowed_base_idcs:
raise ValueError(

Copilot uses AI. Check for mistakes.
Comment thread tests/test_gp.py
train_Y = to_tensor(objective.transform(measurements, allow_extra=True))

# >>>>> Code adapted from BoTorch landing page: https://botorch.org/ >>>>>
# NOTE: We normalize according to the searchspace bounds to ensure consisteny with
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: “consisteny” → “consistency”.

Suggested change
# NOTE: We normalize according to the searchspace bounds to ensure consisteny with
# NOTE: We normalize according to the searchspace bounds to ensure consistency with

Copilot uses AI. Check for mistakes.
TaskParameter(
name="task",
values=task_names,
active_values=["on-task"],
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TaskParameter is created with active_values=["on-task"], but the training measurements include rows with task="off-task". Since SearchSpace.transform validates dataframe values against each parameter’s active_values, fitting / recommending in transfer-learning mode will error on the off-task rows. For this demo to work, avoid restricting active_values here (or include all task names), and enforce “recommend only on-task” via filtering the recommendation set instead.

Suggested change
active_values=["on-task"],

Copilot uses AI. Check for mistakes.
Comment on lines +4 to +5
from botorch.models.multitask import _compute_multitask_mean
from botorch.models.utils.gpytorch_modules import MIN_INFERRED_NOISE_LEVEL
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module imports BoTorch’s private _compute_multitask_mean (leading underscore). Private APIs can change without notice, which would break BayBE’s multitask mean implementation. If the goal is exact BoTorch parity, consider isolating this dependency behind a small compatibility wrapper (with a version check and/or a local fallback implementation) so failures are easier to diagnose when BoTorch updates.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dev new feature New functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants