Skip to content

Add SLAC FEL model files and preliminary workflow#83

Open
pluflou wants to merge 21 commits intomainfrom
slac-fel
Open

Add SLAC FEL model files and preliminary workflow#83
pluflou wants to merge 21 commits intomainfrom
slac-fel

Conversation

@pluflou
Copy link
Copy Markdown
Collaborator

@pluflou pluflou commented Feb 27, 2026

Summary

This pull request introduces support for the SLAC FEL continuous-learning experiment, including configuration, data utilities, and integration with the main example harness. The changes add a new TOML config file, a data utility module, and update the configuration and dependency management to support the new experiment.

Approach

  • Added examples/slac_fel/slac-fel.toml configuration file with model, data, training, continual learning, drift detection, logging, and visualization settings for the SLAC FEL experiment.
  • Added examples/slac_fel/utils.py with data-loading and preprocessing utilities, including dataset, dataloader, feature config, scaler loading, window discovery, and legacy window splitting for SLAC FEL data.
  • Updated examples/utils.py to support the SLAC FEL experiment by returning SLAC_FEL harness when cfg.data.name is slac-fel.

Configuration and dependency updates:

  • Extended ModelCfg and DataCfg dataclasses in src/config/configuration.py to support optional model config paths and window size parameters for time-windowed datasets.
  • Updated pyproject.toml to include new dependencies (types-pyyaml, botorch) required for SLAC FEL experiment and added the examples directory to the package list.

This PR also includes a fix for #85 by @Zilinghan

Dependencies

  • Added: botorch
    note: can work on implementation to remove the need for it

Testing Plan

  • Unit tests
  • Integration tests
  • e2e / smoke test
  • Manual steps: python -m app --help

Documentation

  • Docstrings updated
  • User docs / README updated
  • CHANGELOG entry

Checklist

  • Code formatted (Ruff) → ruff format --check
  • Lint passes (Ruff) → ruff check .
  • Types pass (mypy/pyright) → mypy src
  • Tests pass (pytest) → pytest -q
  • Backward compatibility considered
  • Adequate comments for tricky parts
  • CI green

@pluflou pluflou requested a review from anagainaru February 27, 2026 20:12
Copy link
Copy Markdown
Collaborator

@anagainaru anagainaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I recommend the two changes so you pass the CI. I cannot run this since I don't have your data/model but I changed it to work with my model and it seems like it's running fine.

In addition to the changes requested, you need types-pyyaml in pyproject.

diff --git a/pyproject.toml b/pyproject.toml
index 8b252e9..e3657bf 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -20,7 +20,8 @@ dependencies = [
     "transformers (>=4.57.1,<5.0.0)",
     "river (>=0.21.0,<0.24.0)",
     "evidently (>=0.4.0,<0.8.0)",
-    "matplotlib (>=3.10.7,<4.0.0)"
+    "matplotlib (>=3.10.7,<4.0.0)",
+    "types-pyyaml (>=6.0.12.20250915,<7.0.0.0)"
 ]

 [dependency-groups]
diff --git a/poetry.lock b/poetry.lock
index 426ce3d..c232f3e 100644
--- a/poetry.lock
+++ b/poetry.lock
@@ -4874,6 +4874,18 @@ files = [
     {file = "types_pytz-2025.2.0.20251108.tar.gz", hash = "sha256:fca87917836ae843f07129567b74c1929f1870610681b4c92cb86a3df5817bdb"},
 ]

+[[package]]
+name = "types-pyyaml"
+version = "6.0.12.20250915"
+description = "Typing stubs for PyYAML"
+optional = false
+python-versions = ">=3.9"
+groups = ["main"]
+files = [
+    {file = "types_pyyaml-6.0.12.20250915-py3-none-any.whl", hash = "sha256:e7d4d9e064e89a3b3cae120b4990cd370874d2bf12fa5f46c97018dd5d3c9ab6"},
+    {file = "types_pyyaml-6.0.12.20250915.tar.gz", hash = "sha256:0f8b54a528c303f0e6f7165687dd33fafa81c807fcac23f632b63aa624ced1d3"},
+]
+
 [[package]]
 name = "types-requests"
 version = "2.32.4.20250913"

Let me know when it's ready to be merged and I will approve.

Comment on lines +184 to +185
X_scaled: Tensor = input_scaler.transform(X_raw)
y_scaled: Tensor = output_scaler.transform(y_raw)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diff --git a/examples/slac_fel/utils.py b/examples/slac_fel/utils.py
index e52df9c..59d347d 100644
--- a/examples/slac_fel/utils.py
+++ b/examples/slac_fel/utils.py
@@ -181,8 +181,8 @@ def load_fel_data(
     X_raw = torch.tensor(df[input_cols].values, dtype=torch.float32)
     y_raw = torch.tensor(df[output_cols].values, dtype=torch.float32)

-    X_scaled: Tensor = input_scaler.transform(X_raw)
-    y_scaled: Tensor = output_scaler.transform(y_raw)
+    X_scaled: Tensor = input_scaler(X_raw)
+    y_scaled: Tensor = output_scaler(y_raw)

     return X_scaled, y_scaled, df.index

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are botorch transformers right now (from the model source). I can save and load them differently if we don't want the added dependency

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI was complaining, it's prob not clear to mypy what those are.

@pluflou
Copy link
Copy Markdown
Collaborator Author

pluflou commented Mar 3, 2026

I updated the way the model is being loaded (there was a bug before), and the config for the drift detector (not confident that these make sense though). There seems to be something off with the logging, so I don't want to mark it as ready to merge yet.

@anagainaru
Copy link
Copy Markdown
Collaborator

I updated the way the model is being loaded (there was a bug before), and the config for the drift detector (not confident that these make sense though). There seems to be something off with the logging, so I don't want to mark it as ready to merge yet.

I will try to run it as well tomorrow before our meeting

@pluflou pluflou marked this pull request as ready for review March 3, 2026 18:34
@pluflou
Copy link
Copy Markdown
Collaborator Author

pluflou commented Mar 3, 2026

I updated the docstrings to be in Google style, which seems to be the dominant/intended format across the codebase (although config/configuration.py and some example modules are using NumPy style)

@ScSteffen ScSteffen added the example Adding a new test case or illustration case label Mar 4, 2026
@Zilinghan Zilinghan mentioned this pull request Mar 4, 2026
16 tasks
@anagainaru
Copy link
Copy Markdown
Collaborator

anagainaru commented Mar 4, 2026

After I rebase this on main you can use in the toml file the following option (introduced in #88):

max_ckpts = 1
ckpts_path = "output/slac/"

to save checkpoints after each CL loop and only keep the most recent one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

example Adding a new test case or illustration case

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants