Skip to content

202 feature configuration support wildcards and regex for array definitions#206

Merged
JeanLucPons merged 13 commits intomainfrom
202-feature-configuration---support-wildcards-and-regex-for-array-definitions
Mar 10, 2026
Merged

202 feature configuration support wildcards and regex for array definitions#206
JeanLucPons merged 13 commits intomainfrom
202-feature-configuration---support-wildcards-and-regex-for-array-definitions

Conversation

@gupichon
Copy link
Contributor

@gupichon gupichon commented Mar 3, 2026

Description

This pull request extends the configuration layer to improve scalability and maintainability of YAML-based definitions.

Two complementary features are introduced:

  1. Pattern-based element selection inside elements:

    • Strings containing * or ? are interpreted as wildcard patterns.
    • Strings prefixed with re: are interpreted as regular expressions.
    • Other strings are treated as explicit element identifiers (backward-compatible behavior).
  2. Python-based configuration macros via a new optional key:

    • elements_code

    • Allows scripted generation of configuration entries directly from YAML.

    • The embedded Python code must return either:

      • a dict (single entry), or
      • a list[dict] (expanded into the surrounding list).

These changes are necessary to:

  • Avoid fragile explicit ID lists,
  • Improve maintainability when naming conventions evolve,
  • Support structured generation (loops, ranges, parametrized names),
  • Keep configuration self-contained while preserving reproducibility.

The macro mechanism does not replace external YAML generation scripts. External generation using Python remains fully possible and unaffected.


Related Issue

Features/issues described there are:

  • new feature: wildcard-based element resolution implemented by interpreting * and ? directly inside elements entries to keep the schema unchanged and preserve backward compatibility.
  • new feature: regex-based element resolution implemented using the re: prefix to explicitly distinguish regular expressions from literal IDs.
  • new feature: Python macro support implemented via elements_code, executed before object construction, replacing the macro block with the returned configuration structure.

Changes to existing functionality

  • Array resolution logic was extended to interpret wildcard and regex patterns inside elements entries.
    This was implemented at the configuration parsing level to preserve object model integrity and deterministic ordering.

  • Configuration parsing pipeline was extended with a pre-processing step detecting elements_code blocks.
    The code is executed in a controlled namespace and must return either a dict or list[dict].
    The returned structure replaces the macro block before normal parsing continues.

  • Duplicate resolution behavior after pattern expansion was made deterministic.

No existing YAML files are impacted. All previous configurations remain valid.


Testing

The feature is validated primarily through modifications to existing configuration files.

Only YAML configuration files were updated in order to:

  • Exercise wildcard resolution inside elements,
  • Exercise regex resolution using the re: prefix,
  • Validate macro execution via elements_code.

All existing pytest test suites pass without modification, meaning:

  • No regression was introduced,
  • The new features integrate transparently with the current test coverage,
  • Existing physics, control-system and array tests implicitly validate the new resolution logic.

In addition:

  • test_load_conf_with_code was added
    This test explicitly verifies elements_code execution and serves as a minimal documented example of configuration scripting.

This test is not strictly required for coverage, since the feature is already exercised indirectly through existing tests, but it improves clarity and provides a reference example.


Verify that your checklist complies with the project

  • New and existing unit tests pass locally
  • Tests were added where appropriate (only one explicit macro test was necessary)
  • The code is commented where appropriate
  • Any existing features are not broken (unless there is an explicit change to an existing functionality)

Copy link
Contributor

@gubaidulinvadim gubaidulinvadim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a test to check that duplicates are still detected if there's a wildcard in the config. Maybe it's unnecessary, I just wanted to be sure.

- BPM_C03-08
- BPM_C03-09
- BPM_C03-10
- BPM*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might break some examples or at least change behaviour. For BPMs, the order matters too. The correct array should be ordered starting from C04-02.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this specific case, it does work because the device declaration order is preserved. The macro at the end causes the sequence to start at C04-02. But I’ll look for a more explicit expression in two steps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually implemented it in my previous commit:

- re:^BPM_C(0[4-9]|1[0-9]|2[0-9]|3[0-2]).*$
- re:^BPM_C(0[1-3]).*$

I’m not sure it’s really clearer. Which version do you prefer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, I wanted to make sure that the BPM* declaration actually preserves the order. And it does, because the tests would not work otherwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this specific case, it does work because the device declaration order is preserved. The macro at the end causes the sequence to start at C04-02. But I’ll look for a more explicit expression in two steps.

I think the re: version is more explicit and clearer. With the macro, I would've just thought that the elements are created in that order, but I wouldn't understand that this will be the order of the elements in the array.

If it preserves the order, it's good to go. But I wasn't sure if any tests actually test for order preservation. For example, orbit correction would still work with "incorrect" order of bpms. The order is more of a convention.

I like the idea of @JeanLucPons to have an exclusion wildcard. I'd imagine it's one of the really common cases. For example, let's say we have a faulty BPM and want to quickly check if everything will work correctly if we exclude it from the BPM_ORBIT array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok i'll look for a set of improvements.

…tion---support-wildcards-and-regex-for-array-definitions
@JeanLucPons
Copy link
Contributor

Would it be possible to update EBSTune and EBSOrbit using wildcard.
i.e.:

EBSTune.yaml (if possible)

 - type: pyaml.arrays.magnet
    name: QForTune
    elements:
      - QD2*
      - QF1*
      - !QD2A-C04 #  excluded
      - !QF1E-C03 #  excluded
- type: pyaml.arrays.magnet
    name: HCorr
    elements:
      - S*-H

- type: pyaml.arrays.magnet
    name: VCorr
    elements:
      - S*-V

etc...

@gupichon
Copy link
Contributor Author

gupichon commented Mar 5, 2026

Would it be possible to update EBSTune and EBSOrbit using wildcard. i.e.:

EBSTune.yaml (if possible)

 - type: pyaml.arrays.magnet
    name: QForTune
    elements:
      - QD2*
      - QF1*
      - !QD2A-C04 #  excluded
      - !QF1E-C03 #  excluded
- type: pyaml.arrays.magnet
    name: HCorr
    elements:
      - S*-H

- type: pyaml.arrays.magnet
    name: VCorr
    elements:
      - S*-V

etc...

I haven’t implemented any behavior to explicitly exclude elements so far. Moreover, since the order matters, I only made very simple and obvious changes. That’s also why I added the possibility to integrate some code. It can become clearer at some point than maintaining a long list of complex regex patterns.
I’ll do my best with those files anyway.

@JeanLucPons
Copy link
Contributor

JeanLucPons commented Mar 5, 2026

Forget the exclusion the EBSTune.yaml, these 2 magnets are in the lattice but not in the yaml file.
But an exclusion mechanism could be useful.
Of course it is up to the user to respect an order in the device declaration.

@gupichon
Copy link
Contributor Author

gupichon commented Mar 5, 2026

I suggest the following three-step approach:

  1. Resolve all names matching the include patterns.
  2. Remove the names matching the exclude patterns (starting with !).
  3. Order the resulting list according to the device list.

@gupichon
Copy link
Contributor Author

gupichon commented Mar 5, 2026

I just have a doubt about the sorting. It does not allow users to freely sort the items independently of the device ordering. However, I don’t think this is really a problem in our context.
Do you have an opinion @gubaidulinvadim ?

@gubaidulinvadim
Copy link
Contributor

I just have a doubt about the sorting. It does not allow users to freely sort the items independently of the device ordering. However, I don’t think this is really a problem in our context. Do you have an opinion @gubaidulinvadim ?

I think in the vast majority (if not in all) cases, the elements in families/arrays are sorted with respect to their position in the lattice. There could be some exceptions, for example, if you merge two arrays A1 and A2, I'd imagine the order expected by a user will be [A1[0],... A1[n], A2[0],... A1[n]] and not something that preserves position-in-the-lattice-order.

@gupichon
Copy link
Contributor Author

gupichon commented Mar 5, 2026

I just have a doubt about the sorting. It does not allow users to freely sort the items independently of the device ordering. However, I don’t think this is really a problem in our context. Do you have an opinion @gubaidulinvadim ?

I think in the vast majority (if not in all) cases, the elements in families/arrays are sorted with respect to their position in the lattice. There could be some exceptions, for example, if you merge two arrays A1 and A2, I'd imagine the order expected by a user will be [A1[0],... A1[n], A2[0],... A1[n]] and not something that preserves position-in-the-lattice-order.

Yes, I'm currently adapting test_array.py because of such issues… It's quite tedious.

…ble to add an exclude pattern but a test must be added for that part. Also, the test `test_tune_hardware.py` is not working anymore but it is probably due to some numeric accuracy problems. Those 2 points will be solved later.
@gupichon
Copy link
Contributor Author

gupichon commented Mar 5, 2026

Ok it's done but I still have un issue with a test: test_tune_hardware.py
Probably only because of some numeric accuracy problems. I'll need the help of @JeanLucPons for that, it's minor I think.
Also, I need to add a new test.

Comment on lines +44 to +45
assert np.abs(currents[62] - 88.04522942) < 1e-8
assert np.abs(currents[63] - 88.26677735) < 1e-8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did currents[0] and currents[1] become currents[62] and currents[63]? From what I remember, the currents array shape should be (2, )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a single CFM, the shape could be (3,). In this case, however, it corresponds to an array of all correctors (QForTune, size 124). Since the order has changed, the matrix computation becomes slightly different, introducing numerical errors. I updated the indices to reflect the new element positions, but the resulting values differ slightly.

@JeanLucPons
Copy link
Contributor

JeanLucPons commented Mar 5, 2026

Consider that you cannot use exclamation mark in ! in yaml. It is a reserved character (not compatible with JSON).
Sorry for my bad example.

@gupichon
Copy link
Contributor Author

gupichon commented Mar 6, 2026

Consider that you cannot use exclamation mark in ! in yaml. It is a reserved character (not compatible with JSON). Sorry for my bad example.

Ah, you're right! I'll find another solution.

@gupichon
Copy link
Contributor Author

gupichon commented Mar 6, 2026

Ok, it's done. I used the ~ character.

@JeanLucPons
Copy link
Contributor

JeanLucPons commented Mar 9, 2026

 - type: pyaml.arrays.magnet
    name: QForTune
    elements:
      - QD2*
      - QF1*

Theoretically i do not expect any difference between the configuration above and the one with the full list.
The tune response matrix should be identical.
If this in not the case, there is definitely a bug somewhere either in the wildcard impl or in the devices list. I'll check ASAP
We already discuss this order problem with @GamelinAl in a previous discussion.

@gupichon
Copy link
Contributor Author

gupichon commented Mar 9, 2026

 - type: pyaml.arrays.magnet
    name: QForTune
    elements:
      - QD2*
      - QF1*

Theoretically i do not expect any difference between the configuration above and the one with the full list. The tune response matrix should be identical. If this in not the case, there is definitely a bug somewhere either in the wildcard impl or in the devices list. I'll check ASAP We already discuss this order problem with @GamelinAl in a previous discussion. I try to find the link.

I checked, and the two lists were different. The one in the device section and the one using the wildcard are now identical and match the ring ordering. The original QForTune list had a different order.

…tion---support-wildcards-and-regex-for-array-definitions
@JeanLucPons
Copy link
Contributor

Could you explain the change below ?
I do not expect any change of this kind using wildcard.

image

@JeanLucPons
Copy link
Contributor

JeanLucPons commented Mar 10, 2026

I restored test and native order of devices.
However, I would like that a PR address only one point for what it was foreseen at the beginning (wildcard).
It is already difficult enough to make code review.
Thanks for understanding.

@gubaidulinvadim gubaidulinvadim self-requested a review March 10, 2026 13:31
@JeanLucPons JeanLucPons merged commit c70627f into main Mar 10, 2026
3 checks passed
@JeanLucPons
Copy link
Contributor

I merge the PR.
I do not delete branch if @gupichon-soleil want to recover the inline code inside yaml file for a future PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Configuration - support wildcards and regex for array definitions

4 participants