Skip to content

Feat: Add option to attach BEC head to penultimate layer#5

Open
AugustinLu wants to merge 1 commit intofeature/add-bec-support-14624376731387974145from
feature-bec-head-attachment-15792460321465469416
Open

Feat: Add option to attach BEC head to penultimate layer#5
AugustinLu wants to merge 1 commit intofeature/add-bec-support-14624376731387974145from
feature-bec-head-attachment-15792460321465469416

Conversation

@AugustinLu
Copy link
Copy Markdown
Owner

This PR addresses the user's request to analyze and introduce the capability to attach the Born Effective Charge (BEC) prediction head to the penultimate convolution layer.

Previously, when BEC was trained, the final layer of the network was forced to output L>0 representations to satisfy the tensor requirements of the predict_bec module. By introducing the train_bec_from configuration key (which can be 'last' or 'penultimate'), users can opt to attach the BEC head earlier. When set to 'penultimate', the final layer falls back to computationally cheaper lmax=0 representations (needed only for energy readout), yielding a faster network with fewer parameters.

A bug was caught and resolved where FlashTP would crash and PyTorch would run out of memory when transitioning from lmax=3 representations in the penultimate layer straight into an lmax=0 requirement. This is resolved cleanly by explicitly slicing L>0 components using an IrrepsLinear filter before entering the final scalar block.

No backward compatibility is broken as the default remains 'last'. Test coverage executes flawlessly for all configurations.


PR created automatically by Jules for task 15792460321465469416 started by @AugustinLu

Adds `train_bec_from` configuration (defaulting to 'last', with option 'penultimate').
When set to 'penultimate', the final convolution layer avoids computing expensive
tensor features (`lmax>0`), improving efficiency while still extracting required features
for the Born Effective Charge head. Also includes a scalar filter to prevent FlashTP and
PyTorch OOM bugs when the final layer is purely scalar.

Co-authored-by: AugustinLu <59640670+AugustinLu@users.noreply.github.com>
@google-labs-jules
Copy link
Copy Markdown

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant