Fix: Replace nn.Buffer with register_buffer by Tar-ive · Pull Request #30 · sapientinc/HRM

Tar-ive · 2025-08-05T05:50:36Z

The core issue was a bug in the original HRM source code that made it incompatible with modern versions of PyTorch. The error message AttributeError: module 'torch.nn' has no attribute 'Buffer' told us that the code was trying to use a feature in a way that doesn't exist.

The Cause: Incorrect PyTorch Usage
In PyTorch, a "buffer" is a tensor that is part of a model's state (like weights) but is not a parameter that gets updated during training (e.g., a running mean in a normalization layer).

The original developer wrote code like this:
self.weights = nn.Buffer(...)

This is incorrect. nn.Buffer is not a function you can call directly to create a buffer. This might have worked in a very old, pre-release version of PyTorch, but it is not the correct way to do it.

The Change: Using the Correct Method
The official and correct way to create and register a buffer in a PyTorch model is by using the self.register_buffer() method.

We fixed the code by changing lines like the one above to the following pattern:

# 1. Create the tensor you want to be a buffer
weights_tensor = trunc_normal_init_(...)

# 2. Register it as a buffer using the correct method
self.register_buffer('weights', weights_tensor)
We had to apply this same logical fix in three different files because the original developer repeated this same coding mistake throughout the repository:

models/sparse_embedding.py

models/layers.py

models/hrm/hrm_act_v1.py

By making these changes, we made the code compliant with the modern PyTorch API, which allowed the training to proceed without errors.

This commit refactors, enhances, and robustifies the unified optimization/evaluation/comparison process. It also incorporates a critical bug fix from PR sapientinc#30. - Applied a fix from PR sapientinc#30 to address an `AttributeError` related to `nn.Buffer` by replacing it with `register_buffer`. - Created an `optimization` directory and moved/renamed the relevant scripts. - Created `optimization/utils.py` to deduplicate code. - The hyperparameter search space is now loaded from a YAML file. - Added support for parallel execution of Optuna trials. - Improved the detail and location of the `comparison_report.md`. - Improved the console output. - Added robust error handling to the main scripts. - Updated `README.md` to reflect the changes. The final verification step was blocked by a `ModuleNotFoundError` which can be fixed by adding the project root to the Python path.

autonull · 2025-08-25T14:39:39Z

I've applied it here: deepstupid@45a3b2c

replaced adam-atan2 with adam-atan2-pytorch which works on older and newer versions of CUDA / pytorch sapientinc/HRM#45 Replace nn.Buffer with register_buffer sapientinc/HRM#30

Fix: Replace nn.Buffer with register_buffer

7cc81cf

dribnet mentioned this pull request Oct 10, 2025

pytorch/cuda compatibility updates SamsungSAILMontreal/TinyRecursiveModels#6

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Replace nn.Buffer with register_buffer#30

Fix: Replace nn.Buffer with register_buffer#30
Tar-ive wants to merge 1 commit into
sapientinc:mainfrom
Tar-ive:fix-buffer-attribute-error

Tar-ive commented Aug 5, 2025

Uh oh!

autonull commented Aug 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Tar-ive commented Aug 5, 2025

Uh oh!

autonull commented Aug 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants