Skip to content

feat: v0.1.0 Refactor & Packaging#89

Open
zbloss wants to merge 5 commits into
sapientinc:mainfrom
zbloss:main
Open

feat: v0.1.0 Refactor & Packaging#89
zbloss wants to merge 5 commits into
sapientinc:mainfrom
zbloss:main

Conversation

@zbloss

@zbloss zbloss commented Oct 1, 2025

Copy link
Copy Markdown

This pull request introduces infrastructure and documentation improvements to streamline development, testing, and deployment for the HRM repository. The most significant changes are the addition of a multi-stage GPU-enabled Dockerfile, a GitHub Actions CI workflow for Python linting and tests, and expanded installation and usage instructions in the README.md. There are also updates to dataset and training script usage, and minor code cleanups in the evaluation notebook.

Infrastructure & Deployment

  • Added a multi-stage Dockerfile supporting both FlashAttention 2 and 3 for Ampere and Hopper GPUs, including CUDA 12.6, Python 3.12, and optimized dependency installation using uv. This enables reproducible GPU builds for both development and production.
  • Added .dockerignore to exclude unnecessary files and directories (such as caches, data, checkpoints, notebooks, and test artifacts) from Docker build context, reducing image size and improving build performance.

Continuous Integration

  • Introduced a GitHub Actions workflow (.github/workflows/ci.yml) to automatically lint and test the codebase on Python 3.11, 3.12, and 3.13, ensuring code quality and compatibility across multiple Python versions.
  • Specified the default Python version as 3.12 in .python-version for consistent local and CI environments.

Documentation & Usage

  • Expanded the README.md with detailed package structure, installation options (including uv, pip, and Docker), FlashAttention setup instructions, Python API usage example, and updated commands for dataset preparation, training, and evaluation to use the new scripts/ directory and uv run. [1] [2] [3] [4] [5]

Notebook Cleanup

  • Minor import reordering and formatting improvements in arc_eval.ipynb for clarity and consistency. [1] [2]

Issues

Closes #88

@zbloss

zbloss commented Oct 1, 2025

Copy link
Copy Markdown
Author

I ran examples/02_train_sudoku_extreme.py to confirm the code still works as expected and it does.

WandB Results

I was not able to replicate the results in the paper but I am running on a much smaller GPU so I had to decrease the batch size and learning rate which I believe is the core issue with my results.

@alexander-rakhlin

alexander-rakhlin commented Oct 2, 2025

Copy link
Copy Markdown

@zbloss were you able to run evaluation of their ARC-2 checkpoint I'm getting errors regarding size mismatch

@zbloss

zbloss commented Oct 2, 2025

Copy link
Copy Markdown
Author

@zbloss was you able to run evaluation of their ARC-2 checkpoint I'm getting errors re. size mismatch

I did not try to load the existing checkpoints, I couldn't get them to load before these changes with similar issues.

@alexander-rakhlin

Copy link
Copy Markdown

I did not try to load the existing checkpoints, I couldn't get them to load before these changes with similar issues.

So this checkpoint works after your changes?

@zbloss

zbloss commented Oct 2, 2025

Copy link
Copy Markdown
Author

No it does not work before or with the changes

@zbloss

zbloss commented Oct 4, 2025

Copy link
Copy Markdown
Author

@alexander-rakhlin I have opened up a PR to add this model to Huggingface's Transformers library with working checkpoints in safetensor format.

I'm waiting on the 🤗 team to review and approve, so you'll have to pull my fork if you want to use it immediately.

huggingface/transformers#41272

Weights are here:

@alexander-rakhlin

Copy link
Copy Markdown

@zbloss thank you. I also trained Sudoku and it works just fine, except for invalid puzzles with multiple solutions. Currently, I am training ARC-1. I think I found the reason why their checkpoint fails and will let you know once I verify it.

@alexander-rakhlin

Copy link
Copy Markdown

@zbloss
#90 (comment)

@zbloss

zbloss commented Dec 8, 2025

Copy link
Copy Markdown
Author

Feel free to merge @kmk142789 , I don't have access to merge PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Contribution Guide, Interested in helping

2 participants