10% on ARC-1 for less than a dollar using a 1M transformer

Update 13 dec: 27.5% now. Looking at the output grids, I expect 50% with scaling. Also gonna run this on ARC-2 today ~~Update: 20% right now~~ ~~(In progress, I expect improvements till 30%)~~

10% on ARC-1 for less than a dollar using a 1M transformer

This already beats the pareto frontier btw

Self supervised compression on ARC

Every DL approach on ARC today trains a supervised algorithm[1]

This is dumb.
A self-supervised compression step will obviously perform better:

There is new information in the input grids and private puzzles that is currently uncompressed
Test grids have distribution shifts. Compression will push these grids into distribution

For more reasoning behind the approach, read My Blog

Details

Performance - 10% on ARC-1 public eval
Total compute cost - $0.709

52m on A100 for training (0.7$)
40s on A100 for inference (0.009$)

This is early performance. Haven't run all ablations yet

I should be able to push to 30% on ARC-1 and 8% on ARC-2

[1]: CompressARC is an exception, but that compresses each task individually. Mine jointly compresses all tasks together

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
assets		assets
.DS_Store		.DS_Store
.gitignore		.gitignore
inference.py		inference.py
readme.md		readme.md
tinytransformer.py		tinytransformer.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

10% on ARC-1 for less than a dollar using a 1M transformer

Self supervised compression on ARC

Details

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

10% on ARC-1 for less than a dollar using a 1M transformer

Self supervised compression on ARC

Details

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages