Skip to content

Add SOTA training script with BigramHash(10240), SWA, and all proven …

e22bfc5
Select commit
Loading
Failed to load commit list.
Open

Non-record: MLX-Optimized 12L 416d with SmearGate + BigramHash (val_bpb=1.9011, Mac) #342

Add SOTA training script with BigramHash(10240), SWA, and all proven …
e22bfc5
Select commit
Loading
Failed to load commit list.

Workflow runs completed with no jobs