-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Pull requests: openai/parameter-golf
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat: Ultimate SOTA submission - 10L Model, Mixed Int6 QAT, and TTT/LoRA Evaluation
#361
opened Mar 21, 2026 by
adityagupta26
Loading…
Non-record: QAT & EMA negative results on SOTA stack (val_bpb=1.1426)
#360
opened Mar 21, 2026 by
MultiFe22
Loading…
4 tasks done
docs: add TIPS.md and resolve environment dependency issues (#280, #82, #43)
#357
opened Mar 21, 2026 by
adityagupta26
Loading…
Non-record: PR315 repro on 1xH100 PCIe, int6+zstd (val_bpb=1.8338)
#356
opened Mar 21, 2026 by
sjp611
Loading…
3 tasks done
Add non-record BigramHash4096 + MLP992 + LR0.08 + Slide64 submission
#355
opened Mar 21, 2026 by
josusanmartin
Loading…
[Non-record] MLA + SmearGate + BigramHash + SWA — pre-quant 1.2838 bpb
#354
opened Mar 21, 2026 by
Skrisps26
Loading…
Memory Tokens + Mixed Quantization (val_bpb: 1.1659)
#352
opened Mar 21, 2026 by
sp00mm
Loading…
4 of 5 tasks
LongContext 4096 + Full SOTA Stack & QAT Int4 → 16 Layers
#347
opened Mar 21, 2026 by
FlashyFlash3011
•
Draft
2 of 4 tasks
Non-record: DART - Differential Attention Recurrent Transformer (Student submission, Kerala)
#345
opened Mar 21, 2026 by
anandks2006
Loading…
Non-record: Autoresearch Heads4 + Step-based LR + Sliding Window (1xH100)
#344
opened Mar 21, 2026 by
aryanbhosale
Loading…
Non-record: MLX-Optimized 12L 416d with SmearGate + BigramHash (val_bpb=1.9011, Mac)
#342
opened Mar 21, 2026 by
adhyaay-karnwal
Loading…
Add Hybrid Depth-Recurrent Transformer submission
#341
opened Mar 21, 2026 by
tobiascanavesi
Loading…
V2 Prototype: SwiGLU + Dropout + MuonWD + MidLayerLoop
#340
opened Mar 21, 2026 by
starfly-web
Loading…
Record: 11L XSA+EMA+TTT, sliding val_bpb=1.1254 (3-seed mean 1.1256)
#338
opened Mar 21, 2026 by
alertcat
Loading…
Non-record: 11L PartialRoPE + LNScale + EMA + SWA + TTT (1xH100 107min, val_bpb=1.2207, 15.4MB)
#334
opened Mar 21, 2026 by
nathon-lee
Loading…
5 tasks done
11L XSA4 + SmearGate + BigramHash + SWA + RoPE50K (mean val_bpb=1.1565, 3 seeds)
#333
opened Mar 21, 2026 by
mahsumaktas
Loading…
4 tasks done
Record: 12L Gradient-Guided Quant + Partial RoPE + LN Scale + EMA + XSA4 (val_bpb: 1.1320)
#332
opened Mar 21, 2026 by
saml212
Loading…
10L MLP3x + BigramHash(2048) + SWA + Stride-32: 1.1487 BPB
#331
opened Mar 21, 2026 by
Rhodrium
Loading…
3 tasks
Previous Next
ProTip!
Follow long discussions with comments:>50.