Skip to content

Non-record: QAT & EMA negative results on SOTA stack (val_bpb=1.1426)#360

Open
MultiFe22 wants to merge 1 commit intoopenai:mainfrom
MultiFe22:first-try
Open

Non-record: QAT & EMA negative results on SOTA stack (val_bpb=1.1426)#360
MultiFe22 wants to merge 1 commit intoopenai:mainfrom
MultiFe22:first-try

Conversation

@MultiFe22
Copy link

Summary

Results (8xH100 SXM, 600s)

Config Steps val_bpb Artifact Delta
Baseline (PR #180 repro) 6,684 1.1426 15.99 MB
+ QAT (warmup=500) 6,143 1.1473 15.69 MB +0.005 (worse)
+ QAT + EMA 4,546 1.1606 16.89 MB +0.018 (worse)

Key findings

  • QAT: Better compression (15.69 vs 15.99 MB) but 8% fewer steps - net negative
  • EMA: .cpu().clone() every step causes 32% throughput loss - catastrophic
  • Implication: Techniques trading steps for inference quality are counterproductive under the 10-min budget

Test plan

  • Baseline reproduction matches published SOTA (1.1426 vs 1.1428)
  • QAT ablation on 8xH100
  • QAT+EMA ablation on 8xH100
  • All logs included

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant