JAXBench: Fix no-op exploit in 19 KernelBench baselines: zero-init weights -> random by charleshong3 · Pull Request #50 · AI-Hypercomputer/accelerator-agents

charleshong3 · 2026-06-10T07:14:39Z

18k-50k KernelBench-derived baselines initialized their weight/bias tensors to jnp.zeros in create_inputs. With zero weights, x @ W (+ b) is identically zero and independent of the input, so the reference output is a trivial constant (all-zero, or a fixed activation thereof). Any kernel returning that constant -- including a no-op that skips the matmul/conv entirely -- passes np.allclose, so these benchmarks could report large meaningless speedups without computing the operator.

This replaces the zero-init weights/biases with small-normal random values (~0.02 scale: input-dependent, bf16-representable, no overflow). Only create_inputs is changed; the workload/op is untouched. After the fix a no-op (all-zero output) fails correctness on all 19.

Scope: 19 of the affected baselines are fully fixed by non-zero weights. Five others whose output is intrinsically small regardless of weights -- the softmax-terminated 38k/43k/50k (row outputs ~1/N) and the structurally degenerate 25k (GroupNorm->Mean) and 42k (Max-Subtract-GELU) -- are NOT addressed here; they need a tolerance or operator change and are left to a follow-up. Megablox (11p) has a distinct input-underflow variant fixed separately.

…andom 18k-50k KernelBench-derived baselines initialized their weight/bias tensors to jnp.zeros in create_inputs. With zero weights, `x @ W (+ b)` is identically zero and independent of the input, so the reference output is a trivial constant (all-zero, or a fixed activation thereof). Any kernel returning that constant -- including a no-op that skips the matmul/conv entirely -- passes np.allclose, so these benchmarks could report large meaningless speedups without computing the operator. This replaces the zero-init weights/biases with small-normal random values (~0.02 scale: input-dependent, bf16-representable, no overflow). Only create_inputs is changed; the workload/op is untouched. After the fix a no-op (all-zero output) fails correctness on all 19. Scope: 19 of the affected baselines are fully fixed by non-zero weights. Five others whose *output* is intrinsically small regardless of weights -- the softmax-terminated 38k/43k/50k (row outputs ~1/N) and the structurally degenerate 25k (GroupNorm->Mean) and 42k (Max-Subtract-GELU) -- are NOT addressed here; they need a tolerance or operator change and are left to a follow-up. Megablox (11p) has a distinct input-underflow variant fixed separately. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

google-cla · 2026-06-10T07:14:56Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JAXBench: Fix no-op exploit in 19 KernelBench baselines: zero-init weights -> random#50

JAXBench: Fix no-op exploit in 19 KernelBench baselines: zero-init weights -> random#50
charleshong3 wants to merge 1 commit into
AI-Hypercomputer:mainfrom
charleshong3:fix-kernelbench-noop-zeros-weights

charleshong3 commented Jun 10, 2026

Uh oh!

google-cla Bot commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

charleshong3 commented Jun 10, 2026

Uh oh!

google-cla Bot commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant