fix(#69): fix tensorInitWithDistribution buffer overflow and add rank… by LeoBuron · Pull Request #73 · es-ude/OnDeviceTraining

LeoBuron · 2026-04-09T14:09:59Z

… check to doDimensionsMatch

tensorInitWithDistribution computed the element count by summing dimensions instead of multiplying them (+= instead of *=, init 0 instead of 1). This caused two simultaneous failures:

Buffer overflow for bias tensors with shape [1, N]: sum = 1+N > product = N, so memset wrote one float past the end of the caller's data buffer.
Under-initialization for weight tensors with shape [M, N] where M,N > 1: sum = M+N << product = M*N, so only a fraction of values received their intended distribution (e.g. Xavier), with the rest staying at zero.

The overflow alone would not explain the crash, but this codebase uses zero-copy shape storage: initTensorWithQFloat calls setShape(shape, dims, ...) which stores shape->dimensions = dims — a raw pointer into the caller's stack frame, not a deep copy. This is a deliberate design choice to minimize heap allocations on MCU targets where memory is scarce.

The consequence is that the 4-byte memset overflow from bias0Data into the adjacent bias0Dims array on main()'s stack corrupted dims[0] from 1 to 0. Because the tensor's shape points directly at that array, the bias permanently read shape [0, N] instead of [1, N]. The forward pass then hit doDimensionsMatch comparing output [1, N] against bias [0, N], producing the "Dim 1: 0, Dims 2: 20" error from the issue.

Additionally, doDimensionsMatch did not verify that both tensors have the same rank (numberOfDimensions). When ranks differed, the loop iterated using the first tensor's rank and read out of bounds from the second tensor's VLA. This is now a hard error (exit) rather than silent undefined behavior.

Also:

MnistExperiment: extracted model construction into buildModel(), added CSV header row. Data arrays are declared static so they outlive the function — required by the zero-copy shape/data pointer design.

Tests added:

UnitTestTensorApi: 4 tests verifying tensorInitWithDistribution initializes the correct number of elements (product, not sum) for ZEROS, ONES, NORMAL
UnitTestArithmetic: 2 tests for doDimensionsMatch (same shape, different dims)
UnitTestMultiLayerTraining: 3 integration tests exercising a 4-layer model (Linear→ReLU→Linear→Softmax) with CrossEntropy loss — the exact architecture that crashed in MnistExperiment

… check to doDimensionsMatch tensorInitWithDistribution computed the element count by summing dimensions instead of multiplying them (+= instead of *=, init 0 instead of 1). This caused two simultaneous failures: 1. Buffer overflow for bias tensors with shape [1, N]: sum = 1+N > product = N, so memset wrote one float past the end of the caller's data buffer. 2. Under-initialization for weight tensors with shape [M, N] where M,N > 1: sum = M+N << product = M*N, so only a fraction of values received their intended distribution (e.g. Xavier), with the rest staying at zero. The overflow alone would not explain the crash, but this codebase uses zero-copy shape storage: initTensorWithQFloat calls setShape(shape, dims, ...) which stores shape->dimensions = dims — a raw pointer into the caller's stack frame, not a deep copy. This is a deliberate design choice to minimize heap allocations on MCU targets where memory is scarce. The consequence is that the 4-byte memset overflow from bias0Data into the adjacent bias0Dims array on main()'s stack corrupted dims[0] from 1 to 0. Because the tensor's shape points directly at that array, the bias permanently read shape [0, N] instead of [1, N]. The forward pass then hit doDimensionsMatch comparing output [1, N] against bias [0, N], producing the "Dim 1: 0, Dims 2: 20" error from the issue. Additionally, doDimensionsMatch did not verify that both tensors have the same rank (numberOfDimensions). When ranks differed, the loop iterated using the first tensor's rank and read out of bounds from the second tensor's VLA. This is now a hard error (exit) rather than silent undefined behavior. Also: - MnistExperiment: extracted model construction into buildModel(), added CSV header row. Data arrays are declared static so they outlive the function — required by the zero-copy shape/data pointer design. Tests added: - UnitTestTensorApi: 4 tests verifying tensorInitWithDistribution initializes the correct number of elements (product, not sum) for ZEROS, ONES, NORMAL - UnitTestArithmetic: 2 tests for doDimensionsMatch (same shape, different dims) - UnitTestMultiLayerTraining: 3 integration tests exercising a 4-layer model (Linear→ReLU→Linear→Softmax) with CrossEntropy loss — the exact architecture that crashed in MnistExperiment

LeoBuron force-pushed the 69-fix-tensor-init-overflow branch 2 times, most recently from e0070b0 to 73ff6a8 Compare April 9, 2026 14:10

HerrCooker approved these changes Apr 14, 2026

View reviewed changes

Base automatically changed from 66-training-loop-module to main April 14, 2026 14:24

LeoBuron force-pushed the 69-fix-tensor-init-overflow branch 2 times, most recently from e4f403a to 09f7ba9 Compare April 14, 2026 14:40

LeoBuron force-pushed the 69-fix-tensor-init-overflow branch from 09f7ba9 to b507b44 Compare April 14, 2026 15:09

LeoBuron merged commit b507b44 into main Apr 14, 2026
2 checks passed

LeoBuron deleted the 69-fix-tensor-init-overflow branch April 14, 2026 15:10

LeoBuron mentioned this pull request Apr 14, 2026

MnistExperiment crashes: bias gradient dimension mismatch on first training step #69

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(#69): fix tensorInitWithDistribution buffer overflow and add rank…#73

fix(#69): fix tensorInitWithDistribution buffer overflow and add rank…#73
LeoBuron merged 1 commit intomainfrom
69-fix-tensor-init-overflow

LeoBuron commented Apr 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LeoBuron commented Apr 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants