Skip to content

fix(#69): fix tensorInitWithDistribution buffer overflow and add rank…#73

Merged
LeoBuron merged 1 commit intomainfrom
69-fix-tensor-init-overflow
Apr 14, 2026
Merged

fix(#69): fix tensorInitWithDistribution buffer overflow and add rank…#73
LeoBuron merged 1 commit intomainfrom
69-fix-tensor-init-overflow

Conversation

@LeoBuron
Copy link
Copy Markdown
Member

@LeoBuron LeoBuron commented Apr 9, 2026

… check to doDimensionsMatch

tensorInitWithDistribution computed the element count by summing dimensions instead of multiplying them (+= instead of *=, init 0 instead of 1). This caused two simultaneous failures:

  1. Buffer overflow for bias tensors with shape [1, N]: sum = 1+N > product = N, so memset wrote one float past the end of the caller's data buffer.

  2. Under-initialization for weight tensors with shape [M, N] where M,N > 1: sum = M+N << product = M*N, so only a fraction of values received their intended distribution (e.g. Xavier), with the rest staying at zero.

The overflow alone would not explain the crash, but this codebase uses zero-copy shape storage: initTensorWithQFloat calls setShape(shape, dims, ...) which stores shape->dimensions = dims — a raw pointer into the caller's stack frame, not a deep copy. This is a deliberate design choice to minimize heap allocations on MCU targets where memory is scarce.

The consequence is that the 4-byte memset overflow from bias0Data into the adjacent bias0Dims array on main()'s stack corrupted dims[0] from 1 to 0. Because the tensor's shape points directly at that array, the bias permanently read shape [0, N] instead of [1, N]. The forward pass then hit doDimensionsMatch comparing output [1, N] against bias [0, N], producing the "Dim 1: 0, Dims 2: 20" error from the issue.

Additionally, doDimensionsMatch did not verify that both tensors have the same rank (numberOfDimensions). When ranks differed, the loop iterated using the first tensor's rank and read out of bounds from the second tensor's VLA. This is now a hard error (exit) rather than silent undefined behavior.

Also:

  • MnistExperiment: extracted model construction into buildModel(), added CSV header row. Data arrays are declared static so they outlive the function — required by the zero-copy shape/data pointer design.

Tests added:

  • UnitTestTensorApi: 4 tests verifying tensorInitWithDistribution initializes the correct number of elements (product, not sum) for ZEROS, ONES, NORMAL
  • UnitTestArithmetic: 2 tests for doDimensionsMatch (same shape, different dims)
  • UnitTestMultiLayerTraining: 3 integration tests exercising a 4-layer model (Linear→ReLU→Linear→Softmax) with CrossEntropy loss — the exact architecture that crashed in MnistExperiment

@LeoBuron LeoBuron force-pushed the 69-fix-tensor-init-overflow branch 2 times, most recently from e0070b0 to 73ff6a8 Compare April 9, 2026 14:10
Base automatically changed from 66-training-loop-module to main April 14, 2026 14:24
@LeoBuron LeoBuron force-pushed the 69-fix-tensor-init-overflow branch 2 times, most recently from e4f403a to 09f7ba9 Compare April 14, 2026 14:40
… check to doDimensionsMatch

tensorInitWithDistribution computed the element count by summing dimensions
instead of multiplying them (+=  instead of *=, init 0 instead of 1). This
caused two simultaneous failures:

1. Buffer overflow for bias tensors with shape [1, N]: sum = 1+N > product = N,
   so memset wrote one float past the end of the caller's data buffer.

2. Under-initialization for weight tensors with shape [M, N] where M,N > 1:
   sum = M+N << product = M*N, so only a fraction of values received their
   intended distribution (e.g. Xavier), with the rest staying at zero.

The overflow alone would not explain the crash, but this codebase uses
zero-copy shape storage: initTensorWithQFloat calls setShape(shape, dims, ...)
which stores shape->dimensions = dims — a raw pointer into the caller's stack
frame, not a deep copy. This is a deliberate design choice to minimize heap
allocations on MCU targets where memory is scarce.

The consequence is that the 4-byte memset overflow from bias0Data into the
adjacent bias0Dims array on main()'s stack corrupted dims[0] from 1 to 0.
Because the tensor's shape points directly at that array, the bias permanently
read shape [0, N] instead of [1, N]. The forward pass then hit
doDimensionsMatch comparing output [1, N] against bias [0, N], producing the
"Dim 1: 0, Dims 2: 20" error from the issue.

Additionally, doDimensionsMatch did not verify that both tensors have the same
rank (numberOfDimensions). When ranks differed, the loop iterated using the
first tensor's rank and read out of bounds from the second tensor's VLA. This
is now a hard error (exit) rather than silent undefined behavior.

Also:
- MnistExperiment: extracted model construction into buildModel(), added CSV
  header row. Data arrays are declared static so they outlive the function —
  required by the zero-copy shape/data pointer design.

Tests added:
- UnitTestTensorApi: 4 tests verifying tensorInitWithDistribution initializes
  the correct number of elements (product, not sum) for ZEROS, ONES, NORMAL
- UnitTestArithmetic: 2 tests for doDimensionsMatch (same shape, different dims)
- UnitTestMultiLayerTraining: 3 integration tests exercising a 4-layer model
  (Linear→ReLU→Linear→Softmax) with CrossEntropy loss — the exact architecture
  that crashed in MnistExperiment
@LeoBuron LeoBuron force-pushed the 69-fix-tensor-init-overflow branch from 09f7ba9 to b507b44 Compare April 14, 2026 15:09
@LeoBuron LeoBuron merged commit b507b44 into main Apr 14, 2026
2 checks passed
@LeoBuron LeoBuron deleted the 69-fix-tensor-init-overflow branch April 14, 2026 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants