Skip to content

Unbounded allocation (header_length) + shape-product integer overflow in .npy loader, CWE-789/-190 #44

@ValheruEldarr

Description

@ValheruEldarr

Loading an attacker-crafted .npy via npy::LoadArrayFromNumpy triggers two input-validation defects, confirmed with AddressSanitizer/UBSan on HEAD (471fe48). Both are denial-of-service / undersized-allocation class. No code execution is claimed.

1. Unbounded allocation from header_length (CWE-789)

In read_header (npy.hpp ~436-449), the version-2 header length is read as a 32-bit value and used directly to size a heap buffer with no upper bound:

header_length = (le[0]) | (le[1]<<8) | (le[2]<<16) | (le[3]<<24);   // u32, attacker-controlled
if ((magic_string_length + 2 + 4 + header_length) % 16 != 0) {
    // TODO(llohse): display warning      <-- no-op; not a bound
}
...
auto buf_v = std::vector<char>(header_length);                      // npy.hpp:448
istream.read(buf_v.data(), header_length);

A 12-byte file declaring header_length = 0xFFFFFFF0 forces a ~4.29 GB allocation before any data is read, a reliable DoS (std::bad_alloc / OOM-kill). This is the same class as the acknowledged ml-explore/mlx issue #3365.

2. Integer overflow in shape product (CWE-190)

comp_size (npy.hpp ~454-458) multiplies the file-declared shape dimensions with no overflow check:

ndarray_len_t size = 1;
for (ndarray_len_t i : shape) size *= i;     // wraps silently
return size;

The result drives data.resize(size) and the subsequent read(sizeof(Scalar)*size):

  • shape (2^61, 8) wraps the product to 0, a silent undersized buffer while shape still reports huge dims (a write-primitive setup for any consumer that trusts shape);
  • shape (2^32+1, 2^32+1) wraps the product to ~8.6e9, a ~68 GiB resize (ASan allocation-size-too-big).

Suggested fixes

  • Replace the no-op % 16 block with a real bound, for example reject header_length > (1u << 20).
  • In comp_size, accumulate with __builtin_mul_overflow and throw on overflow; also reject when product * sizeof(Scalar) overflows.

Minimal harness, the crafted .npy PoCs, and the traces are available on request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions