Skip to content

Implemented sparse builtin and threaded the new value type throughout#376

Open
gneeri wants to merge 10 commits into
devfrom
rm-37
Open

Implemented sparse builtin and threaded the new value type throughout#376
gneeri wants to merge 10 commits into
devfrom
rm-37

Conversation

@gneeri
Copy link
Copy Markdown
Contributor

@gneeri gneeri commented May 28, 2026

Note

Medium Risk
Broad match-arm updates across runtime/VM increase regression risk; densification in find/I/O can spike memory on large sparse matrices, though CSC validation and overflow checks mitigate some failure modes.

Overview
Introduces a first-class real double sparse matrix representation (Value::SparseTensor / CSC with validation, to_dense, display) and a new sparse() builtin covering sparse(A), sparse(m,n), and triplet forms with duplicate summation, zero dropping, and GPU input gather before host CSC construction.

Runtime integration extends shape/numel/class/whos/disp/WASM previews and memory estimates for sparse values; find and several string/JSON/write paths densify when needed; nnz counts stored nonzeros and can build logical masks for dimensional forms; transpose/ctranspose rebuild CSC for sparse inputs. Many other builtins now explicitly accept, pass through, or error on sparse (e.g. double preserves sparse, integer casts reject it, flip/circshift/min/max reject dense-only paths).

Planner/accelerator marks sparse as gather-immediate (not fused); GC treats sparse like dense numeric storage without nested refs.

Reviewed by Cursor Bugbot for commit ea9193d. Bugbot is set up for automated code reviews on this repo. Configure here.

Summary by CodeRabbit

  • New Features

    • Added a native sparse-tensor type and MATLAB-compatible sparse() constructor with CSC/triplet forms, densification, nnz, transpose, and display support.
    • Many builtins updated to recognize, accept, convert, or reject sparse tensors (IO, formatting, indexing, math, FFT, logical, containers, introspection, etc.).
  • Documentation

    • Builtin metadata and examples added for sparse().
  • Tests

    • Unit tests covering construction, transpose, previews, memory estimation, and error cases.

Review Change Stack

@gneeri gneeri requested a review from nallana May 28, 2026 20:43
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 28, 2026

📝 Walkthrough

Walkthrough

Adds a CSC-format SparseTensor value, implements the MATLAB-compatible sparse() builtin, provides densification/preview/size helpers, integrates sparse handling into metadata/GC/VM/residency, adds transpose/nnz/mask logic, and updates many builtins to accept, reject, or densify sparse inputs.

Changes

SparseTensor Type and Core APIs

Layer / File(s) Summary
SparseTensor struct definition and Value variant
crates/runmat-builtins/src/lib.rs
New SparseTensor struct holds CSC storage and Value::SparseTensor variant is added.
SparseTensor construction, validation, and accessors
crates/runmat-builtins/src/lib.rs
SparseTensor::new validates CSC invariants; zeros, nnz, shape, get, and to_dense are implemented; densify overflow tested.
Type inference and value formatting
crates/runmat-builtins/src/lib.rs
Type::from_value maps sparse to tensor shape; Display prints header and nonzeros in column order.

Sparse Builtin Implementation

Layer / File(s) Summary
Sparse builtin registration and metadata
crates/runmat-runtime/src/builtins/array/creation/mod.rs, sparse.rs, builtins-json/sparse.json
Module wiring, descriptor, GPU/fusion specs, and JSON metadata for MATLAB-style sparse forms.
Builtin entrypoint and argument dispatch
crates/runmat-runtime/src/builtins/array/creation/sparse.rs
Async entry gathers GPU args then dispatches to conversion, dimension-only, or triplet-form flow.
Single-value and array conversion to CSC
crates/runmat-runtime/src/builtins/array/creation/sparse.rs
Converts dense/logical/scalar inputs into CSC by column-major scanning or logical mapping.
Triplet-form construction with duplicate and zero handling
crates/runmat-runtime/src/builtins/array/creation/sparse.rs
Parses (i,j,v) vectors, validates one-based indices, sums duplicates, drops 0.0 (keeps NaN), builds CSC arrays.
Sparse builtin unit tests
crates/runmat-runtime/src/builtins/array/creation/sparse.rs
Tests dimension construction, duplicate summation, zero dropping, dense-to-sparse ordering, GPU gathering, and overflow bounds.

Value Metadata and Introspection

Layer / File(s) Summary
Core metadata helpers (shape, size, class)
crates/runmat-core/src/value_metadata.rs, crates/runmat-runtime/src/builtins/common/shape.rs, crates/runmat-runtime/src/builtins/introspection/class.rs
matlab_class_name returns "double", value_shape returns [rows,cols], value_numel uses saturating mul, and memory bytes estimated by sparse_tensor_memory_bytes.
Introspection and state reporting (class, whos)
crates/runmat-runtime/src/builtins/introspection/whos.rs
whos sets sparse flag and reports bytes from CSC storage arrays.
Garbage collection and GPU residency
crates/runmat-gc/src/collector.rs, crates/runmat-vm/src/accel/residency.rs
Treats SparseTensor as a leaf for GC marking and GPU residency clearing/collection.

Math Operations on Sparse Tensors

Layer / File(s) Summary
Transpose and conjugate transpose
crates/runmat-runtime/src/builtins/math/linalg/ops/mod.rs, transpose.rs, ctranspose.rs
transpose_real_sparse_tensor rebuilds CSC for the transpose; transpose/ctranspose delegate to it with error mapping and tests.
Nonzero count with mask conversion
crates/runmat-runtime/src/builtins/math/reduction/nnz.rs
nnz returns sparse.nnz(); mask_from_sparse builds dense Mask bitvector from CSC with overflow checks and tests.

I/O and Formatting

Layer / File(s) Summary
Display and cell rendering
crates/runmat-runtime/src/builtins/io/disp.rs
Full disp renders sparse Display output split into lines; cell summaries show "[MxN sparse double]".
JSON encoding and WASM wire format
crates/runmat-runtime/src/builtins/io/json/jsonencode.rs, crates/runmat-wasm/src/wire/value.rs
jsonencode densifies then serializes; WASM wire format provides sparse metadata, bounded previews for colPtrs/rowIndices/values, and entry preview with truncation flags.

Sparse Tensor Support Across Builtins

Layer / File(s) Summary
Value type labeling and diagnostic names
crates/runmat-accelerate/src/native_auto.rs, crates/runmat-runtime/src/builtins/acceleration/gpu/pagefun.rs, crates/runmat-vm/src/accel/fusion.rs
value_kind/type_name now report sparse-specific labels.
Unsupported operations rejection
shape/flip/rot/tril/triu/circshift/fft & others
Many builtins now explicitly reject sparse inputs with existing error descriptors instead of falling through.
Densification and type conversion
array/indexing/find.rs, strings/core/char.rs, string.rs, io/net/write.rs, logical/ops.rs
Where appropriate, code densifies via to_dense() and proceeds through existing dense paths; logical conversion densifies and maps errors.
Numeric type conversion (double, single, int, uint)
math/elementwise/*
double() passes sparse through; other numeric conversions reject sparse with conversion errors.
Container and cell operations
cells/core/cell.rs, cellstr.rs, containers/map/containers.map.rs
Sparse treated as numeric prototype for empty cell creation; rejected in cellstr and containers.Map normalization.
Runtime dispatch and spawn handling
crates/runmat-vm/src/interpreter/dispatch/mod.rs
SparseTensor treated as leaf in GPU-handle traversal and spawn-task-id collection/liveness.
Miscellaneous
timing/pause.rs, format helpers
Sparse classified as invalid for pause() and unsupported in some formatting contexts.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Suggested reviewers

  • nallana
  • finrunsfar

🐰 I scurried through CSC fields bright,

Col‑ptrs and rows keep nonzeros tight,
Triplets fold, zeros take flight,
Dense or sparse, the kernels hum,
A rabbit cheers: new math has come!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 46.09% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main changes: implementing a sparse builtin and integrating the new SparseTensor value type throughout the codebase.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch rm-37

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit c267aba. Configure here.

Comment thread crates/runmat-core/src/value_metadata.rs Outdated
Comment thread crates/runmat-runtime/src/builtins/math/linalg/ops/transpose.rs Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
crates/runmat-runtime/src/builtins/timing/pause.rs (1)

363-371: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

[low] Pause() rejects Value::SparseTensor while dense scalars are accepted; this aligns with MATLAB (sparse inputs not supported). In crates/runmat-runtime/src/builtins/timing/pause.rs (match arm around the new | Value::SparseTensor(_)), pause([0.5]) is accepted but pause(sparse(0.5)) is rejected. MATLAB documents pause(n) as a nonnegative real scalar and does not list sparse types for this argument, so this behavior likely isn’t an inconsistency. Consider documenting this constraint and/or adding a regression test for pause(sparse(0.5)) to ensure the error behavior stays stable.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/runmat-runtime/src/builtins/timing/pause.rs` around lines 363 - 371,
The pause implementation currently rejects Value::SparseTensor in the match arm
handling input parsing (see Value::SparseTensor(_) alongside Value::Complex,
Value::ComplexTensor) while accepting dense scalar tensor inputs via
parse_tensor; update the codebase documentation and add a regression test to
assert the current behavior for sparse inputs: document in
crates/runmat-runtime/src/builtins/timing/pause.rs (near the pause parsing/match
with Value::SparseTensor) that pause only accepts nonnegative real dense scalars
and that sparse inputs intentionally error, and add a unit/integration test that
calls the pause entry point with a sparse scalar (sparse(0.5)) and verifies it
produces the expected error to prevent regressions.
crates/runmat-runtime/src/builtins/math/fft/common.rs (2)

66-91: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

[critical] SparseTensor rejected but documentation claims it's accepted as numeric input.

The layer description states this change "Handles SparseTensor in FFT length/dimension parsing (accepting as numeric input)" and the AI summary claims "the accepted input types...were expanded to include Value::SparseTensor(_)", but line 73 explicitly places SparseTensor in the error arm that returns "length must be numeric".

This is a direct contradiction. If the intent is to accept sparse tensors as numeric length arguments:

  • SparseTensor should be handled in a success arm (like Tensor at lines 21-29)
  • It should densify the sparse tensor and extract a scalar value
  • Or it should delegate to the scalar extraction logic

If the intent is to reject sparse tensors, the documentation is misleading.

Proposed fix to accept SparseTensor as numeric input

Add a handler before the error arm:

         Value::Complex(re, im) => {
             if im.abs() > f64::EPSILON {
                 return Err(builtin_error(
                     builtin,
                     format!("{builtin}: length must be real-valued"),
                 ));
             }
             parse_length_scalar(*re, builtin).map(Some)
         }
+        Value::SparseTensor(st) => {
+            if st.values.len() != 1 {
+                return Err(builtin_error(
+                    builtin,
+                    format!("{builtin}: length must be a scalar"),
+                ));
+            }
+            parse_length_scalar(st.values[0], builtin).map(Some)
+        }
         Value::Bool(_) | Value::LogicalArray(_) => Err(builtin_error(
             builtin,
             format!("{builtin}: length must be numeric"),
         )),

Then remove SparseTensor from the error arm at line 73.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/runmat-runtime/src/builtins/math/fft/common.rs` around lines 66 - 91,
The match currently rejects Value::SparseTensor(_) but the change log says
SparseTensor should be accepted as numeric; update the match so
Value::SparseTensor(_) is handled in the same success branch as Value::Tensor(_)
(not in the error arm), convert/densify the sparse tensor to a dense scalar or
delegate to the existing scalar extraction helper used for tensors, then return
the parsed numeric length; finally remove Value::SparseTensor(_) from the error
arm and ensure any conversion errors are propagated via builtin_error (same
pattern as Tensor handling).

771-794: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

[critical] SparseTensor rejected as non-numeric dimension specification.

Same issue as parse_length: line 774 places SparseTensor in the error arm with non-numeric types, contradicting the layer's stated intent to accept it as numeric input.

If dimension specifications from sparse tensors should be supported (consistent with how Tensor is handled at lines 720-750), the sparse tensor must be densified or its values extracted.

Proposed fix to accept SparseTensor for dimension specs

Add a handler before the error arm:

         Value::LogicalArray(array) => {
             if !is_vector_shape(&array.shape) && !array.data.is_empty() {
                 return Err(builtin_error(
                     builtin,
                     format!("{builtin}: dimension masks must be row or column vectors"),
                 ));
             }
             let mut dims = Vec::new();
             for (idx, &flag) in array.data.iter().enumerate() {
                 if flag != 0 {
                     dims.push(idx + 1);
                 }
             }
             Ok(dims)
         }
+        Value::SparseTensor(st) => {
+            // Densify and process as tensor
+            let dense = st.to_dense()
+                .map_err(|e| builtin_error(builtin, format!("{builtin}: {e}")))?;
+            dims_from_value(&Value::Tensor(dense), builtin)
+        }
         Value::GpuTensor(_) => Err(builtin_error(

Then remove SparseTensor from the error arm at line 774.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/runmat-runtime/src/builtins/math/fft/common.rs` around lines 771 -
794, The match currently treats Value::SparseTensor as non-numeric and returns
an error in the dimension-index parsing branch; add a new arm handling
Value::SparseTensor (e.g., in the same match that handles Tensor/Array numeric
inputs in functions like parse_length/parse_dim_indices) that converts/densifies
the sparse tensor or extracts its numeric indices/values into a numeric vector
(or scalar) usable as dimension specs, then proceed with the existing
numeric-path logic; finally remove Value::SparseTensor from the error group so
sparse tensors are accepted as numeric dimension specifications (reference the
match handling around parse_length/parse_dim_indices and the error arm
containing Value::String/Value::SparseTensor).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/runmat-builtins/src/lib.rs`:
- Around line 582-591: The to_dense method currently uses
saturating_mul(self.rows, self.cols) and may attempt to allocate an enormous
vector when rows * cols overflows; update to_dense to explicitly check for
overflow by computing self.rows.checked_mul(self.cols) and return a
Result<Tensor, Error> (or a suitable error type) instead of panicking — validate
the size before creating the vec and return an Err when checked_mul returns None
or when allocation would be impossible; adjust callers (or provide a new
to_dense_result) to handle the Result and continue to build the dense Tensor via
Tensor::new only after the size check succeeds.

In `@crates/runmat-core/src/value_metadata.rs`:
- Around line 78-82: The current Value::SparseTensor branch computes byte size
with plain usize arithmetic that can overflow before casting; replace the
arithmetic with saturating operations (e.g., use .saturating_mul() and
.saturating_add() or convert each length to u64 before multiplying) when
computing (s.values.len() * size_of::<f64>()) + (s.row_indices.len() *
size_of::<usize>()) + (s.col_ptrs.len() * size_of::<usize>()) to ensure no
intermediate overflow; also factor this logic into a shared helper like
sparse_tensor_memory_bytes(s: &SparseTensor) and call that from the
Value::SparseTensor arm to avoid duplication.

In `@crates/runmat-runtime/src/builtins/array/creation/sparse.rs`:
- Around line 449-466: numeric_vector currently accepts any Tensor or
SparseTensor and simply flattens data, which lets matrices pass for triplet-mode
sparse(i,j,v,...); change numeric_vector to enforce vector-only inputs by
checking that Value::Tensor tensors have rank/shape length == 1 (or equivalent
1-D check) and that Value::SparseTensor represents a 1-D sparse vector (e.g.,
its indices are 1-D or its shape.len()==1); if the input is not 1-D, return
Err(sparse_error(&SPARSE_ERROR_INVALID_INPUT, format!("sparse: {name} must be a
real numeric vector, got {other:?}"))). Keep existing handling for LogicalArray,
Num, Int, Bool but apply the same 1-D check for tensor-like variants; also apply
the same fix for the other similar helper (the one referenced around lines
372-381).
- Around line 486-493: The parse_size_arg and parse_subscript helpers currently
only check finiteness, non-negativity and integerness before doing Ok(raw as
usize); add an explicit upper-bound check that raw does not exceed usize::MAX
(e.g. compare raw <= (usize::MAX as f64) or another safe max you choose) and
return the same sparse_error (SPARSE_ERROR_INVALID_INPUT) with a clear message
if it does; place this check just before the Ok(...) cast to prevent saturating
large f64 values into usize::MAX and ensure indices/dimensions cannot be created
from excessively large floats.

In `@crates/runmat-runtime/src/builtins/common/shape.rs`:
- Line 43: The SparseTensor branch returns vec![t.rows, t.cols] directly which
is inconsistent with the Tensor branch that calls normalize_shape(&t.shape);
update the Value::SparseTensor arm to call normalize_shape(&[t.rows, t.cols])
(or otherwise pass the same normalized slice) so both Tensor and SparseTensor
use normalize_shape consistently; reference the Value::SparseTensor arm, the
normalize_shape function, and the Tensor branch to make the change.

In `@crates/runmat-runtime/src/builtins/io/disp.rs`:
- Around line 193-197: The Value::SparseTensor arm currently expands
sparse.to_string() into multiple lines regardless of context; modify that arm to
check the current RenderMode and, when RenderMode::Nested, return a single
compact summary string (e.g., a one-line brief representation such as shape/nnz
or a short label) instead of splitting sparse.to_string() into lines.
Concretely, in the Value::SparseTensor match in disp.rs replace the
unconditional sparse.to_string().lines().map(...).collect() with a branch that
returns vec![compact_summary] for RenderMode::Nested and preserves the
multi-line to_string() behavior for non-nested modes. Ensure you reference the
existing Value::SparseTensor match and use the RenderMode::Nested enum value to
drive the conditional.

In `@crates/runmat-runtime/src/builtins/io/json/jsonencode.rs`:
- Line 418: The current JSON encoder unconditionally calls sparse.to_dense() in
the Value::SparseTensor arm (invoking tensor_to_json) which can OOM for large
sparse tensors; change this by either implementing a sparse-aware encoding path
in the jsonencode logic (serialize shape + non-zero indices/values instead of
densifying) or, if dense output is required, add a deterministic size guard
before calling sparse.to_dense() that checks number of elements (shape product
or nnz vs threshold) and returns a clear runtime error instead of attempting to
allocate; update the match arm for Value::SparseTensor and related
tensor_to_json usage accordingly so the code either serializes sparsely or fails
fast with a clear message.

In `@crates/runmat-runtime/src/builtins/logical/ops.rs`:
- Line 163: The match arm rejecting sparse inputs (Value::SparseTensor(_)) in
crates/runmat-runtime/src/builtins/logical/ops.rs breaks MATLAB compatibility;
replace that rejection with a pathway that densifies the sparse tensor and then
reuses the existing logical conversion logic (e.g., call the SparseTensor->dense
conversion method or helper, then feed the resulting DenseTensor into the same
logical conversion used for Value::Tensor), or if you cannot convert now, update
the error path to return a clear documented limitation mentioning
logical(sparse(...)) is not supported; specifically modify the
Value::SparseTensor(_) branch (and avoid using conversion_error("sparse") there)
to either convert to dense and proceed or emit a documented limitation error.

In `@crates/runmat-runtime/src/builtins/math/linalg/ops/ctranspose.rs`:
- Around line 208-236: In ctranspose_sparse_tensor the triplets are created as
(col, row, value) but the sort closure destructures them with misleading names
|&(row, col, _)| causing confusion; fix by renaming the destructured variables
to reflect their meaning (e.g., |&(orig_col, orig_row, _)|) and then compute the
sort key as (orig_row, orig_col) so intent is clear, or add a short comment
above the sort_by_key explaining the intentional order swap; update references
to triplets, sort_by_key, and the loop that builds col_ptrs if you change
naming.

In `@crates/runmat-runtime/src/builtins/math/linalg/ops/transpose.rs`:
- Around line 205-233: The closure in transpose_sparse_tensor that sorts
triplets is destructuring tuple elements with confusing names; change the
closure in triplets.sort_by_key(|&(row, col, _)| (col, row)) to use accurate
names that reflect how the triplet was constructed (e.g., triplets.push((col,
row, value)) then sort_by_key(|&(orig_col, orig_row, _)| (orig_row, orig_col)))
or add a clarifying comment above the sort explaining the intentional reversal,
so the tuple element meanings in triplets and the sort key are clear and
maintainable.

In `@crates/runmat-runtime/src/builtins/math/reduction/nnz.rs`:
- Around line 494-508: The mask_from_sparse function currently uses
saturating_mul and unchecked index arithmetic which can overflow; replace the
size and index math with explicit checked arithmetic: compute total =
sparse.rows.checked_mul(sparse.cols) and return an Err via BuiltinResult if
None, allocate bits with total; for each column compute col_off = (col as
usize).checked_mul(sparse.rows) and then idx = col_off.checked_add(row) using
checked_add, returning Err on overflow; ensure you reference mask_from_sparse,
SparseTensor, bits, sparse.rows, sparse.cols, and the index expression row + col
* sparse.rows when implementing these checks.

In `@crates/runmat-wasm/src/wire/value.rs`:
- Around line 119-133: The Value::SparseTensor branch is emitting unbounded
arrays (col_ptrs, row_indices, values) into the JSON causing oversized WASM
payloads; modify this branch to replace those full vectors with bounded previews
(slice or use the same MAX_DATA_PREVIEW cap) or omit them and only include
lengths/nnz plus the existing preview from sparse_entry_preview; update the
block that constructs the JSON (inside Value::SparseTensor) to export either
truncated previews (e.g., col_ptrs_preview, row_indices_preview, values_preview
computed with MAX_DATA_PREVIEW) and include flags/length fields, or remove the
raw arrays entirely and keep only shape, nnz, lengths and the
entry_preview/truncated fields so the payload is size-bounded.

---

Outside diff comments:
In `@crates/runmat-runtime/src/builtins/math/fft/common.rs`:
- Around line 66-91: The match currently rejects Value::SparseTensor(_) but the
change log says SparseTensor should be accepted as numeric; update the match so
Value::SparseTensor(_) is handled in the same success branch as Value::Tensor(_)
(not in the error arm), convert/densify the sparse tensor to a dense scalar or
delegate to the existing scalar extraction helper used for tensors, then return
the parsed numeric length; finally remove Value::SparseTensor(_) from the error
arm and ensure any conversion errors are propagated via builtin_error (same
pattern as Tensor handling).
- Around line 771-794: The match currently treats Value::SparseTensor as
non-numeric and returns an error in the dimension-index parsing branch; add a
new arm handling Value::SparseTensor (e.g., in the same match that handles
Tensor/Array numeric inputs in functions like parse_length/parse_dim_indices)
that converts/densifies the sparse tensor or extracts its numeric indices/values
into a numeric vector (or scalar) usable as dimension specs, then proceed with
the existing numeric-path logic; finally remove Value::SparseTensor from the
error group so sparse tensors are accepted as numeric dimension specifications
(reference the match handling around parse_length/parse_dim_indices and the
error arm containing Value::String/Value::SparseTensor).

In `@crates/runmat-runtime/src/builtins/timing/pause.rs`:
- Around line 363-371: The pause implementation currently rejects
Value::SparseTensor in the match arm handling input parsing (see
Value::SparseTensor(_) alongside Value::Complex, Value::ComplexTensor) while
accepting dense scalar tensor inputs via parse_tensor; update the codebase
documentation and add a regression test to assert the current behavior for
sparse inputs: document in crates/runmat-runtime/src/builtins/timing/pause.rs
(near the pause parsing/match with Value::SparseTensor) that pause only accepts
nonnegative real dense scalars and that sparse inputs intentionally error, and
add a unit/integration test that calls the pause entry point with a sparse
scalar (sparse(0.5)) and verifies it produces the expected error to prevent
regressions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 736ef82c-2dc2-49dd-b91e-126322669cee

📥 Commits

Reviewing files that changed from the base of the PR and between 8268533 and c267aba.

📒 Files selected for processing (58)
  • crates/runmat-accelerate/src/native_auto.rs
  • crates/runmat-builtins/src/lib.rs
  • crates/runmat-core/src/value_metadata.rs
  • crates/runmat-gc/src/collector.rs
  • crates/runmat-runtime/src/builtins/acceleration/gpu/pagefun.rs
  • crates/runmat-runtime/src/builtins/array/creation/meshgrid.rs
  • crates/runmat-runtime/src/builtins/array/creation/mod.rs
  • crates/runmat-runtime/src/builtins/array/creation/sparse.rs
  • crates/runmat-runtime/src/builtins/array/creation/zeros.rs
  • crates/runmat-runtime/src/builtins/array/indexing/find.rs
  • crates/runmat-runtime/src/builtins/array/shape/circshift.rs
  • crates/runmat-runtime/src/builtins/array/shape/flip.rs
  • crates/runmat-runtime/src/builtins/array/shape/fliplr.rs
  • crates/runmat-runtime/src/builtins/array/shape/flipud.rs
  • crates/runmat-runtime/src/builtins/array/shape/rot90.rs
  • crates/runmat-runtime/src/builtins/array/shape/squeeze.rs
  • crates/runmat-runtime/src/builtins/array/shape/tril.rs
  • crates/runmat-runtime/src/builtins/array/shape/triu.rs
  • crates/runmat-runtime/src/builtins/builtins-json/sparse.json
  • crates/runmat-runtime/src/builtins/cells/core/cell.rs
  • crates/runmat-runtime/src/builtins/cells/core/cellstr.rs
  • crates/runmat-runtime/src/builtins/common/format.rs
  • crates/runmat-runtime/src/builtins/common/shape.rs
  • crates/runmat-runtime/src/builtins/containers/map/containers.map.rs
  • crates/runmat-runtime/src/builtins/introspection/class.rs
  • crates/runmat-runtime/src/builtins/introspection/whos.rs
  • crates/runmat-runtime/src/builtins/io/disp.rs
  • crates/runmat-runtime/src/builtins/io/json/jsonencode.rs
  • crates/runmat-runtime/src/builtins/io/net/write.rs
  • crates/runmat-runtime/src/builtins/logical/ops.rs
  • crates/runmat-runtime/src/builtins/logical/tests/isreal.rs
  • crates/runmat-runtime/src/builtins/math/elementwise/double.rs
  • crates/runmat-runtime/src/builtins/math/elementwise/int32.rs
  • crates/runmat-runtime/src/builtins/math/elementwise/ldivide.rs
  • crates/runmat-runtime/src/builtins/math/elementwise/minus.rs
  • crates/runmat-runtime/src/builtins/math/elementwise/plus.rs
  • crates/runmat-runtime/src/builtins/math/elementwise/rdivide.rs
  • crates/runmat-runtime/src/builtins/math/elementwise/single.rs
  • crates/runmat-runtime/src/builtins/math/elementwise/times.rs
  • crates/runmat-runtime/src/builtins/math/elementwise/uint16.rs
  • crates/runmat-runtime/src/builtins/math/elementwise/uint8.rs
  • crates/runmat-runtime/src/builtins/math/fft/common.rs
  • crates/runmat-runtime/src/builtins/math/fft/fft2.rs
  • crates/runmat-runtime/src/builtins/math/fft/fftshift.rs
  • crates/runmat-runtime/src/builtins/math/fft/ifft2.rs
  • crates/runmat-runtime/src/builtins/math/fft/ifftshift.rs
  • crates/runmat-runtime/src/builtins/math/linalg/ops/ctranspose.rs
  • crates/runmat-runtime/src/builtins/math/linalg/ops/transpose.rs
  • crates/runmat-runtime/src/builtins/math/reduction/max.rs
  • crates/runmat-runtime/src/builtins/math/reduction/min.rs
  • crates/runmat-runtime/src/builtins/math/reduction/nnz.rs
  • crates/runmat-runtime/src/builtins/strings/core/char.rs
  • crates/runmat-runtime/src/builtins/strings/core/string.rs
  • crates/runmat-runtime/src/builtins/timing/pause.rs
  • crates/runmat-vm/src/accel/fusion.rs
  • crates/runmat-vm/src/accel/residency.rs
  • crates/runmat-vm/src/interpreter/dispatch/mod.rs
  • crates/runmat-wasm/src/wire/value.rs

Comment thread crates/runmat-builtins/src/lib.rs Outdated
Comment thread crates/runmat-core/src/value_metadata.rs Outdated
Comment thread crates/runmat-runtime/src/builtins/array/creation/sparse.rs
Comment thread crates/runmat-runtime/src/builtins/array/creation/sparse.rs
pub async fn value_dimensions(value: &Value) -> Result<Vec<usize>, RuntimeError> {
let dims = match value {
Value::Tensor(t) => normalize_shape(&t.shape),
Value::SparseTensor(t) => vec![t.rows, t.cols],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

[low] Shape normalization inconsistency with dense tensors.

SparseTensor returns vec![t.rows, t.cols] directly, while Tensor (line 42) passes through normalize_shape(&t.shape). Although sparse tensors are always 2D—so the shape.len() == 1 normalization never applies—this inconsistency could cause subtle edge-case divergence (e.g., if rows or cols is zero) and makes the code harder to reason about.

Consider calling normalize_shape(&[t.rows, t.cols]) for uniformity, or document why SparseTensor intentionally skips normalization.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/runmat-runtime/src/builtins/common/shape.rs` at line 43, The
SparseTensor branch returns vec![t.rows, t.cols] directly which is inconsistent
with the Tensor branch that calls normalize_shape(&t.shape); update the
Value::SparseTensor arm to call normalize_shape(&[t.rows, t.cols]) (or otherwise
pass the same normalized slice) so both Tensor and SparseTensor use
normalize_shape consistently; reference the Value::SparseTensor arm, the
normalize_shape function, and the Tensor branch to make the change.

Comment thread crates/runmat-runtime/src/builtins/logical/ops.rs Outdated
Comment thread crates/runmat-runtime/src/builtins/math/linalg/ops/ctranspose.rs Outdated
Comment thread crates/runmat-runtime/src/builtins/math/linalg/ops/transpose.rs Outdated
Comment thread crates/runmat-runtime/src/builtins/math/reduction/nnz.rs
Comment thread crates/runmat-wasm/src/wire/value.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
crates/runmat-runtime/src/builtins/strings/core/string.rs (1)

547-553: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

[high] string format arguments densify sparse tensors, risking OOM

Line 548 unconditionally densifies a SparseTensor passed as a format argument, allocating rows × cols × 8 bytes. Large sparse inputs will exhaust memory before the error is returned.

Impact: string("%d", largeSparseMatrix) crashes instead of formatting the sparse entries or rejecting the input with a clear error.

Recommendation: Add a size guard before to_dense() and return string_flow("sparse tensors exceeding N elements not supported as format arguments") if rows * cols > threshold.

As per coding guidelines, prioritize catching resource exhaustion in user-facing builtins.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/runmat-runtime/src/builtins/strings/core/string.rs` around lines 547 -
553, The code unconditionally calls s.to_dense() inside the Value::SparseTensor
arm, risking OOM; before densifying use the tensor's shape (from s.shape or
s.dimensions) to compute element_count (product of dimensions) and if
element_count > threshold (define a constant like MAX_FORMAT_TENSOR_ELEMENTS)
return Err(string_flow("sparse tensors exceeding N elements not supported as
format arguments")); only call s.to_dense() after that guard and proceed to
build ArgumentData as before (keep references to Value::SparseTensor,
s.to_dense(), ArgumentData, and string_flow so reviewers can find the changed
logic).
crates/runmat-runtime/src/builtins/math/reduction/nnz.rs (1)

494-524: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

[critical] mask_from_sparse allocates full dense mask, causing OOM for large sparse matrices

Lines 495-505 allocate a dense bit vector of size rows × cols to build the mask for dimension-wise nnz reduction. A 1M×1M sparse matrix with only 100 non-zero entries will attempt a 1 TB allocation and fail, even though the checked arithmetic correctly detects overflow.

Impact: nnz(largeSparseMatrix, dim) crashes with allocation failure instead of computing the result efficiently from the CSC structure.

Root cause: The mask-based reduction design assumes dense representation. Sparse matrices should use a sparse-aware algorithm that iterates the stored entries column-by-column and accumulates counts per dimension without materializing the full mask.

Recommendation: Implement a sparse-specific dimension reduction that:

  1. Allocates output counts array of size output_shape (e.g., [1, cols] for dim=1)
  2. Iterates col_ptrs and row_indices to increment the appropriate output bin
  3. Never allocates the full rows × cols mask

As per coding guidelines, prioritize memory inefficiencies and allocation hot paths.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/runmat-runtime/src/builtins/math/reduction/nnz.rs` around lines 494 -
524, mask_from_sparse currently materializes a full dense bits Vec of size
rows*cols causing OOM; instead implement a sparse-aware path that never
allocates the full mask: allocate an output counts Vec with length equal to the
reduction output (e.g., sparse.cols for column-wise nnz), iterate
sparse.col_ptrs and sparse.row_indices and increment the appropriate output bin
for each stored entry, and return that compact result (or adapt the caller to
accept counts) rather than creating bits.resize(rows*cols). Update logic in
mask_from_sparse (and related nnz reduction callsite) to use
SparseTensor.col_ptrs and SparseTensor.row_indices to accumulate counts
directly. Ensure all checked arithmetic and error construction
(nnz_descriptor_error_with_detail / NNZ_ERROR_INTERNAL) remain in place for
index overflows.
crates/runmat-runtime/src/builtins/logical/ops.rs (1)

163-198: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

[high] Put a densification guard on logical(sparse) instead of always materializing full storage.

logical_from_sparse_tensor allocates a dense Tensor and then a dense LogicalArray, so memory jumps from O(nnz) to O(rows*cols). On large sparse inputs this turns a sparse-friendly path into RM.LOGICAL.INTERNAL or an allocation failure. If sparse-logical output is not supported yet, fail fast with a targeted error or gate this fallback behind a size budget.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/runmat-runtime/src/builtins/logical/ops.rs` around lines 163 - 198,
logical_from_sparse_tensor currently always densifies the sparse input
(sparse.to_dense()) which can blow memory; instead add a densification guard in
logical_from_sparse_tensor that checks the sparse size (e.g., nnz and rows*cols
or estimated bytes) and if the dense footprint exceeds a safe budget either
return a targeted logical error (using logical_error_with_message and
LOGICAL_ERROR_INTERNAL) or short-circuit to a sparse-aware path; if you choose a
budget gate, expose a constant or config and only call sparse.to_dense() when
tensor.num_elements() * element_size is below the threshold before calling
logical_from_tensor/logical_buffer_to_host.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/runmat-runtime/src/builtins/array/creation/sparse.rs`:
- Around line 490-495: The single-argument dense path must enforce matrix-only
input to avoid silently dropping pages; add a rank check using the existing
predicate (or a simple shape.len() == 2 check) before calling
sparse_from_dense_tensor. Specifically, in the match arm Value::Tensor(tensor)
=> sparse_from_dense_tensor(&tensor), validate tensor.shape() (use
is_vector_shape or require shape.len() == 2) and return an error or panic if the
rank != 2 so 3-D+ tensors are rejected instead of truncated.
- Around line 449-495: numeric_vector currently rejects 2-D matrix triplet
inputs and prevents scalar expansion; update numeric_vector to accept
tensor-like inputs of rank <=2 by flattening them (row-major) instead of
enforcing vector-only shapes (use tensor.data.clone() or dense.data after
to_dense(), and logical.data mapped to f64) and keep numeric_vector_error for
invalid types, and update the triplet-handling code (the triplet path referenced
in the review around the other function handling i, j, v) to perform scalar
expansion: when combining i, j, v compute the target length = max(len(i),
len(j), len(v)) and if any operand has length 1 broadcastingly repeat its single
value to match target_length before the length check/assembly; leave scalar
Value::Num/Int/Bool handling as producing length-1 vectors so the broadcast
logic can handle them. Ensure to use is_vector_shape only to allow rank <=2
inputs and remove the strict “single non-1 dim” rejection that blocks matrices.

In `@crates/runmat-runtime/src/builtins/array/indexing/find.rs`:
- Around line 542-547: The current implementation unconditionally calls
sparse.to_dense() inside the Value::SparseTensor match arm (invoked in the find
path), which can OOM for large sparse matrices; replace this by a sparse-aware
path that directly iterates the sparse CSC representation (use the
sparse.col_ptrs and sparse.row_indices/values or equivalent fields on the sparse
object) to collect non-zero linear indices into the output without materializing
a dense buffer, and preserve the existing error handling via
find_error_with_message for any sparse iteration failures while returning
DataStorage::Real (or the appropriate sparse-result wrapper) and the same
boolean flag logic instead of calling sparse.to_dense().

In `@crates/runmat-runtime/src/builtins/io/json/jsonencode.rs`:
- Around line 418-423: The code unconditionally calls sparse.to_dense() inside
the Value::SparseTensor arm which can OOM for large sparse shapes; before
calling sparse.to_dense() check sparse.rows.checked_mul(sparse.cols) (or
equivalent) against a safe threshold (e.g., 10_000_000 elements) and if exceeded
return an error via jsonencode_error_with(&JSONENCODE_ERROR_INTERNAL,
"<descriptive message>") instead of densifying, or implement a sparse-aware path
that serializes only the stored entries (using the sparse structure) and calls
tensor_to_json or a new sparse_to_json helper; update the Value::SparseTensor
branch to perform the size guard and choose the sparse-encoding fallback to
prevent unbounded memory allocation.

In `@crates/runmat-runtime/src/builtins/io/net/write.rs`:
- Around line 497-499: The code path in Value::SparseTensor currently calls
s.to_dense() unconditionally which can allocate huge memory and cause OOM during
TCP write; modify the handling in the SparseTensor branch (the code around
Value::SparseTensor -> s.to_dense()) to perform a size guard before
densification by checking s.rows(), s.cols() (or whatever APIs expose shape) and
returning write_error_with_message(..., &WRITE_ERROR_INTERNAL) when rows × cols
exceeds a safe threshold, or alternatively implement serialization of only the
sparse entries (nonzero coordinates + values) instead of calling s.to_dense();
ensure the changed branch still returns the same error semantics and uses
WRITE_ERROR_INTERNAL and write_error_with_message for the guarded failure path.

In `@crates/runmat-runtime/src/builtins/strings/core/string.rs`:
- Line 685: The Value::SparseTensor arm currently calls sparse.to_dense() before
tensor_to_string_array (seen in the match branch using Value::SparseTensor and
tensor_to_string_array), which can OOM for large sparse tensors; change this by
adding a size guard that checks sparse.rows * sparse.cols against a threshold
and returns a descriptive error (e.g., "cannot convert sparse tensor XxY with Z
stored entries to dense string array") or, better, implement a sparse-aware
conversion path that iterates stored entries (using the sparse object's
stored-entry API / nnz) and emits either a sparse string summary ("sparse
{rows}x{cols} ({stored} stored)") or a sparse string array without calling
to_dense; ensure the new path preserves existing error mapping (string_flow)
semantics when failing.

---

Outside diff comments:
In `@crates/runmat-runtime/src/builtins/logical/ops.rs`:
- Around line 163-198: logical_from_sparse_tensor currently always densifies the
sparse input (sparse.to_dense()) which can blow memory; instead add a
densification guard in logical_from_sparse_tensor that checks the sparse size
(e.g., nnz and rows*cols or estimated bytes) and if the dense footprint exceeds
a safe budget either return a targeted logical error (using
logical_error_with_message and LOGICAL_ERROR_INTERNAL) or short-circuit to a
sparse-aware path; if you choose a budget gate, expose a constant or config and
only call sparse.to_dense() when tensor.num_elements() * element_size is below
the threshold before calling logical_from_tensor/logical_buffer_to_host.

In `@crates/runmat-runtime/src/builtins/math/reduction/nnz.rs`:
- Around line 494-524: mask_from_sparse currently materializes a full dense bits
Vec of size rows*cols causing OOM; instead implement a sparse-aware path that
never allocates the full mask: allocate an output counts Vec with length equal
to the reduction output (e.g., sparse.cols for column-wise nnz), iterate
sparse.col_ptrs and sparse.row_indices and increment the appropriate output bin
for each stored entry, and return that compact result (or adapt the caller to
accept counts) rather than creating bits.resize(rows*cols). Update logic in
mask_from_sparse (and related nnz reduction callsite) to use
SparseTensor.col_ptrs and SparseTensor.row_indices to accumulate counts
directly. Ensure all checked arithmetic and error construction
(nnz_descriptor_error_with_detail / NNZ_ERROR_INTERNAL) remain in place for
index overflows.

In `@crates/runmat-runtime/src/builtins/strings/core/string.rs`:
- Around line 547-553: The code unconditionally calls s.to_dense() inside the
Value::SparseTensor arm, risking OOM; before densifying use the tensor's shape
(from s.shape or s.dimensions) to compute element_count (product of dimensions)
and if element_count > threshold (define a constant like
MAX_FORMAT_TENSOR_ELEMENTS) return Err(string_flow("sparse tensors exceeding N
elements not supported as format arguments")); only call s.to_dense() after that
guard and proceed to build ArgumentData as before (keep references to
Value::SparseTensor, s.to_dense(), ArgumentData, and string_flow so reviewers
can find the changed logic).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5210ef21-cda1-43bb-a759-8b881fb7dbc8

📥 Commits

Reviewing files that changed from the base of the PR and between c267aba and ea9193d.

📒 Files selected for processing (15)
  • crates/runmat-builtins/src/lib.rs
  • crates/runmat-core/src/lib.rs
  • crates/runmat-core/src/value_metadata.rs
  • crates/runmat-runtime/src/builtins/array/creation/sparse.rs
  • crates/runmat-runtime/src/builtins/array/indexing/find.rs
  • crates/runmat-runtime/src/builtins/io/json/jsonencode.rs
  • crates/runmat-runtime/src/builtins/io/net/write.rs
  • crates/runmat-runtime/src/builtins/logical/ops.rs
  • crates/runmat-runtime/src/builtins/math/linalg/ops/ctranspose.rs
  • crates/runmat-runtime/src/builtins/math/linalg/ops/mod.rs
  • crates/runmat-runtime/src/builtins/math/linalg/ops/transpose.rs
  • crates/runmat-runtime/src/builtins/math/reduction/nnz.rs
  • crates/runmat-runtime/src/builtins/strings/core/char.rs
  • crates/runmat-runtime/src/builtins/strings/core/string.rs
  • crates/runmat-wasm/src/wire/value.rs

Comment on lines +449 to +495
fn numeric_vector(value: &Value, name: &str) -> BuiltinResult<Vec<f64>> {
match value {
Value::Tensor(tensor) => {
if !is_vector_shape(&tensor.shape) {
return Err(numeric_vector_error(value, name));
}
Ok(tensor.data.clone())
}
Value::SparseTensor(sparse) => {
if !is_vector_shape(&[sparse.rows, sparse.cols]) {
return Err(numeric_vector_error(value, name));
}
sparse
.to_dense()
.map(|dense| dense.data)
.map_err(|err| sparse_error(&SPARSE_ERROR_INTERNAL, format!("sparse: {err}")))
}
Value::LogicalArray(logical) => {
if !is_vector_shape(&logical.shape) {
return Err(numeric_vector_error(value, name));
}
Ok(logical
.data
.iter()
.map(|&bit| if bit != 0 { 1.0 } else { 0.0 })
.collect())
}
Value::Num(n) => Ok(vec![*n]),
Value::Int(i) => Ok(vec![i.to_f64()]),
Value::Bool(b) => Ok(vec![if *b { 1.0 } else { 0.0 }]),
other => Err(numeric_vector_error(other, name)),
}
}

fn numeric_vector_error(value: &Value, name: &str) -> RuntimeError {
sparse_error(
&SPARSE_ERROR_INVALID_INPUT,
format!("sparse: {name} must be a real numeric vector, got {value:?}"),
)
}

fn is_vector_shape(shape: &[usize]) -> bool {
if shape.len() > 2 {
return false;
}
shape.iter().filter(|&&dim| dim != 1).count() <= 1
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

[high] Restore MATLAB triplet semantics for matrix inputs and scalar expansion.

This new numeric_vector gate now rejects 2-D triplet inputs outright, and the triplet path still errors when exactly one of i, j, or v is scalar because singleton values are never broadcast before the length check. MATLAB accepts matrix triplets by flattening with (:), and it also allows scalar expansion for any one of i, j, or v. Flatten tensor-like inputs and broadcast singleton operands instead of enforcing vector-only shapes. (mathworks.com)

Also applies to: 612-645

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/runmat-runtime/src/builtins/array/creation/sparse.rs` around lines 449
- 495, numeric_vector currently rejects 2-D matrix triplet inputs and prevents
scalar expansion; update numeric_vector to accept tensor-like inputs of rank <=2
by flattening them (row-major) instead of enforcing vector-only shapes (use
tensor.data.clone() or dense.data after to_dense(), and logical.data mapped to
f64) and keep numeric_vector_error for invalid types, and update the
triplet-handling code (the triplet path referenced in the review around the
other function handling i, j, v) to perform scalar expansion: when combining i,
j, v compute the target length = max(len(i), len(j), len(v)) and if any operand
has length 1 broadcastingly repeat its single value to match target_length
before the length check/assembly; leave scalar Value::Num/Int/Bool handling as
producing length-1 vectors so the broadcast logic can handle them. Ensure to use
is_vector_shape only to allow rank <=2 inputs and remove the strict “single
non-1 dim” rejection that blocks matrices.

Comment on lines +490 to +495
fn is_vector_shape(shape: &[usize]) -> bool {
if shape.len() > 2 {
return false;
}
shape.iter().filter(|&&dim| dim != 1).count() <= 1
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

[critical] Reuse this shape validation for sparse(A) as well.

The file now has a shape predicate, but the single-argument dense path still accepts arbitrary-rank Tensor inputs. sparse_from_dense_tensor only iterates rows()*cols() elements, so a 3-D tensor will silently drop later pages instead of failing. Add a matrix-only rank check before Value::Tensor(tensor) => sparse_from_dense_tensor(&tensor).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/runmat-runtime/src/builtins/array/creation/sparse.rs` around lines 490
- 495, The single-argument dense path must enforce matrix-only input to avoid
silently dropping pages; add a rank check using the existing predicate (or a
simple shape.len() == 2 check) before calling sparse_from_dense_tensor.
Specifically, in the match arm Value::Tensor(tensor) =>
sparse_from_dense_tensor(&tensor), validate tensor.shape() (use is_vector_shape
or require shape.len() == 2) and return an error or panic if the rank != 2 so
3-D+ tensors are rejected instead of truncated.

Comment on lines +542 to +547
Value::SparseTensor(sparse) => Ok((
DataStorage::Real(sparse.to_dense().map_err(|e| {
find_error_with_message(format!("find: {e}"), &FIND_ERROR_INTERNAL)
})?),
false,
)),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

[high] find unconditionally densifies sparse input, risking OOM

Line 543 calls sparse.to_dense() to materialize the full dense representation before searching for non-zero indices. For large sparse matrices, this causes memory exhaustion.

Impact: find(sparseMat) on a large sparse matrix crashes instead of efficiently iterating the CSC storage.

Recommendation: Implement a sparse-aware path that iterates col_ptrs and row_indices directly to extract non-zero linear indices without allocating the dense array.

As per coding guidelines, prioritize performance and memory inefficiencies in hot paths.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/runmat-runtime/src/builtins/array/indexing/find.rs` around lines 542 -
547, The current implementation unconditionally calls sparse.to_dense() inside
the Value::SparseTensor match arm (invoked in the find path), which can OOM for
large sparse matrices; replace this by a sparse-aware path that directly
iterates the sparse CSC representation (use the sparse.col_ptrs and
sparse.row_indices/values or equivalent fields on the sparse object) to collect
non-zero linear indices into the output without materializing a dense buffer,
and preserve the existing error handling via find_error_with_message for any
sparse iteration failures while returning DataStorage::Real (or the appropriate
sparse-result wrapper) and the same boolean flag logic instead of calling
sparse.to_dense().

Comment on lines +418 to +423
Value::SparseTensor(sparse) => {
let dense = sparse.to_dense().map_err(|err| {
jsonencode_error_with(&JSONENCODE_ERROR_INTERNAL, format!("jsonencode: {err}"))
})?;
tensor_to_json(&dense, options)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

[high] Unconditional sparse densification causes OOM for large sparse inputs

Lines 419-420 call sparse.to_dense() before encoding, which allocates rows × cols × 8 bytes. A 100k×100k sparse matrix with only 100 non-zero entries will attempt an 80 GB allocation and crash before map_err can return a user-facing error.

Impact: Large sparse tensors (common in scientific/ML workloads) trigger process-level OOM instead of returning a catchable error.

Recommendation: Add a size guard before to_dense() that checks sparse.rows * sparse.cols against a threshold (e.g., 10M elements) and returns JSONENCODE_ERROR_INTERNAL with a descriptive message, or implement sparse-aware JSON encoding that serializes only the stored entries.

As per coding guidelines, prioritize catching resource exhaustion issues that can crash the runtime.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/runmat-runtime/src/builtins/io/json/jsonencode.rs` around lines 418 -
423, The code unconditionally calls sparse.to_dense() inside the
Value::SparseTensor arm which can OOM for large sparse shapes; before calling
sparse.to_dense() check sparse.rows.checked_mul(sparse.cols) (or equivalent)
against a safe threshold (e.g., 10_000_000 elements) and if exceeded return an
error via jsonencode_error_with(&JSONENCODE_ERROR_INTERNAL, "<descriptive
message>") instead of densifying, or implement a sparse-aware path that
serializes only the stored entries (using the sparse structure) and calls
tensor_to_json or a new sparse_to_json helper; update the Value::SparseTensor
branch to perform the size guard and choose the sparse-encoding fallback to
prevent unbounded memory allocation.

Comment on lines +497 to +499
Value::SparseTensor(s) => s.to_dense().map(|dense| dense.data).map_err(|err| {
write_error_with_message(format!("write: {err}"), &WRITE_ERROR_INTERNAL)
}),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

[high] Sparse densification before TCP write can cause OOM

Line 497 unconditionally densifies the sparse tensor before flattening to payload bytes. A large sparse matrix (e.g., 50k×50k with 500 nnz) will attempt a 20 GB allocation and crash the write thread.

Impact: Network I/O with sparse data triggers process-level OOM; the blocking write context means the event loop stalls during allocation.

Recommendation: Either (1) add a size guard before to_dense() and return WRITE_ERROR_INTERNAL if rows × cols > threshold, or (2) serialize only the stored sparse entries if the wire protocol can represent sparse data.

As per coding guidelines, prioritize catching resource exhaustion and I/O hot-path issues.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/runmat-runtime/src/builtins/io/net/write.rs` around lines 497 - 499,
The code path in Value::SparseTensor currently calls s.to_dense()
unconditionally which can allocate huge memory and cause OOM during TCP write;
modify the handling in the SparseTensor branch (the code around
Value::SparseTensor -> s.to_dense()) to perform a size guard before
densification by checking s.rows(), s.cols() (or whatever APIs expose shape) and
returning write_error_with_message(..., &WRITE_ERROR_INTERNAL) when rows × cols
exceeds a safe threshold, or alternatively implement serialization of only the
sparse entries (nonzero coordinates + values) instead of calling s.to_dense();
ensure the changed branch still returns the same error semantics and uses
WRITE_ERROR_INTERNAL and write_error_with_message for the guarded failure path.

Value::StringArray(sa) => Ok(sa),
Value::CharArray(ca) => char_array_to_string_array(ca, encoding),
Value::Tensor(tensor) => tensor_to_string_array(tensor),
Value::SparseTensor(sparse) => tensor_to_string_array(sparse.to_dense().map_err(string_flow)?),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

[high] convert_to_string_array densifies sparse tensors, risking OOM

Line 685 unconditionally densifies the sparse tensor before converting each element to a string. Large sparse matrices will trigger memory exhaustion.

Impact: string(largeSparseMatrix) crashes the runtime.

Recommendation: Add a size guard and fail fast with a descriptive error if sparse.rows * sparse.cols > threshold, or implement sparse-aware conversion that iterates stored entries and produces a sparse string representation (e.g., "sparse 1000x1000 (50 stored)").

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/runmat-runtime/src/builtins/strings/core/string.rs` at line 685, The
Value::SparseTensor arm currently calls sparse.to_dense() before
tensor_to_string_array (seen in the match branch using Value::SparseTensor and
tensor_to_string_array), which can OOM for large sparse tensors; change this by
adding a size guard that checks sparse.rows * sparse.cols against a threshold
and returns a descriptive error (e.g., "cannot convert sparse tensor XxY with Z
stored entries to dense string array") or, better, implement a sparse-aware
conversion path that iterates stored entries (using the sparse object's
stored-entry API / nnz) and emits either a sparse string summary ("sparse
{rows}x{cols} ({stored} stored)") or a sparse string array without calling
to_dense; ensure the new path preserves existing error mapping (string_flow)
semantics when failing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant