Skip to content

Batched (matrix) right-hand sides: LinearProblem(A, B) ≡ A \ B (v4.0.0)#1072

Draft
ChrisRackauckas-Claude wants to merge 7 commits into
SciML:mainfrom
ChrisRackauckas-Claude:batch-rhs-v4
Draft

Batched (matrix) right-hand sides: LinearProblem(A, B) ≡ A \ B (v4.0.0)#1072
ChrisRackauckas-Claude wants to merge 7 commits into
SciML:mainfrom
ChrisRackauckas-Claude:batch-rhs-v4

Conversation

@ChrisRackauckas-Claude

@ChrisRackauckas-Claude ChrisRackauckas-Claude commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Please ignore until reviewed by @ChrisRackauckas. Draft; opened by an agent at Chris's direction.

Closes #552.

Summary

solve(LinearProblem(A, B::AbstractMatrix)) and the init/solve! cache path now behave like A \ B, returning an n×k solution, for all factorization-based algorithms. Iterative/Krylov algorithms throw an informative ArgumentError at init time for matrix b instead of the previous internal MethodError (or silently wrong shapes).

Root cause of the old behavior: __init_u0_from_Ab(A, b) = similar(b, size(A, 2)) flattened a matrix b into a length-n vector u, so factorization solve! hit _ldiv!(::Vector, ::LU, ::Matrix) with no method.

What changed

  • src/common.jl: matrix-b method for __init_u0_from_Ab (zero-filled (size(A,2), size(b,2)), + SMatrix disambiguations); new _check_batched_rhs_support guard at __init for AbstractKrylovSubspaceMethod and Krylov-choosing DefaultLinearSolver.
  • src/factorization.jl: _check_residual_safety buffer made shape-generic.
  • src/default.jl: size heuristics use size(b, 1) instead of length(b) (3 sites) so batched b doesn't inflate the apparent problem size.
  • src/iterative_wrappers.jl: init_cacheval(::KrylovJL, A, b::AbstractMatrix, ...) = nothing so the default polyalgorithm's unused Krylov slots initialize for structured defaults.
  • src/mkl.jl / src/openblas.jl / src/appleaccelerate.jl: rectangular (m>n) solution copies made matrix-aware.
  • src/simplelu.jl: informative error (vector-only workspace).
  • ext/LinearSolveSparseArraysExt.jl: CHOLMOD pre-1.12 fallback widened to AbstractVecOrMat; SPQR matrix solve via \; SparseColumnPivotedQR column-by-column; size(b,1) heuristic.
  • test/Core/batch.jl (new, 85 tests, wired into Core), docs tutorial section + release note.
  • Project.toml: 3.87.0 → 4.0.0.

Batch support matrix (all verified by running)

Works (≡ A\B): LU, GenericLU (incl. BigFloat), QR (NoPivot/ColumnNorm), SVD, Cholesky (Symmetric/Hermitian), BunchKaufman, NormalCholesky, LDLt, Diagonal, DirectLdiv!, MKL/OpenBLAS/AppleAccelerate LU, UMFPACK, KLU, PureKLU, CHOLMOD, SparseColumnPivotedQR, the DEFAULT algorithm (dense/sparse/structured/non-square), static A+B, cache reuse (cache.A=, cache.b=), residual safety, singular-A Failure retcode.

Informative ArgumentError: all KrylovJL_*, SimpleGMRES, IterativeSolversJL/KrylovKitJL, SimpleLUFactorization, default→Krylov for operator A.

Left as-is: SMatrix A + heap Matrix B (StaticArrays.LU has no matrix ldiv!; SMatrix+SMatrix works); adjoint.jl matrix-b adjoints out of scope.

Why breaking (v4.0.0)

The u0 shape for matrix b changes from a flattened length-n vector to n×k, and matrix-b + iterative now errors at init. Downstream audit (OrdinaryDiffEq.jl, NonlinearSolve.jl): functionally unaffected — every call site passes _vec(...)/safe_vec-flattened vectors with explicit u0 (~60 sites audited in OrdinaryDiffEq libs, 1 production site in NonlinearSolveBase). They need only compat bumps for the major version (OrdinaryDiffEq: 10 lib Project.tomls at "3.75.0", RosenbrockTableaus "3.46", 2 test envs; NonlinearSolve: top-level + Base "3.48", FirstOrder/QuasiNewton "2.36.1, 3", gpu-test/docs envs).

Tests

GROUP=Core Pkg.test() on Julia 1.12.4 and 1.10.11: 1506 pass, 0 new failures (identical counts on both), including the 85 new batch tests. Pre-existing on unmodified main: 1 @test_broken in return codes, 2 in resize, and 1 error in forwarddiff_overloads.jl ("PureKLU direct dual path") — the latter bisected to aabf007 (#1069) and fixed separately in #1073.

Note: tests were run with the version field temporarily at 3.87.0 solely so temp test environments resolve; docs example manually verified equivalent to a passing test.

First adopter once released: phi!/PhiPadeCache in SciML/ExponentialUtilities.jl#236 (currently using lu!/ldiv! directly because the v3 cache path only solves vector b).

🤖 Generated with Claude Code

ChrisRackauckas and others added 3 commits July 3, 2026 11:42
`solve(LinearProblem(A, B))` with `B::AbstractMatrix` now computes the
equivalent of `A \ B`: `u0` is initialized as a `size(A, 2) × size(B, 2)`
matrix and the factorization-based algorithms solve all columns against a
single factorization of `A`. This is a breaking change (v4.0.0): previously a
matrix `b` initialized a flattened vector `u` and errored downstream.

- `__init_u0_from_Ab` gains matrix-`b` methods (including SMatrix
  disambiguation).
- Iterative/Krylov algorithms (`KrylovJL`, `SimpleGMRES`, and the other
  `AbstractKrylovSubspaceMethod`s, plus the default algorithm when it selects
  a Krylov method for an operator) throw an informative `ArgumentError` at
  `init` time for matrix `b`; `SimpleLUFactorization` (vector-only workspace)
  does the same.
- The default polyalgorithm's unused Krylov cacheval slots initialize to
  `nothing` for matrix `b` so structured-matrix defaults (Diagonal,
  SymTridiagonal, ...) work.
- `_check_residual_safety` sizes its residual buffer shape-generically.
- Default-algorithm size heuristics use `size(b, 1)` instead of `length(b)`
  so batched right-hand sides don't inflate the apparent problem size (same
  for the sparse `use_klulike_sparse_structure` heuristic).
- MKL/OpenBLAS/AppleAccelerate LU copy solution columns correctly for matrix
  `b` in the rectangular (`m > n`) path.
- SparseArrays extension: CHOLMOD pre-1.12 `_ldiv!` fallback widened to
  matrices, SPQR gains a matrix `_ldiv!` via `\` (no in-place matrix ldiv! on
  any Julia version), and SparseColumnPivotedQR solves matrix `b`
  column-by-column against the one factorization.

Part of SciML#552

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01EYp371jx6LurezUDhKcYRh
Covers dense LU/GenericLU/QR (NoPivot + ColumnNorm)/SVD/Cholesky/
BunchKaufman/NormalCholesky vs A \ B for Float64 and ComplexF64, the default
algorithm (dense, sparse, structured, non-square with k != n), sparse
UMFPACK/KLU/CHOLMOD and the non-square sparse QR default, cache reuse via
`cache.b = B2` and `cache.A = A2`, failure retcodes on singular A (including
the default's QR fallback), residual-safety with matrix B, informative errors
for Krylov methods and SimpleLUFactorization, static arrays, BigFloat, and
single-column-matrix vs vector consistency.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01EYp371jx6LurezUDhKcYRh
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01EYp371jx6LurezUDhKcYRh
All sublibrary, test, qa, GPU, Trim, and docs environments pinned
LinearSolve = "3[.x.y]", so every CI job died at resolution against the
4.0.0 bump ("empty intersection between LinearSolve@4.0.0 and project
compatibility"). Widen each to include 4. The docs environment also gains
[sources] entries for LinearSolve and LinearSolveAutotune (the registered
Autotune restricts LinearSolve to 3, so the docs build must use the local
copies, matching the pattern the test environments already use).

Verified locally: docs, lib/LinearSolveAutotune, and test/qa environments
all instantiate against 4.0.0.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

CI fallout was the predictable major-bump resolution failure: 18 in-repo environments (sublibraries, test envs, qa, GPU, Trim, docs) pinned LinearSolve = "3", so every job died at instantiate with "empty intersection between LinearSolve@4.0.0 and project compatibility". Pushed 9c30317 widening all of them to include 4, plus [sources] entries in the docs env (registered LinearSolveAutotune restricts LinearSolve to 3, so docs must build against the local copies — same pattern the test envs already use). Verified locally that the docs, Autotune, and qa environments instantiate against 4.0.0.

The Runic check failure is pre-existing on main (test/Core/forwarddiff_overloads.jl landed unformatted in aabf007/#1069) — formatted in #1073 with the rest of that commit's fallout.

ChrisRackauckas and others added 2 commits July 3, 2026 14:25
The repo-wide Runic Format Check fails on main because this file landed
unformatted in aabf007 (SciML#1069); format it here alongside the dispatch fix
for the same commit's fallout.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Every registered AlgebraicMultigrid >= 0.6 caps LinearSolve at 3, so the
root test target (which all CI test groups resolve) was unsatisfiable
against the 4.0.0 bump. AMG 0.5.x predates its LinearSolve dependency and
exposes the identical API surface the extension uses
(SmoothedAggregationAMG, RugeStubenAMG, aspreconditioner, ruge_stuben);
the AlgebraicMultigridJL testset passes on 0.5.1 (verified locally,
including the tight-tolerance case). The resolver will pick AMG 2.x again
automatically once AlgebraicMultigrid widens its own compat; the compat
comment marks the entry for removal then.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

Two more CI fixes pushed (0c5e242, d64bad8):

  1. Runic: the format violation in test/Core/forwarddiff_overloads.jl is pre-existing on main (landed unformatted in aabf007/Route duals-in-b PureKLU solves to a native LinearCache (uses PureKLU 1.1 mixed ldiv!) #1069); the format fix is cherry-picked here too so this PR's check can go green independently of Fix PureKLU Dual-A dispatch stolen by DualBLinearProblem opt-out #1073.

  2. The real v4 bootstrap problem: every registered AlgebraicMultigrid ≥ 0.6 caps LinearSolve at 3, and AMG sits in the root test target that all CI test groups resolve — so one registry cap poisoned every job. Rather than dropping the AMG tests or pointing at a fork, root compat now also allows AlgebraicMultigrid = 0.5, which predates AMG's LinearSolve dependency but exposes the identical API the extension uses. Verified locally: resolver picks 0.5.1 under v4, the ext loads, and the full AlgebraicMultigridJL testset passes including the tight-tolerance case. Once AMG widens its compat (needs a PR to JuliaLinearAlgebra/AlgebraicMultigrid.jl — not opened, per repo-permission rules), the resolver auto-upgrades back to 2.x and the annotated compat entry can be dropped.

Registry scan says these are the only three packages in any in-repo environment that cap LinearSolve: AlgebraicMultigrid (handled above), and the two in-repo sublibraries LinearSolveAutotune/LinearSolvePyAMG (handled via widened compat + [sources]). Full root test target (23 extras) verified to resolve locally against 4.0.0.

@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

Upstream compat PR opened at Chris's direction: JuliaLinearAlgebra/AlgebraicMultigrid.jl#135 (verified against this branch — AMG algorithms and the preconditioner path give bit-identical results under v3.87.0 and v4.0.0). Once that merges and tags, the temporary AlgebraicMultigrid = "0.5" compat entry here can be dropped and the resolver returns to AMG 2.x.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

CI status: 61 pass, 1 fail — and the one failure ("Downgrade / Downgrade Tests - Core") is pre-existing infrastructure, not this branch:

  • Failure mode is "The self-hosted runner lost communication with the server" ~16–17 min into the test run (twice in a row on this PR, same spot), i.e. the runner process dies — no test assertion fails.
  • The same workflow fails on main: 4 of the last 5 main runs failed (2026-07-03 ×2, 2026-06-30, 2026-06-28; last success 2026-06-29). Whatever kills the runner predates this branch — the timing is consistent with something merged ~June 30 making the downgraded Core run heavy enough to starve the runner, but that's for a separate investigation.
  • This branch's Downgrade job passes resolution and build (the phases v4 could plausibly break, and the phases that did break before the compat fixes here); locally the downgraded env resolves.

So the v4 branch is green on everything that is green on main. Attempting a local repro of the downgraded Core run to see if it's an OOM; will file findings separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Batch mode

2 participants