Skip to content

Fix data race in BSpline and QuadraticSpline evaluation (rebase of #533)#550

Merged
ChrisRackauckas merged 3 commits into
SciML:masterfrom
ChrisRackauckas-Claude:rebase-pr533-bspline-data-race
Jun 29, 2026
Merged

Fix data race in BSpline and QuadraticSpline evaluation (rebase of #533)#550
ChrisRackauckas merged 3 commits into
SciML:masterfrom
ChrisRackauckas-Claude:rebase-pr533-bspline-data-race

Conversation

@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor

Please ignore until reviewed by @ChrisRackauckas.

Rebased version of #533 (original by @alecloudenback) onto current master. That PR was 18 commits behind and conflicting; this branch resolves the conflicts and updates it for the new test layout. Opened from the bot fork because I lacked push access to update #533's head branch directly. Supersedes #533 — close whichever is not merged.

Original change (#532 fix)

BSplineInterpolation, BSplineApprox, and uncached QuadraticSpline kept a scratch buffer in their sc field and overwrote it in place on every evaluation, so a "read" mutated shared state — evaluating one shared interpolant from multiple threads raced and silently returned wrong values.

  • BSpline{Interpolation,Approx}: allocate the spline-coefficient scratch per call instead of reusing A.sc, making evaluation reentrant.
  • QuadraticSpline: compute the three nonzero degree-2 B-spline basis values in local variables instead of writing to the shared buffer.
  • Thread-safety regression tests: a deterministic no-mutation-on-eval guard plus a concurrent stress test (now also covering degree-1 BSpline).

Rebase conflict resolution

Verification (run locally)

  • All 28 thread-safety tests pass with julia -t4 (concurrent stress path exercised, not skipped).
  • 2-point QuadraticSpline edge case works; cached vs uncached agree across the range.
  • Runic-formatted.

🤖 Generated with Claude Code

alecloudenback and others added 2 commits June 28, 2026 12:47
BSplineInterpolation, BSplineApprox, and QuadraticSpline (with the default
cache_parameters=false) kept a scratch buffer in their `sc` field and overwrote
it in place on every evaluation. Because a "read" (evaluation) mutated shared
state inside the object, evaluation was not reentrant: evaluating a single
shared interpolant from multiple threads raced on `A.sc` and silently returned
wrong values.

- BSpline{Interpolation,Approx} (interpolation_methods.jl, derivatives.jl):
  allocate the spline-coefficient scratch per call instead of reusing `A.sc`.
  This matches the existing ForwardDiff.Dual branch and makes eval reentrant.
- QuadraticSpline (parameter_caches.jl): compute the three nonzero degree-2
  B-spline basis values in local variables instead of writing them into the
  shared `A.sc` buffer. Keeps evaluation allocation-free (verified for scalar
  and SVector data) while making it reentrant; also covers derivatives/integrals.
- Add thread-safety regression tests: a deterministic no-mutation-on-eval check
  plus a concurrent stress test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A degree-1 B-spline goes through the same _interpolate method as higher
degrees (dispatch is on type, not degree), so it was equally subject to the
shared-buffer race; add it explicitly to document the coverage.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

Added a follow-up commit: Make BSpline evaluation allocation-free as well as thread-safe.

The original #532 fix made BSpline eval reentrant by allocating a per-call scratch buffer (zeros(eltype(t), n)), which removed the data race but added a heap allocation per call (848 B for a length-99 scalar spline). QuadraticSpline was already allocation-free via its local-variable rewrite; this brings BSpline to parity.

Approach: a degree-d B-spline has only d + 1 nonzero basis functions at any point, so the new bspline_nonzero_coefficients computes them into a stack-allocated MVector (StaticArrays) and returns an SVector window + control-point offset. Allocation-free and reentrant — no shared or heap state per call. All four _interpolate and four _derivative BSpline methods use it.

To keep a single statically-allocation-free path (AllocCheck sees every branch reachable from the call operator, including extrapolation paths into _derivative), the heap fallback is removed and degree is bounded at construction (d < 16). Max degree used anywhere in this repo is 3; degree ≥16 global B-splines are numerically degenerate.

StaticArrays moves from a test-only dep to a hard dep.

Allocations/call (length-99 scalar spline):

before after
BSpline value 848 B 0 B
BSpline derivative 848 B 0 B
BSplineApprox 128 B 0 B
BSpline array value 1888 B 1040 B (output array only)

Verification (all run locally):

  • New bspline_nonzero_coefficients matches the reference spline_coefficients! exactly (max diff 0.0) over degrees 1–7 incl. both endpoints
  • Derivatives match finite differences (~1e-10) and ForwardDiff (~1e-16)
  • AllocCheck.check_allocs: 0 allocation sites for value and derivative
  • test/qa/alloc_tests.jl (with new BSpline value+derivative cases): 16/16
  • test/Methods/thread_safety_tests.jl with -t4: 28/28 (concurrent stress path exercised)
  • interpolation_tests.jl: 17412 pass, 0 fail; derivative_tests.jl: 34479 pass, 0 fail

🤖 Generated with Claude Code

@ChrisRackauckas ChrisRackauckas marked this pull request as ready for review June 29, 2026 11:12
The thread-safety fix for SciML#532 replaced the shared `sc` scratch field with a
per-call `zeros(eltype(t), n)` buffer in BSpline{Interpolation,Approx} evaluation
and differentiation. That made eval reentrant but reintroduced a heap allocation
on every call (848 B for a length-99 scalar spline; QuadraticSpline was already
allocation-free via its local-variable rewrite).

A degree-`d` B-spline has only `d + 1` nonzero basis functions at any point, so a
buffer sized to the full knot vector is unnecessary. `bspline_nonzero_coefficients`
computes those `d + 1` values into a stack-allocated `MVector` (StaticArrays) and
returns them as an `SVector` window plus the control-point offset. This is both
allocation-free and reentrant — no shared or heap state per call. All four
`_interpolate` and four `_derivative` BSpline methods use it.

Because the guarantee requires a single statically-allocation-free path (AllocCheck
sees every branch reachable from the call operator, including the extrapolation
paths that dispatch to `_derivative`), the heap fallback is removed and the degree
is bounded at construction (`d < 16`). The largest degree used anywhere in the repo
is 3; degree ≥16 global B-splines are numerically degenerate.

StaticArrays moves from a test-only dependency to a hard dependency.

Allocations per call (length-99 scalar spline): value 848 B → 0 B, derivative
848 B → 0 B; BSplineApprox 128 B → 0 B. Array-valued eval still allocates its
output array, which is inherent.

Adds AllocCheck regression coverage for BSpline value and derivative evaluation to
test/qa/alloc_tests.jl.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01TpSbnpEmw2Z2z5rxvHWLGn
@ChrisRackauckas-Claude ChrisRackauckas-Claude force-pushed the rebase-pr533-bspline-data-race branch from 82c269e to d60cf28 Compare June 29, 2026 11:25
@ChrisRackauckas ChrisRackauckas merged commit 4cf5128 into SciML:master Jun 29, 2026
15 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants