Skip to content

Add batched and multi-dimensional rfft/irfft support#60

Merged
dlfivefifty merged 6 commits into
JuliaApproximation:mainfrom
jamesquinlan:main
Jun 17, 2026
Merged

Add batched and multi-dimensional rfft/irfft support#60
dlfivefifty merged 6 commits into
JuliaApproximation:mainfrom
jamesquinlan:main

Conversation

@jamesquinlan

Copy link
Copy Markdown
Contributor

Motivation: add support for AbstractMatrix and higher-dim AbstractArray types in rfft and irfft. It was needed to support workflows in SpeedyTransforms.jl which call plans on 2d views. Previously, generic_rfft only handled AbstractVector giving a MehtodError for matrces.

New Features:

  • Batched 1d transfomrs: Compute rffts along any dimension of an array.
  • Support transforms over regions (aligns with AbstractFFTs interface.
  • Handle plan inversions between real and complex types

Minor Bug Fixes & Improvements:

  • plan_rfft stores length of the tranformed dimension instead of the total array length
  • Fixed double-scaling bug when normalizing multi-dims array
  • Lossened pinv in DummyPlan to allow real-input plans to store complex-output inverses.

Verification:

  • added set of tests in test/fft_tests.jl of batached 1d rfft/irfft along different dimensions
  • 2d rffts
  • plan inversion
  • all existing tests passed (with a relaxation of tol = 100eps

Signed-off-by: jamesquinlan <james@stillwater-sc.com>
@milankl

milankl commented May 26, 2026

Copy link
Copy Markdown
Collaborator

Awesome thanks for this effort! Feel free to keep @maximilian-gelbrecht and me updated about this progress. We have recently implemented MatrixSpectralTransform in SpeedyTransforms as it's fast at low resolutions with the side effect of being more type flexible as it only needs a matrix vector multiplication. But being able to use GenericFFT is also great!

@codecov

codecov Bot commented May 26, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 93.33333% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.97%. Comparing base (eaae4c0) to head (43f6205).
⚠️ Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
src/fft.jl 93.33% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #60      +/-   ##
==========================================
+ Coverage   76.05%   82.97%   +6.91%     
==========================================
  Files           2        2              
  Lines         213      276      +63     
==========================================
+ Hits          162      229      +67     
+ Misses         51       47       -4     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: jamesquinlan <james@stillwater-sc.com>
Comment thread src/fft.jl Outdated
Comment thread src/fft.jl
Comment thread src/fft.jl Outdated
Signed-off-by: jamesquinlan <james@stillwater-sc.com>
@jamesquinlan

jamesquinlan commented May 30, 2026

Copy link
Copy Markdown
Contributor Author

My goal is to run in low-precision (added to this to the README).

A few bugs (in my case) found along the way include:

Bug 1: plan_fft not registered for real input
fft(x) for real arrays bypassed GenericFFT entirely, falling through to AbstractFFTs which ran in Float64. This made Float16 appear to give 1e-7 error when it was actually computing in Float64. Float16 has at best 1e-4.

Bug 2: Missing fft dispatch for non-FFTW types
Even after fixing plan_fft, AbstractFFTs method plan_fft(x::AbstractArray{<:Real}, region; kws...) won dispatch over GenericFFT's plan. Required explicitly defining fft(x::StridedArray{T}) where {T<:AbstractFloats}.

Bug 3: generic_fft only defined for Complex{T}, not real T
Real input fell through to the general fallback generic_fft(x) = generic_fft!(copy(complex(x))) which caused infinite recursion and stack overflow for Vector{Complex{BFloat16}}.

Bug 4: Index arithmetic computed in type T (LP formats have limited consecutive integers)
ks = range(zero(real(T)), stop=n-one(real(T)), length=n)
Wks = @. cispi(-T(ks^2/n))
For k=999, BFloat16 cannot represent 999 exactly (rounds to 1000), giving k^2 error of 576 and making twiddle factors completely wrong for the upper half of frequency bins. Fix: compute ks in Float64.

Bug 5: _conv! intermediate products exceeded low-precision type range
Internal convolution FFT produces values up to 2545 where eps(BFloat16(2545)) = 16, meaning every intermediate product is rounded to the nearest multiple of 16. Fix: promote to Float64 inside _conv!, round back to T at output.

Bug 6: Wks = T.(cispi.(...)) fails for real T
cispi returns complex values which cannot be converted to real BFloat16. Fix: always use Complex{real(T)} for twiddle factors.

Tests: Added testsets covering all six bugs for Float16, BFloat16, BigFloat

Dependencies:
(1) Moved DoubleFloats to test dependencies
(2) Added BFloat16s as test dependency

Signed-off-by: jamesquinlan <james@stillwater-sc.com>
Signed-off-by: jamesquinlan <james@stillwater-sc.com>
@jamesquinlan jamesquinlan requested a review from dlfivefifty June 12, 2026 12:37
Signed-off-by: jamesquinlan <james@stillwater-sc.com>
@jamesquinlan

Copy link
Copy Markdown
Contributor Author

Review comments addressed; removed threading and added CartesianIndices explanation. Tests for the missing coverage paths are in place.

@dlfivefifty dlfivefifty merged commit 770d6a1 into JuliaApproximation:main Jun 17, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants