Skip to content

Dev m4 optimization#29

Open
DiamonDinoia wants to merge 44 commits intoDEV_NEW_ALPHA_REBfrom
DEV_M4_OPTIMIZATION
Open

Dev m4 optimization#29
DiamonDinoia wants to merge 44 commits intoDEV_NEW_ALPHA_REBfrom
DEV_M4_OPTIMIZATION

Conversation

@DiamonDinoia
Copy link
Collaborator

Used clang builtin to vectorize iw4 accumulation.
Optimizations are disabled if gcc is used.

@DiamonDinoia DiamonDinoia requested a review from Wentzell May 31, 2024 21:28
@Wentzell Wentzell changed the base branch from unstable to DEV_NEW_ALPHA_REB June 3, 2024 18:53
@ahbarnett
Copy link
Collaborator

How's the FINUFFT speed vs NFFT3 working out for you?

@Wentzell Wentzell force-pushed the DEV_NEW_ALPHA_REB branch from d7823eb to 1efbbd1 Compare June 18, 2024 21:07
@Wentzell Wentzell force-pushed the DEV_M4_OPTIMIZATION branch from d450d90 to 791f048 Compare June 18, 2024 21:07
@Wentzell
Copy link
Member

How's the FINUFFT speed vs NFFT3 working out for you?

Hey @ahbarnett, switching to the FINUFFT backend has been a major performance improvement!
For the measurement that we are optimizing here the NUFFT transform is now subdominant again (~6%).

@Wentzell Wentzell force-pushed the DEV_M4_OPTIMIZATION branch from 8565f44 to 607b619 Compare June 25, 2024 16:40
@Wentzell Wentzell force-pushed the DEV_M4_OPTIMIZATION branch from 19c0e6a to e57982b Compare August 9, 2024 19:51
@Wentzell Wentzell force-pushed the DEV_M4_OPTIMIZATION branch 7 times, most recently from 03852e0 to 405df56 Compare May 2, 2025 15:02
@Wentzell Wentzell force-pushed the DEV_NEW_ALPHA_REB branch from 2e7d310 to 7ffa19f Compare May 2, 2025 15:12
@Wentzell Wentzell force-pushed the DEV_M4_OPTIMIZATION branch from 405df56 to d7e36bd Compare May 2, 2025 15:13
hmenke and others added 21 commits June 4, 2025 10:24
…er parameter (-1 = default and unlimited)

Co-authored-by: Henri Menke <henri@henrimenke.de>
Co-authored-by: Marcel Klett <m.klett@fkf.mpg.de>
Co-authored-by: Marcel Klett <m.klett@fkf.mpg.de>
Co-authored-by: Marcel Klett <m.klett@fkf.mpg.de>
Co-authored-by: Marcel Klett <m.klett@fkf.mpg.de>
…eanup

- Update reference test data

Co-authored-by: Nils Wentzell <nwentzell@flatironinstitute.org>
- Set nfft tol to 1e-15
- Use finufft guru interface
- Change default buffer size
@Wentzell Wentzell force-pushed the DEV_M4_OPTIMIZATION branch from f756c6f to ec86e6b Compare August 4, 2025 22:35
@Wentzell Wentzell force-pushed the DEV_M4_OPTIMIZATION branch from ec86e6b to c1109de Compare January 6, 2026 09:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants