Add macro-controlled SIMD support (SSE/NEON) to math library#64
Draft
Add macro-controlled SIMD support (SSE/NEON) to math library#64
Conversation
- Add SKY_MATH_SIMD CMake option (ON by default) in cmake/options.cmake - Add SKY_MATH_SIMD compile definition in cmake/configuration.cmake - Create MathSimd.h with platform detection for SSE (x86) and NEON (ARM) - Add __m128/float32x4_t SIMD member to Vector4 union with alignas(16) - Optimize Vector4 operators (+=, -=, *=, /=, Dot, Normalize) with SIMD - Optimize Matrix4 multiply (mat*mat, mat*vec) with SIMD - Add comprehensive math unit tests for Vector4 and Matrix4 operations Co-authored-by: bluesky013 <35895395+bluesky013@users.noreply.github.com>
Co-authored-by: bluesky013 <35895395+bluesky013@users.noreply.github.com>
Co-authored-by: bluesky013 <35895395+bluesky013@users.noreply.github.com>
Co-authored-by: bluesky013 <35895395+bluesky013@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Improve math library to support SIMD with x86 and ARM
Add macro-controlled SIMD support (SSE/NEON) to math library
Feb 16, 2026
- Create SFloat4.h: platform-agnostic SIMD float4 type wrapping __m128 (SSE), float32x4_t (NEON), or scalar fallback with Load/Store/Set/Splat, arithmetic operators, HorizontalSum, HorizontalSumSplat, InvSqrt - Refactor Vector4.h: remove inline __m128/float32x4_t from union, include SFloat4.h instead of MathSimd.h directly - Refactor Vector4.inl: replace #if SSE/#elif NEON branches with unified #if SKY_SIMD_ENABLED using SFloat4 operations - Refactor Matrix4.inl: replace #if SSE/#elif NEON branches with unified #if SKY_SIMD_ENABLED using SFloat4 operations Co-authored-by: bluesky013 <35895395+bluesky013@users.noreply.github.com>
Co-authored-by: bluesky013 <35895395+bluesky013@users.noreply.github.com>
Copilot
AI
changed the title
Add macro-controlled SIMD support (SSE/NEON) to math library
Extract SFloat4 SIMD abstraction from Vector4/Matrix4
Feb 16, 2026
Restore the SIMD native type as a union member in Vector4 for direct access, avoiding unnecessary Load/Store memory round-trips. Vector4 operations now construct SFloat4 directly from the simd union member and write back via .value assignment. Co-authored-by: bluesky013 <35895395+bluesky013@users.noreply.github.com>
Copilot
AI
changed the title
Extract SFloat4 SIMD abstraction from Vector4/Matrix4
Add macro-controlled SIMD support for math library (SSE/NEON)
Feb 16, 2026
Remove SFloat4.h and replace all SFloat4 usage in Vector4.inl and Matrix4.inl with direct #if SSE / #elif NEON / #else SIMD intrinsics. Co-authored-by: bluesky013 <35895395+bluesky013@users.noreply.github.com>
Copilot
AI
changed the title
Add macro-controlled SIMD support for math library (SSE/NEON)
Add macro-controlled SIMD support (SSE/NEON) to math library
Feb 16, 2026
bluesky013
approved these changes
Feb 17, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds compile-time SIMD acceleration for
Vector4andMatrix4operations, with platform detection for x86 (SSE) and ARM (NEON) and scalar fallback.Changes
cmake/options.cmake—SKY_MATH_SIMDoption (ON by default)cmake/configuration.cmake— propagatesSKY_MATH_SIMDas compile definitionMathSimd.h— platform detection macros:SKY_SIMD_SSE,SKY_SIMD_NEON,SKY_SIMD_ENABLEDVector4.h—__m128/float32x4_tas union member alongsidefloat v[4]/{x,y,z,w}Vector4.inl— SIMD paths for+=,-=,*=,/=,Dot,Normalize(with Newton-Raphson refinedrsqrt)Matrix4.inl— SIMD mat×mat and mat×vec multiplytest/core/MathTest.cpp— tests for Vector4 arithmetic, dot, negate, and Matrix4 multiplyDesign
SIMD type lives in the
Vector4union for zero-cost access. Operations use direct intrinsics with three-way#if SSE / #elif NEON / #elsebranching:Normalizeuses_mm_rsqrt_ps/vrsqrteq_f32with one Newton-Raphson iteration for precision.💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.