Skip to content

Add macro-controlled SIMD support (SSE/NEON) to math library#64

Draft
Copilot wants to merge 9 commits intomainfrom
copilot/improve-math-library-simd-support
Draft

Add macro-controlled SIMD support (SSE/NEON) to math library#64
Copilot wants to merge 9 commits intomainfrom
copilot/improve-math-library-simd-support

Conversation

Copy link

Copilot AI commented Feb 16, 2026

Adds compile-time SIMD acceleration for Vector4 and Matrix4 operations, with platform detection for x86 (SSE) and ARM (NEON) and scalar fallback.

Changes

  • cmake/options.cmakeSKY_MATH_SIMD option (ON by default)
  • cmake/configuration.cmake — propagates SKY_MATH_SIMD as compile definition
  • MathSimd.h — platform detection macros: SKY_SIMD_SSE, SKY_SIMD_NEON, SKY_SIMD_ENABLED
  • Vector4.h__m128/float32x4_t as union member alongside float v[4]/{x,y,z,w}
  • Vector4.inl — SIMD paths for +=, -=, *=, /=, Dot, Normalize (with Newton-Raphson refined rsqrt)
  • Matrix4.inl — SIMD mat×mat and mat×vec multiply
  • test/core/MathTest.cpp — tests for Vector4 arithmetic, dot, negate, and Matrix4 multiply

Design

SIMD type lives in the Vector4 union for zero-cost access. Operations use direct intrinsics with three-way #if SSE / #elif NEON / #else branching:

inline Vector4& Vector4::operator+=(const Vector4& rhs) {
#if SKY_SIMD_SSE
    simd = _mm_add_ps(simd, rhs.simd);
#elif SKY_SIMD_NEON
    simd = vaddq_f32(simd, rhs.simd);
#else
    x += rhs.x; y += rhs.y; z += rhs.z; w += rhs.w;
#endif
    return *this;
}

Normalize uses _mm_rsqrt_ps / vrsqrteq_f32 with one Newton-Raphson iteration for precision.


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 4 commits February 16, 2026 01:34
- Add SKY_MATH_SIMD CMake option (ON by default) in cmake/options.cmake
- Add SKY_MATH_SIMD compile definition in cmake/configuration.cmake
- Create MathSimd.h with platform detection for SSE (x86) and NEON (ARM)
- Add __m128/float32x4_t SIMD member to Vector4 union with alignas(16)
- Optimize Vector4 operators (+=, -=, *=, /=, Dot, Normalize) with SIMD
- Optimize Matrix4 multiply (mat*mat, mat*vec) with SIMD
- Add comprehensive math unit tests for Vector4 and Matrix4 operations

Co-authored-by: bluesky013 <35895395+bluesky013@users.noreply.github.com>
Co-authored-by: bluesky013 <35895395+bluesky013@users.noreply.github.com>
Co-authored-by: bluesky013 <35895395+bluesky013@users.noreply.github.com>
Co-authored-by: bluesky013 <35895395+bluesky013@users.noreply.github.com>
Copilot AI changed the title [WIP] Improve math library to support SIMD with x86 and ARM Add macro-controlled SIMD support (SSE/NEON) to math library Feb 16, 2026
Copilot AI requested a review from bluesky013 February 16, 2026 01:36
Copilot AI and others added 2 commits February 16, 2026 02:03
- Create SFloat4.h: platform-agnostic SIMD float4 type wrapping __m128 (SSE),
  float32x4_t (NEON), or scalar fallback with Load/Store/Set/Splat,
  arithmetic operators, HorizontalSum, HorizontalSumSplat, InvSqrt
- Refactor Vector4.h: remove inline __m128/float32x4_t from union, include
  SFloat4.h instead of MathSimd.h directly
- Refactor Vector4.inl: replace #if SSE/#elif NEON branches with unified
  #if SKY_SIMD_ENABLED using SFloat4 operations
- Refactor Matrix4.inl: replace #if SSE/#elif NEON branches with unified
  #if SKY_SIMD_ENABLED using SFloat4 operations

Co-authored-by: bluesky013 <35895395+bluesky013@users.noreply.github.com>
Co-authored-by: bluesky013 <35895395+bluesky013@users.noreply.github.com>
Copilot AI changed the title Add macro-controlled SIMD support (SSE/NEON) to math library Extract SFloat4 SIMD abstraction from Vector4/Matrix4 Feb 16, 2026
Restore the SIMD native type as a union member in Vector4 for direct
access, avoiding unnecessary Load/Store memory round-trips. Vector4
operations now construct SFloat4 directly from the simd union member
and write back via .value assignment.

Co-authored-by: bluesky013 <35895395+bluesky013@users.noreply.github.com>
Copilot AI changed the title Extract SFloat4 SIMD abstraction from Vector4/Matrix4 Add macro-controlled SIMD support for math library (SSE/NEON) Feb 16, 2026
Remove SFloat4.h and replace all SFloat4 usage in Vector4.inl and
Matrix4.inl with direct #if SSE / #elif NEON / #else SIMD intrinsics.

Co-authored-by: bluesky013 <35895395+bluesky013@users.noreply.github.com>
Copilot AI changed the title Add macro-controlled SIMD support for math library (SSE/NEON) Add macro-controlled SIMD support (SSE/NEON) to math library Feb 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants