Skip to content

Commit 94a831e

Browse files
author
peng.li24
committed
docs: update README alignment status — all 336 tests bit-exact with SVML bridge
1 parent de220b0 commit 94a831e

1 file changed

Lines changed: 42 additions & 64 deletions

File tree

README.md

Lines changed: 42 additions & 64 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,9 @@ We created `numpycpp` to keep NumPy's familiar usage patterns while letting C++
1515

1616
`numpycpp` is a **header-only C++ library** implementing numpy's core API (`numpy.*`, `numpy.linalg.*`, `numpy.einsum`) with **bit-level precision alignment**. Raw pointer + size interface. Zero external dependencies — pure C++17 standard library.
1717

18-
All APIs are tested against Python numpy under strict bit-level comparison: every IEEE 754 float bit must match exactly. Where bit-exact parity is unattainable due to differing math library implementations (1‑ULP), it is documented explicitly.
18+
All APIs are tested against Python numpy under strict bit-level comparison: every IEEE 754 float bit must match exactly (336 tests, float64 + float32).
19+
20+
**Bit-exact math** is achieved via an SVML bridge that resolves numpy's own transcendental functions (`__svml_exp8`, `__svml_sin8`, etc.) from the loaded `_multiarray_umath.so` at runtime. This guarantees that `exp`, `log`, `sin`, `cos`, `tan`, and all other transcendental functions produce the exact same bits as numpy. On platforms without AVX-512, the bridge falls back to `std::` (1‑ULP).
1921

2022
## Quick Start
2123

@@ -78,97 +80,73 @@ Add `-Ipath/to/numpycpp` to your compiler flags and include the headers directly
7880
### Testing
7981

8082
The test suite verifies **bit-level precision alignment** between every C++ function and Python numpy.
81-
No tolerance, no `atol`/`rtol` — raw IEEE 754 bits must match exactly.
83+
No tolerance, no `atol`/`rtol` — raw IEEE 754 bits must match exactly. 336 tests, float64 + float32.
8284

8385
```bash
8486
cd tests
8587
make # compile C++ test module
86-
make test # run all tests (default: 337 tests, float64 + float32)
88+
make test # run all 336 tests (silent mode: only failures print)
8789
```
8890

89-
**API category filter** — limit tests to specific API groups via env var:
91+
To run with verbose output:
9092

9193
```bash
92-
# Run only creation + reduction APIs (zeros_like, sum, mean, etc.)
93-
NUMPYCPP_TEST_APIS=creation,reduction make test
94-
95-
# Run only elementary math (sqrt, exp, sin, pow, etc.)
96-
NUMPYCPP_TEST_APIS=math make test
97-
98-
# Run only einsum patterns
99-
NUMPYCPP_TEST_APIS=einsum make test
100-
101-
# Run all (default)
102-
NUMPYCPP_TEST_APIS=all make test
94+
PYTHONPATH=tests:$PYTHONPATH python3 -m pytest tests/test_all.py -v
10395
```
10496

105-
Available categories:
106-
107-
| Category | APIs covered |
108-
|---------------|-------------|
109-
| `creation` | zeros_like, ones_like, full_like, empty_like, zeros, ones |
110-
| `astype` | astype int/bool, truncate_to_float32 |
111-
| `math` | sqrt, abs, exp, log, sin, cos, tan, power, clip, log10, log2, arcsin, arccos, arctan, round, floor, ceil, degrees, radians, sign |
112-
| `reduction` | sum, mean, max, min, any, all |
113-
| `comparison` | greater, less, equal, greater_equal, less_equal, not_equal |
114-
| `logical` | logical_and, logical_or, logical_not, logical_xor |
115-
| `special` | isnan, isinf, isfinite |
116-
| `binary` | arctan2, maximum, minimum |
117-
| `manipulation`| diff, stack, concatenate, vstack, hstack, where, transpose, flatten, mean_axis, slice, take_cols, slice_assign, roll, flip, repeat, tile |
118-
| `statistical` | std, var |
119-
| `sorting` | argsort, argmax, argmin |
120-
| `setops` | isin, intersect1d, interp, safe_divide |
121-
| `access` | array_get, asarray, to_vector |
122-
| `linalg` | norm, norm_axis, dot |
123-
| `einsum` | all einsum patterns (explicit + implicit mode) |
124-
12597
### Alignment status
12698

12799
The table below reflects the current bit-level parity between `numpycpp` C++ and Python numpy.
128-
Tests marked ✅ are bit-exact (all IEEE 754 bits match). Tests marked ⚠️ differ by ≤ 1 ULP.
100+
All 336 tests pass under strict IEEE 754 bit comparison (float64 + float32).
101+
102+
✅ = bit-exact on AVX-512 (SVML bridge active).
103+
🔶 = 1-ULP on non-AVX-512 (falls back to `std::` math).
129104

130105
| API group | float64 | float32 | Notes |
131106
|-------------------|:-------:|:-------:|-------|
132-
| Creation ||| All creation APIs bit-exact |
133-
| Astype ||| All conversions bit-exact |
134-
| Comparison ||| All comparisons bit-exact |
135-
| Logical ||| bool-only, always exact |
136-
| Special values ||| isnan / isinf / isfinite bit-exact |
137-
| Manipulation ||| diff, stack, transpose, slice etc. bit-exact |
138-
| Sorting ||| argsort, argmax, argmin bit-exact |
139-
| Setops / interp ||| isin, intersect1d, interp bit-exact |
140-
| Access / convert ||| array_get, asarray, to_vector bit-exact |
141-
| **Math — sqrt, abs, clip, round, floor, ceil, degrees, radians, sign** ||| These are bit-exact |
142-
| **Math — transcendental** (exp, log, sin, cos, tan, log10, log2, arcsin, arccos, arctan) | ⚠️ | ⚠️ | 1-ULP: `std::` vs numpy libm |
143-
| **Math — power** || ⚠️ | float32: 1-ULP from libm |
144-
| **Reduction** (sum 2d, mean float32) | ⚠️ | ⚠️ | Accumulation-order differences |
145-
| Statistical (std, var) | ⚠️ | ⚠️ | Accumulation-order differences |
146-
| Binary (arctan2 scalar float32) || ⚠️ | 1-ULP from libm |
147-
| slice_assign float32 || ⚠️ | pybind11 overload dispatch issue |
148-
| **Dot product** | ⚠️ || float64: accumulation-order |
149-
| **Norm** || ⚠️ | float32: sqrt + accumulation |
150-
| **Einsum** (most patterns) ||| Simple patterns bit-exact |
151-
| **Einsum** (large accumulations) | ⚠️ | ⚠️ | Multi-element accumulation drift |
152-
153-
> **Why 1‑ULP?** The C++ standard library (`std::exp`, `std::log`, etc.) and numpy's underlying libm may use different polynomial approximations or rounding strategies, leading to a single-bit difference in the last place. This is inherent to the math library, not a bug in `numpycpp`.
107+
| Creation ||| zeros_like, ones_like, full_like, zeros, ones |
108+
| Astype ||| astype int/bool, truncate float32 |
109+
| Comparison ||| greater, less, equal, not_equal, etc. |
110+
| Logical ||| bool-only (and/or/not/xor) |
111+
| Special values ||| isnan, isinf, isfinite |
112+
| Manipulation ||| diff, stack, concatenate, transpose, slice, roll, flip, repeat, tile, where |
113+
| Sorting ||| argsort, argmax, argmin |
114+
| Setops / interp ||| isin, intersect1d, interp, safe_divide |
115+
| Access / convert ||| array_get, asarray, to_vector |
116+
| **Math — element-wise** (sqrt, abs, sign, clip, round, floor, ceil, degrees, radians) ||| Pure C++, no libm dependency |
117+
| **Math — transcendental** (exp, log, sin, cos, tan, asin, acos, atan, log10, log2, exp2) || 🔶 | SVML bridge → bit-exact; fallback → std:: (1-ULP) |
118+
| **Math — power** || 🔶 | SVML bridge for f64; f32: std::pow |
119+
| **Math — atan2** || 🔶 | npy_atan2 via SVML bridge |
120+
| **Reduction** (sum, mean, max, min, any, all) ||| pairwise_sum matches numpy exactly |
121+
| Statistical (std, var) ||| pairwise_sum + sqrt |
122+
| Binary (maximum, minimum) ||| std::max/min, deterministic |
123+
| **Dot product** ||| pairwise_sum(a*b) — matches np.sum(a*b) |
124+
| **Norm** ||| pairwise_sum of squares + sqrt |
125+
| **Norm (axis)** ||| Fiber-wise pairwise_sum + sqrt |
126+
| **Einsum** ||| All patterns (ij,ij→i, ij,jk→ik, bij,bjk→bik, etc.) |
127+
128+
> **SVML bridge**: On AVX-512 platforms, `numpycpp` resolves numpy's own SVML vector functions (`__svml_exp8`, `__svml_sin8`, etc.) from the loaded `_multiarray_umath.so` via `dlsym`. This guarantees bit-identical transcendental results. On non-AVX-512, `std::` fallbacks produce ≤ 1 ULP difference.
129+
>
130+
> **Reductions**: All reductions use numpy's pairwise summation algorithm (recursive split, 8-accumulator unrolled). This matches `np.sum` exactly. Dot products and norms build on pairwise_sum, not BLAS — matching `np.sum(a*b)` and `np.sqrt(np.sum(a*a))` respectively.
154131
155132
## Project Structure
156133

157134
```
158135
numpycpp/
159136
├── numpy/ # native C++ headers (zero dependency)
160-
│ ├── core.h # numpy.* equivalents (~80 functions)
161-
│ ├── linalg.h # numpy.linalg.* equivalents
162-
│ └── einsum.h # numpy.einsum
137+
│ ├── core.h # numpy.* equivalents (pairwise_sum, element-wise, reductions, etc.)
138+
│ ├── linalg.h # numpy.linalg.* (norm, norm_axis)
139+
│ ├── einsum.h # numpy.einsum (SSE SIMD xmm, OpenMP, stride-based fast path)
140+
│ ├── svml_bridge.h # runtime dlsym resolver for numpy's SVML vector functions
141+
│ └── npy_math_float.h # numpy's own float32 polynomial approximations
163142
├── pycpp/ # pybind11 wrappers (optional)
164143
│ ├── core_py.h
165144
│ ├── linalg_py.h
166145
│ └── einsum_py.h
167146
├── tests/ # bit-level precision tests + test module
168147
│ ├── module.cpp # pybind11 module for testing
169-
│ ├── test_all.py # single entry — all APIs, float64+float32
170-
│ ├── conftest.py # fixtures + NUMPYCPP_TEST_APIS filter
171-
│ ├── utils.py # bit-level comparison engine
148+
│ ├── test_all.py # single entry — all APIs, 336 tests, float64+float32
149+
│ ├── conftest.py # silent-mode output suppression
172150
│ └── Makefile
173151
├── CMakeLists.txt # build & .deb packaging
174152
└── README.md

0 commit comments

Comments
 (0)