faster prime sieve part 1 by oscardssmith · Pull Request #120 · JuliaMath/Primes.jl

oscardssmith · 2022-06-13T16:29:38Z

Before _primesmask(2^30) took 2.726 seconds after it took 2.358s. Although this is a relatively small improvement overall, it removes ~100% of the time for the small primes (250ms vs 30ms) which means that this will continue to show large gains once we optimize the larger primes.

I've also separated sieving into it's own file since I expect the code will become more complex as we move to better sieves.
@haampie since this is essentially part 1 of #87.

before `_primesmask(2^30)` took 2.726 seconds after it took `2.358s`.

codecov · 2026-06-16T19:48:01Z

Codecov Report

❌ Patch coverage is 98.78788% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.67%. Comparing base (c5b95b9) to head (60a3197).

Files with missing lines	Patch %	Lines
src/sieve.jl	98.73%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #120      +/-   ##
==========================================
+ Coverage   94.38%   94.67%   +0.28%     
==========================================
  Files           2        3       +1     
  Lines         463      563     +100     
==========================================
+ Hits          437      533      +96     
- Misses         26       30       +4

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Split sieve tests into test/sieve_tests.jl for a fast dev loop; runtests.jl includes it. Tests cross-check primes/primesmask against an isprime-based reference across small, window-crossing, and boundary ranges. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

_sieve_range crosses off primes in cache-sized 64-bit-chunk windows, resuming each base prime's (next-multiple, phase) state across windows. _segment_primes collects base primes recursively. Added alongside the old _primesmask; callers are rewired in following commits. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Both _primesmask methods are fully replaced by _sieve_range; no callers remain. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Pack each sieving prime's (prime, next-multiple, phase) into one tuple array so the crossing loop streams a single contiguous region; drop presieve primes and out-of-range primes at construction (_sieving_primes) to remove the per-window skip branch. Extract _presieve_fill!/_cross_window!. Restore 7,11,13,17,19 once after the loop instead of per-window. SEGMENT_CHUNKS 4096 -> 2048 (16 KiB). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Replace _sieve_range (which allocated the full O(hi) wheel mask) with a SegmentedSieve iterator holding only O(sqrt hi) base-prime state + one reusable SEGMENT_CHUNKS buffer. primes/primesmask/_segment_primes stream segments from it, scanning set bits with trailing_zeros (cost tracks prime count, not window width). SegmentedSieve accepts an optional precomputed base-prime list. Back-to-back @Btime vs 58ad969: primes(10^7) 18.6ms -> 9.7ms primes(10^8) 214ms -> 129ms primes(10^9) 3.58s -> 1.39s (2.6x) primesmask(10^9-10^7,10^9) 22ms -> 12.6ms Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Split iterate into the no-state (first window) and stateful (subsequent) methods so the 0x1f restore lives in the first-window path instead of being re-tested on every window. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

- _small_sieve_primes: single-buffer self-referential sieve for n that fits one window; _segment_primes uses it instead of recursing into SegmentedSieve, so base-prime generation no longer nests segmented sieves for small isqrt(hi). - SegmentedSieve(lo,hi,base_primes) now extends/recomputes the list when its max is below isqrt(hi); the no-base-primes constructor skips that check. - Extract _presieve_mask; _scan_segment now takes the buffer directly (reused by both the iterator consumers and the base case). - Collapse iterate back to one method with a once-per-window first-window check. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Readability pass over the segmented sieve: - rename _sieving_primes -> _crossing_state (it returns crossing records, not primes), _segment_primes -> _base_primes, _small_sieve_primes -> _small_primes; drop the _segmented_sieve helper for a single SegmentedSieve constructor. - the window buffer is now a BitVector, using ordinary buf[b] / buf[b]=false and buf.chunks instead of hand-rolled _getbit/_clearbit!. - rename cryptic locals (m -> hi_val, whi -> hi_wi, jj/r/L -> widx/q/start, bps -> base_primes, seg_base -> base) and trim the verbose comments. No behavior change; 25 sieve tests pass, perf unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Interleaved @Btime sweep (min over rounds, one process to control for load): chunks primes(1e8) primes(1e9) 1024 163ms 2928ms 2048 172ms 2610ms 4096 166ms 1998ms <- chosen 8192 282ms 1888ms 16384 190ms 1904ms 32768 194ms 1811ms 2048->4096 is the big jump (~24% at 1e9); above 4096 is a flat, noisy plateau. 4096 keeps the window L1d-resident. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

eachprime([lo,]hi) lazily streams primes in increasing order, backed by a SegmentedSieve it owns (single-pass, mutating cursor). The iteration state counts yielded primes. SegmentedSieve now yields an empty sieve for hi<7 (no throw), so EachPrime needs no Union. primes() drives eachprime with a presized result vector. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

A clean back-to-back showed primes=collect(eachprime) is ~28% slower (per-element iterate vs the batch _each_setbit callback), so primes keeps the windowed batch collection while eachprime remains the lazy iterator; both are thin layers on SegmentedSieve. Also restore the si increment in eachprime's 2,3,5 loop (without it, an out-of-range small spun forever for lo > 2). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

oscardssmith changed the title ~~Oscardssmith faster sieve part 1~~ faster prime sieve part 1 Jun 13, 2022

oscardssmith mentioned this pull request Jul 3, 2023

I would like to re-use Primes.Factorization for polynomials #126

Closed

oscardssmith closed this Jul 3, 2023

oscardssmith reopened this Jul 3, 2023

oscardssmith and others added 5 commits June 16, 2026 15:45

15% speedup for _primesmask

d30e48e

before `_primesmask(2^30)` took 2.726 seconds after it took `2.358s`.

move sieve to separate file

7593e87

bugfix for small sieve sizes.

c63ee36

fix ci failures

f550ad6

minor cleanup

58ad969

oscardssmith force-pushed the oscardssmith-faster-sieve-part-1 branch from 10613fb to 58ad969 Compare June 16, 2026 19:45

oscardssmith and others added 13 commits June 16, 2026 16:41

refactor: route primes() through the segmented sieve core

dc71b02

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

refactor: route primesmask() through the segmented sieve core

64abbb8

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

refactor: drop the now-unused _primesmask methods

ed72860

Both _primesmask methods are fully replaced by _sieve_range; no callers remain. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

faster prime sieve part 1#120

faster prime sieve part 1#120
oscardssmith wants to merge 18 commits into
mainfrom
oscardssmith-faster-sieve-part-1

oscardssmith commented Jun 13, 2022

Uh oh!

codecov Bot commented Jun 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

oscardssmith commented Jun 13, 2022

Uh oh!

codecov Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented Jun 16, 2026 •

edited

Loading