Performance update for permutations.jl#205
Conversation
Update docstrings
Update for string comparison in `derangements`
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #205 +/- ##
==========================================
+ Coverage 97.17% 97.19% +0.02%
==========================================
Files 8 8
Lines 813 857 +44
==========================================
+ Hits 790 833 +43
- Misses 23 24 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Delete non-mutating `nextpermutation()`
Provide a derangement-specific implementation. Improving performance and providing further functionality.
|
I noticed that Some benefits of the new implementation:
BenchmarkingUnique set elementsCurrent implementation ( BenchmarkTools.Trial: 10 samples with 1 evaluation per sample.
Range (min … max): 404.076 ms … 703.335 ms ┊ GC (min … max): 24.18% … 44.09%
Time (median): 496.590 ms ┊ GC (median): 28.75%
Time (mean ± σ): 517.083 ms ± 91.147 ms ┊ GC (mean ± σ): 31.96% ± 6.32%
█ █ █ █ ██ █ █ █ █
█▁▁▁▁▁█▁▁▁▁█▁▁█▁▁▁██▁▁▁▁█▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
404 ms Histogram: frequency by time 703 ms <
Memory estimate: 1.16 GiB, allocs estimate: 19479014.New BenchmarkTools.Trial: 31 samples with 1 evaluation per sample.
Range (min … max): 112.177 ms … 278.214 ms ┊ GC (min … max): 0.00% … 59.36%
Time (median): 159.637 ms ┊ GC (median): 27.81%
Time (mean ± σ): 168.818 ms ± 43.172 ms ┊ GC (mean ± σ): 32.69% ± 16.97%
▁▁ ▁▄ █ ▁ ▁
▆▁██▆▆▁▁▆▁▁▆▁██▆▁█▆▁▁▁▁▆▁▁▁▆▁▁▁▁▆▆▁▁█▁█▁▁▆▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆▁▁▁▆ ▁
112 ms Histogram: frequency by time 278 ms <
Memory estimate: 218.85 MiB, allocs estimate: 2669983.Multiset elementsCurrent implementation ( BenchmarkTools.Trial: 90 samples with 1 evaluation per sample.
Range (min … max): 31.160 ms … 86.085 ms ┊ GC (min … max): 0.00% … 38.91%
Time (median): 57.960 ms ┊ GC (median): 29.18%
Time (mean ± σ): 55.699 ms ± 13.952 ms ┊ GC (mean ± σ): 24.51% ± 15.05%
▃ ▁ ▃ ▁ ▆ ▁█ ▁ ▁ ▃
▄█▄▄▇▄▇▁▁█▄▇▁█▄▁▄▇▇▄▁▇█▁▇▄▄▁█▁▄██▄█▇▄▇▄▇▇▇▄█▇▄▁▁▇▁█▄▁▁▄▄▄▄▇ ▁
31.2 ms Histogram: frequency by time 81 ms <
Memory estimate: 142.82 MiB, allocs estimate: 2352077.New BenchmarkTools.Trial: 731 samples with 1 evaluation per sample.
Range (min … max): 4.366 ms … 21.888 ms ┊ GC (min … max): 0.00% … 71.92%
Time (median): 5.298 ms ┊ GC (median): 0.00%
Time (mean ± σ): 6.828 ms ± 3.301 ms ┊ GC (mean ± σ): 17.23% ± 19.83%
██▆▆▄▃▂▂▁▄▃▁▂▂ ▁ ▁
██████████████████████▆▁▅▄▄█▅▇▇▅▇▆▆▇▆▇▇▅▇▅▅▆▆▇▆▆▅▆▄▁▅▇▄▁▄▅ █
4.37 ms Histogram: log(frequency) by time 18.2 ms <
Memory estimate: 13.37 MiB, allocs estimate: 168114. |
Iteration has been streamlined a bit more and some comments have been added to help with future maintenance.
|
Note: I've changed the three argument If it is meant to be exported, I believe it would need a docstring which explains how to construct |
Limits constructors for the types `Derangements` and `DerangementsIterState` to one inner constructor a piece.
This update contains improvements to the performance of
permutations.jl, keeping the underlying algorithm (nextpermutation()) in place. See below for various benchmarks (more can be found in Issue #204 with the benchmarking after the first post directly relevant to this implementation).The main strategy was to move potential overhead away from the performance critical methods of
nextpermutationanditerateto the constructors by standardizing input. One step involves separatingPermutationsfromMultiSetPermutationsby reverting to a previous version of theiteratemethod forPermutations(resulting in a large performance boost).Special attention was paid to reducing the number of allocations made by both
permutations()andmultiset_permutations(). To this end, one of the larger changes involves now modifying thestatein place during iteration (while the data in the structs remains unchanged). I can't currently see this as an issue since the algorithm is serial and can't be parallelized.In total, these modifications see a cut in allocations to 1/3 and 1/2 their current numbers for
permutations()andmultiset_permutations(), respectively, and bring their performance into line with Heap's Permutation Algorithm.Other notes:
mutlitset_permutations()andpermutations()now have the same performance on collections with unique elements (wherev.1.1.0hasmultiset_permutations()outperformingpermutations()).dataandmalways being aVector{T}whereTis the element type of the input.Vectors,Multidimensional Arrays,Sparse ArraysandOffset Arrays(i.e. coveringLinearIndex,CartesianIndexand offset indices).permutations()is now type safe (in line withmultiset_permutations()), always returning aPermutationstype.multiset_permutations(m, t)is now linear (vs the currentBenchmarks
collect(permutations(1:10))Before (v1.1.0)
After
collect(multiset_permutations(1:10))Before (v1.1.0)
After
collect(multiset_permutations([2,4,5,6,3,4,1,8,9,3,2]))Before (v1.1.0)
After
Heap's Algorithm performance comparison
heapermutations(1:10)collect(permutations(1:10))Note: Below is the implementation of Heap's algorithm which I used to compare performance. It was written to be comparable in structure to
nextpermutation()used in this update, but it's not actually correct forpermlen < length(data), since it always produces all permutations (i.e. there are potential duplicates).