Performance update for permutations.jl by depial · Pull Request #205 · JuliaMath/Combinatorics.jl

depial · 2025-12-18T13:58:19Z

This update contains improvements to the performance of permutations.jl, keeping the underlying algorithm (nextpermutation()) in place. See below for various benchmarks (more can be found in Issue #204 with the benchmarking after the first post directly relevant to this implementation).

The main strategy was to move potential overhead away from the performance critical methods of nextpermutation and iterate to the constructors by standardizing input. One step involves separating Permutations from MultiSetPermutations by reverting to a previous version of the iterate method for Permutations (resulting in a large performance boost).

Special attention was paid to reducing the number of allocations made by both permutations() and multiset_permutations(). To this end, one of the larger changes involves now modifying the state in place during iteration (while the data in the structs remains unchanged). I can't currently see this as an issue since the algorithm is serial and can't be parallelized.

In total, these modifications see a cut in allocations to 1/3 and 1/2 their current numbers for permutations() and multiset_permutations(), respectively, and bring their performance into line with Heap's Permutation Algorithm.

Other notes:

mutlitset_permutations() and permutations() now have the same performance on collections with unique elements (where v.1.1.0 has multiset_permutations() outperforming permutations()).
The constructors now homogenize the input to the structs, with data and m always being a Vector{T} where T is the element type of the input.
Attention has been paid to accept any input which is indexable (i.e. no need to be iterable).
Tested and working with Vectors, Multidimensional Arrays, Sparse Arrays and Offset Arrays (i.e. covering LinearIndex, CartesianIndex and offset indices).
permutations() is now type safe (in line with multiset_permutations()), always returning a Permutations type.
multiset_permutations(m, t) is now linear (vs the current $O(n^2)$ version), however, this is likely not terribly important since input size is highly limited.

Benchmarks

`collect(permutations(1:10))`

Before (v1.1.0)

BenchmarkTools.Trial: 6 samples with 1 evaluation per sample.
 Range (min … max):  570.045 ms …    1.280 s  ┊ GC (min … max):  0.00% … 55.55%
 Time  (median):     940.828 ms               ┊ GC (median):    37.16%
 Time  (mean ± σ):   917.961 ms ± 231.521 ms  ┊ GC (mean ± σ):  36.47% ± 18.44%

  ▁                   ▁         ▁  █                          ▁  
  █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁█▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  570 ms           Histogram: frequency by time          1.28 s <

 Memory estimate: 1.49 GiB, allocs estimate: 21772839.

After

BenchmarkTools.Trial: 12 samples with 1 evaluation per sample.
 Range (min … max):  359.795 ms … 588.693 ms  ┊ GC (min … max): 35.43% … 50.36%
 Time  (median):     377.968 ms               ┊ GC (median):    35.71%
 Time  (mean ± σ):   418.433 ms ±  75.727 ms  ┊ GC (mean ± σ):  39.63% ±  6.85%

  ▁▁█ █▁   ▁             ▁ ▁                      ▁           ▁  
  ███▁██▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁█▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁█ ▁
  360 ms           Histogram: frequency by time          589 ms <

 Memory estimate: 526.03 MiB, allocs estimate: 7257607.

`collect(multiset_permutations(1:10))`

Before (v1.1.0)

BenchmarkTools.Trial: 8 samples with 1 evaluation per sample.
 Range (min … max):  321.493 ms … 807.490 ms  ┊ GC (min … max):  0.00% … 45.44%
 Time  (median):     670.635 ms               ┊ GC (median):    41.88%
 Time  (mean ± σ):   625.491 ms ± 149.446 ms  ┊ GC (mean ± σ):  38.67% ± 15.86%

  ▁                     ▁                 ▁▁   ▁█             ▁  
  █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁██▁▁▁██▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  321 ms           Histogram: frequency by time          807 ms <

 Memory estimate: 1.00 GiB, allocs estimate: 14515235.

After

BenchmarkTools.Trial: 13 samples with 1 evaluation per sample.
 Range (min … max):  257.694 ms … 564.993 ms  ┊ GC (min … max):  0.00% … 56.52%
 Time  (median):     378.982 ms               ┊ GC (median):    37.36%
 Time  (mean ± σ):   396.513 ms ±  80.342 ms  ┊ GC (mean ± σ):  38.71% ± 13.44%

  █         █  █       █████     █  █      █    █             █  
  █▁▁▁▁▁▁▁▁▁█▁▁█▁▁▁▁▁▁▁█████▁▁▁▁▁█▁▁█▁▁▁▁▁▁█▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  258 ms           Histogram: frequency by time          565 ms <

 Memory estimate: 526.03 MiB, allocs estimate: 7257644.

`collect(multiset_permutations([2,4,5,6,3,4,1,8,9,3,2]))`

Before (v1.1.0)

BenchmarkTools.Trial: 6 samples with 1 evaluation per sample.
 Range (min … max):  518.051 ms …    1.083 s  ┊ GC (min … max):  0.00% … 47.98%
 Time  (median):     805.651 ms               ┊ GC (median):    36.66%
 Time  (mean ± σ):   834.476 ms ± 200.360 ms  ┊ GC (mean ± σ):  35.33% ± 16.90%

  █                           ██  █                    █      █  
  █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁██▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁█ ▁
  518 ms           Histogram: frequency by time          1.08 s <

 Memory estimate: 1.38 GiB, allocs estimate: 19958439.

After

BenchmarkTools.Trial: 10 samples with 1 evaluation per sample.
 Range (min … max):  494.901 ms … 668.593 ms  ┊ GC (min … max): 29.28% … 49.77%
 Time  (median):     533.120 ms               ┊ GC (median):    36.22%
 Time  (mean ± σ):   556.938 ms ±  60.735 ms  ┊ GC (mean ± σ):  38.80% ±  6.97%

  ▁▁▁        █  ▁           ▁                ▁ ▁              ▁  
  ███▁▁▁▁▁▁▁▁█▁▁█▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  495 ms           Histogram: frequency by time          669 ms <

 Memory estimate: 723.29 MiB, allocs estimate: 9979241.

Heap's Algorithm performance comparison

`heapermutations(1:10)`

BenchmarkTools.Trial: 13 samples with 1 evaluation per sample.
 Range (min … max):  206.685 ms … 598.413 ms  ┊ GC (min … max):  0.00% … 65.33%
 Time  (median):     382.252 ms               ┊ GC (median):    41.20%
 Time  (mean ± σ):   403.536 ms ±  94.463 ms  ┊ GC (mean ± σ):  45.00% ± 15.82%

  ▁                   █   ▁▁▁▁ ▁         ▁▁█                  ▁  
  █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁████▁█▁▁▁▁▁▁▁▁▁███▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  207 ms           Histogram: frequency by time          598 ms <

 Memory estimate: 570.24 MiB, allocs estimate: 7257632.

`collect(permutations(1:10))`

BenchmarkTools.Trial: 13 samples with 1 evaluation per sample.
 Range (min … max):  246.140 ms … 585.436 ms  ┊ GC (min … max):  0.00% … 59.40%
 Time  (median):     371.256 ms               ┊ GC (median):    35.27%
 Time  (mean ± σ):   395.530 ms ±  89.370 ms  ┊ GC (mean ± σ):  39.57% ± 15.16%

  ▁           ▁   ▁▁  ▁ █▁      ▁ ▁      ▁         ▁          ▁  
  █▁▁▁▁▁▁▁▁▁▁▁█▁▁▁██▁▁█▁██▁▁▁▁▁▁█▁█▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁█ ▁
  246 ms           Histogram: frequency by time          585 ms <

 Memory estimate: 526.03 MiB, allocs estimate: 7257607.

Note: Below is the implementation of Heap's algorithm which I used to compare performance. It was written to be comparable in structure to nextpermutation() used in this update, but it's not actually correct for permlen < length(data), since it always produces all permutations (i.e. there are potential duplicates).

function heapermutations(data, permlen=length(data), perm=collect(eachindex(data)), datalen=length(data))
    state = ones(Int, datalen)
    output = [data[view(perm, 1:permlen)]]
    i = 1
    while i ≤ datalen
        @inbounds(if state[i] < i
            if isodd(i)
                perm[1], perm[i] = perm[i], perm[1]
            else
                perm[state[i]], perm[i] = perm[i], perm[state[i]]
            end
            push!(output, data[view(perm, 1:permlen)])
            state[i] += 1
            i = 1
        else
            state[i] = 1
            i += 1
        end)
    end
    output
end

Update docstrings

Update for string comparison in `derangements`

codecov · 2025-12-18T14:19:00Z

Codecov Report

❌ Patch coverage is 98.87640% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 97.19%. Comparing base (b808ce2) to head (b9479c4).

Files with missing lines	Patch %	Lines
src/permutations.jl	98.87%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #205      +/-   ##
==========================================
+ Coverage   97.17%   97.19%   +0.02%     
==========================================
  Files           8        8              
  Lines         813      857      +44     
==========================================
+ Hits          790      833      +43     
- Misses         23       24       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Delete non-mutating `nextpermutation()`

Provide a derangement-specific implementation. Improving performance and providing further functionality.

depial · 2025-12-19T17:33:05Z

I noticed that derangements() was a filter of multiset_permutations(), so I've coded up a derangement-specific implementation which mirrors that of permutations and multiset permutations. It now has it's own type Derangements and nextderangement() method, which is an iterative version of Rohl's algorithm described here in recursive form.

Some benefits of the new implementation:

More performant in both time and space.
Increasingly more performant as multiplicity increases in multisets.
Includes support for derangements of size t.
Similar type implementation to Permutations and MultiSetPermutations.
Only one inner constructor.

Benchmarking

Unique set elements

Current implementation (v1.1.0) run on collect(derangements(1:10))

BenchmarkTools.Trial: 10 samples with 1 evaluation per sample.
 Range (min … max):  404.076 ms … 703.335 ms  ┊ GC (min … max): 24.18% … 44.09%
 Time  (median):     496.590 ms               ┊ GC (median):    28.75%
 Time  (mean ± σ):   517.083 ms ±  91.147 ms  ┊ GC (mean ± σ):  31.96% ±  6.32%

  █     █    █  █   ██    █ █                    █            █  
  █▁▁▁▁▁█▁▁▁▁█▁▁█▁▁▁██▁▁▁▁█▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  404 ms           Histogram: frequency by time          703 ms <

 Memory estimate: 1.16 GiB, allocs estimate: 19479014.

New Derangements implementation run on collect(derangements(1:10))

BenchmarkTools.Trial: 31 samples with 1 evaluation per sample.
 Range (min … max):  112.177 ms … 278.214 ms  ┊ GC (min … max):  0.00% … 59.36%
 Time  (median):     159.637 ms               ┊ GC (median):    27.81%
 Time  (mean ± σ):   168.818 ms ±  43.172 ms  ┊ GC (mean ± σ):  32.69% ± 16.97%

    ▁▁         ▁▄  █                  ▁ ▁                        
  ▆▁██▆▆▁▁▆▁▁▆▁██▆▁█▆▁▁▁▁▆▁▁▁▆▁▁▁▁▆▆▁▁█▁█▁▁▆▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆▁▁▁▆ ▁
  112 ms           Histogram: frequency by time          278 ms <

 Memory estimate: 218.85 MiB, allocs estimate: 2669983.

Multiset elements

Current implementation (v1.1.0) run on collect(derangements([1, 1, 2, 4, 5, 5, 6, 7, 7, 9])

BenchmarkTools.Trial: 90 samples with 1 evaluation per sample.
 Range (min … max):  31.160 ms … 86.085 ms  ┊ GC (min … max):  0.00% … 38.91%
 Time  (median):     57.960 ms              ┊ GC (median):    29.18%
 Time  (mean ± σ):   55.699 ms ± 13.952 ms  ┊ GC (mean ± σ):  24.51% ± 15.05%

   ▃       ▁   ▃        ▁     ▆  ▁█ ▁        ▁      ▃          
  ▄█▄▄▇▄▇▁▁█▄▇▁█▄▁▄▇▇▄▁▇█▁▇▄▄▁█▁▄██▄█▇▄▇▄▇▇▇▄█▇▄▁▁▇▁█▄▁▁▄▄▄▄▇ ▁
  31.2 ms         Histogram: frequency by time          81 ms <

 Memory estimate: 142.82 MiB, allocs estimate: 2352077.

New Derangements implementation run on collect(derangements([1, 1, 2, 4, 5, 5, 6, 7, 7, 9]))

BenchmarkTools.Trial: 731 samples with 1 evaluation per sample.
 Range (min … max):  4.366 ms … 21.888 ms  ┊ GC (min … max):  0.00% … 71.92%
 Time  (median):     5.298 ms              ┊ GC (median):     0.00%
 Time  (mean ± σ):   6.828 ms ±  3.301 ms  ┊ GC (mean ± σ):  17.23% ± 19.83%

  ██▆▆▄▃▂▂▁▄▃▁▂▂ ▁ ▁                                          
  ██████████████████████▆▁▅▄▄█▅▇▇▅▇▆▆▇▆▇▇▅▇▅▅▆▆▇▆▆▅▆▄▁▅▇▄▁▄▅ █
  4.37 ms      Histogram: log(frequency) by time     18.2 ms <

 Memory estimate: 13.37 MiB, allocs estimate: 168114.

Iteration has been streamlined a bit more and some comments have been added to help with future maintenance.

depial · 2025-12-21T17:18:32Z

Note: I've changed the three argument multiset_permutations(m::Vector, f::Vector{<:Integer}, t::Integer) to be more clearly an outer constructor MultiSetPermutations(m::Vector, f::Vector{<:Integer}, t::Integer) since it appears this method is not actually meant to be exported.

If it is meant to be exported, I believe it would need a docstring which explains how to construct m and f in the way it is done in the two argument method multiset_permutations(a, t::Integer).

Limits constructors for the types `Derangements` and `DerangementsIterState` to one inner constructor a piece.

depial added 3 commits December 18, 2025 07:42

Update permutations.jl

2eb3b82

Update permutations.jl

9091144

Update docstrings

Update permutations.jl

2a634d7

Update for string comparison in `derangements`

depial added 2 commits December 18, 2025 09:32

Update permutations.jl

6139b8c

Update permutations.jl

819f2eb

Delete non-mutating `nextpermutation()`

depial mentioned this pull request Dec 18, 2025

Performance regression: in iterate method for Permutations. #204

Open

Update permutations.jl

ba1fdb1

depial changed the title ~~Update permutations.jl~~ Performance update for permutations.jl Dec 18, 2025

depial added 2 commits December 19, 2025 11:09

Update permutations.jl

d4898d8

Provide a derangement-specific implementation. Improving performance and providing further functionality.

Update derangements

c84b4cb

depial added 3 commits December 19, 2025 12:36

Replace isnothing for Julia 1.0 comaptibility

c1d6350

add DerangementsIterState type

287b265

Further optimization of derangements

51426e8

Iteration has been streamlined a bit more and some comments have been added to help with future maintenance.

Update permutations.jl

153f827

depial mentioned this pull request Dec 30, 2025

Combinatorics performance update #207

Open

Update api for Derangements

b9479c4

Limits constructors for the types `Derangements` and `DerangementsIterState` to one inner constructor a piece.

depial mentioned this pull request Feb 15, 2026

Improves backwards compatibility in alphametics after addition of Combinatorics.jl exercism/julia#1055

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance update for permutations.jl#205

Performance update for permutations.jl#205
depial wants to merge 13 commits intoJuliaMath:masterfrom
depial:permutations-update

depial commented Dec 18, 2025 •

edited

Loading

Uh oh!

codecov bot commented Dec 18, 2025 •

edited

Loading

Uh oh!

depial commented Dec 19, 2025 •

edited

Loading

Unique set elements

Multiset elements

Uh oh!

depial commented Dec 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

depial commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

collect(permutations(1:10))

collect(multiset_permutations(1:10))

collect(multiset_permutations([2,4,5,6,3,4,1,8,9,3,2]))

heapermutations(1:10)

collect(permutations(1:10))

Uh oh!

codecov bot commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

depial commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Unique set elements

Multiset elements

Uh oh!

depial commented Dec 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

depial commented Dec 18, 2025 •

edited

Loading

`collect(permutations(1:10))`

`collect(multiset_permutations(1:10))`

`collect(multiset_permutations([2,4,5,6,3,4,1,8,9,3,2]))`

`heapermutations(1:10)`

`collect(permutations(1:10))`

codecov bot commented Dec 18, 2025 •

edited

Loading

depial commented Dec 19, 2025 •

edited

Loading