Skip to content

Tests fail on macOS (arm64 architecture) #208

@lamBOOO

Description

@lamBOOO

After playing around with GridapPETSc (see gridap/GridapPETSc.jl#102), I noticed that on macOS (Apple M4 chip), some custom MPI operations produce an error if custom reduction are used (see the error message below).

One can reproduce theses errors when running the PartionedArrays test suite either:

  1. using a Github macos (arm64) runner as done in https://github.com/lamBOOO/PartitionedArrays.jl/actions/runs/16026309753/job/45215040606
  2. or locally with Docker and the docker run --rm -it --platform linux/amd64 julia command followed by the usual ] test command within Julia.

The problem seems to be known, see JuliaParallel/MPI.jl#404 or the MPI.jl docs.

I came up with 2 ways to fix the tests:

  1. Using @RegisterOps as recommend in the MPI.jl docs (also done here): Statically register MPI reduction ops for distances and fix global AN… lamBOOO/PartitionedArrays.jl#3
  2. Using a try/catch that uses a fallback when running on macOS: add macos fix (not as performant) lamBOOO/PartitionedArrays.jl#1

I think the second option in not optimal in terms of runtime. Do you have any other idea to get this working?

Details
Error: 
│   exception =
│    User-defined reduction operators are currently not supported on non-Intel architectures.
│    See https://github.com/JuliaParallel/MPI.jl/issues/404 for more details.
│    
│    You may want to use `@RegisterOp` to statically register `f`.
│    
│    Stacktrace:
│      [1] error(s::String)
│        @ Base ./error.jl:35
│      [2] MPI.Op(f::Function, T::Type; iscommutative::Bool)
│        @ MPI ~/.julia/packages/MPI/TKXAj/src/operators.jl:109
│      [3] MPI.Op(f::Function, T::Type)
│        @ MPI ~/.julia/packages/MPI/TKXAj/src/operators.jl:102
│      [4] reduction_impl(op::PartitionedArrays.var"#and#68", a::PartitionedArrays.MPIArray{Bool, 1}, destination::Symbol; init::Bool)
│        @ PartitionedArrays ~/work/PartitionedArrays.jl/PartitionedArrays.jl/src/mpi_array.jl:481
│      [5] #reduction#49
│        @ ~/work/PartitionedArrays.jl/PartitionedArrays.jl/src/primitives.jl:682 [inlined]
│      [6] (::PartitionedArrays.var"#is_included#65")(snd_ids_a::PartitionedArrays.MPIArray{Vector{Int32}, 1}, snd_ids_b::PartitionedArrays.MPIArray{Vector{Int32}, 1})
│        @ PartitionedArrays ~/work/PartitionedArrays.jl/PartitionedArrays.jl/src/primitives.jl:811
│      [7] ExchangeGraph_impl_with_neighbors(snd_ids::PartitionedArrays.MPIArray{Vector{Int32}, 1}, neighbors::PartitionedArrays.ExchangeGraph{PartitionedArrays.MPIArray{Vector{Int32}, 1}})
│        @ PartitionedArrays ~/work/PartitionedArrays.jl/PartitionedArrays.jl/src/primitives.jl:817
│      [8] #ExchangeGraph#57
│        @ ~/work/PartitionedArrays.jl/PartitionedArrays.jl/src/primitives.jl:797 [inlined]
│      [9] ExchangeGraph
│        @ ~/work/PartitionedArrays.jl/PartitionedArrays.jl/src/primitives.jl:787 [inlined]
│     [10] #compute_assembly_neighbors#153
│        @ ~/work/PartitionedArrays.jl/PartitionedArrays.jl/src/p_range.jl:448 [inlined]
│     [11] compute_assembly_neighbors
│        @ ~/work/PartitionedArrays.jl/PartitionedArrays.jl/src/p_range.jl:436 [inlined]
│     [12] #assembly_neighbors#146
│        @ ~/work/PartitionedArrays.jl/PartitionedArrays.jl/src/p_range.jl:423 [inlined]
│     [13] assembly_neighbors
│        @ ~/work/PartitionedArrays.jl/PartitionedArrays.jl/src/p_range.jl:417 [inlined]
│     [14] psparse(f::PartitionedArrays.var"#f#410"{SparseArrays.SparseMatrixCSC{Float64, Int64}}, I::PartitionedArrays.MPIArray{Vector{Int64}, 1}, J::PartitionedArrays.MPIArray{Vector{Int64}, 1}, V::PartitionedArrays.MPIArray{Vector{Float64}, 1}, rows::PartitionedArrays.MPIArray{PartitionedArrays.LocalIndicesWithVariableBlockSize{1}, 1}, cols::PartitionedArrays.MPIArray{PartitionedArrays.LocalIndicesWithVariableBlockSize{1}, 1}; split_format::Val{true}, subassembled::Bool, assembled::Bool, assemble::Bool, indices::Symbol, restore_ids::Bool, assembly_neighbors_options_rows::@NamedTuple{neighbors::PartitionedArrays.ExchangeGraph{PartitionedArrays.MPIArray{Vector{Int32}, 1}}}, assembly_neighbors_options_cols::@NamedTuple{neighbors::PartitionedArrays.ExchangeGraph{PartitionedArrays.MPIArray{Vector{Int32}, 1}}}, assembled_rows::Nothing, reuse::Bool)
│        @ PartitionedArrays ~/work/PartitionedArrays.jl/PartitionedArrays.jl/src/p_sparse_matrix.jl:1192
│     [15] psparse
│        @ ~/work/PartitionedArrays.jl/PartitionedArrays.jl/src/p_sparse_matrix.jl:1150 [inlined]
│     [16] #psparse#409
│        @ ~/work/PartitionedArrays.jl/PartitionedArrays.jl/src/p_sparse_matrix.jl:1139 [inlined]
│     [17] psparse
│        @ ~/work/PartitionedArrays.jl/PartitionedArrays.jl/src/p_sparse_matrix.jl:1137 [inlined]
│     [18] #psparse#408
│        @ ~/work/PartitionedArrays.jl/PartitionedArrays.jl/src/p_sparse_matrix.jl:1134 [inlined]
│     [19] psystem(I::PartitionedArrays.MPIArray{Vector{Int64}, 1}, J::PartitionedArrays.MPIArray{Vector{Int64}, 1}, V::PartitionedArrays.MPIArray{Vector{Float64}, 1}, I2::PartitionedArrays.MPIArray{Vector{Int64}, 1}, V2::PartitionedArrays.MPIArray{Vector{Float64}, 1}, rows::PartitionedArrays.MPIArray{PartitionedArrays.LocalIndicesWithVariableBlockSize{1}, 1}, cols::PartitionedArrays.MPIArray{PartitionedArrays.LocalIndicesWithVariableBlockSize{1}, 1}; subassembled::Bool, assembled::Bool, assemble::Bool, indices::Symbol, restore_ids::Bool, assembly_neighbors_options_rows::@NamedTuple{neighbors::PartitionedArrays.ExchangeGraph{PartitionedArrays.MPIArray{Vector{Int32}, 1}}}, assembly_neighbors_options_cols::@NamedTuple{neighbors::PartitionedArrays.ExchangeGraph{PartitionedArrays.MPIArray{Vector{Int32}, 1}}}, assembled_rows::Nothing, reuse::Bool)
│        @ PartitionedArrays ~/work/PartitionedArrays.jl/PartitionedArrays.jl/src/p_sparse_matrix.jl:2818
│     [20] fem_example(distribute::PartitionedArrays.var"#105#106"{MPI.Comm, Bool})
│        @ Main.MPIArrayFEMExample ~/work/PartitionedArrays.jl/PartitionedArrays.jl/test/fem_example.jl:314
│     [21] with_mpi(f::typeof(Main.MPIArrayFEMExample.fem_example); comm::MPI.Comm, duplicate_comm::Bool)
│        @ PartitionedArrays ~/work/PartitionedArrays.jl/PartitionedArrays.jl/src/mpi_array.jl:73
│     [22] with_mpi(f::Function)
│        @ PartitionedArrays ~/work/PartitionedArrays.jl/PartitionedArrays.jl/src/mpi_array.jl:64
│     [23] top-level scope
│        @ ~/work/PartitionedArrays.jl/PartitionedArrays.jl/test/mpi_array/drivers/fem_example.jl:7
│     [24] include(mod::Module, _path::String)
│        @ Base ./Base.jl:557
│     [25] exec_options(opts::Base.JLOptions)
│        @ Base ./client.jl:323
│     [26] _start()
│        @ Base ./client.jl:531
└ @ PartitionedArrays ~/work/PartitionedArrays.jl/PartitionedArrays.jl/src/mpi_array.jl:75```

</details>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions