forked from julesghub/underworld3-old
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Hi team,
Here's a difficult bug to reproduce! A script which normally runs fine on a small number of processors is failing when I run it across a large number of processors. It's failing at the SNES solve stage with PETSc error 62 (although sometimes error code 73 is printed out).
Things I've discovered:
- It actually runs for mesh refinement < 5 for the cubed sphere mesh
- It can run for mesh refinement < 3 on the spherical shell mesh
- It's not a memory usage limitation
I attach a modified Darcy flow benchmark I used to reproduce this error, but again, it's difficult to reproduce this error unless you have access to >= 128 cores.
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Invalid argument
[0]PETSC ERROR: Scalar value must be same on all processes, argument # 3
[0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc!
[0]PETSC ERROR: Option left: name:-Solver_14_mg_levels_ksp_converged_maxits (no value) source: code
[0]PETSC ERROR: Option left: name:-Solver_14_mg_levels_ksp_max_it value: 3 source: code
[0]PETSC ERROR: Option left: name:-Solver_14_pc_mg_type value: additive source: code
[0]PETSC ERROR: Option left: name:-Solver_15_ksp_rtol value: 0.001 source: code
[0]PETSC ERROR: Option left: name:-Solver_15_ksp_type value: gmres source: code
[0]PETSC ERROR: Option left: name:-Solver_15_mg_levels_ksp_converged_maxits (no value) source: code
[0]PETSC ERROR: Option left: name:-Solver_15_mg_levels_ksp_max_it value: 3 source: code
[0]PETSC ERROR: Option left: name:-Solver_15_pc_gamg_agg_nsmooths value: 2 source: code
[0]PETSC ERROR: Option left: name:-Solver_15_pc_gamg_repartition value: true source: code
[0]PETSC ERROR: Option left: name:-Solver_15_pc_gamg_type value: agg source: code
[0]PETSC ERROR: Option left: name:-Solver_15_pc_mg_type value: additive source: code
[0]PETSC ERROR: Option left: name:-Solver_15_pc_type value: gamg source: code
[0]PETSC ERROR: Option left: name:-Solver_15_snes_atol value: 1e-08 source: code
[0]PETSC ERROR: Option left: name:-Solver_15_snes_rtol value: 0.001 source: code
[0]PETSC ERROR: Option left: name:-Solver_15_snes_type value: newtonls source: code
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.21.6, unknown
[0]PETSC ERROR: 05-assimilate-conductivity.py on a named ip-0A3A580C by ben.r.mather Mon Jun 16 00:33:30 2025
[0]PETSC ERROR: Configure options --with-debugging=1 --prefix=/shared/home/ben.r.mather/petsc-3.21.5-hpcx-mt3-debug --COPTFLAGS="-g -O3" --CXXOPTFLAGS="-g -O3" --FOPTFLAGS="-g -O3" --with-petsc4py=1 --with-zlib=1 --with-shared-libraries=1 --with-cxx-dialect=C++11 --with-make-np=4 --download-bison --download-hdf5=[https://github.com/HDFGroup/hdf5/archive/refs/tags/hdf5-1\_10\_8.tar.gz](https://github.com/HDFGroup/hdf5/archive/refs/tags/hdf5-1_10_8.tar.gz) --download-mumps=1 --download-parmetis=1 --download-metis=1 --download-superlu=1 --download-hypre=1 --download-scalapack=1 --download-superlu_dist=1 --download-pragmatic=1 --download-ctetgen=1 --download-eigen --download-triangle --download-ptscotch --download-fblaslapack
[0]PETSC ERROR: #1 VecMAXPYAsync_Private() at /shared/home/ben.r.mather/petsc/src/vec/vec/interface/rvector.c:1242
[0]PETSC ERROR: #2 VecMAXPY() at /shared/home/ben.r.mather/petsc/src/vec/vec/interface/rvector.c:1286
[0]PETSC ERROR: #3 KSPGMRESClassicalGramSchmidtOrthogonalization() at /shared/home/ben.r.mather/petsc/src/ksp/ksp/impls/gmres/borthog2.c:73
[0]PETSC ERROR: #4 KSPGMRESCycle() at /shared/home/ben.r.mather/petsc/src/ksp/ksp/impls/gmres/gmres.c:149
[0]PETSC ERROR: #5 KSPSolve_GMRES() at /shared/home/ben.r.mather/petsc/src/ksp/ksp/impls/gmres/gmres.c:227
[0]PETSC ERROR: #6 KSPSolve_Private() at /shared/home/ben.r.mather/petsc/src/ksp/ksp/interface/itfunc.c:905
[0]PETSC ERROR: #7 KSPSolve() at /shared/home/ben.r.mather/petsc/src/ksp/ksp/interface/itfunc.c:1078
[0]PETSC ERROR: #8 PCGAMGOptProlongator_AGG() at /shared/home/ben.r.mather/petsc/src/ksp/pc/impls/gamg/agg.c:1365
[0]PETSC ERROR: #9 PCSetUp_GAMG() at /shared/home/ben.r.mather/petsc/src/ksp/pc/impls/gamg/gamg.c:710
[0]PETSC ERROR: #10 PCSetUp() at /shared/home/ben.r.mather/petsc/src/ksp/pc/interface/precon.c:1079
[0]PETSC ERROR: #11 KSPSetUp() at /shared/home/ben.r.mather/petsc/src/ksp/ksp/interface/itfunc.c:415
[0]PETSC ERROR: #12 KSPSolve_Private() at /shared/home/ben.r.mather/petsc/src/ksp/ksp/interface/itfunc.c:831
[0]PETSC ERROR: #13 KSPSolve() at /shared/home/ben.r.mather/petsc/src/ksp/ksp/interface/itfunc.c:1078
[0]PETSC ERROR: #14 SNESSolve_NEWTONLS() at /shared/home/ben.r.mather/petsc/src/snes/impls/ls/ls.c:221
[0]PETSC ERROR: #15 SNTraceback (most recent call last):
File "/shared/home/ben.r.mather/mge/data-engineering/feature_pool/fluid-flow-modelling/05-assimilate-conductivity.py", line 414, in <module>
Traceback (most recent call last):
File "/shared/home/ben.r.mather/miniforge3/envs/uw3/lib/python3.11/site-packages/underworld3/systems/solvers.py", line 319, in solve
darcy.solve()
File "src/underworld3/cython/petsc_generic_snes_solvers.pyx", line 898, in underworld3.cython.generic_solvers.SNES_Scalar.solve
super().solve(zero_init_guess, _force_setup)
File "petsc4py/PETSc/SNES.pyx", line 1601, in petsc4py.PETSc.SNES.solve
petsc4py.PETSc.Error: error code 62
Metadata
Metadata
Assignees
Labels
No labels