And here's the promised further information on the crashes. Now with a model that I know converges (the "Crankshaft" example that I got from Rescale but I figure it's one of your examples).
I get the following error when I run on a node with AMD Zen1 CPUs:
OMP: Error #100: Fatal system error detected.
OMP: System error #22: Invalid argument
forrtl: error (76): Abort trap signal
Image PC Routine Line Source
libifcoremt.so.5 00007F14872F1555 for__signal_handl Unknown Unknown
libpthread-2.28.s 00007F144EDD9CE0 Unknown Unknown Unknown
libc-2.28.so 00007F144C5C0A4F gsignal Unknown Unknown
libc-2.28.so 00007F144C593DB5 abort Unknown Unknown
libiomp5.so 00007F1484DA4B23 Unknown Unknown Unknown
libiomp5.so 00007F1484D8FD17 Unknown Unknown Unknown
libiomp5.so 00007F1484D310A8 Unknown Unknown Unknown
libiomp5.so 00007F1484DE5E57 Unknown Unknown Unknown
libiomp5.so 00007F1484D2962D Unknown Unknown Unknown
libiomp5.so 00007F1484D1F119 Unknown Unknown Unknown
libiomp5.so 00007F1484D1E68B Unknown Unknown Unknown
libiomp5.so 00007F1484DA3B1F Unknown Unknown Unknown
libiomp5.so 00007F1484D8698E omp_get_num_procs Unknown Unknown
libansOpenMP.so 00007F146B886CEC ppinit_ Unknown Unknown
libansys.so 00007F147287D7EC smpstart_ Unknown Unknown
ansys.e 00000000004113F0 Unknown Unknown Unknown
ansys.e 000000000040EE28 MAIN__ Unknown Unknown
ansys.e 000000000040ED22 main Unknown Unknown
libc-2.28.so 00007F144C5ACCA3 __libc_start_main Unknown Unknown
ansys.e 000000000040EC39 Unknown Unknown Unknown
/uufs/chpc.utah.edu/sys/installdir/ansys/22.2/v222/ansys/bin/ansysdis222: line 77: 2638867 Aborted (core dumped) /uufs/chpc.utah.edu/sys/installdir/ansys/22.2/v222/ansys/bin/linx64/ansys.e -b nolist -s noread -i "dummy.dat" -o "solve.out" -dis -p ansys
Sounds like an issue with the Intel OpenMP library, but since it only occurs with the distributed solver, not the shared memory one, I suspect it may be due to some interaction with the Intel MPI that drives the distrbuted run. We have had quite few issues with Intel MPI on Rocky 8, some versions work and some don't, and in the versions that work we need to set the FI_FABRICS=verbs explicitly.
I am wondering if there's a way to tell the Mechanical IDE to use OpenMPI instead of Intel MPI. I know there's a command line option for that but I can't find if the IDE can set it.
The same example runs fine on nodes with Intel CPUs.
So, I think we are more less good. I'll instruct the user to stay on the Intel nodes, and to fix their model to improve convergence, and re-iterate my encouragement to support Rocky Linux in future Ansys releases, focusing both on Intel and AMD CPUs.