randyk
Ansys Employee

Hi Ehsan,

Please consider using a newer version AEDT, there have been changes to improve the SLURM integration. 

I would expect the following three items might be playing a role in your MPI failure.
1. AEDT defaults to Passwordless SSH communication by default.
- for SLURM, I recommend against using Passwordless SSH and specify "tight-integration" - leaving the scheduler initiate the remote processes. This is done by adding batchoption '/RemoteSpawnCommand'='scheduler'
- ex: -batchoptions " 'HFSS/RemoteSpawnCommand'='scheduler'"

2. Intel MPI bit crashing running a newer OS variant such as RHEL8.x 
- AEDT defaults to Intel MPI 2018u3 (the binaries are integrated - not installed separate) this version requires additional libraries when run under newer OS releases.  
- If your OS is a RHEL8.x, I would recommend using the Intel 2021.6 option.
--  AEDT 2022R2 would enable the beta flag for IntelMPI2021 with  environment variable:    ANSYSEM_FEATURE_F539685_MPI_INTEL21_ENABLE=1
--  AEDT 2023R1 and newer enables with batchoption: '/MPIVersion'='2021'
     ex:   'HFSS/MPIVersion'='2021'

3. The execution hosts have multiple active networks. 
- It is common for schedulers to have ethernet and high speed networks, you would need to specify the CIDR value of the preferred network. 
- This is done with batchoption "Desktop/Settings/ProjectOptions/AnsysEMPreferredSubnetAddress"
ex: If your preferred network is 192.168.16.123/255.255.255.0, you would set -batchoption " 'Desktop/Settings/ProjectOptions/AnsysEMPreferredSubnetAddress'='192.168.16.0/24"
- There are many CIDR calculators on the web if you need assistance convering the IP/Subnetmask to CIDR.

I also suggest changes to your script, as you pasted a screen capture, I will post my example script in a separate response - you can modify as needed.

Thanks
Randy