Ansys Products

Ansys Products

Troubleshooting RSM problems (for Intel MPI)

    • apr37
      Subscriber
      Hello,nI have ANSYS EDT 2021 R1 installed on a few computers, along with RSM service and the Intel MPI. Last week I was able to successfully analyze a simulation using multiple machines in a HPC configuration, but this week some of the compute node machines are not cooperating. While running an analysis, message manager tells men[error] Unable to locate or start COM engine on '[compute node]' : Unable to reach AnsoftRSMService.nAnd when I go to 'Test Machines' in the analysis configuration, the problem machines say ANS_CANNOT_CONNECTTO_ANSOFTRSMSERVICEnI check for proper MPI behavior on the compute nodes using the following commands in command prompt:nC:\Windows\system32>hydra_service -statusnResponse: hydra service running on [compute node]nandnmpiexec -validatenResponse: SUCCESSnFirewalls on all machines have been turned off.nWhat troubleshooting steps can I take from here to try and narrow down the problem? I remember when I used to use the IBM MPI, I could use the following command to check for MPI function between two machines:n%MPI_ROOT%\bin\mpirun -hostlist localhost:2,[compute node DNS name]:2 %ANSYSEM_ROOT201%\schedulers\diagnostics\Utils\pcmpi_test.exenIs there a similar command for use with the Intel MPI?nAny other suggestions are also appreciated.nThank you!n-Alexn
    • ANSYS_MMadore
      Ansys Employee
      Array Please make sure you have installed Ansoft RSM on all machines and registered the engines.nhttp:/storage.ansys.com/doclinks/videos.html?code=InsElecRSMonWindows-VLU-K0a nTo test Intel MPI:nTry this - first with existing computer and then add the new one - you can add/remove by adding from :-n 2 computername C:\Program Files\AnsysEM\AnsysEM21.1\Win64\schedulers\diagnostics\Utils\intelmpi_test.exe' onward for each computer.nC:\Program Files\AnsysEM\AnsysEM21.1\Win64\common\fluent_mpi\multiport\mpi\win64\intel\bin\mpiexec -n 2 -host computer1 C:\Program Files\AnsysEM\AnsysEM21.1\Win64\schedulers\diagnostics\Utils\intelmpi_test.exe : -n 2 -host computer2 C:\Program Files\AnsysEM\AnsysEM21.1\Win64\schedulers\diagnostics\Utils\intelmpi_test.exennOutput should be something like:nIntel MPInHello world! I'm rank 0 of 4 running on computer1nHello world! I'm rank 1 of 4 running on computer1nHello world! I'm rank 2 of 4 running on computer2nHello world! I'm rank 3 of 4 running on computer2nnThank you,nMattn
    • apr37
      Subscriber
      Alright! Looks like even though I had installed and registered RSM on all the computers, somewhere along the line something happened that required me to do it again. At first I attempted to go straight to registering with RSM and on the affected computers several errors were returned along the lines of nC:/Program Fiels/AnsysEM/AnsusEM21.1/Win64/RXPRTCOMENGINE.exe:Error obtaining statusn> Please make sure that Remote Simulation Manager is installed and runningnI did a repair installation of RSM, re-registered successfully, and now I am once again able to do distributed solves with those machines.nAlso, your suggested Intel MPI test command worked as expected for me. Thanks for that, too.n
Viewing 2 reply threads
  • You must be logged in to reply to this topic.