General Mechanical

General Mechanical

Using ARC to distribute a job on a multi-node cluster

    • ahmed.desoki
      Subscriber

      Dear folks

      I have 4 identical workstations each having 16 core. I want to use them as execution nodes in Ansys RSM Cluster (ARC) to distribute jobs on these 4 execution nodes. In doing so, I followed the "Ansys Remote Solve Manager User's Guide". I could successfully submit a job to ARC, but after inspection, I noticed the following:

      • The job runs on only one execution node, (say node 1)
      • However, the "RSM Cluster Load Monitoring" application shows wrong information that all the 4 execution nodes and 64 cores are utilized!!
      • Accordingly, I run the "ARC Configuration" application and removed node 1 from the cluster and queue. Then I re-submitted the job to ARC, but I found that the job also runs only one execution node (say node 2)
      • Again, the "RSM Cluster Load Monitoring" application shows wrong information that all the 3 execution nodes and 48 cores are utilized!!
      • Accordingly, I opened "RSM Configuration Utility" and run the queue test, but I discovered that the test job is submitted to one execution node only!!!

      So, please help me answering the folowing questions:
      1) what should I do to run the job on all the execution nodes?
      2) Why the "RSM Cluster Load Monitoring" shows wrong information?

      I can attach all log files or any information you need.

      Thanks and regards

    • Mike Rife
      Ansys Employee

      Hi @ahmed.desoki Please copy/paste the RSM job log of one of the jobs to here.  Specifically looking for the command line that was issued.

      Mike

    • ahmed.desoki
      Subscriber

       

      Hi @Mike

      Thanks for your reply. Please find the log files here.

      Thanks and regards

       

      • Mike Rife
        Ansys Employee

        Hi @ahmed.desoki

        Ok this is expected.  Please see the MAPDL Parallel Processing Guide, chapter 4.1 table 4.2 that lists the supported MPI types.  For MS Windows based computers, and this version of WB Mechanical with the MAPDL solver, only MS MPI and Win Server OS supports distributed parallel solve across multiple compute nodes.  Using the Microsoft HPC job scheduler (freely available).  

        Mike

         

Viewing 2 reply threads
  • You must be logged in to reply to this topic.