Fluids

Fluids

CFX local parallel run hangs when writing results

    • jamesgr
      Subscriber

      Hi everyone,


      I am trying to run a parallel simulation of a centrifugal pump. I have already performed the simulation in serial, and it works as I expect. However, when I attempt to do a local parallel run using just the cores on my machine, the run hangs after it has finished calculating and right before writing results.


      I am using the Intel MPI local prallel method with 6 nodes (my computer has 8). The relevant portion of the out file looks like this:



      SIMULATION CONTROL:
         EXECUTION CONTROL:
           EXECUTABLE SELECTION:
             Double Precision = No
             Large Problem = No
           END
           INTERPOLATOR STEP CONTROL:
             Runtime Priority = Standard
             MEMORY CONTROL:
               Memory Allocation Factor = 1.0
             END
           END
           PARALLEL HOST LIBRARY:
             HOST DEFINITION: jg847pc
               Host Architecture String = linux-amd64
               Installation Root = /ansys_inc/v%v/CFX
             END
           END
           PARTITIONER STEP CONTROL:
             Multidomain Option = Automatic
             Runtime Priority = Standard
             MEMORY CONTROL:
               Memory Allocation Factor = 1.5
             END
             PARTITION SMOOTHING:
               Maximum Partition Smoothing Sweeps = 100
               Option = Smooth
             END
             PARTITIONING TYPE:
               MeTiS Type = k-way
               Option = MeTiS
               Partition Size Rule = Automatic
               Partition Weight Factors = 0.16667, 0.16667, 0.16667, 0.16667,
                 0.16667, 0.16667
             END
           END
           RUN DEFINITION:
             Run Mode = Full
             Solver Input File = CFX.def
             Solver Results File =
               /home/jg847/CFD/Ansys/ICEM/Impeller_pending/dp0_CFX_1_Solution_1-5/CF
               X_004.res
           END
           SOLVER STEP CONTROL:
             Runtime Priority = Standard
             MEMORY CONTROL:
               Memory Allocation Factor = 1.0
             END
             PARALLEL ENVIRONMENT:
               Number of Processes = 6
               Start Method = Intel MPI Local Parallel
               Parallel Host List = jg847pc*6
             END
           END
         END



      The difficult thing is that because it just hangs, there is not even an error message to go from. Even if I leave it for awhile, there is no response. Checking the CPU usage shows that all 6 processors are being used 100% by solver-mpi.exe, even though the results have finished calculating. I was under the impression that for writing results, it was largely the role of the master core, with the others simply sending information to the master.


      As a side note, I am using Ubuntu, which Ansys does not give active support to, so there is not much help in that regard.


      Does anyone have any advice on how to fix this?


      Regards,


      James

    • DrAmine
      Ansys Employee

      Hi,


      Which version are you using?  What about RAM  usage? Can you run the case outside the WB?


      A.

    • jamesgr
      Subscriber

      Hi abenhadj!


      Thank you for taking the time to assist me. I really appreciate your help.


      I am using Ansys V18.2 and Intel MPI version 5.1.3.223. I have 64GB of RAM so I doubt that is the problem.


      I have just run it via command line using:


      cfx5solve -def Pump.def -par-local -partition 6

      However the same issue persists. That is, the calculations are performed as expected with the out file showing that the maximum number of iterations has been reached. The final output to the out file is


       +


      +
       |                     Variable Range Information                     |
       +
      +

       Domain Name : Rotating
       +
      +
       |      Variable Name                         |    min    |    max    |
       +
      +
       | Density                                    |  9.97E+02 |  9.97E+02 |
       | Specific Heat Capacity at Constant Pressure|  4.18E+03 |  4.18E+03 |
       | Dynamic Viscosity                          |  8.90E-04 |  8.90E-04 |
       | Thermal Conductivity                       |  6.07E-01 |  6.07E-01 |
       | Static Entropy                             |  0.00E+00 |  0.00E+00 |
       | Velocity u                                 | -1.40E+01 |  1.41E+01 |
       | Velocity v                                 | -1.11E+01 |  1.12E+01 |
       | Velocity w                                 | -7.41E+00 |  7.25E+00 |
       | Pressure                                   |  1.11E+04 |  1.20E+05 |
       | Temperature                                |  2.98E+02 |  2.98E+02 |
       +
      +


      After this, the solver is unresponsive, although CPU usage is 100% for all 6 processors running solver-mpi.exe. Eventually I must kill the process to free up the processors. I've also tried to run it on verbose by changing the command line option in start-methods.ccl, however the output is not particularly helpful, and it does not output anything once the solver begins nor when it hangs. Regardless, I have attached the verbose output from the case if that may help at all.


      Do you happen to have any suggestions as to what I can do troubleshoot this?


      Regards,


      James


       

    • DrAmine
      Ansys Employee

      Hi,


      Can you you use some other MPI method and check for any output in your console.


      Can you check if it is case specific?


      As UBUNTU is not supported platform there is really nothing which we could add further here.


       


      Amine

    • jamesgr
      Subscriber

      Hi Amine,


      I have also checked the IBM method. When I run it from command line, I am prompted for my root password. Upon entering it, I get the following error.



      /usr/ansys_inc/v182/CFX/bin/linux-amd64/ifort/solver-mpi.exe: error while loading shared libraries: libmport.so: cannot open shared object file: No such file or directory
      /usr/ansys_inc/v182/CFX/bin/linux-amd64/ifort/solver-mpi.exe: error while loading shared libraries: libmport.so: cannot open shared object file: No such file or directory
      MPI Application rank 0 exited before MPI_Init() with status 127
      mpirun: Broken pipe
      An error has occurred in cfx5solve:

      The ANSYS CFX solver exited with return code 127.   No results file has
      been created.



      I guess I am missing the libmport.so necessary to run the IBM method. I do not have any other options for MPI methods.


      As for checking if it's case specific, I was hoping to find a benchmark I could use to test it against, however I couldn't find any available to students. I have been referred to the customer portal login. However, I am using Ansys as part of my university's license, and as such didn't receive any login details. Will I have to ask the staff member responsible for Ansys licenses for this information?


      Regards,


      James

Viewing 4 reply threads
  • You must be logged in to reply to this topic.