General Mechanical

General Mechanical

Issues with running ANSYS Structural on HPC cluster

Tagged: ,

    • helen.durand

      We are running ANSYS Structural simulations on an HPC cluster with a user interface. We are sometimes able to complete the simulation with multiple cores, and sometimes we are not. Specifically:

      1. For a 32-core job, we can run it successfully if we ask for 16 cores in the distributed option.
      2. If we ask for 7 cores or 32 cores in the distributed option for a 32-core job, the simulation becomes stuck at 10% completion of “Building the Mathematical Model” and never continues to progress.
      3. If we ask for 27 cores in the distributed option for a 32-core job, the simulation completes several load steps and then suddenly shuts down Structural with no warning or listed errors.

      Here are our questions:

      1. Why can we sometimes complete the simulation in a parallel fashion, and sometimes not? There are no security programs on the HPC cluster.
      2. What should be the maximum number of cores we can request with the distributed option for a job with 32 cores? Should we be able to request all 32, or do some need to be available for some “background” tasks of ANSYS?
      3. Why does the program sometimes crash and sometimes instead become stuck at 10% completion of “Building the Mathematical Model”?
      4. Is there any rule about the number of cores we can ask for in the distributed option compared to the total number of cores in the job (like does it need to be some divisor of the total number of cores)? If yes, why?
    • Mike Rife
      Ansys Employee
      What OS is being used by the cluster, and which job scheduler? Often these types of issues are OS dependent. Also can you please define the term 'job'. Usually when we say a solve is run on say 7 of 32 physical cpu cores that the compute node has, that is a 7 core job.
      The 'rules of thumb' for the number of cpu cores to use can depend on the type of physics being solved, the hardware being used, any possible bottleneck of the solution, etc. For example usually we want to run the solution 'in-core' meaning the FEA matrices and vectors are kept wholly in RAM. But to do so may require running on more cpu cores than the FEM warrants in order to check out enough compute nodes to get the needed RAM.
Viewing 1 reply thread
  • You must be logged in to reply to this topic.