Ansys Products

Ansys Products

Ansys Fluent running slow on HPC

    • mkhademi
      Subscriber

      Hello,

      I have a question about the ways I can use to optimize the solution speed on HPC of our campus. I have run case of Fluent on my own workstation with 48 CPU cores, and I have transferred them to the HPC. Running the same cases with 128 CPU cores on HPC is even slower than the same cases on my workstation. I contacted our administration, and he puts forward some scenarios. He proposed that 

      "Fluent is a vendor supplied binary; like most proprietary software packages it is not compiled locally and is given to us basically as a "black box". It uses MPI for parallelization, but relies on its own MPI libraries shipped with the rest of the package (in general, MPI enabled packages want to run with the same MPI used for compilation). We have found that a number of proprietary packages appear to have been compiled with Intel compilers, and therefore (as the Intel compilers only optimize well for Intel CPUs, and are suspected of not optimizing well at all for AMD processors) do not perform well on AMD based clusters like the one on campus. But as we do not control the compilation of the package, there is little we can do about it."

      I am also using Opnempi for parallelization on the HPC. Do you think this problem having AMD based clusters can be the main reason for this issue. Do you suggest some general guidelines? For example, when I am preparing the case and data files, do you think I should make the case and data file with the same version of ansys as the one on cluster? Do you think if it makes any difference if I have Fluent read the mesh on the cluster, and make the case there and run it instead of preparing it on Windows pc? 

      Thank you so much,

      Mahdi.

       

    • Rob
      Ansys Employee

      How many cells are in the model? Are the cores all "real" as opposed to virtual, and is the data transfer "stuff" up to the task of passing all of the data. We've seen some hardware where running on around half of the cores gave far better performance as the data handling was lacking. 

Viewing 1 reply thread
  • You must be logged in to reply to this topic.