    • Joan Rico
      Good morning,
      I am running ANSYS Fluent CFD simulations on an HPC cluster.
      I am trying to simulate particle dispersion around a urban area, with a domain size of 10x10km, and 8M cells.
      This week I was ready to launch my final simulation in the cluster, but I have found an unexpected issue. 
      For all the testing that I have been doing, I have been running my simulation on 4 nodes with 20 tasks per node.Now, in order to obtain faster results for the real simulation, I wanted to increase the number of nodes to 8-12, but I just realised that when I increase this number, my simulation diverges and crashes after a few minutes.
      I first thought that it could be a problem with the mesh of my simulation, but then I do not understand why the simulation with 4(x20) runs without problems (currently running for 10h, see attached residuals evolution). On the other hand, I have tried launching exactly the same simulation (copy-paste) with 5, 6, 7, 8, 10, 12 and 16(x20) nodes, and they all crash after ~ 200 iterations (see also attached a typical profile). Does this make sense? Am I missing something?
      I tried finding some information online about such an issue but I didn't manage to finde any explanation.
      Do you think it could be a parallelization issue? In that case, do you have any advice for me? Should I use more or less tasks per node when increasing my number of nodes?
      Thank you very much in advance.
      Details of my simulation: Transient, LES, double precision, PISO P-V coupling, Second Order Upwind mom, Second order pressure.
    • SRP
      Ansys Employee
      Fluent automatically decomposes your computational area into smaller subdomains as the number of nodes increases to disperse the computational strain. Inefficient communication between nodes might produce instability or divergence if the domain decomposition is not appropriate.
      Troubleshooting problems with parallel CFD simulations may be difficult, and it frequently entails a mix of modifying simulation parameters, optimising domain decomposition, and taking hardware limits into account. When growing the number of nodes, it is critical to methodically test alternative configurations and monitor performance to determine the underlying cause of divergence and failures.
      Thank you.
    • Joan Rico

      Thanks a lot for your answer!

