February 6, 2023 at 9:01 pmrohantSubscriber
I am working on a 2-phase flow simulation on an HPC cluster. The simplest case uses a single node with 24 cores and 8gb RAM per core (reserved the entire node). When I start the simulation I see a wall-clock time per iteration of around .017. Visually, the console shows quick residual printouts per iteration that matches this speed. However, over time this visual printout in the console slows down. When I check the Parallel -> Usage tab I see roughly the same wall clock time per iteration. When I time it with a stop watch I get a completely different value (much higher). If I pause and re-start the simulation the console printouts appear to speed up again.
What's the issue here? I originially had monitors running every 100 iterations - I ran a test without monitors for some time and saw the same phenomenon. I have a workstation that I use and did not see any noticable visual slowdown in residual printouts in the console. Is there something to look out for when running on HPC that could cause this? Any insight would be helpful, thank you.
February 7, 2023 at 12:28 pmDrAmineAnsys Employee
Can you please add screenshots describing the issue you are currently facing?
What do you mean with pause and re-start: are you re-starting in new Fluent session?
Which release are you using?
February 7, 2023 at 3:08 pmrohantSubscriber
It is tough to show exactly what I mean since the time/iteration and the Parallel -> Timer -> Usage -> "Avereage wall-clock time per iteration" are roughly the same throughout the simulation. However, when I watch the console in real time it is clear that there is a slow down.
"Slowed Down" Speed:
Pause and re-start to me means hitting "stop at the end of time-step" waiting for the simulation to stop and then pressing calculate again. I am sure that we would see the same speed increase by saving the case and data and re-opening another instance of fluent.
I am using 2021/R2.
I am not sure if this is a linux cache issue as well, have you seen anything similar for Linux HPCs?
February 8, 2023 at 9:24 amDrAmineAnsys Employee
When referring to slowing down speed: are you referrin to that column time/iter?
Please do not rely on that if yes. The ultimative test will be to run the case via journal in batch and introduce some time stamps to compare between certain parts in the journal. You might there also rely on the benchmark command or just create scheme variables to store time before and after and it prints you the acculated time. Slow down might have several reasons: software, OS and hardware!
February 9, 2023 at 3:49 pmrohantSubscriber
I attempted to output /parallel/timer/usage and (benchmark'(iterate 10)) every 1000 timesteps to see where the issue could be occurring. I also compared the difference between running in batch mode vs. running in the cluster OpenOnDemand GUI. Typically I have been running interactive Fluent in the cluster GUI since it makes it easier to see simulation progress over time. It seems like the GUI has a similar parallel usage time, but higher benchmark values. The following are done after the same number of timesteps (1999)
I am not sure on the meaning of the benchmark values (cpu-time, solver, elapsed) - can you please let me know? I haven't found much documentation on this online.
February 10, 2023 at 7:08 amDrAmineAnsys Employee
That is the time including all I/O and solver time -> Elapsed time. It seems that the batch solution requires much less time compared to the GUI one for the 10 iterations. The other output is not really helpful as you have done it for different number of iterations (or your time steps required completely different number of outer iterations) but does not depict huge differences.
You might use a standard case which you can then benchmark on your ressources to check if the timing you are getting are appropriate or not.
Back to your first question regarding starting, then getting slow, stopping then continuing: it is hard for me from here to debug on that.
- You must be logged in to reply to this topic.
Boost Ansys Fluent Simulations with AWS
Computational Fluid Dynamics (CFD) helps engineers design products in which the flow of fluid components is a significant challenge. These different use cases often require large complex models to solve on a traditional workstation. Click here to join this event to learn how to leverage Ansys Fluids on the cloud, thanks to Ansys Gateway powered by AWS.
Earth Rescue – An Ansys Online Series
The climate crisis is here. But so is the human ingenuity to fight it. Earth Rescue reveals what visionary companies are doing today to engineer radical new ideas in the fight against climate change. Click here to watch the first episode.
Subscribe to the Ansys Blog to get great new content about the power of simulation delivered right to your email on a weekly basis. With content from Ansys experts, partners and customers you will learn about product development advances, thought leadership and trends and tips to better use Ansys tools. Sign up here.
- Suppress Fluent to open with GUI while performing in journal file
- Floating point exception in Fluent
- What are the differences between CFX and Fluent?
- Heat transfer coefficient
- Getting graph and tabular data from result in workbench mechanical
- Difference between K-epsilon and K-omega Turbulence Model
- The solver failed with a non-zero exit code of : 2
- Time Step Size and Courant Number
- Mesh Interfaces in ANSYS FLUENT
- error: Received signal SIGSEGV
© 2023 Copyright ANSYS, Inc. All rights reserved.