Influence of number of cores and processor speed (GHz) in thermal simulations on ANSYS APDL

rodrigo_searchsrodrigo_searchs Member Posts: 2


I am facing a problem whose theme I do not see clearly in the Ansys documentation, so I hope that someone here on the forum can give me some information.

I currently work with 2 computers with the following configurations:


RYZEN 2700 with 8 cores at 3.2GHz with 16 GB of RAM


XEON GOLD 5120 with 28 cores at 2.2GHz with 64 GB of RAM

Both with hypertreading disabled, and the simulations are running INCORE. Licenses are set up correctly.

My simulation consists in transient non-linear thermal simulation, using SOLID70 and SURF152 elements. ~210.000 nodes and ~270.000 elements. Conductivity and enthalpy as function of temperature. I also use element birth and death. I use JCG solver with default configurations (I already tested sparse solver and its slower).

The point is that, with 8 cores (SMP) on both pcs, the Ryzen one is approximately 30% faster ("Elapsed time spent computing solution(s)" in output file).

If I increase the number of cores in Xeon to 16, the difference remains the same, or even gets a little worse.

I know that after a certain point, increasing the number of cores does not decrease computational time.

So, my doubts:

1- Is there a way to predict the optimal number of cores to be used in a simulation, without having to do manual tests? (Like run with 4, 8, 16 cores, note the times and make a curve.)

2- Does the processor clock (GHz) have so much influence on the solution? Since when I use 8 cores on both PCs, Ryzen resolves much faster.

3- The main reason for my doubts is that a Ryzen desktop is currently being faster than an expensive Xeon workstation. Are we wasting money here?



  • huwanghuwang Posts: 16Forum Coordinator

    Your model, ~210.000 nodes and ~270.000 elements (~210K nodes and ~270K elements? ), for thermal analysis (1 DOF per node) is too small to take advantage of distributed MAPDL on too many cores.


    Simulation has too few DOF (degrees of freedom) — Some analyses (such as transient analyses) may require long compute times, not because the number of DOF is large, but because a large number of calculations are performed (that is, a very large number of time steps). Generally, if the number of DOF is relatively small, parallel processing will not significantly decrease the solution time. Consequently, for small models with many time steps, parallel performance may be poor because the model size is too small to fully utilize a large number of cores.

    For "XEON GOLD 5120 with 28 cores at 2.2GHz with 64 GB of RAM with hypertreading disabled", assuming it's dual-socket workstation, has 2x6=12 memory channels, should but fully populated with equal size memory chips for max. memory bandwidth, 64GB seems not a balanced RAM configuration. 12X8GB, 12X16GB, etc. are recommended. It should still be faster for larger model (in millions of DOF).

  • mrifemrife PHLPosts: 136Forum Coordinator

    @rodrigo_searchs SMP parallel stops scaling at around 4-6 cpu cores depending on make/model of the CPU. My guess is that for either system you won't see any difference if you step back a few cores. And all things being equal, which they are not, that Ryzen has a 50% faster clock rate. So that is the difference in the solve times.


  • rodrigo_searchsrodrigo_searchs Posts: 5Member

    Thank you for the explanation!

    Lets go by parts:

    1: you are right. 210K and 270K (nodes and elements) with 1DOF (Temp).

    2: my next models will at max achieve 1 million nodes, so the parallel advantage will increase a little more.

    3: I have a LOT of time-steps. Since my thermal history is important and influences everything, I'm simulating 20 minutes of my process, in time-steps of 0,3 seconds.

    4: you are right, its dual-socket, sorry for that. And the guy who bought the workstation thinks he knows everything about pcs, and the RAM is VERY unbalanced. Its 4x16GB.

    Do you know of any paper that demonstrates some way of predicting an adequate use of the system (number of cores and amount of RAM)? Based on the number of DOFs and the physics of the problem (non-linear, transient, etc)?

    Or is it just that each case is different and I have to do my own tests every time (that's how I always do it)?

    In summary, I believe that for the type of simulations that I do the processor speed should be given more priority.

    And just for future reference, going from SMP (shared memory) to Distributed Ansys option, has decreased the time almost by half!

  • rodrigo_searchsrodrigo_searchs Posts: 5Member


    Thanks for the reply.

    Actually, for SMP i had good scaling up to 8 cores. And you are right, frequency plays a huge role for smaller simulations with a lot of time-steps. But as i said in the other reply, going from SMP (shared memory) to Distributed Ansys option, has decreased the time almost by half! But uses a lot of RAM.

Sign In or Register to comment.