Mike Rife
Ansys Employee

Hi Matt

Ok that model is very, very small with respect to GPU acceleration (of a solve).  Without knowing anything about the cluster, other than the compute nodes have some model of Intel i7 cpus, I'd suggest to run a test to compare using one GPU.  First change the model so that it is not solving for all 3050 sub-steps.  We only need to solve for a few in order to compare compute performance.  So change the loading set up to solve for maybe 10 sub-steps.  Or maybe just the first 2-3 load steps.  Next I usually start with 50,000 degrees of freedom per CPU core as a baseline test.  If the CPU was a leading edge model then I'd take that down to around 30,000.  But with 50k dof per core I'd try solving on 8 CPU cores to start (I also prefer even numbers!).  When done make a copy of the output and pcs files, then solve again on 8 CPU cores plus 1 of the GPUs.  Make a copy of the resulting output and pcs files.  Lastly try using 4 CPU cores and 1 GPU.  Save the files then report back the total CPU time and the total elapsed time for each solution.