Topics related to Lumerical and more

Hardware resource optimization

    • Kaisarbek Omirzakhov

      I am using Lumerical for 3D FDTD simulations. I have the following
      problem when running 2 simulations at the same time.

      Simulation 1 alone using 32 cores takes 10 hours.
      Simulation 2 alone using 32 cores takes 10 hours.
      When simulation 1 and simulation 2 are running at the same time using
      32 cores each, it takes 20 hours for each one to finish.

      Is there a way to optimize the resources for parallel simulations?
      Do you have suggestions for optimizing resources for big simulations?

      Here are the details of the simulation machine:
      OS - Red Hat Enterprise Linux Server 7.9
      Memory - 252 GiB
      Processor - Intel Xeon(R) Gold 6314U CPU @2.3GHzx64
      GNOME - v 3.28.2
      OS type - 64-bit
      Disk - 1.6 TB

    • Guilin Sun
      Ansys Employee

      I guess that you have only one engine license. If so, since one engine license can allow up to 32 cores, it is reasonable that when the two simulation are at the same time, each needs twice the time as when running single file. Please refer to

      and check your license.


      Please also check if one simulation really needs 32 cores. Maybe 16 or even 8 is sufficient. Due to parallel computing's limit on scaling, not every simulation needs more cores than 4 or 8, since the data communications among different blocks (each core simulate one block of the original file) of the simulation will also need time. when such communication time is not negligible, the scaling will deviate from linear relationship. from there, the more processes are used, the slower the simulation compared to n*#cores. Therefore the simulation efficiency is declined. It might be better to simulate two files at the same time but each simulation just uses 16 cores. Please set the cores/processes properly at "Resources" after some testing.

    • Kaisarbek Omirzakhov

      Hi Guilin!

      Thanks for your feedback. I have figured out the issue. I can summarize my finding here for other people who will face similar problem.

      1. I have enough number of licenses. So this doesn't affect the simulation time.
      2. In short, the bottleneck for simulation speed is the memory bandwidth. As you mentioned before, increasing # of cores, doesn't linearly scale the simulation time. In my case, 32 cores give the fastest simulation time for a specific design. When I simulate another similar design in parallel, using another 32 cores on the same machine, the simulation speed dropped twice. This is because the communication between CPU and RAM was already at the cap for a single simulation. Adding another simulation in parallel practically takes 2 times longer time.

      I hope this helps for other people as well. I am attaching some useful links from Ansys website regarging performance optimization.

      Information on Hardware Specifications

      Getting the Best FDTD Performance



    • Guilin Sun
      Ansys Employee

      Thank you for your summary. Memory bandwidth is in deed one main factor that affects simulation speed.

      If you have enough licenses, and enough cores (seems 64 cores), it should not be slow down since they should use different cores. For clusters, you can use different nodes, each node has 32 cores used.

      Please explore more.

Viewing 3 reply threads
  • You must be logged in to reply to this topic.