Running concurrent Lumerical simulations on cluster scales terribly

    • Pernilla Ekborg-Tanner


      I'm running Lumerical on a HPC cluster and get terrible scalability when running multiple concurrent simulations. If I run something like this on a compute node:

      mpirun -n 3 ~/opt/lumerical/v231/bin/fdtd-engine-ompi-lcl -t 1 test1.fsp &

      mpirun -n 3 ~/opt/lumerical/v231/bin/fdtd-engine-ompi-lcl -t 1 test2.fsp &

      I can see that 6 processes are started that each run on about 50% CPU efficiency. If I where to run 4 simulations at once, the CPU would drop down to 25%. As a result, the wall time per simulation is greaty increased. I run on a single node, and this happens no matter of how many cores I have available. I do not have the same issue on my local workstation where I use the built-in MPICH2 and the MPICH2-nem fdtd engine. Are there any known issues using OpenMPI? Do you have any idea of what might be the issue? Me and one of the cluster technicians have been at this problem for weeks. 

      The cluster runs RHEL8 and OpenMPI 4.1.4 and I run Lumerical 23.1.

      Thanks in advance and best regards,

      Pernilla Ekborg-Tanner

    • Lito Yap
      Ansys Employee

      @Pernilla Ekborg-Tanner,

      Have you tried running with the bundled MPICH2 nemesis on the cluster?

      Or run with the FDTD executable to run all your simulations at once: (i.e. run 4 simulations concurrently on 1 node with 16 cores) 

      /[installpath]/lumerical/v231/bin/fdtd-engine-mpich2nem -t 4 sweepfile1.fsp sweepfile2.fsp sweepfile3.fsp sweepfile4.fsp 

      Or run with Intel MPI: > Running simulations using terminal in Linux (Using Intel MPI) – Ansys Optics 

      Otherwise, the best way to run concurrent FDTD simulations is to run 1 simulation on 1 machine/node. 


Viewing 1 reply thread
  • You must be logged in to reply to this topic.