Fluids

Fluids

Running Fluent 2022R2 across multiple nodes

    • dheryadi
      Subscriber

      Hello,

      I manage Ansys installation and administration on our site. Our cluster is running RHEL8 and Univa Grid Engine (UGE) 8.6.18 is used for our job scheduler. Our end user reported that they're not able to run Fluent across multiple nodes. Basically, all Fluent processes are running on a single node, instead of distributed across  nodes. I understand that Fluent comes with its own MPI implementation and was wondering whether this is a known issue with this software on RHEL 8 and Grid Engine 8.6. It would be greatly appreciated if you have any suggestions on how to resolve this issue (e.g. command line options to use, etc).

       

      Thanks. 

    • Hunter Wang
      Ansys Employee

      If you use custom SGE scripts instead of relying on the standard Fluent option (either the -scheduler=sge option from the command line or the Use SGE option in Fluent Launcher, as described in the preceding sections), your environment variables related to the job scheduler will not be used unless you include the -scheduler_custom_script option with the Fluent options in your script.

      Also try -scheduler_tight_coupling, which also internally invokes -scheduler_custom_script

    • dheryadi
      Subscriber
      Thank you for the response.
       
      I'm using the standard fluent option, like so:
               fluent 3ddp -g -i journal_file -t$NSLOTS
       
      where NSLOTS is the core count reserved in the job script. This option used to work in the older version of SGE (8.6.2), but not the later one that we're running (SGE 8.6.18).  I'm currently testing Grid Engine 8.7.2 and the same Fluent issue is unfortunately encountered in this version. According to this table (https://www.ansys.com/content/dam/it-solutions/platform-support/2022-r2-job-schedulers-queuing-systems-support.pdf), Fluent supports SGE 8.6. This support is probably for the earlier release of 8.6, not the very last one (8.6.18). 
       
      Please note that I was able to run the following Ansys 2022R2's mpitest command across multiple hosts:
                 mpitest222  -mpi openmpi -scheduler=sge -np $NSLOTS
       
      Using our own OpenMPI installation, I was also able to run another Ansys 2022R2 product (called CFX) across multiple hosts with the following command:
                      cfx5solve -parallel -part $NSLOTS -par-local -start-method "Open MPI Local Parallel" . . .
       
      I've tried many different Fluent options, but to no avail.  
    • Hunter Wang
      Ansys Employee

      I was able to replicate the issue you reported. Fixed by adding -scheduler_tight_coupling or -scheduler_custom_script.

      Change your:

      fluent 3ddp -g -i journal_file -t$NSLOTS

      to:

      fluent 3ddp -g -i journal_file -t$NSLOTS -scheduler_tight_coupling

      or:

      fluent 3ddp -g -i journal_file -t$NSLOTS -scheduler_custom_script

      It should not relate to OS version of UGE version, but Fluent version. The 2 flags work for 2021 R2 and new releases. 

       

    • dheryadi
      Subscriber

      Thank you so much for the suggestions. The '-scheduler_tight_coupling' flag (with the default Intel MPI) appears to work on both versions of Grid Engine: 8.6.8 and 8.7.2

Viewing 4 reply threads
  • You must be logged in to reply to this topic.