Explicit dynamics on HPC

Hi,

I run mechanical models on HPC with no problem.  In the client RSM, the parameter of shared memory parallel is "thread" and the distributed parallel is "openmpi".

However, I can't run Explicit Dynamics Simulations on more than one core on the cluster. If I request more than one core, the simulations does not start. The solution output attached. Shows pre-solver steps all complete but solve does not start!  Tried with one core on the cluster and it runs totally fine (so something is not right with the distributed parallel parameter).

Do I need to change the openmpi parameter or something else is not correct? - thanks

Comments

  • tsiriakstsiriaks 3240 El Camino Real #290, Irvine, CA 92602Forum Coordinator
    edited December 2019

    Ali,

    Can you post the full RSM Job Report inline with text (or screenshots if it doesn't allow you to do so) ?

    Thanks,

    Win

  • alitabeialitabei Member
    edited December 2019

    Hello Win, 

    Please a screen shot of the part of the RSM report with errors/warnings. 

    Thanks

  • tsiriakstsiriaks 3240 El Camino Real #290, Irvine, CA 92602Forum Coordinator
    edited December 2019

    Ali,

    please post the entire RSM Job Report including line number.

    Thanks,

    Win

  • alitabeialitabei Member
    edited December 2019

    Win, 

    Since it won't fit here I am putting this public link to the full RSM report. Can you open it?

    Please let me know if I should provide it to you via other methods. 

    Thanks

     

  • tsiriakstsiriaks 3240 El Camino Real #290, Irvine, CA 92602Forum Coordinator
    edited December 2019

    Ali,

    Sorry, we are not allowed to open and download it. 

    How many lines in this report ? If it's a few hundreds, can you take screenshots and post them here ? Otherwise, please post a few screenshots of the very top part of the report.

    Thanks,

    Win

  • alitabeialitabei Member
    edited December 2019

    Win, 

    Please see them below:

  • tsiriakstsiriaks 3240 El Camino Real #290, Irvine, CA 92602Forum Coordinator
    edited December 2019

    Thank you for the info.

    Does this compute-j-1 have 32 cores ?

    Can you try with 4 cores first ?

    Also, for your static structural analysis, please see if it's also submitted to solve on the same compute node (compute-j-1) without issue.

    Thanks,

    Win

  • alitabeialitabei Member
    edited December 2019

    Hello Win, 

    Yes, compute-j-1 has 32 cores. My queue has two other compute nodes, each with 32 cores.  I successfully ran a structural analysis on either of the nodes (the RSM report screenshot is below). 

    The Explicit dynamics on 4 cores also fails (the screen shots again are below). 

    I noticed one thing: my desktop has 8 cores. When an explicit job finishes on my desktop, it says it has used Intel MPI (screen shot below), while in my RSM and for structural analysis, I am asking RSM to use OpenMPI (screen shot below). Does the Autodyn solver distribute the solution with Intel MPI only? Maybe I need to change this in my RSM?

    Thanks for your helps. 

    The structural job on 33 cores on the cluster:

    and the explicit job on the 4 cores that again failed;

  • tsiriakstsiriaks 3240 El Camino Real #290, Irvine, CA 92602Forum Coordinator
    edited December 2019

    Thanks for the info Ali.
    It seems the explicit dynamics + RSM is adding switch -mpi -ibmmpi  in the submission command while your static structural job does not have this switch (you can see this from the line 'Running Solver' in each RSM Job Report) . So, your static structural analyses must be using OpenMPI correctly but your explicit dynamics analyses are using IBM MPI which encounters the issue. I will have to do a bit research on this on how to control the MPI for explicit dynamics analyses, I will get back to you.

    Thanks,

    Win

  • alitabeialitabei Member
    edited December 2019

     Thanks very much Win. I will ask our admins if we have the IBM MPI on our cluster and if it can be added to my compute nodes. 

    Please let me know what you find about this too. Looking forward to have this issue fixed. 

    best

  • tsiriakstsiriaks 3240 El Camino Real #290, Irvine, CA 92602Forum Coordinator
    edited December 2019

    Hi Ali,

    Can you also post the screenshot of your WB project schematic ? And screenshot of the first tab of your RSM Configuration GUI

    Another thing is, can you try taking the entire solver command from the line 'Running Solver' in explicit dynamics RSM Job Report (4 or 32 cores is ok) , then manually use that command on the submit node (submit-3 something) . You will have to make sure you execute the command from the staging directory and that the admodel_0.ad input file has to exist there. If you don't have these file or folder, in the RSM Configuration GUI, you can check to keep the files in staging directory after the job is done, then just submit a job, let it fails, and use the solver command and files from there. If you have question with this, let us know.

    Also, it turns out that only IBM MPI is supported for explicit dynamics analyses on Linux cluster, ref:

    https://ansyshelp.ansys.com/account/secured?returnurl=/Views/Secured/corp/v195/adyn_para/adyn_para_config.html

    so, your IT will have to make sure that the cluster can run jobs with IBM MPI

    Thanks,

    Win

  • alitabeialitabei Member
    edited December 2019

    Hi Win, 

    Please find both screenshot attached. Our admin says that IBM MPI: "It's in /opt/intel on all of the cluster machines" ; but when I tried ibmmpi or impi in the first tab of RSM GUI, I saw the error with screenshot below. 

    I will try the manual command (if I understood the instructions correctly) and will let you know what happens. 

    Thanks

     

     

     

  • tsiriakstsiriaks 3240 El Camino Real #290, Irvine, CA 92602Forum Coordinator
    edited December 2019

    Ali, 

    The 'PE' in the first RSM Configuration GUI is a file that your cluster admin manually specifies/creates for UGE/SGE scheduler, it's not where you specify MPI implementation for ANSYS solver. So, you can't just change openmpi to ibmmpi for this. Ask your cluster admin for the correct PE to be used with IBM MPI (if it's not the same 'openmpi'). These two boxes are about telling the UGE/SGE scheduler what environment to be set for the parallel solving of the job. For the ANSYS explicit solver, it's always default to IBM MPI, so you don't have to worry about how to tell the solver to use IBM MPI. 

    Yes, please try manual command submission and let us know what error you get.

    Thanks,

    Win

  • alitabeialitabei Member
    edited December 2019

    Hi Win,

    I got a correction from our HPC admin! we do not have the IBM MPI!! We have Intel MPI.

    They are short staffed now and installing the IBM MPI will happen by the end of Jan!!!

    I guess there is nothing else that I can do till then? If this is the case, should I close this thread now and open another one in Jan/Feb or leave it open?

    thanks

  • tsiriakstsiriaks 3240 El Camino Real #290, Irvine, CA 92602Forum Coordinator
    edited December 2019

    Hi Ali,

    Unfortunately, that's it since other MPI implementations are not supported by this.

    You can either respond here again when IBM MPI is installed or you can create a new thread and just reference the url of this thread. This is solely up to you.

    Thanks,

    Win

  • alitabeialitabei Member
    edited December 2019

    Thanks Win. Will then post questions here after we got the IBM MPI. 

  • tsiriakstsiriaks 3240 El Camino Real #290, Irvine, CA 92602Forum Coordinator
    edited December 2019

    Sounds good Ali.

     

     

Sign In or Register to comment.