Ansys Products

Ansys Products

RSM with rsh

    • heechangna
      Subscriber

      Dear,


       


      We have a rsh wrapper (pbsrsh) on our cluster.


      We like to use pbsrsh instead of ssh or rsh for RSM.


      Are there any options to change it?


      I found that there is an option of a check box for ssh or rsh from the RSM configuration application, but we need to use "pbsrsh" explicitly. 


      Thanks!


       


      -Heechang

    • heechangna
      Subscriber

      I am wondering if I can get any support on this matter.


      Thanks.


       


      -Heechang

    • George Karnos
      Ansys Employee

      Hello Heechang,


      What solver type will you be using?(Fluent, Mechanical, etc.),


      What version of the ANSYS Software version?
      What Cluster type (PBS Pro or Torque with Moab)? 
      What Operating System?  


      Best, Geo  

    • heechangna
      Subscriber

      Hi Geo,


       


      Thank you so much for the response!


       


      I think this would give you more direct question.


      So, From https://ansyshelp.ansys.com/account/secured?returnurl=/Views/Secured/corp/v194/rsm_tutr/rsmtut_custserv_setupclientmgr.html?q=rsh,


      it says that rsh is the default and we can change to ssh by click on the check box.


      But, we want to use our own rsh wrapper, called pbsrsh. Is this possible?


       


      Anyway,


      solver type: ensight.


      Ansys version: 2019R1


      Cluster type: Torque with Moab


      OS: RHEL 7.6


       


      Thank you so much!


       


      -Heechang 


       



       
    • tsiriaks
      Ansys Employee

      Hi Heechang,


      I'm not familiar with Ensight. Can you check out the ANSYS Knowledge Material (KM) # 2057599 from the customer portal, to see if it gives you any clue/direction ?


      Thanks,


      Win

    • heechangna
      Subscriber

      Hi Win,


       


      Thank you for this!


      I tried to search "2057599" from the ansys customer portal, but the ID number is not searchable, I think.


      I tried it from the Soulutions Search from the Knowledge Resources tab, but it is not working as well.


      Can you please give me a link to the page?


      Thanks again.


       


      -Heechang


       

    • heechangna
      Subscriber

      Hi Win,


       


      Can you please give me a link or an instruction to get the information about  ANSYS Knowledge Material (KM) # 2057599?


      I tried many different ways, but I cannot get the information.


      Thank you so much!


       


      -Heechang

    • tsiriaks
      Ansys Employee

      Sorry, I was on vacation.


      Here is the info in that KM


      ###########################


      Background information: Default remote launcher of distributed Mechanical APDL from master compute node to slave node(s) is ssh. The -usersh flag can be added to change to rsh which rarely used in modern Linux cluster. Refer below process tree on master compute node and a salve compute node of a PBS Pro job for distributed Mechanical APDL, Intel MPI processes on master compute node are launched under pbs_mom, but Intel MPI processes on slave node are launched from ssd which is not monitored and managed by PBS Pro (pbs_mom). It’s possible that sometimes killing PBS Pro job (e.g. through qdel jobid) only kill processes on master node, leaving zombie processes on slave node(s) which are launched by sshd rather than PBS Pro pbs_mom.


      Solution:


      -- Set below 2 environment variables:


      export I_MPI_HYDRA_BOOTSTRAP=rsh


      export I_MPI_HYDRA_BOOTSTRAP_EXEC=/opt/pbs/default/bin/pbs_tmrsh


      -- Add -usersh in distributed MAPDL command line.


      -- Backup and edit {installed_path}/ansys_inc/v193/ansys/bin/anssh.ini, go to line # 483, change this line from 


      ${rshcmd} -n ${i} mkdir -p "'${dirpath}'"


      to


      /opt/pbs/default/bin/pbs_tmrsh -n ${i} mkdir -p "'${dirpath}'"


      When a distributed Mechanical APDL job is launched by PBS Pro and  runs on multiple compute nodes, MPI processes on both master node and slave node(s) will be launched by pbs_mom: 


      ###########################


      Now, you are using Ensight, so the second and third steps would definitely be different but could be in a similar manner (adding/modifying certain switch and certain file for Ensight to control its solver).

    • heechangna
      Subscriber

      Hi Win,


       


      Thank you so much for the information. This definitely gives us more information.


      So, I figured that each component or/and solver in ANSYS, like fluent, ensight, etc, has independent setting rules for ssh/rsh.


      We always want to use the rsh wrapper. If we don't use it, there will be strayed jobs. (This is mentioned on your previous message.)


      And, it is very difficult to set, because each solver is different.


       


      Even with the information you gave, I cannot make ensight with the wrapper. Is there any support on this?


      And, we have the same issue on fluent as well. Can you please help?


      For fluent, I have a command like


       


      fluent 3ddp -t$ncpus -pinfiniband.ofed -cnf=pnodes -g < input_file


       


      Then, in the output file, we can see that it use "ssh":


       


      /usr/local/ansys/2019R1/v193/fluent/fluent19.3.0/bin/fluent -r19.3.0 3ddp -t40 -pinfiniband.ofed -cnf=pnodes -g


      /usr/local/ansys/2019R1/v193/fluent/fluent19.3.0/cortex/lnamd64/cortex.19.3.0 -f fluent -g (fluent "3ddp -pinfiniband  -host -r19.3.0 -t40 -mpi=ibmmpi -cnf=pnodes -path/usr/local/ansys/2019R1/v193/fluent -ssh")


      /usr/local/ansys/2019R1/v193/fluent/fluent19.3.0/bin/fluent -r19.3.0 3ddp -pinfiniband -host -t40 -mpi=ibmmpi -cnf=pnodes -path/usr/local/ansys/2019R1/v193/fluent -ssh -cx o0314.ten.osc.edu:412516606


      Starting /usr/local/ansys/2019R1/v193/fluent/fluent19.3.0/lnamd64/3ddp_host/fluent.19.3.0 host -cx o0314.ten.osc.edu:412516606 "(list (rpsetvar (QUOTE parallel/function) "fluent 3ddp -flux -node -r19.3.0 -t40 -pinfiniband -mpi=ibmmpi -cnf=pnodes -ssh") (rpsetvar (QUOTE parallel/rhost) "") (rpsetvar (QUOTE parallel/ruser) "") (rpsetvar (QUOTE parallel/nprocs_string) "40") (rpsetvar (QUOTE parallel/auto-spawn?) #t) (rpsetvar (QUOTE parallel/trace-level) 0) (rpsetvar (QUOTE parallel/remote-shell) 1) (rpsetvar (QUOTE parallel/path) "/usr/local/ansys/2019R1/v193/fluent") (rpsetvar (QUOTE parallel/hostsfile) "pnodes") )"


       


      I tried to put "-rsh", "-usersh" and setting the environment variables, but no luck so far.


      I tried to read through the manual, but there is not so much of information about it.


      Any comments or information would be very helpful.


      Thanks.


       


      -Heechang

    • tsiriaks
      Ansys Employee

      Heechang,


      There might be a way for Ensight to use rsh but it can't be used with RSM. Please refer to this guide


      https://ansyshelp.ansys.com/account/secured?returnurl=/Views/Secured/corp/v201/en/ensight_ht/HT-Run-Connect.html?q=rsh


      Similarly for Fluent, you can't use it with RSM but you can try adding a switch in the submission command, see


      https://ansyshelp.ansys.com/account/secured?returnurl=/Views/Secured/corp/v193/flu_ug/flu_ug_sec_parallel_unix_command.html?q=rsh


      Thanks,


      Win

Viewing 9 reply threads
  • You must be logged in to reply to this topic.