Mike Rife
Ansys Employee
Hi. How was the model launched to the HPC system? Is RSM configured to submit to the cluster? If so do you still have the RSM log? If so any clues there?nIf you submitted the job manually was it a direct submission? I.E. did you issue the MAPDL command to start the batch solve? Or did you submit the job via a job scheduler? If a job scheduler was used are there any logs to be had there? nReview the sole output file - some 'clue' may not be a error or warning message.nIf a job scheduler was used, either by direct submission or from Mechanical via RSM, does the job scheduler queue have a time limit? nIs the HPC system running Linux? If so ask the cluster admin about any Cgroup rules in place. Or any OS rules on hardware usage. A Cgroup rule on say RAM usage (i.e. don't let compute node use more than 95% of RAM) can lead the OS to kill a process that is using a lot of RAM. So the OS may have killed the job. That may or may not be evident in the solve output file, or in the job scheduler log, or in the RSM log.nYou can restart the solution from time 0.11 at least. But you may want to find out about any time and/or hardware usage rules first. So you don't run into this again in a few days.nMike