We have configured the arc master on login node and arcnode on compute node. When we test the RSM it gives error as, cannot submit job and failed to start the arcnode.
RSM Version: 19.3.328.0, Build Date: 11/18/2018 14:06:25
Job Name: RSM Queue Test Job
Client Directory: /hpchome/hpcadmin/.local/share/Temp/RsmConfigTest/2r2bm361.x9s
Client Machine: login
Queue: RSM Queue [localhost, default]
Cluster Configuration: localhost [localhost]
Cluster Type: ARC
Custom Keyword: blank
Transfer Option: None
Staging Directory: blank
Local Scratch Directory: /workarea/workarea/ansys
Using SSH for inter-node communication on cluster
Cluster Submit Options: blank
Normal Inputs: [*,commands.xml,*.in]
Cancel Inputs: [-]
Excluded Inputs: [-]
Normal Outputs: [*]
Failure Outputs: [-]
Cancel Outputs: [-]
Excluded Outputs: [-]
Submission in progress...
Job Owner: hpcadmin
Submit Time: Thursday, 17 December 2020 15:26
2.67 KB, .05 sec (55.81 KB/sec)
JobType is: SERVERTEST
Final command platform: Linux
Distributed mode requested: True
Running 5 commands
Job working directory: /hpchome/hpcadmin/.local/share/Temp/RsmConfigTest/2r2bm361.x9s
Number of CPU requested: 1
Testing writability of working directory...
If you can read this, file was written successfully to working directory
Writability test complete
Checking queue default exists ...
Job will run locally on each node in: /workarea/workarea/ansys/ack7tpu4.7hx
JobId was parsed as: 1
External operation: 'queryStatus' has failed. This may or may not become a fatal error
Status parsing failed to parse the primary command output: '
External operation: 'parseStatus' has failed. This may or may not become a fatal error
Parser could not parse the job status. Checking for completed job exitcode...
Problem during Status. The parser was unable to parse the output and did not output the variable: RSM_HPC_OUTPUT_STATUS.
Error: Please check that the master service is started and that there is no firewall blocking access on ports 11193, 12193, or 13193
The status command failed to get single job status to the master service on: login:11193.
Job working directory: /hpchome/hpcadmin/.local/share/Temp/RsmConfigTest/gdqlpi3q.1d8
Checking queue local exists ...
Submit parsing failed to parse the primary command output: 'ArcMaster process running as rsmadmin
External operation: 'parseSubmit' has failed. This may or may not become a fatal error
ArcMaster process running as rsmadmin
ARCNode Process could not be reached
Skipping autostart because Master processes is started as a service.
Job not submitted. Error: Job is too big to fit in the queue local with the currently assigned machines.
Failed to submit job to cluster
Problem during Submit. The parser was unable to parse the output and did not output the variable: RSM_HPC_OUTPUT_JOBID.
Output: ArcMaster process running as rsmadmin
Exec Node Name Associated Master State Service User Avail Max Avail Max Avail Max
nid00033 login Running root 72 50 * * * *
nid00034 login Running root 72 50 * * * *
* Indicates that resources have not been set up. Any resource request will be accepted.
Updating Users and Groups...
Groups matching *
Users matching *
Name Status Priority Start Time End Time Max Jobs Allowed Machines Allowed Users
default Active 0 00:00:00 23:59:59 * login:nid00033:nid00034 all
local Active 0 00:00:00 23:59:59 * login all
Please refer RSM documentation, search for "Example: Setting Up a Multi-Node ANSYS RSM Cluster (ARC)". Please double check that all the lines in Step #1 were followed. The arcmaster node needs to be able to communicate with the arcnode
I have setup the ARC master and client nodes and they can communicate with each other. I can see the execution nodes and I have setup the queues using the arcconfigui.
I have configured the RSM and I can see the queue of ARC in it. Once I submit the jobs from the RSM the arc master service is stopped ?? and the process gives as can't communicate to arc master
Ansys customers with active commercial software licenses can access the
customer portal and submit support questions. You will need your active account number to register.