Check if the “exclusive” flag in the HPC Pack 2019 job template is set to “true”. We were seeing the same thing running on MS HPC pack with MS-MPI, and noticed about 30% boost in performance.
To check/set, you need administrative access to HPC Pack Cluster Manager (or reach out to whomever manages the HPC Cluster). The setting is as shown:
We have a 256 core cluster at the moment, and mostly run 128 core jobs. The jobs were slower previously when this was set to false. Say you have a 32-core server in your cluster, and you request to run a job at 24-cores, this just means nobody else can use the remaining 8 cores not being used on that server, but users can still use unused servers in the cluster. This is at least my understanding of it.