Tagged: fluent-error, mpi, parallel, parallel-computing
-
-
March 12, 2021 at 11:57 am
heisenmech
SubscriberHi all,nI've been experiencing MPI problems on clusters, leading to failure of simulations. It is odd because it is very inconsistent as it sometimes runs for days then fails, and sometimes it fails almost instantly. The cases run on the cluster were tested on a different cluster and it was all fine. nI usually have a mesh with 30+ million nodes-structured for LES. I was curious if anyone else was also experiencing such parallelisation problems with Fluent (v20.1).nPlease see below for the error. nBest,nOguzhafluent_mpi.20.1.0: Rank 0:84: MPI_Bcast: 863: IBV connection to 96 (pid 20275) on channel 0 is broken. ibv_poll_cq(): bad status 12nfluent_mpi.20.1.0: Rank 0:84: MPI_Bcast: self cnode1033 peer cnode1034 (rank: 96)nfluent_mpi.20.1.0: Rank 0:84: MPI_Bcast: error message: transport retry exceeded errornfluent_mpi.20.1.0: Rank 0:84: MPI_Bcast: Internal MPI errornsrun: forcing job terminationnsrun: Job step aborted: Waiting up to 32 seconds for job step to finish.nslurmstepd: error: *** STEP 2498796.0 ON cnode1005 CANCELLED AT 2021-03-06T06:56:26 ***nsrun: error: cnode1101: tasks 147,156,158: Killednsrun: Terminating job step 2498796.0nsrun: error: cnode1101: task 146: Killednsrun: error: cnode1005: tasks 2,5-6,13: Killednsrun: error: cnode1100: task 140: Killednsrun: error: cnode1034: task 102: Killednsrun: error: cnode1101: tasks 150,157: Killednsrun: error: cnode1005: tasks 3,7,9-11,14: Killednsrun: error: cnode1006: tasks 17-19,22,24,27,31: Killednsrun: error: cnode1100: tasks 133,135,141-143: Killednsrun: error: cnode1024: task 45: Killednsrun: error: cnode1032: tasks 65-66,74,77: Killednsrun: error: cnode1033: task 84: Exited with exit code 16nsrun: error: cnode1033: tasks 87,89,92,94-95: Killednsrun: error: cnode1101: task 152: Killednsrun: error: cnode1025: tasks 48,51,59-60: Killednsrun: error: cnode1005: tasks 1,4: Killednsrun: error: cnode1034: tasks 98,108,110: Killednsrun: error: cnode1060: task 116: Killednsrun: error: cnode1032: tasks 67-68,73,75: Killednsrun: error: cnode1033: tasks 85,91: Killednsrun: error: cnode1101: task 155: Killednsrun: error: cnode1025: tasks 52,61: Killednsrun: error: cnode1034: task 101: Killednsrun: error: cnode1100: tasks 131,134,136,138: Killednsrun: error: cnode1032: task 79: Killedn The fluent process could not be started.nnrealt1476m53.643snusert5m24.318snsyst2m54.340snn -
March 12, 2021 at 2:14 pm
Rob
Ansys EmployeeIf it's random check on the system side. You're looking for RAM leaks (I'm not aware of any issues) and random acts of IT. Is the head node also on the cluster? n -
March 13, 2021 at 9:52 pm
heisenmech
SubscriberWe've been testing different solvers as well, and no issues with them at all. IT people wanted me to check with ANSYS if it's some sort of bug with parallelisation. Yes, the head node is on the cluster. n
-
Viewing 2 reply threads
- You must be logged in to reply to this topic.
Ansys Innovation Space

Earth Rescue – An Ansys Online Series
The climate crisis is here. But so is the human ingenuity to fight it. Earth Rescue reveals what visionary companies are doing today to engineer radical new ideas in the fight against climate change. Click here to watch the first episode.

Ansys Blog
Subscribe to the Ansys Blog to get great new content about the power of simulation delivered right to your email on a weekly basis. With content from Ansys experts, partners and customers you will learn about product development advances, thought leadership and trends and tips to better use Ansys tools. Sign up here.
Trending discussions
- Suppress Fluent to open with GUI while performing in journal file
- Floating point exception in Fluent
- What are the differences between CFX and Fluent?
- Heat transfer coefficient
- Getting graph and tabular data from result in workbench mechanical
- The solver failed with a non-zero exit code of : 2
- Difference between K-epsilon and K-omega Turbulence Model
- Time Step Size and Courant Number
- Mesh Interfaces in ANSYS FLUENT
- error in cfd post
Top Contributors
-
2656
-
2120
-
1347
-
1118
-
461
Top Rated Tags
© 2023 Copyright ANSYS, Inc. All rights reserved.
Ansys does not support the usage of unauthorized Ansys software. Please visit www.ansys.com to obtain an official distribution.