Partitioning with journal file

bujarpbujarp Member Posts: 3

I used tihs command in TUI to try a partition without going to different zones.

/parallel/partition/method/principal-axes 40

/parallel/partition/set/across-zones

After the second commant Fluent says that with Metis it isn't possible to change across zones, allthough I used the command before.

I want to try, if disabling across zones can lead to better simulation time.

Thank you

«1

Answers

  • RobRob UKPosts: 11,730Forum Coordinator

    It's unlikely to help, but will depend on the rest of the model. How many cells have you got, and how many zones? Are those zones moving?

  • bujarpbujarp Posts: 24Member

    16 cell zones. There are two sliding meshes and also periodic boundaries. Maybe the a partition boundary, which is cutting the rotational zone can lead to large communication time?

    Mesh Info:218718 nodes and 1110183 elements. My question was more about the TUI.

    Thanks anyway

  • RobRob UKPosts: 11,730Forum Coordinator

    With around 1 million cells and non-conformals for sliding mesh I'd not go over about 10 partitions for efficient use of the cores. You'll still see speed up beyond that.

    Re the two commands, why are you partitioning and then setting the partition across boundaries? Also, whilst journals are very powerful it's usually easier to set up a model locally, partition manually and then send it off to the cluster to run.

  • bujarpbujarp Posts: 24Member

    Thanks. There is also another problem that the if I simulate with 1 Node(20 tasks per node--> so 20 partitions) I get this error message:


    ==============================================================================


    ===================================================================================

    =  BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES

    =  PID 4882 RUNNING AT cstd02-049

    =  EXIT CODE: 9

    =  CLEANING UP REMAINING PROCESSES

    =  YOU CAN IGNORE THE BELOW CLEANUP MESSAGES

    ===================================================================================

      Intel(R) MPI Library troubleshooting guide:

          https://software.intel.com/node/561764

    =====

    When simulating with one node it uses too much memory:

    > time        kbmemused

    > 10:40:01 AM 11605588

    > 10:50:01 AM 14114168

    > 11:00:01 AM 16619788

    > 11:10:01 AM 18385440

    > 11:20:01 AM 20677200

    > 11:30:01 AM 22551200

    > 11:40:01 AM 24563312

    > 11:50:01 AM 27044424

    > 12:00:01 PM 28601752

    > 12:10:02 PM 30850688

    > 12:20:01 PM 32915072

    > 12:30:01 PM 35065452

    > 12:40:01 PM 36657816

    > 12:50:01 PM 38874056

    > 01:00:01 PM 41414584

    > 01:10:01 PM 43445944

    > 01:20:01 PM 44990388

    > 01:30:01 PM 47385076

    > 01:40:01 PM 49660092

    > 01:50:01 PM 51335148

    > 02:00:01 PM 53959052

    > 02:10:01 PM 56032252

    > 02:20:01 PM 58121960

    > 02:30:01 PM 59675572

    > 02:40:01 PM 61839556

    Then there is not ennough RAM availlable for that problem.

    If i simulate with 2Nodes (40 partitions) the simulation is running fine.


    I very appreciate your help.

  • RobRob UKPosts: 11,730Forum Coordinator

    What solver models have you got turned on? 1M cells shouldn't use much more than 3GB RAM unless you're running chemistry or multiphase.

  • bujarpbujarp Posts: 24Member

    It is multiphase Eulerian model. For turbulence I used mixture Real. kepsilon model.

  • RobRob UKPosts: 11,730Forum Coordinator
  • RobRob UKPosts: 11,730Forum Coordinator

    2 phase Euler with no chemistry shouldn't need any more than 4GB RAM, so I'm not sure why it's failing. Can you get more info from IT to see if it's something on their side.

  • bujarpbujarp Posts: 24Member

    I talked with someone from IT but he said it has more something to do with Fluent. There is also one thing in Fluent, which I want to ask, because you talked about chemistry:

    The species model in Fluent is deactivated, nevertheless I get the Warnings:

    Warning: Create_Material: (water-liquid . species-mixture) already exists

    and

    Warning: Create_Material: (buac-liquid-tracer . species-mixture) already exists


    around 30 times. That's because I wanted to activate the species model but then deactivated it again. But the problem with the simulation time was also present before I think.

  • RobRob UKPosts: 11,730Forum Coordinator

    A repeat like that tends to be parallel nodes trying to open something. If you turned species off, the mixture should be kept in the case but isn't available as an option. I have seen another case with stuck phase/species and wonder if you've glitched the scheme call when turning the model off. Check what species are present in the materials and see what is used in the cell zone(s).

  • bujarpbujarp Posts: 24Member

    For both phase materials it looks fine, I think. There is no mixture defined.


  • RobRob UKPosts: 11,730Forum Coordinator

    Not sure, there's something not set up correctly somewhere. How many materials are there?

  • bujarpbujarp Posts: 24Member

    The problem is that you cannot see that in the GUI. But from the TUI it is visible.Deleting is not possible. I decided to make a new fluent setup and then I am sure that no only desired material is activated.

    I am not sure if then the simulation is getting faster, but I hope so.

  • RobRob UKPosts: 11,730Forum Coordinator

    Something's very scrambled in the case. It's probably recoverable but it'll be quicker for you to rebuild from the mesh file.

  • bujarpbujarp Posts: 24Member

    Hello,

    It's me again. I still have the same problems with the too long simulation time. I use periodic boundaries with periodic repeats and sliding mesh. Any ideas? That#s the only problem I have now. The thing with the species models is resolved. I only have two phases there. The periodic offset is 120 degree.


  • RobRob UKPosts: 11,730Forum Coordinator

    Define "too long", CFD runs typically take hours or even weeks depending on what you're modelling.

  • bujarpbujarp Posts: 24Member

    9 hours for 1 second simulation time (timestep=0.001s; no. of timesteps 1000).

    Two sliding meshes

    I use a HPC cluster. I simulate 2 Nodes with one task per node. I don't know why it takes so long.

    I use Eulerian apprach two phases and the real. k- epsilon model.

  • bujarpbujarp Posts: 24Member

    I simulate with ca. 1 Mio cells

  • bujarpbujarp Posts: 24Member

    Thats the file for the cluster starting the job


    #!/bin/bash -l

    #SBATCH --job-name 181121_2

    #SBATCH --partition=long

    #SBATCH --time 18:00:00

    ##SBATCH --exclusive

    #SBATCH --constraint="[cstd01|cstd02]"

    #SBATCH --mem=60000

    #SBATCH --nodes=2

    #SBATCH --ntasks-per-node=20

    #SBATCH -e error_file.e

    #SBATCH -o output_file.o


    ## Gather the number of nodes and tasks

    numnodes=$SLURM_JOB_NUM_NODES

    numtasks=$SLURM_NTASKS

    mpi_tasks_per_node=$(echo "$SLURM_TASKS_PER_NODE" | sed -e 's/^\([0-9][0-9]*\).*$/\1/')

    ## store hostname in txt file

    srun hostname -s > slurmhosts.$SLURM_JOB_ID.txt

    ## calculate slurm task count

    if [ "x$SLURM_NPROCS" = "x" ]; then

       if [ "x$SLURM_NTASKS_PER_NODE" = "x" ];then

          SLURM_NTASKS_PER_NODE=1

       fi

       SLURM_NPROCS=`expr $SLURM_JOB_NUM_NODES \* $SLURM_NTASKS_PER_NODE`

    fi


    export OMP_NUM_THREADS=1

    export I_MPI_PIN_DOMAIN=omp:compact # Domains are $OMP_NUM_THREADS cores in size

    export I_MPI_PIN_ORDER=scatter # Adjacent domains have minimal sharing of caches/sockets

    # Number of MPI tasks to be started by the application per node and in total (do not change):

    np=$[${numnodes}*${mpi_tasks_per_node}]

    # load necessary modules

    module purge

    module add intel/mpi/2019.1

    module add fluent/2021R1

    # run the fluent simulation

    fluent 3ddp -ssh -t$np -mpi=intel -pib -cnf=slurmhosts.$SLURM_JOB_ID.txt -g -i journal.jou

    # delete temp file

    rm slurmhosts.$SLURM_JOB_ID.txt

Sign In or Register to comment.