Ansys Products

Ansys Products

Fluent(ANSYS 2023R1) fails to acquire license on 1st run, but succeeds on 2nd

    • Tomas Llano-Rios
      Subscriber

      Hello,

      I'm trying to run Fluent (ANSYS 2023R1) on a cluster (more on the cluster setup later) in batch mode. Fluent runs successfully, at the first try (and subsequent tries) on the head node. When I run fluent in a compute node, it only succeeds if I fire it up immediately after the first run fails. The error message is:

                    Welcome to ANSYS Fluent 2023 R1

                    Copyright 1987-2023 ANSYS, Inc. All Rights Reserved.
                    Unauthorized use, distribution or duplication is prohibited.
                    This product is subject to U.S. laws governing export and re-export.
                    For full Legal Notice, see documentation.

      Build Time: Nov 28 2022 09:30:46 EST  Build Id: 10208
      Cannot initialize ANSYS Licensing context
      Connected License Server List:
      ANSYS LICENSE MANAGER ERROR:ANSYSLI exited or could not read server port ansyscl.cluster-c01.14070.12506.
      Please refer /home/user/.ansys/ansyscl.cluster-c01.14070.12506.log for more information.

      The cluster setup is as follows:

      • OS: All nodes run Ubuntu Server 20.04, no GUI.
      • License: 4-way parallel academic license
      • Network topology: There are two networks: a public network and a private network. The head node is directly connected to both networks, while the compute nodes are connected to the private one only. The head node routes traffic from the private network to the public network.
      • ANSYS Setup: ANSYS is installed on the head node, at folder /apps/ansys/2023r1. The head node runs an NFSv4 server and shares folder /apps  to all compute nodes. The product was installed using the "silent install" mode.
      • ANSYS licensing server: It is connected to the public network only and uses ports 2325 and 1055.

      Here are the steps I have alreay followed to troubleshoot the issue:

      •  telnet 2325 and telnet 1055 when ran from any node, both succeeded.
      • /apps/ansys/2023r1/shared_files/licensing/lic_admin/bin/licensing_diagnostics.sh when ran from any node, succeeded.
      • traceroute -T   (TCP) and traceroute -I (ICMP) when ran from the compute nodes, both reached the license server within two hops: 1st hop to the head node, 2nd from the head node to the license server.
      • When executing ldd /apps/ansys/2023r1/v231/licensingclient/linx64/ansyscl, there isn't any (apparent) missing or broken .so library. I double checked the ansyscl binary after reading https://forum.ansys.com/forums/topic/ansysli-exited-or-could-not-read-server-port-anasyscl-mylab-604472-17495/.

      Any help or suggestion you can provide me with is highly appreciated :)

      Thank you,

      Tomas

       

    • Mangesh Bhide
      Ansys Employee

      Please ensre that the compute nodes can reach the license server on all  ports used by license server.

       

      The last line said

      Please refer /home/user/.ansys/ansyscl.cluster-c01.14070.12506.log for more information.

      Does that ile list any errors

      If unable to locate that file, then try running gain and you might get a simlar new file name

       

      • Tomas Llano-Rios
        Subscriber

         

         

        Hello Mangesh Bhide,

        I think the NFS could be part of the issue as the nodes can reach all relevant ports in the licensing server. Also, I found a workaround by increasing timeout-related variables in /apps/ansys/2023r1/shared_files/licensing/ansyslmd.ini:

        SERVER=1055@license-server-ip
        ANSYSLI_SERVERS=2325@license-server-ip
        ANSYSLI_FNP_IP_ENV=1
        ANSYSLI_TIMEOUT_FLEXLM=20
        ANSYSLI_FLEXLM_TIMEOUT_ENV=2000000
        ANSYSCL_TIMEOUT_CONNECT=60
        ANSYSCL_TIMEOUT_RESPONSE=300

        now ansys can retrieve the license on the compute nodes after 2 to 3 minutes approximately. I suspect the second run did not fail before I applied this workaround due to READs being cached. Here is a small test that can give you an idea of what READ speeds look like on the NFS:

        cluster-c01$ do dd if=/dev/urandom of=/apps/ansys/tmpfile bs=1M count=1024

        cluster-c01$ for i in {0..4}; do dd if=/apps/ansys/tmpfile of=/dev/null bs=1M count=1024; done
        1024+0 records in
        1024+0 records out
        1073741824 bytes (1.1 GB, 1.0 GiB) copied, 9.35959 s, 115 MB/s
        1024+0 records in
        1024+0 records out
        1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.253889 s, 4.2 GB/s
        1024+0 records in
        1024+0 records out
        1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.199752 s, 5.4 GB/s
        1024+0 records in
        1024+0 records out
        1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.194783 s, 5.5 GB/s

        as you can see, the first READ seems slow compared to the others, but since the nodes only have 1Gbe nics, 115MB/s (~920Mbps) is acceptable given the hardware. A similar test for WRITEs results in WRITE speeds of 89.6MB/s on average. Here is how much activity I see on the ansys mount:

        cluster-c01$ nfsiostat
        :/opt/apps/ansys mounted on /apps/ansys:

                   ops/s       rpc bklog
                 466.680           0.000

        read:              ops/s            kB/s           kB/op         retrans    avg RTT (ms)    avg exe (ms)
                          31.864         261.736           8.214        0 (0.0%)           0.916           1.007
        write:             ops/s            kB/s           kB/op         retrans    avg RTT (ms)    avg exe (ms)
                           0.005           0.016           3.137        0 (0.0%)           0.333           0.417

        which points to the problem being exclusively related to READs, but I wonder how much data ansys needs to read from its installation folder before it reaches the license server. Does this seem reasonable to you?

        Regarding the file you mention, there are no errors. This is what it shows:

        2023/05/17 16:01:28    INFO                ANSYSLI_CMD=/apps/ansys/2023r1/v231/licensingclient/linx64/ansyscl -acl 1739221.88242 -nodaemon -log /home/user/.ansys/ansyscl.cluster-c01.1739221.88242.log
        2023/05/17 16:01:28    INFO                ANSYSLI_INITIALIZATION_FILE='/apps/ansys/2023r1/shared_files/licensing/ansyslmd.ini'
        2023/05/17 16:01:28    INFO                ANSYSLI_PRODORD_FILE='/apps/ansys/2023r1/shared_files/licensing/prodord/ansysli.prodord.xml'
        2023/05/17 16:01:28    INFO                ANSYSLI_TIMEOUT_FLEXLM=20
        2023/05/17 16:01:28    INFO                Configuring ACL Core
        2023/05/17 16:01:28    INFO                ACL Core Initialized
        2023/05/17 16:01:28    INFO                ANSYSCL_PORT=54013
        2023/05/17 16:01:28    INFO                Listen Socket Created
        2023/05/17 16:01:28    INFO                ANSYSLI_CLIENT_IDLE_TIMEOUT=0
        2023/05/17 16:01:28    INFO                ANSYSCL_FNP_PATH=1055@license-server-hostname:1055@license-server-ip
        2023/05/17 16:01:28    INFO                Ready to accept connections.
        2023/05/17 16:01:28    INFO                ANSYSLI_IP_OVERRIDE option is off.
        2023/05/17 16:01:28    CLIENT_ACCEPT                                                                                            1/1/1/1                                                                              16:127.0.0.1
        2023/05/17 16:01:28    CLIENT_CONNECT                                                                                           1/1/1/1   1739221:1739221:FLUENT:user@cluster-c01.maas:linx64                    16:127.0.0.1
        2023/05/17 16:01:28    RECONNECT           CLIENT_IDLE                                                                          1/1/1/1   1739221:1739221:FLUENT:user@cluster-c01.maas:linx64                    16:127.0.0.1

        Here is the content of another relevant log file:

        /home/user/.ansys/licdebug.cluster-c01.FLUENT.231.out

        2023/05/16 15:22:35    INFO                Starting Licensing Client Proxy server.
        2023/05/16 15:22:35    INFO                /apps/ansys/2023r1/v231/licensingclient/linx64/ansyscl -acl 736341.91737 -nodaemon -log /home/user/.ansys/ansyscl.cluster-c01.736341.91737.log
        2023/05/16 15:22:35    INFO                Started ANSYSLI server.
        2023/05/16 15:22:50    CONNECT_ERROR                                                                                            0/0/0/0   736341:FLUENT:user@cluster-c01.maas:linx64             0:127.0.1.1
                        ANSYSLI exited or could not read server port ansyscl.cluster-c01.736341.91737.
                        Please refer /home/user/.ansys/ansyscl.cluster-c01.736341.91737.log for more information.
        2023/05/16 15:22:50    CONNECT_ERROR                                                                                            0/0/0/0   736341:FLUENT:user@cluster-c01.maas:linx64             0:127.0.1.1
                        ANSYSLI exited or could not read server port ansyscl.cluster-c01.736341.91737.
                        Please refer /home/user/.ansys/ansyscl.cluster-c01.736341.91737.log for more information.
        2023/05/16 15:22:50    INFO                Starting Licensing Client Proxy server.
        2023/05/16 15:22:50    INFO                /apps/ansys/2023r1/v231/licensingclient/linx64/ansyscl -acl 736341.91737 -nodaemon -log /home/user/.ansys/ansyscl.cluster-c01.736341.91737.log
        2023/05/16 15:22:50    INFO                Started ANSYSLI server.
        2023/05/16 15:23:05    CONNECT_ERROR                                                                                            0/0/0/0   736341:FLUENT:user@cluster-c01.maas:linx64             0:127.0.1.1
                        ANSYSLI exited or could not read server port ansyscl.cluster-c01.736341.91737.
                        Please refer /home/user/.ansys/ansyscl.cluster-c01.736341.91737.log for more information.
        2023/05/16 15:23:05    CONNECT_ERROR                                                                                            0/0/0/0   736341:FLUENT:user@cluster-c01.maas:linx64             0:127.0.1.1
                        ANSYSLI exited or could not read server port ansyscl.cluster-c01.736341.91737.
                        Please refer /home/user/.ansys/ansyscl.cluster-c01.736341.91737.log for more information.
        2023/05/16 15:23:05    INFO                Starting Licensing Client Proxy server.
        2023/05/16 15:23:05    INFO                /apps/ansys/2023r1/v231/licensingclient/linx64/ansyscl -acl 736341.91737 -nodaemon -log /home/user/.ansys/ansyscl.cluster-c01.736341.91737.log
        2023/05/16 15:23:05    INFO                Started ANSYSLI server.
        2023/05/16 15:23:20    CONNECT_ERROR       cfd_solve_level2                                                                     0/0/0/0   736341:FLUENT:user@cluster-c01.maas:linx64             0:127.0.1.1
                        ANSYSLI exited or could not read server port ansyscl.cluster-c01.736341.91737.
                        Please refer /home/user/.ansys/ansyscl.cluster-c01.736341.91737.log for more information.
        2023/05/16 15:23:20    CONNECT_ERROR       cfd_solve_level2                                                                     0/0/0/0   736341:FLUENT:user@cluster-c01.maas:linx64             0:127.0.1.1
                        ANSYSLI exited or could not read server port ansyscl.cluster-c01.736341.91737.
                        Please refer /home/user/.ansys/ansyscl.cluster-c01.736341.91737.log for more information.
        2023/05/16 15:25:23    INFO                Starting Licensing Client Proxy server.
        2023/05/16 15:25:23    INFO                /apps/ansys/2023r1/v231/licensingclient/linx64/ansyscl -acl 736962.66239 -nodaemon -log /home/user/.ansys/ansyscl.cluster-c01.736962.66239.log
        2023/05/16 15:25:23    INFO                Started ANSYSLI server.
        2023/05/16 15:25:24    CLIENT_CONNECT                                                                                           1/1/1/1   736962:FLUENT:user@cluster-c01.maas:linx64             16:127.0.1.1
        2023/05/16 15:25:24    NEW_CONNECTION      Connected to Licensing Client Proxy server: 48569@127.0.0.1.
        2023/05/16 15:25:24    INFO                Parent Child context created with id e855bde0-bd25-4471-9240-6adbb51ea260.
        2023/05/16 15:25:24    CHECKOUT            cfd_solve_level1                23.1 (2022.1114)            1/1/1/25                 1/1/1/1   736962:FLUENT:user@cluster-c01.maas:linx64             16:127.0.1.1
        2023/05/16 15:25:24    CHECKOUT            cfd_solve_level1 (Share)        23.1 (2022.1114)             1/-/-/-                 1/1/1/1   736962:FLUENT:user@cluster-c01.maas:linx64             16:127.0.1.1
        2023/05/16 15:25:24    CHECKOUT            cfd_base                        23.1 (2022.1114)            1/1/1/25                 1/1/1/1   736962:FLUENT:user@cluster-c01.maas:linx64             16:127.0.1.1
        2023/05/16 15:25:24    CHECKOUT            cfd_base (Share)                23.1 (2022.1114)             1/-/-/-                 1/1/1/1   736962:FLUENT:user@cluster-c01.maas:linx64             16:127.0.1.1
        2023/05/16 15:25:24    CHECKOUT            cfd_solve_level2                23.1 (2022.1114)            1/1/1/25                 1/1/1/1   736962:FLUENT:user@cluster-c01.maas:linx64             16:127.0.1.1
        2023/05/16 15:25:24    CHECKOUT            cfd_solve_level2 (Share)        23.1 (2022.1114)             1/-/-/-                 1/1/1/1   736962:FLUENT:user@cluster-c01.maas:linx64             16:127.0.1.1
        2023/05/16 15:25:42    INFO                Parent Child context, id e855bde0-bd25-4471-9240-6adbb51ea260, has been closed
        2023/05/16 15:25:42    CLIENT_EXIT                                                                                              0/1/1/1   736962:FLUENT:user@cluster-c01.maas:linx64             16:127.0.1.1
Viewing 1 reply thread
  • You must be logged in to reply to this topic.