Uploaded image for project: 'OpenFlowPlugin'
  1. OpenFlowPlugin
  2. OPNFLWPLUG-607

[Clustering]: Unrecoverable cluster flow provisioning failure with 30 switches, 1000 flows/switch. (Tried He design only)

    XMLWordPrintable

Details

    • Bug
    • Status: Resolved
    • Resolution: Done
    • None
    • None
    • General
    • None
    • Operating System: All
      Platform: All

    • 5114
    • Highest

    Description

      Build used :
      ===================
      Karaf distro from latest ODL Beryllium code

      Unrecoverable flow provisioning failure in ODL cluster with 30 switches and 1000 flows per switch.

      Test Type :
      ===================
      Flow Provisioning failing on connecting 30 switches across 3 cluster nodes , 1000 flows per switch.Total 30,000 flows.

      Test setup :
      ====================
      Build used: Beryllium
      OF-Plugin Used: Helium
      OF-HA used: NO

      Objective of test :
      ===================
      verify if the flow provioning is stable when we scale switches per controller.

      Test Steps :
      ============
      1. Bring up healthy 3 node cluster say c1, c2 and c3.
      2. Bring UP the mininet with following command and connect to controller c1(10.183.181.41),c2(10.183.181.42) and c3(10.183.181.43).c2 is the Leader of inventory-config-shard
      Here each controller is connected with 10 switches.so overall 30 switches across the cluster.
      3. Now i provision 30000 flows from c1(Follower).

      Note:We are provisioning flows via Binding aware Api of Openflow Inventory Model.
      4.Check if 30000 flows (1000 flows per switch) have been provisioned across 30 switches.

      Commands
      ========
      To connect 10 switches per controller, i use mininet custom command on each controller(c1,c2,c3)
      sudo mn --custom /home/mininet/mininet/custom/mytopo.py --topo mytopo --controller remote,ip=10.183.181.41 --switch ovsk,protocols=OpenFlow13
      sudo mn --custom /home/mininet/mininet/custom/mytopo.py --topo mytopo --controller remote,ip=10.183.181.42 --switch ovsk,protocols=OpenFlow13
      sudo mn --custom /home/mininet/mininet/custom/mytopo.py --topo mytopo --controller remote,ip=10.183.181.43 --switch ovsk,protocols=OpenFlow13

      Note:mytopo.py is attached for quick reference.please modify mytopo.py accordingly

      The below command we use to check number of flows provisioned per controller.
      dpctl dump-aggregate -O OpenFlow13

      Test Results:
      =============
      1. I could not see 30000 flows(1000 flows per switch) and i get the below Exception (full stack could be seen in the logs)
      Caused by: akka.pattern.AskTimeoutException: Ask timed out on [ActorSelection[Anchor(akka.tcp://opendaylight-cluster-data@10.183.181.41:2550/), Path(/user/shardmanager-operational/member-1-shard-inventory-operational/shard-member-2-chn-157-txn-1#1338207586)]] after [5000 ms]

      2. After above condition, no further flow provisioning via config DS works and system requires complete cluster reboot to restore normalcy.

      Attaching all the karaf logs of 3 controller nodes.

      Thanks & Regards,
      Saibal Roy.

      Attachments

        1. logs.zip
          154 kB
        2. logs.zip
          353 kB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            Avishnoi Anil Vishnoi
            saibal.roy@ericsson.com Saibal Roy
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: