Uploaded image for project: 'genius'
  1. genius
  2. GENIUS-263

tunnels down after bouncing ODL nodes in netvirt csit 3node HA job

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: High High
    • None
    • None
    • ITM

      in our 3node netvirt jobs it seems the default tunnels all end up showing as 'down'
      after running the first suite that bounces ODL instances.

      example job

      the test code is using odltools to check if all the tunnels are up. The command and
      response:

      odltools netvirt analyze tunnels -i 10.30.170.109 -t 8181 -u admin -w admin --path /tmp/07_ha_l3_Suite_Setup
      
      2019-01-03 03:11:41,564 | ERR | common.rest_client   | 0052 | 404 Client Error: Not Found for url: http://10.30.170.109:8181/restconf/config/itm-state:dpn-teps-state
      Analysing transport-zone:default-transport-zone
      ..Interface tun65d79967da1 is down between 10.30.170.163 and 10.30.170.27
      ..Interface tun8186ae8b8b0 is down between 10.30.170.27 and 10.30.170.163
      ..Interface tunaddd45e0aa2 is down between 10.30.170.163 and 10.30.170.170
      ..Interface tun0a682004fbe is down between 10.30.170.27 and 10.30.170.170
      

      but, looking at some debug output in the suite before and comparing to the
      failure debug, I can't find any difference.

      taking one interface "tun65d79967da1" here is some output:

      operational/itm-state:tunnels_state

      {
                      "dst-info": {
                          "tep-device-id": "140946245075061",
                          "tep-device-type": "itm-state:tep-type-internal",
                          "tep-ip": "10.30.170.27"
                      },
                      "oper-state": "unknown",
                      "src-info": {
                          "tep-device-id": "62509838011292",
                          "tep-device-type": "itm-state:tep-type-internal",
                          "tep-ip": "10.30.170.163"
                      },
                      "transport-type": "odl-interface:tunnel-type-vxlan",
                      "tunnel-interface-name": "tun65d79967da1",
                      "tunnel-state": false
      },
      

      of-ctl show

      2(tun65d79967da1): addr:52:fe:71:a9:ad:14
           config:     0
           state:      LIVE
           speed: 0 Mbps now, 0 Mbps max
      

      ovs-vsctl show

      Port "tun65d79967da1"
                  Interface "tun65d79967da1"
                      type: vxlan
                      options: {key=flow, local_ip="10.30.170.163", remote_ip="10.30.170.27"}
      

      It does seem that there may be trouble in at least one ODL after it's coming up as
      I see some clustering INFO messages that seem to indicate something is out-of-sync.
      for example:

      2019-01-03T03:04:38,348 | INFO  | opendaylight-cluster-data-shard-dispatcher-21 | Shard                            | 229 - org.opendaylight.controller.sal-clustering-commons - 1.8.2 | member-3-shard-default-config (Follower): The log is not empty but the prevLogIndex 19042 was not found in it - lastIndex: 17875, snapshotIndex: -1
      2019-01-03T03:04:38,348 | INFO  | opendaylight-cluster-data-shard-dispatcher-21 | Shard                            | 229 - org.opendaylight.controller.sal-clustering-commons - 1.8.2 | member-3-shard-default-config (Follower): Follower is out-of-sync so sending negative reply: AppendEntriesReply [term=23, success=false, followerId=member-3-shard-default-config, logLastIndex=17875, logLastTerm=4, forceInstallSnapshot=false, payloadVersion=9, raftVersion=3]
      

            enidadh nidhi adhvaryu
            jluhrsen Jamo Luhrsen
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved:

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0 minutes
                0m
                Logged:
                Time Spent - 1 day
                1d