[NETVIRT-130] Netvirt fails to add all the ODL nodes as controllers to br-int Created: 09/Sep/16 Updated: 09/Mar/18 Resolved: 09/Mar/18 |
|
| Status: | Resolved |
| Project: | netvirt |
| Component/s: | None |
| Affects Version/s: | Boron |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Medium |
| Reporter: | Venkatrangan Govindarajan | Assignee: | Unassigned |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| Attachments: |
|
| External issue ID: | 6685 |
| Description |
|
Step 1: Setup 3node ODL Step 2: Stack three openstack nodes Step 3: In one of the nodes, the sudo 0vs-vsctl show displayed br-int with only one ODL added as controller. this observation is a bug because if that ODL goes down the instances in that compute node will not be able to work well Please note: Actually a failover was done in ODL1 for some testing. so the entity owners were collected after failover so they do not reflect the situation before failover i.e. when the problem occurred. But the karaf logs should be helpful. When ODL Managers are set in local.conf ODL_OVS_MANAGERS=10.128.0.9,10.128.0.5,10.128.0.6 the expectation is any br-int created by NEtvirt will set all these ODL nodes as openflow controllers. Among the 3 Openstack nodes, the br-int in one of the nodes had only 10.128.0.9 as openflow controller. |
| Comments |
| Comment by Venkatrangan Govindarajan [ 09/Sep/16 ] |
|
Attachment odl1_log.tgz has been added with description: ODL1 Logs |
| Comment by Venkatrangan Govindarajan [ 09/Sep/16 ] |
|
Attachment odl2_log.tgz has been added with description: ODL2 logs |
| Comment by Venkatrangan Govindarajan [ 09/Sep/16 ] |
|
Attachment inventory_contents.json has been added with description: Inventory DS contents |
| Comment by Venkatrangan Govindarajan [ 09/Sep/16 ] |
|
Attachment topology_config.json has been added with description: Topology Config creqted by Netvirt |
| Comment by Venkatrangan Govindarajan [ 09/Sep/16 ] |
|
Attachment topology.json has been added with description: Operational topology from device |
| Comment by Venkatrangan Govindarajan [ 09/Sep/16 ] |
|
Attachment odl3_log.tgz has been added with description: ODL3 logs |
| Comment by Venkatrangan Govindarajan [ 09/Sep/16 ] |
|
The ovs-vsctl show output from the three Openstack nodes Control Node: [gvrangan@openstack-control-node devstack]$ sudo ovs-vsctl show Port "vxlan-10.128.0.7" ovs_version: "2.5.0" [gvrangan@compute2 devstack]$ sudo ovs-vsctl list open_vswitch iface_types : [geneve, gre, internal, ipsec_gre, lisp, patch, stt, system, tap, vxlan] ovs_version : "2.5.0" |
| Comment by Venkatrangan Govindarajan [ 09/Sep/16 ] |
|
Node in which the problem was seen [gvrangan@compute1 devstack]$ sudo ovs-vsctl show Port br-int Port "tap0682360f-2c" iface_types : [geneve, gre, internal, ipsec_gre, lisp, patch, stt, system, tap, vxlan] ovs_version : "2.5.0" |
| Comment by Venkatrangan Govindarajan [ 09/Sep/16 ] |
|
Final Node [gvrangan@compute2 devstack]$ sudo ovs-vsctl list open_vswitch iface_types : [geneve, gre, internal, ipsec_gre, lisp, patch, stt, system, tap, vxlan] ovs_version : "2.5.0" [gvrangan@compute2 devstack]$ Port "vxlan-10.128.0.4" Port br-int |
| Comment by Venkatrangan Govindarajan [ 09/Sep/16 ] |
|
(In reply to Venkatrangan Govindarajan from comment #0) Also Note I had created some instances for testing |
| Comment by Venkatrangan Govindarajan [ 09/Sep/16 ] |
|
The topology information and inventory information weere colelcted after failover only |
| Comment by ranjithkumar_t [ 19/Sep/16 ] |
|
I have analysed the karaf logs which you have attached in the bug. The following are my prediction. • The node 10.128.0.9 is not a member of the cluster nodes. Logs obseerved in karaf logs: 2016-09-09 18:46:00,472 | WARN | ult-dispatcher-5 | ReliableDeliverySupervisor | 154 - com.typesafe.akka.slf4j - 2.4.7 | Association with remote system [akka.tcp://opendaylight-cluster-data@10.128.0.9:2550] has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://opendaylight-cluster-data@10.128.0.9:2550]] Caused by: [Connection refused: /10.128.0.9:2550] 2016-09-09 19:52:55,245 | INFO | ult-dispatcher-7 | kka://opendaylight-cluster-data) | 154 - com.typesafe.akka.slf4j - 2.4.7 | Cluster Node [akka.tcp://opendaylight-cluster-data@10.128.0.5:2550] - Leader can currently not perform its duties, reachability status: [akka.tcp://opendaylight-cluster-data@10.128.0.5:2550 -> akka.tcp://opendaylight-cluster-data@10.128.0.9:2550: Unreachable [Unreachable] (1), akka.tcp://opendaylight-cluster-data@10.128.0.6:2550 -> akka.tcp://opendaylight-cluster-data@10.128.0.9:2550: Unreachable [Unreachable] (1)], member status: [akka.tcp://opendaylight-cluster-data@10.128.0.5:2550 Up seen=true, akka.tcp://opendaylight-cluster-data@10.128.0.6:2550 Up seen=true, akka.tcp://opendaylight-cluster-data@10.128.0.9:2550 Up seen=false] [_value=tcp:10.128.0.6:6653], isIsConnected=true, augmentations={}}, ControllerEntry{getControllerUuid=Uuid [_value=b56f65ee-999a-48f1-ae43-1e529c5cb48a], getTarget=Uri [_value=tcp:10.128.0.5:6653], isIsConnected=true, augmentations={}}, ControllerEntry{getControllerUuid=Uuid [_value=82d358f0-629d-44da-8cd0-923002bd68f5], getTarget=Uri [_value=tcp:10.128.0.9:6653], isIsConnected=false, augmentations={}}], getDatapathId=DatapathId [_value=00:00:b4:9c:52:d0:cb:95] I have tested many times manually but the bug was not reproduced. |
| Comment by Sam Hague [ 09/Mar/18 ] |
|
Legacy NetVirt is deprecated. |