[OVSDB-329] br-int not created in clustered setup Created: 13/Apr/16  Updated: 28/Jun/16  Resolved: 28/Jun/16

Status: Resolved
Project: ovsdb
Component/s: Southbound.Open_vSwitch
Affects Version/s: unspecified
Fix Version/s: None

Type: Bug
Reporter: Jamo Luhrsen Assignee: Vinh Nguyen
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Attachments: File 3ControllerLogs.tar.gz    
External issue ID: 5721

 Description   

3 ODL (odl-mdsal-clustering, odl-jolokia, odl-ovsdb-openstack). Controllers were freshly
unzipped and started and the below steps were done. This is with a recent (4/12)
stable/beryllium distribution.

ODL_1=172.18.182.15
ODL_1=172.18.182.3
ODL_1=172.18.182.19

two ovs nodes A=172.18.182.7 and B=172.18.182.17, and made these two configs:

ovs-vsctl set Open_vSwitch $OVS_UUID other_config:local_ip=$OVS_SYSTEM_IP
ovs-vsctl set-manager tcp:$ODL_1:6640 tcp:$ODL_2:6640 tcp:$ODL_3:6640

A received the br-int config and netvirt pipeline, but B did not.

ovs output from A:
-------------------------

  1. ovs-vsctl show
    ce273766-0e83-4f5f-87d7-e5014531dc60
    Manager "tcp:172.18.182.19:6640"
    is_connected: true
    Manager "tcp:172.18.182.3:6640"
    is_connected: true
    Manager "tcp:172.18.182.15:6640"
    is_connected: true
    Bridge br-int
    Controller "tcp:172.18.182.15:6653"
    is_connected: true
    Controller "tcp:172.18.182.3:6653"
    is_connected: true
    Controller "tcp:172.18.182.19:6653"
    is_connected: true
    fail_mode: secure
    Port br-int
    Interface br-int
    type: internal
    ovs_version: "2.0.2"
  1. ovs-ofctl -OOpenflow13 dump-flows br-int
    OFPST_FLOW reply (OF1.3) (xid=0x2):
    cookie=0x0, duration=496.032s, table=0, n_packets=0, n_bytes=0, dl_type=0x88cc actions=CONTROLLER:65535
    cookie=0x0, duration=496.032s, table=0, n_packets=7, n_bytes=558, priority=0 actions=goto_table:20
    cookie=0x0, duration=496.022s, table=20, n_packets=7, n_bytes=558, priority=0 actions=goto_table:30
    cookie=0x0, duration=496.003s, table=30, n_packets=7, n_bytes=558, priority=0 actions=goto_table:40
    cookie=0x0, duration=495.976s, table=40, n_packets=7, n_bytes=558, priority=0 actions=goto_table:50
    cookie=0x0, duration=495.95s, table=50, n_packets=7, n_bytes=558, priority=0 actions=goto_table:60
    cookie=0x0, duration=495.929s, table=60, n_packets=7, n_bytes=558, priority=0 actions=goto_table:70
    cookie=0x0, duration=495.901s, table=70, n_packets=7, n_bytes=558, priority=0 actions=goto_table:80
    cookie=0x0, duration=495.877s, table=80, n_packets=7, n_bytes=558, priority=0 actions=goto_table:90
    cookie=0x0, duration=495.837s, table=90, n_packets=7, n_bytes=558, priority=0 actions=goto_table:100
    cookie=0x0, duration=495.818s, table=100, n_packets=7, n_bytes=558, priority=0 actions=goto_table:110
    cookie=0x0, duration=495.798s, table=110, n_packets=7, n_bytes=558, priority=0 actions=drop
    root@jamo-mininet-ubuntu14:/home/ubuntu#

ovs output from B:
-------------------------

  1. ovs-vsctl show
    d2f2225b-9b42-4cb7-b0d2-8e4f5f03761a
    Manager "tcp:172.18.182.19:6640"
    is_connected: true
    Manager "tcp:172.18.182.3:6640"
    is_connected: true
    Manager "tcp:172.18.182.15:6640"
    is_connected: true
    ovs_version: "2.4.0"

on repeated attempts, the problem remained. doing a "del-manager" followed by a "set-manager" I could
see two controllers log that they were "NOT an OWNER of the device". The 3rd controller would have
this exception:

2016-04-12 23:19:19,704 | WARN | pool-37-thread-3 | OvsdbConnectionManager | 252 -
org.opendaylight.ovsdb.southbound-impl - 1.2.3.SNAPSHOT | OVSDB entity Entity{type='ovsdb',
id=/(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)network-topology/topology/topology[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)topology-id=ovsdb:1}

]/node/node[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node-id=ovsdb://uuid/d2f2225b-9b42-4cb7-b0d2-8e4f5f03761a}

]}
was already registered for ownership
org.opendaylight.controller.md.sal.common.api.clustering.CandidateAlreadyRegisteredException: Candidate has already been
registered for Entity{type='ovsdb',
id=/(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)network-topology/topology/topology[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)topology-id=ovsdb:1}

]/node/node[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node-id=ovsdb://uuid/d2f2225b-9b42-4cb7-b0d2-8e4f5f03761a}

]}
at
org.opendaylight.controller.cluster.datastore.entityownership.DistributedEntityOwnershipService.registerCandidate(DistributedEntityOwnershipService.java:142)[165:org.opendaylight.controller.sal-distributed-datastore:1.3.2.SNAPSHOT]
at
org.opendaylight.ovsdb.southbound.OvsdbConnectionManager.registerEntityForOwnership(OvsdbConnectionManager.java:515)[252:org.opendaylight.ovsdb.southbound-impl:1.2.3.SNAPSHOT]
at
org.opendaylight.ovsdb.southbound.OvsdbConnectionManager.connected(OvsdbConnectionManager.java:115)[252:org.opendaylight.ovsdb.southbound-impl:1.2.3.SNAPSHOT]
at
org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$5.run(OvsdbConnectionService.java:379)[249:org.opendaylight.ovsdb.library:1.2.3.SNAPSHOT]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)[:1.8.0_77]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)[:1.8.0_77]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)[:1.8.0_77]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)[:1.8.0_77]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)[:1.8.0_77]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)[:1.8.0_77]
at java.lang.Thread.run(Thread.java:745)[:1.8.0_77]

The workaround was to stop ovs, delete the /etc/openvswitch/conf.db, start ovs and set the managers
again. This time br-int was created and the netvirt pipeline showed up.



 Comments   
Comment by Jamo Luhrsen [ 13/Apr/16 ]

Attachment 3ControllerLogs.tar.gz has been added with description: controller logs

Comment by Venkatrangan Govindarajan [ 11/May/16 ]

Am also able to reporduce the scenario at my end using Beryllium SR2 and 3node deployment with feature: odl-ovsdb-openstack

There was a exception

ifier KeyedInstanceIdentifier

{targetType=interface org.opendaylight.yang.gen.v1.urn.tbd.params.xml.ns.yang.network.topology.rev131021.network.topology.topology.Node, path =[org.opendaylight.yang.gen.v1.urn.tbd.params.xml.ns.yang.network.topology.rev131021.NetworkTopology, org.opendaylight.yang.gen.v1.urn.tbd.params.xml.ns.yang.network.topo logy.rev131021.network.topology.Topology[key=TopologyKey [_topologyId=Uri [_value=ovsdb:1]]], org.opendaylight.yang.gen.v1.urn.tbd.params.xml.ns.yang.network.topology.rev 131021.network.topology.topology.Node[key=NodeKey [_nodeId=Uri [_value=ovsdb://uuid/4e1fa111-de62-4b68-89b5-e95ed2404459]]]]}

generated for device connection ConnectionIn
fo [Remote-address=10.128.0.7, Remote-port=47579, Local-address10.128.0.6, Local-port=6640, type=PASSIVE]
2016-05-11 20:13:06,817 | WARN | pool-38-thread-7 | OvsdbConnectionManager | 238 - org.opendaylight.ovsdb.southbound-impl - 1.2.3.Beryllium-SR2 | OVSDB entity
Entity{type='ovsdb', id=/(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)network-topology/topology/topology[

{(urn:TBD:params:xml:ns:yang:network-topology ?revision=2013-10-21)topology-id=ovsdb:1}

]/node/node[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node-id=ovsdb://uuid/4e1fa111-de62-4b68-89b5-e95ed2 404459}

]} was already registered for ownership
org.opendaylight.controller.md.sal.common.api.clustering.CandidateAlreadyRegisteredException: Candidate has already been registered for Entity{type='ovsdb', id=/(urn:TBD:
params:xml:ns:yang:network-topology?revision=2013-10-21)network-topology/topology/topology[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)topology-id=o vsdb:1}

]/node/node[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node-id=ovsdb://uuid/4e1fa111-de62-4b68-89b5-e95ed2404459}

]}
at org.opendaylight.controller.cluster.datastore.entityownership.DistributedEntityOwnershipService.registerCandidate(DistributedEntityOwnershipService.java:142)[1
52:org.opendaylight.controller.sal-distributed-datastore:1.3.2.Beryllium-SR2]
at org.opendaylight.ovsdb.southbound.OvsdbConnectionManager.registerEntityForOwnership(OvsdbConnectionManager.java:515)[238:org.opendaylight.ovsdb.southbound-impl
:1.2.3.Beryllium-SR2]
at org.opendaylight.ovsdb.southbound.OvsdbConnectionManager.connected(OvsdbConnectionManager.java:115)[238:org.opendaylight.ovsdb.southbound-impl:1.2.3.Beryllium-
SR2]
at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$5.run(OvsdbConnectionService.java:379)[235:org.opendaylight.ovsdb.library:1.2.3.Beryllium-SR2]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)[:1.7.0_101]
at java.util.concurrent.FutureTask.run(FutureTask.java:262)[:1.7.0_101]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)[:1.7.0_101]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)[:1.7.0_101]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)[:1.7.0_101]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)[:1.7.0_101]
at java.lang.Thread.run(Thread.java:745)[:1.7.0_101]
2016-05-11 21:16:39,825 | INFO | ult-dispatcher-6 | Shard

Comment by Venkatrangan Govindarajan [ 11/May/16 ]

Steps that landed me in the failure

a. Set up ODL 3node
b. stack up Control Node
c. stack up compute node
d. brought down compute node
e. bring up compute node again, the br-int was not getting created and the karaf.log indicated the exception.

Comment by Vinh Nguyen [ 17/May/16 ]

Code review:
https://git.opendaylight.org/gerrit/#/c/39002/

Comment by Anil Vishnoi [ 28/Jun/16 ]

This bug is fixed through following patches

master : https://git.opendaylight.org/gerrit/#/c/40541/2
stable/beryllium : https://git.opendaylight.org/gerrit/#/c/39002/

Generated at Wed Feb 07 20:36:07 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.