[OVSDB-370] br-int not created when del-manager and set-manager in quick succession Created: 01/Sep/16  Updated: 19/Oct/17  Resolved: 06/Oct/16

Status: Verified
Project: ovsdb
Component/s: Clustering
Affects Version/s: unspecified
Fix Version/s: None

Type: Bug
Reporter: Vinh Nguyen Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Attachments: Zip Archive LOG.zip     File odl1.log.tar.gz     File odl2.log.tar.gz     File odl3.log.tar.gz    
External issue ID: 6603

 Description   

Br-int is not created when an OVS is disconnected and then quickly re-connected to the 3-nodes ODL. The reproduction steps are below:

1) start 3-nodes cluster, say
10.138.0.2, 10.138.0.3, 10.138.0.4
2) connect an OVS switch to the cluster
ovs-vsctl set-manager tcp:10.138.0.2:6640 tcp:10.138.0.3:6640 tcp:10.138.0.4:6640
3) br-int is properly created at the OVS switch
Manager "tcp:10.138.0.2:6640"
Manager "tcp:10.138.0.3:6640"
Manager "tcp:10.138.0.4:6640"
Bridge br-int
Controller "tcp:10.138.0.2:6653"
is_connected: true
Controller "tcp:10.138.0.4:6653"
is_connected: true
Controller "tcp:10.138.0.3:6653"
is_connected: true
fail_mode: secure
Port br-int
Interface br-int
type: internal
ovs_version: "2.0.2"

4) delete the br-int, disconnect the OVS, and re-connect the OVS in quick succession, ie
ovs-vsctl del-br br-int; ovs-vsctl del-manager; ovs-vsctl set-manager tcp:10.138.0.2:6640 tcp:10.138.0.3:6640 tcp:10.138.0.4:6640

5) Intermittently (one out of 5 tries), the br-int is not created after the OVS is connected to the cluster:

0af0ec21-563d-40cf-900c-6a66ef91a34d
Manager "tcp:10.138.0.2:6640"
is_connected: true
Manager "tcp:10.138.0.3:6640"
is_connected: true
Manager "tcp:10.138.0.4:6640"
is_connected: true
ovs_version: "2.0.2"

When the issue in 5) happens there is no entry for the OVS connection in the operational DS. That means netvirt never gets the ADD node event to create teh br-int in the first place. The events for delete and create same node might have been clobbered each other.

I am testing with latest boron code as of Aug 30, 2016



 Comments   
Comment by Vinh Nguyen [ 01/Sep/16 ]

Attachment odl1.log.tar.gz has been added with description: karaf.log odl1

Comment by Vinh Nguyen [ 01/Sep/16 ]

Attachment odl2.log.tar.gz has been added with description: karaf.log odl2

Comment by Vinh Nguyen [ 01/Sep/16 ]

Attachment odl3.log.tar.gz has been added with description: karaf.log ODL3

Comment by Arthi Bhattacharjee [ 02/Sep/16 ]

Vinh,
Setup:

  • 3 node cluster setup and a Control node.
  • Openstack Version: Mitaka and Liberty.
  • Distribution patch: distribution-karaf-0.5.0-20160902.020649-4739.tar.gz

We have tested with the above distribution patch and the bug is not reproduced. We performed disconnect the OVS and re-connect the OVS for 10 times each in both Mitaka and Liberty.

Can you please let us know if there is any specific steps to be followed to reproduce the issue.

Comment by Arthi Bhattacharjee [ 02/Sep/16 ]

Attachment LOG.zip has been added with description: Karaf logs

Comment by Vinh Nguyen [ 02/Sep/16 ]

Hi Arthi,

I noticed in the attached logs the time interval between disconnect and re-connect of the OVS switch is about few seconds or more. Please retry with smaller interval, say 1 seconds or none. Also please remove the br-int before disconnecting the OVS. The steps I use are mentioned in the description section.
Please try the command in step 4 repeatedly until you see the br-int is not present in 'show' command.

sudo ovs-vsctl del-br br-int; sudo ovs-vsctl del-manager; sudo ovs-vsctl set-manager tcp:<odl1-ip>:6640 tcp:<odl2-ip>:6640 tcp:<odl3-ip>:6640; sleep 2; sudo ovs-vsctl show

Comment by Arthi Bhattacharjee [ 17/Sep/16 ]

Yes, I deleted the bridges before setting the manager.
And I tried step 4 by giving all at once. Still the bug is not reproduced.

Comment by Vinh Nguyen [ 05/Oct/16 ]

Bug cannot be reproduced the the latest boron release: distribution-karaf-0.5.0-Boron

Generated at Wed Feb 07 20:36:13 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.