[NETVIRT-770] The time takes in Logical Switch creation and tunnel I/F creation will increase dramatically when repeating l2gw node addition/deletion. Created: 07/Jul/17  Updated: 08/Nov/19  Resolved: 08/Nov/19

Status: Resolved
Project: netvirt
Component/s: General
Affects Version/s: Carbon
Fix Version/s: None

Type: Bug Priority: Medium
Reporter: Ran Xiao Assignee: Unassigned
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Attachments: File data.tar.gz     File karaf.log.1.xz     File karaf.log.2.xz     File karaf.log.3.xz     File karaf.log.xz    
External issue ID: 8819

 Description   

Issue:
We repeated l2gw node addition/deletion to Open vSwitch HWVTEP Emulator a few times.
The first 10 times, Logical Switch and tunnel I/F will be created in about 6 seconds after l2gw connection creation.
After 100 times, it will takes more than 300 seconds to finish Logical Switch and tunnel I/F creation.

Environment details:
OpenStack Version:stable/ocata
ODL Version:Carbon-FR + patch
patch: https://git.opendaylight.org/gerrit/#/c/56773/
https://git.opendaylight.org/gerrit/#/c/56710/
https://git.opendaylight.org/gerrit/#/c/58787/
https://git.opendaylight.org/gerrit/#/c/59598/
HWVTEP: Open vSwitch 2.6.1 HWVTEP Emulator
: HA Cluster

What we did (Steps):
1. L2GW Node Initialization:
a. delete L2GW GATEWAY/CONNECTION
b. stop OVS process and initialize
2. Network/subnet deletion:
neutron net-delete test-nw
3. Network/subnet creation:
neutron net-create test-nw
neutron subnet-create test-nw 192.168.0.0/24 --enable-dhcp
4. L2GW Node configuration
L2GW/LACP environment configuration
L2GW GATEWAY/CONNECTION creation
5. Wait until creation is completed
Wait until the VXLAN I/F is created in the OVS of the L2GW Node with a 3 seconds × 100 times loop
6. Communication confirmation (ping)
7. Go back to setp 1

Please check the log files attached.



 Comments   
Comment by Ran Xiao [ 07/Jul/17 ]

Attachment karaf.log.xz has been added with description: karaflog_0

Comment by Ran Xiao [ 07/Jul/17 ]

Attachment karaf.log.1.xz has been added with description: karaflog_1

Comment by Ran Xiao [ 07/Jul/17 ]

Attachment karaf.log.2.xz has been added with description: karaflog_2

Comment by Ran Xiao [ 07/Jul/17 ]

Attachment karaf.log.3.xz has been added with description: karaflog_3

Comment by Ran Xiao [ 02/Aug/17 ]

The cause of this bug is that the information of deleted L2GW Node still left in ODL.

We found that the creation time can be reduced by deleting ODL data when this issue occurred.
The information left in ODL and the results of the creation time reduction by deleting it are below:
1. MD-SAL network-topology(CONFIGURATION)
Deleting this data will not reduce the creation time
2. MD-SAL itm(transport-zones)
Deleting this data will reduce the creation time
3. Memory cache(L2GWCONN, HA)
This data cannot be deleted from external side.

Comment by suneel verma [ 02/Aug/17 ]

Hi Ran,
Can you give the exact steps that are followed.
like the restcalls that are used and the southbound config that is done.
Thanks,
Suneelu

Comment by Ran Xiao [ 04/Aug/17 ]

Hi Suneelu

This issue also occurred in non-HA Cluster environment (single L2GW Node environment).

The following are the detailed commands we used to collect data and REST API used for deletion.

The contents of attached data file is listed as below:

  • Attached data file list
    100.* Data of L2GW Node registration / deletion repeated near 100 times
    del1.* Data after deleted "MD-SAL network-topology (CONFIGURATION)" near 107 times
    150.* Data of L2GW Node registration / deletion repeated near 150 times
    del2.* Data after deleted "MD-SAL network-topology (CONFIGURATION)" near 151 times

The detailed reproduction steps are below:

  • Detailed reproduction steps
    1. L2GW Node Initialization
    Delete L2GW GATEWAY / CONNECTION
    neutron l2-gateway-connection-delete `neutron l2-gateway-connection-list --segmentation_id=2222 -c id -f value`
    neutron l2-gateway-delete gw1

Stop OVS process and initialization:
pkill ovsdb-server
pkill ovs-vswitchd
ps -ef|grep ovs-vtep | grep -v grep | awk '

{print "kill "$2}

' | sh
ps -ef | grep ovs | grep -v grep
ps -ef | grep -v '[' | grep -v grep
rm -f /etc/openvswitch/ovs.db /etc/openvswitch/vtep.db
ovsdb-tool create /etc/openvswitch/ovs.db /usr/share/openvswitch/vswitch.ovsschema
ovsdb-tool create /etc/openvswitch/vtep.db /usr/share/openvswitch/vtep.ovsschema
ovs-dpctl del-flows system@ovs-system
rm -f /var/log/openvswitch/*

2. Network / subnet deletion
neutron net-delete test-nw

3. Network / subnet creation
neutron net-create test-nw
neutron subnet-create test-nw 192.168.0.0/24 --enable-dhcp

4. L2GW Node setting up
Create L2GW / LACP environment:
ulimit -n 65535
vtep=10.0.0.50
BRIDGE=ocata-l2gw1
ovsdb-server --pidfile --detach --log-file \
--remote=punix:/var/run/openvswitch/db.sock \
--remote=db:hardware_vtep,Global,managers \
--remote=ptcp:6632 /etc/openvswitch/ovs.db /etc/openvswitch/vtep.db
ovs-vswitchd --log-file --detach --pidfile \
unix:/var/run/openvswitch/db.sock
sleep 2
ovs-vsctl add-br $BRIDGE
ovs-vsctl add-port $BRIDGE eth2
vtep-ctl add-ps $BRIDGE
vtep-ctl set Physical_Switch $BRIDGE tunnel_ips=$vtep
sleep 2
/usr/share/openvswitch/scripts/ovs-vtep --log-file=/var/log/openvswitch/ovs-vtep.log \
--pidfile=/var/run/openvswitch/ovs-vtep.pid \
--detach $BRIDGE
vtep-ctl set-manager tcp:172.16.1.30:6640

Create L2GW GATEWAY / CONNECTION:
neutron l2-gateway-create gw1 --device name=ocata-l2gw1,interface_names=eth2
neutron l2-gateway-connection-create gw1 test-nw --default-segmentation-id 2222

5. Wait for creation completion
ssh root@172.16.1.50 'ovs-vsctl show'
ssh root@172.16.1.50 'ovsdb-client dump hardware_vtep'
*wait untill the tunnel I/F (vx1) in Logical Switch is created

6. Ping communication confirmation
ip netns exec qdhcp-`neutron net-show test-nw -c id -f value` ping -c 5 192.168.0.200
sleep 20

Comment by Ran Xiao [ 04/Aug/17 ]

Attachment data.tar.gz has been added with description: Data file

Comment by Abhinav Gupta [ 08/Nov/19 ]

Old bug, will reopen if seen again

Generated at Wed Feb 07 20:22:25 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.