[NETVIRT-5] Instances in Compute node not able to reach dhcp Created: 17/Feb/16  Updated: 08/Apr/19  Resolved: 01/Sep/16

Status: Resolved
Project: netvirt
Component/s: None
Affects Version/s: Beryllium
Fix Version/s: None

Type: Bug
Reporter: Venkatrangan Govindarajan Assignee: balakrishnan k
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Attachments: Text File 26-5 FLows.txt     Zip Archive ODL1.zip     Zip Archive ODL2.zip     Zip Archive ODL3.zip    
External issue ID: 5376

 Description   

Steps to recreate the problem:

1. Create Network

2. Create instances in such a way there are instances created in control node as well as compute node

3. ping the dhcp IP address from the instances

local.conf :https://gist.github.com/gvranganvtn/ccbf76d20d18b6c670c7
(control
(compute)

ovs-vsctl show in both nodes: https://gist.github.com/gvranganvtn/f4cccecc76bb27195b06

dump-flows in both nodes:
https://gist.github.com/gvranganvtn/0d8dd22ba19491b39c61

Test:
ubuntu@odl-os-control-node:~/devstack$ sudo ip netns exec qdhcp-6b6f54de-f04a-4dd6-8baf-7725a51df8fa ping 1.2.3.6
PING 1.2.3.6 (1.2.3.6) 56(84) bytes of data.
64 bytes from 1.2.3.6: icmp_seq=1 ttl=64 time=5.63 ms
^C
— 1.2.3.6 ping statistics —
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 5.631/5.631/5.631/0.000 ms
ubuntu@odl-os-control-node:~/devstack$ sudo ip netns exec qdhcp-6b6f54de-f04a-4dd6-8baf-7725a51df8fa ping 1.2.3.5
PING 1.2.3.5 (1.2.3.5) 56(84) bytes of data.
^C
— 1.2.3.5 ping statistics —
1 packets transmitted, 0 received, 100% packet loss, time 0ms

1.2.3.5 is the instacne created in compute node and 1.2.3.6 is the one on the control node.



 Comments   
Comment by Venkatrangan Govindarajan [ 17/Feb/16 ]

Additional details of the scenario

-> ODL in cluster mode

-> Control Node up

-> Created BNetwork in Openstack

-> Bring up compute node

-> create instances

-> IP for instances got assigned, but the instacne in compute node not able to reach the dhcp IP

Comment by Sam Hague [ 17/May/16 ]

Looking at flows, it seems the flow to get back from the compute node to the control node for the dhcp mac is not in table 110.

Can we enable TRACE level on the netvirt code (at least for Of13Provider) and that might show us why that flow is not programmed.

      • On the control node, port 1 is the dhcp port with mac=dl_src=fa:16:3e:d0:1f:ad

ubuntu@odl-os-control-node:~/devstack$ sudo ovs-ofctl dump-flows br-int -OOpenFlow13
OFPST_FLOW reply (OF1.3) (xid=0x2):
cookie=0x0, duration=2004.335s, table=0, n_packets=67, n_bytes=8000, in_port=1,dl_src=fa:16:3e:d0:1f:ad actions=set_field:0x426->tun_id,load:0x1->NXM_NX_REG0[],goto_table:20
cookie=0x0, duration=1182.896s, table=0, n_packets=44, n_bytes=3754, cookie=0x0, duration=1179.898s, table=110, n_packets=13, n_bytes=1800,

      • Flow to go out tunnel to reach vm on compute node:

tun_id=0x426,dl_dst=fa:16:3e:9e:f6:70 actions=output:3

      • On the control node, port 2 is the vm port with mac=dl_src=fa:16:3e:9e:f6:70

ubuntu@odl-os-compute-node:~/devstack$ sudo ovs-ofctl dump-flows br-int -OOpenFlow13
OFPST_FLOW reply (OF1.3) (xid=0x2):
cookie=0x0, duration=1217.089s, table=0, n_packets=58, n_bytes=4928, in_port=2,dl_src=fa:16:3e:9e:f6:70 actions=set_field:0x426->tun_id,load:0x1->NXM_NX_REG0[],goto_table:20

      • but on the compute node there is not a flow to send back to the dhcp mac over the tunnel. All we have are these two flows below. The first is to forward traffic over the tunnel for the vm on the control node at 1.2.3.6. The second flow is to forward traffic locally to the 1.2.3.5 vm.

cookie=0x0, duration=1219.169s, table=110, n_packets=3, n_bytes=294, tun_id=0x426,dl_dst=fa:16:3e:de:47:a5 actions=output:1
cookie=0x0, duration=1216.963s, table=110, n_packets=13, n_bytes=1800, tun_id=0x426,dl_dst=fa:16:3e:9e:f6:70 actions=output:2

Comment by Venkatrangan Govindarajan [ 20/May/16 ]

Disabled iptables in all compute nodes, all intances are reachable now when using ODL without clustering configutation.

There seems to be problems exisitng when ODL is used as 3node, there are multiple exceptions observed while port event handling and other processing.

Link: https://jenkins.opendaylight.org/releng/view/netvirt/job/netvirt-csit-3node-openstack-liberty-openstack-beryllium/30/artifact/odl1_karaf.log.xz

Some important flows in pipeline is missed in all the openstack nodes. This needs to be investigated immediately to have a stable 3node job.

Comment by balakrishnan k [ 26/May/16 ]

ODL1 log

Comment by balakrishnan k [ 26/May/16 ]

Attachment ODL1.zip has been added with description: ODL1 log

Comment by balakrishnan k [ 26/May/16 ]

Hi Sam,
i have created 3node cluster setup in local and tested below scenario.

1.Running 3 ODL controller as cluster with HA Proxy connected 1control and 2 compute nodes.
2.created 2 networks.
3.net1 subnet range 10.0.0.0/24.
4.net2 subnet range 20.0.0.0/24.
5.bring down ODL1.
6.created 3VM instance.

first 2 VM(10.0.0.3, 10.0.0.4) instance getting IP and the flows are created properly in openstack nodes.
3rd VM (10.0.0.5) Not getting IP address in the VM console also in DUMP flows
only ARP flows present no flow entries in table 40,90 and 110.

enabled Debug log for southboudhandler and porthandler.
in the log i am able to see some failure in southbound event dispatcher.

OFPST_FLOW reply (OF1.3) (xid=0x2):
cookie=0x0, duration=300.163s, table=20, n_packets=1, n_bytes=42, priority=1024,arp,tun_id=0x444,arp_tpa=10.0.0.5,arp_op=1 actions=move:NXM_OF_ETH_SRC[]>NXM_OF_ETH_DST[],set_field:fa:16:3e:f5:cf:02>eth_src,load:0x2->NXM_OF_ARP_OP[],move:NXM_NX_ARP_SHA[]>NXM_NX_ARP_THA[],move:NXM_OF_ARP_SPA[]>NXM_OF_ARP_TPA[],load:0xfa163ef5cf02->NXM_NX_ARP_SHA[],load:0xa000005->NXM_OF_ARP_SPA[],IN_PORT
cookie=0x0, duration=1676.577s, table=20, n_packets=0, n_bytes=0, priority=1024,arp,tun_id=0x444,arp_tpa=10.0.0.2,arp_op=1 actions=move:NXM_OF_ETH_SRC[]>NXM_OF_ETH_DST[],set_field:fa:16:3e:ea:30:af>eth_src,load:0x2->NXM_OF_ARP_OP[],move:NXM_NX_ARP_SHA[]>NXM_NX_ARP_THA[],move:NXM_OF_ARP_SPA[]>NXM_OF_ARP_TPA[],load:0xfa163eea30af->NXM_NX_ARP_SHA[],load:0xa000002->NXM_OF_ARP_SPA[],IN_PORT
cookie=0x0, duration=1590.239s, table=20, n_packets=0, n_bytes=0, priority=1024,arp,tun_id=0x3fe,arp_tpa=20.0.0.2,arp_op=1 actions=move:NXM_OF_ETH_SRC[]>NXM_OF_ETH_DST[],set_field:fa:16:3e:9f:9d:7f>eth_src,load:0x2->NXM_OF_ARP_OP[],move:NXM_NX_ARP_SHA[]>NXM_NX_ARP_THA[],move:NXM_OF_ARP_SPA[]>NXM_OF_ARP_TPA[],load:0xfa163e9f9d7f->NXM_NX_ARP_SHA[],load:0x14000002->NXM_OF_ARP_SPA[],IN_PORT
cookie=0x0, duration=683.569s, table=20, n_packets=1, n_bytes=42, priority=1024,arp,tun_id=0x444,arp_tpa=10.0.0.4,arp_op=1 actions=move:NXM_OF_ETH_SRC[]>NXM_OF_ETH_DST[],set_field:fa:16:3e:43:9a:8a>eth_src,load:0x2->NXM_OF_ARP_OP[],move:NXM_NX_ARP_SHA[]>NXM_NX_ARP_THA[],move:NXM_OF_ARP_SPA[]>NXM_OF_ARP_TPA[],load:0xfa163e439a8a->NXM_NX_ARP_SHA[],load:0xa000004->NXM_OF_ARP_SPA[],IN_PORT
cookie=0x0, duration=1135.884s, table=20, n_packets=1, n_bytes=42, priority=1024,arp,tun_id=0x444,arp_tpa=10.0.0.3,arp_op=1 actions=move:NXM_OF_ETH_SRC[]>NXM_OF_ETH_DST[],set_field:fa:16:3e:58:a9:49>eth_src,load:0x2->NXM_OF_ARP_OP[],move:NXM_NX_ARP_SHA[]>NXM_NX_ARP_THA[],move:NXM_OF_ARP_SPA[]>NXM_OF_ARP_TPA[],load:0xfa163e58a949->NXM_NX_ARP_SHA[],load:0xa000003->NXM_OF_ARP_SPA[],IN_PORT
cookie=0x0, duration=2632.505s, table=20, n_packets=118, n_bytes=12483, priority=0 actions=goto_table:30
cookie=0x0, duration=2632.505s, table=30, n_packets=118, n_bytes=12483, priority=0 actions=goto_table:40
cookie=0x0, duration=2632.505s, table=40, n_packets=118, n_bytes=12483, priority=0 actions=goto_table:50
cookie=0x0, duration=2632.505s, table=50, n_packets=118, n_bytes=12483, priority=0 actions=goto_table:60
cookie=0x0, duration=2632.505s, table=60, n_packets=118, n_bytes=12483, priority=0 actions=goto_table:70
cookie=0x0, duration=2632.505s, table=70, n_packets=118, n_bytes=12483, priority=0 actions=goto_table:80
cookie=0x0, duration=2632.505s, table=80, n_packets=118, n_bytes=12483, priority=0 actions=goto_table:90
cookie=0x0, duration=2632.505s, table=90, n_packets=118, n_bytes=12483, priority=0 actions=goto_table:100
cookie=0x0, duration=2632.505s, table=100, n_packets=118, n_bytes=12483, priority=0 actions=goto_table:110
cookie=0x0, duration=1673.671s, table=110, n_packets=31, n_bytes=2882, tun_id=0x444,dl_dst=fa:16:3e:ea:30:af actions=output:1
cookie=0x0, duration=1565.622s, table=110, n_packets=0, n_bytes=0, tun_id=0x3fe,dl_dst=fa:16:3e:9f:9d:7f actions=output:5
cookie=0x0, duration=998.128s, table=110, n_packets=19, n_bytes=2311, tun_id=0x444,dl_dst=fa:16:3e:58:a9:49 actions=output:3
cookie=0x0, duration=643.071s, table=110, n_packets=16, n_bytes=2017, tun_id=0x444,dl_dst=fa:16:3e:43:9a:8a actions=output:4
cookie=0x0, duration=1672.376s, table=110, n_packets=25, n_bytes=2326, priority=16384,reg0=0x2,tun_id=0x444,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 actions=output:1
cookie=0x0, duration=1672.378s, table=110, n_packets=0, n_bytes=0, priority=16383,reg0=0x1,tun_id=0x444,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 actions=output:1,output:3,output:4
cookie=0x0, duration=1565.611s, table=110, n_packets=0, n_bytes=0, priority=16384,reg0=0x2,tun_id=0x3fe,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 actions=output:5
cookie=0x0, duration=1565.601s, table=110, n_packets=0, n_bytes=0, priority=16383,reg0=0x1,tun_id=0x3fe,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 actions=output:5,output:3,output:4
cookie=0x0, duration=1672.285s, table=110, n_packets=2, n_bytes=196, priority=8192,tun_id=0x444 actions=drop
cookie=0x0, duration=1564.369s, table=110, n_packets=0, n_bytes=0, priority=8192,tun_id=0x3fe actions=drop
cookie=0x0, duration=2632.505s, table=110, n_packets=25, n_bytes=2751, priority=0 actions=drop

Comment by balakrishnan k [ 26/May/16 ]

Attachment ODL2.zip has been added with description: ODL2 log

Comment by balakrishnan k [ 26/May/16 ]

Attachment ODL3.zip has been added with description: ODL3 log

Comment by balakrishnan k [ 26/May/16 ]

Attachment 26-5 FLows.txt has been added with description: dump flows for VM creation

Comment by Hideyuki Tai [ 27/Aug/16 ]

I think one of reasons which caused this issue was a performance issue of the OVSDB Southbound plugin.

The following patches have dramatically increased the performance of the OVSDB Southbound Plugin, so I think the rate of occurrence of this issue has been decreased now.
https://git.opendaylight.org/gerrit/#/c/43580/
https://git.opendaylight.org/gerrit/#/c/44374/

Comment by Prashanth Jakkam [ 01/Sep/16 ]

We have tested with latest distribution after below patches were merged

https://git.opendaylight.org/gerrit/#/c/43580/
https://git.opendaylight.org/gerrit/#/c/44374/

Issue was not observed during multiple runs.We have observed, all created VM instances has acquired ip address.

Following bugs has been raised for issues found in fail-over scenario testing.

https://bugs.opendaylight.org/show_bug.cgi?id=6596

https://bugs.opendaylight.org/show_bug.cgi?id=6601

Hence,this bug can be closed.

Generated at Wed Feb 07 20:20:27 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.