[GENIUS-20] Cannot have same vtep both on vxlan TZ and vxlan-gpe TZ Created: 12/Sep/16  Updated: 27/Feb/18  Resolved: 27/Feb/18

Status: Resolved
Project: genius
Component/s: General
Affects Version/s: (unspecified)
Fix Version/s: None

Type: Bug
Reporter: Jaime Caamaño Ruiz Assignee: Hema Gopalakrishnan
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 6697

 Description   

On a context of:

  • ODL (Boron)
  • netvirt enabled as neutron provider
  • SFC enabled
  • genius transport zones being used to configure the tunnel meshes both for netvirt and SFC
  • Yi Yang OVS patches applied (https://github.com/yyang13/ovs_nsh_patches)

consider the following transport zone configuration:

{
"transport-zones": {
"transport-zone": [
{
"zone-name": "dovs-tz",
"tunnel-type": "odl-interface:tunnel-type-vxlan-gpe",
"subnets": [
{
"prefix": "172.17.0.0/16",
"vteps": [

{ "dpn-id": 263936438496426, "portname": "dovs-vtep-1", "ip-address": "172.17.0.2" }

,

{ "dpn-id": 5274128771036, "portname": "dovs-vtep-2", "ip-address": "172.17.0.3" }

],
"vlan-id": 0,
"gateway-ip": "0.0.0.0"
}
]
},
{
"zone-name": "49e000d3-2812-4c81-bc7a-3c113e98e350",
"tunnel-type": "odl-interface:tunnel-type-vxlan",
"subnets": [
{
"prefix": "10.0.0.0/24",
"vteps": [

{ "dpn-id": 263936438496426, "portname": "tunnel_port", "ip-address": "172.17.0.2" }

,

{ "dpn-id": 5274128771036, "portname": "tunnel_port", "ip-address": "172.17.0.3" }

],
"gateway-ip": "0.0.0.0",
"vlan-id": 0
}
]
}
]
}
}

Zone "49e000d3-2812-4c81-bc7a-3c113e98e350" is created by netvirt. Zone "dovs-tz" is created for SFC.

We look into the resulting OVS configuration of DPN 263936438496426:


> ovs-vsctl show

fbeb6060-1a19-4a35-a37f-3a6913dee417
Manager "tcp:10.0.2.2:6640"
is_connected: true
Bridge br-int
Controller "tcp:10.0.2.2:6653"
is_connected: true
fail_mode: secure
Port "tapbfc5baaa-a1"
Interface "tapbfc5baaa-a1"
Port "tunb32255ba66c"
Interface "tunb32255ba66c"
type: vxlan
options:

{key=flow, local_ip="172.17.0.2", remote_ip="172.17.0.3"}

Port "tun445ad3536e6"
Interface "tun445ad3536e6"
type: vxlan
options:

{exts=gpe, key=flow, local_ip="172.17.0.2", "nshc1"=flow, "nshc2"=flow, "nshc3"=flow, "nshc4"=flow, nsi=flow, nsp=flow, remote_ip="172.17.0.3"}

Port br-int
Interface br-int
type: internal
ovs_version: "2.5.90"


As you can see we have two vxlan ports, with equal configuration regarding source IP, destination IP and key. Main difference being one of them is vxlan while the other is vxlan-gpe.

But then when we check how things are at openflow level:


> ovs-ofctl show -O OpenFlow13 br-int

OFPT_FEATURES_REPLY (OF1.3) (xid=0x2): dpid:0000ac2034e48141
n_tables:254, n_buffers:256
capabilities: FLOW_STATS TABLE_STATS PORT_STATS GROUP_STATS QUEUE_STATS
OFPST_PORT_DESC reply (OF1.3) (xid=0x3):
1(tapbfc5baaa-a1): addr:7e:1e:2a:b4:8c:09
config: 0
state: 0
current: 10GB-FD COPPER
speed: 10000 Mbps now, 0 Mbps max
2(tun445ad3536e6): addr:a2:42:54:79:62:3e
config: 0
state: 0
speed: 0 Mbps now, 0 Mbps max
LOCAL(br-int): addr:ac:20:34:e4:81:41
config: PORT_DOWN
state: LINK_DOWN
speed: 0 Mbps now, 0 Mbps max
OFPT_GET_CONFIG_REPLY (OF1.3) (xid=0x5): frags=normal miss_send_len=0

One of the tunnel ports, "tunb32255ba66c", is missing it's openflow definition. This cause various problems down the line.

It looks like OVS forms a tuple with some of the tunnel ports configuration parameters that uses to distinguish one tunnel port from another, and that in our case the basic difference of gpe vs non gpe is not part of this tuple; and results in the problem observed.

There are some alternatives:

  • TZ definition could allow to define a port number. OVS would presumably be able to tell apart both tunnel port definitions if they have a different port number.
  • Genius could add the gpe extension to an existing vxlan tunnel port, instead of creating a new tunnel port, if the vxlan port of the same characteristics already exists. Other way around, i would not need to create a vxlan port if there is already a vxlan-gpe one of the same characteristics. For this to be considered, a vxlan-gpe endpoint should be able to exchange standard vxlan traffic with a vxlan endpoint. Own tests indicate that this might indeed be the case.
  • Maybe this could be an OVS bug and it should be solved there.

Also, it looks like genius is using the port-name of each vtep inside the transport zone definition (i.e. dovs-vtep-1 in the example above) to generate the tunnel port name (i.e. tunb32255ba66c). Probably that is not the best approach as that is not a meaningful parameter at OVS level. Genius also uses the tunnel type for the tunnel port name and in this case vxlan and vxlan-gpe should not be considered different tunnel types for this purpose either. Basically the data used to generate the port name should be the same data that OVS uses to tell tunnel ports apart.



 Comments   
Comment by Jaime Caamaño Ruiz [ 21/Nov/16 ]

I made a test making sure to use the same portnames for both the vxlan transport zones created by netvirt and the extra vxlan-gpe transport zone created manually for sfc

{
"transport-zones": {
"transport-zone": [
{
"zone-name": "dovs-tz",
"tunnel-type": "odl-interface:tunnel-type-vxlan-gpe",
"subnets": [
{
"prefix": "172.17.0.0/16",
"vteps": [

{ "dpn-id": 263936438496426, "portname": "tunnel_port", "ip-address": "172.17.0.2" }

,

{ "dpn-id": 5274128771036, "portname": "tunnel_port", "ip-address": "172.17.0.3" }

],
"vlan-id": 0,
"gateway-ip": "0.0.0.0"
}
]
},
{
"zone-name": "49e000d3-2812-4c81-bc7a-3c113e98e350",
"tunnel-type": "odl-interface:tunnel-type-vxlan",
"subnets": [
{
"prefix": "10.0.0.0/24",
"vteps": [

{ "dpn-id": 263936438496426, "portname": "tunnel_port", "ip-address": "172.17.0.2" }

,

{ "dpn-id": 5274128771036, "portname": "tunnel_port", "ip-address": "172.17.0.3" }

],
"gateway-ip": "0.0.0.0",
"vlan-id": 0
}
]
}
]
}
}

This example is just for two compute nodes and a tunnel between them. The result was a normal vxlan tunnel endpoint on one side and a vxlan-gpe endpoint on the other:

> docker exec dovs-node-2 ovs-vsctl show
c5be3492-cb81-4bc3-b42e-6199ec956f9c
Manager "tcp:10.0.2.2:6640"
is_connected: true
Bridge br-int
Controller "tcp:10.0.2.2:6653"
is_connected: true
fail_mode: secure
Port "tunb1ca1a033ef"
Interface "tunb1ca1a033ef"
type: vxlan
options:

{key=flow, local_ip="172.17.0.3", remote_ip="172.17.0.2"}

Port br-int
Interface br-int
type: internal
Port "tapc4860dbb-5b"
Interface "tapc4860dbb-5b"
ovs_version: "2.5.90"

> docker exec dovs-node-1 ovs-vsctl show
9ae7d6f9-2733-4715-a9b2-1d64d76dc8bf
Manager "tcp:10.0.2.2:6640"
is_connected: true
Bridge br-int
Controller "tcp:10.0.2.2:6653"
is_connected: true
fail_mode: secure
Port br-int
Interface br-int
type: internal
Port "tun0d32793f3c1"
Interface "tun0d32793f3c1"
type: vxlan
options:

{exts=gpe, key=flow, local_ip="172.17.0.2", "nshc1"=flow, "nshc2"=flow, "nshc3"=flow, "nshc4"=flow, nsi=flow, nsp=flow, remote_ip="172.17.0.3"}

Port "tap5de268d2-bc"
Interface "tap5de268d2-bc"
ovs_version: "2.5.90"

This gave me the perfect opportunity to test if a vxlan endpoint could talk with a vxlan-gpe endpoint (bidirectional). An, indeed, it does seem to work:

> sudo ip netns exec dovs-node-1-guest-1 ping -c 1 10.0.0.2
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=3.40 ms

> sudo ip netns exec dovs-node-2-guest-1 ping -c 1 10.0.0.1
PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=5.38 ms

After that, I could add the gpe extions to the existing vxlan interface:

> docker exec dovs-node-2 ovs-vsctl set interface tunb1ca1a033ef options:exts=gpe options:nshc1=flow options:nshc2=flow options:nshc3=flow options:nshc4=flow options:nsi=flow options:nsp=flow

> docker exec dovs-node-2 ovs-vsctl show
2016-09-02T15:30:34Z|00001|vsctl|WARN|/proc/0/cmdline: open failed (No such file or directory)
c5be3492-cb81-4bc3-b42e-6199ec956f9c
Manager "tcp:10.0.2.2:6640"
is_connected: true
Bridge br-int
Controller "tcp:10.0.2.2:6653"
is_connected: true
fail_mode: secure
Port "tunb1ca1a033ef"
Interface "tunb1ca1a033ef"
type: vxlan
options:

{exts=gpe, key=flow, local_ip="172.17.0.3", "nshc1"=flow, "nshc2"=flow, "nshc3"=flow, "nshc4"=flow, nsi=flow, nsp=flow, remote_ip="172.17.0.2"}

Port br-int
Interface br-int
type: internal
Port "tapc4860dbb-5b"
Interface "tapc4860dbb-5b"
ovs_version: "2.5.90"

And still have connectivity:
> sudo ip netns exec dovs-node-1-guest-1 ping -c 1 10.0.0.2
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=2.08 ms

> sudo ip netns exec dovs-node-2-guest-1 ping -c 1 10.0.0.1
PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.153 ms

So I think this maybe workable with a few genius patches:

  • Currently genius generates the OVS tunnel interface name from: source IP, destination IP, transport zone vtep port name and tunnel type. But it should only use parameters that OVS also uses to distinguish interfaces correctly; so it should use only source IP, destination IP and tunnel type (being vxlan and vxlan-gpe the same tunnel type: vxlan).
  • Genius should make sure that it adds the gpe extension if there is already an non gpe vxlan interface with the same other characteristics.
Comment by Jaime Caamaño Ruiz [ 27/Feb/18 ]

I am closing this due to inactivity. Reopen if necessary. This has not been a problem for a long time since a specific meshes for SFC are not being used, instead gpe is enabled deployment wise for all meshes.

Generated at Wed Feb 07 19:59:42 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.