[NETVIRT-966] In conntrack mode, SNAT communication fails if router gateway is set before virtual NW creation or VM creation in a fresh environment. Created: 24/Oct/17  Updated: 05/Apr/18  Resolved: 05/Apr/18

Status: Resolved
Project: netvirt
Component/s: General
Affects Version/s: Carbon
Fix Version/s: Carbon

Type: Bug Priority: Medium
Reporter: Ran Xiao Assignee: Bertrand Low
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Environment Details:
OpenStack Version: stable/ocata
ODL Version: Carbon-SR2
Node:
OpenStack Controller Node: 1 node
OpenStack Compute Node: 1 node
ODL Node: 3 nodes(cluster)



 Description   

We found the following issue when running test with ODL Carbon SR2.
In conntrack mode, SNAT communication fails if route gateway is set before virtual NW  creation or VM creation in a fresh environment.
The following WARN karaf log message shows when configuring router gateway and interface.
   2017-10-24 10:09:47,079 | WARN | nPool-1-worker-0 | ExternalNetworksChangeListener | 369
   org.opendaylight.netvirt.natservice-impl - 0.4.2.Carbon | associateExternalNetworkWithVPN :
   primary napt Switch not found for router Uuid [_value=e81d29f5-19cb-412b-aa7e-4c9377e00ee4] on dpn: 0
   2017-10-24 10:11:10,857 | WARN | eChangeHandler-0 | AbstractSnatService | 369
   org.opendaylight.netvirt.natservice-impl - 0.4.2.Carbon | getTunnelInterfaceName :
   RPC Call to getTunnelInterfaceId returned with Errors []

○Possible cause
 By checking both OVS and ODL logs, we think that the cause might be as follows
  It seems that in controller node the patch ports of br-int br-ex are created when creating the first virtual NW.
  And in compute node, the patch ports are created when creating the first VW in this node.
  The processes are not finished correctly when router gateway is set before patch ports are created on each node.

○Reproduction steps:
1. neutron-server and related agent restart (controller node)
2. OVS initialization (controller node and compute node)
3. ODL initialization (all 3 odl nodes)
4. ODL 3 node cluster start
5. OVS configuration (controller node and compute node)
6. External NW creation, External NW subnet creation
7. Virtual NW creation
8. Router creation
9. Router GW configuration
10. Router IF configuration
11. VM creation
SNAT communication confirm
12. All resources deletion
13. OVS stop (controller node and compute node)
14. ODL stop

○log files
We will upload them if necessary.



 Comments   
Comment by Bertrand Low [ 26/Nov/17 ]

Hi there,

the bug reproduction steps appear to contradict the bug title and description. Can you confirm if step 7 (Virtual NW creation) is in the correct order, or if it should come after step 9 or even 10 (as per title "...SNAT communication fails if router gateway is set before creating virtual NW...")?

thanks,

Bertrand

Comment by Ran Xiao [ 27/Nov/17 ]

Hi Bertrand,
The bug reproduction steps are correct, as the issue is that SNAT communication only works when router gateway is set after both Virtual NW and VM creation.
The steps we shared above is one of resources creation orders that SNAT can not work.
Sorry for making you confused.
I've modified the bug title and description.

 

Comment by Bertrand Low [ 29/Nov/17 ]

Hi Ran,

thanks for the clarification. How do you verify the SNAT communication?

Comment by yogalakshmi swetha [ 29/Nov/17 ]

Hi Bert,

To verify SNAT communication ,please follow the below steps:

1. ODL initialization (all 3 odl nodes)
2. ODL 3 node cluster start
3. neutron-server and related agent restart (controller node)
4. OVS configuration (controller node and compute node)
5. Create Internal Network and Subnet (net1)
6. Create 2 VM's for this network.(vm1)
7. Create external network and subnet (externl_net1)
8. Create a router and set gateway to the external network and add a interface to the internal network (net1)
9. Check for the communication from vm1 to external host by Pinging from vm1 to the external host. In conntrack mode SNAT can be verified using PING.
 

Comment by Ran Xiao [ 30/Nov/17 ]

Hi Yogalakshmi,

Thanks for your quick reply to Bertrand's question.

Hi Bertrand,

The verify steps I executed are the same as Yogalakshmi shared.
If there is any other information i can provide, please let me know.

 

Comment by Bertrand Low [ 03/Dec/17 ]

Hi Ran,

I have followed the reproduction steps indicated in this bug report, and I have used both ssh and ping to test the SNAT connection from the VM instances to the external host.

I can verify that the bug is no longer reproduced using the following snapshots:

  • distribution-karaf-0.6.3-20171128.170008-250.tar.gz (Carbon)

- karaf-0.7.1-20171122.210509-337.tar.gz (Nitrogen)

SNAT communication is successful.

I recommend closing this bug as it is no longer reproducible.

thanks,

Bertrand

Generated at Wed Feb 07 20:22:54 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.