[OPNFLWPLUG-891] Xid cannot be reserved after OVS connection flaps Created: 16/May/17 Updated: 27/Sep/21 Resolved: 28/Aug/17 |
|
| Status: | Resolved |
| Project: | OpenFlowPlugin |
| Component/s: | General |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Vinh Nguyen | Assignee: | Jozef Bacigal |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| Attachments: |
|
| External issue ID: | 8478 |
| Description |
|
When defining large number of networks/subnets in bulk, OVS connection flaps frequently and eventually remains disconnected. Environment: single openstack node and single ODL. To reproduce: 1) using script to define large number of networks/subnets: neutron net-create vx-net$j-$i --provider:network_type vxlan where j is [1..15] and j is [1..100] 2) Around 1000 networks, OVS connection flaps and eventually remains disconneted. |
| Comments |
| Comment by Vinh Nguyen [ 16/May/17 ] |
|
Attachment karaf.log.tar has been added with description: karar.log.tar |
| Comment by Jozef Bacigal [ 05/Jun/17 ] |
|
Hi Vinh, can you please run this again on latest snapshots with bigger log size and DEBUG log for org.opendayligh.openflowplugin.impl Thanks Jozef |
| Comment by Vinh Nguyen [ 10/Jun/17 ] |
|
Hi Jozef, I am not able to reproduce this problem with the latest snapshot. However since this issue happened intermittently when I logged it, I'll try some more runs to make sure it is gone in the latest code base |
| Comment by Vinh Nguyen [ 14/Jun/17 ] |
|
Attachment karaf.log.18.7z has been added with description: karaf.log.18 |
| Comment by Vinh Nguyen [ 14/Jun/17 ] |
|
Attachment karaf.log.17.7z has been added with description: karaf.log.17 |
| Comment by Vinh Nguyen [ 14/Jun/17 ] |
|
Attachment karaf.log.16.7z has been added with description: karaf.log.16 |
| Comment by Vinh Nguyen [ 14/Jun/17 ] |
|
Attachment karaf.log.15.7z has been added with description: karaf.log.15 |
| Comment by Vinh Nguyen [ 14/Jun/17 ] |
|
Attachment karaf.log.14.7z has been added with description: karaf.log.14 |
| Comment by Vinh Nguyen [ 14/Jun/17 ] |
|
Attachment karaf.log.13.7z has been added with description: karaf.log.13 |
| Comment by Vinh Nguyen [ 14/Jun/17 ] |
|
Attachment karaf.log.12.7z has been added with description: karaf.log.12 |
| Comment by Vinh Nguyen [ 14/Jun/17 ] |
|
Attachment karaf.log.11.7z has been added with description: karaf.log.11 |
| Comment by Vinh Nguyen [ 14/Jun/17 ] |
|
Attachment karaf.log.10.7z has been added with description: karaf.log.10 |
| Comment by Vinh Nguyen [ 14/Jun/17 ] |
|
I cannot reproduce the original issue with latest ODL snapshot. However I encounter another issue: When defining large number of networks/subnets in bulk, OVS connection flaps frequently and new requests for network/subnet creation results in ERROR/WARNING in log files: "No queue present, failing request" Environment: single openstack node and single ODL. To reproduce: 1) using script to define large number of networks/subnets: neutron net-create vx-net$j-$i --provider:network_type vxlan where j is [1..30] and j is [1..100] 2) OVS connection to OFPlugin flaps several times but ODL recovers and keeps up the with continuous network creation requests 3) Around 3000 network creations, OVS connection flaps, this time new network creation requests result in ERROR/WARNING in karaf.log: "No queue present, failing request" I attached the karaf log files captured before and when the issus happens. Some interesting notes: 1) The "Xid cannot be reserved " issue happens in karaf.log.10 2) There are two consecutive ConnectionEvent for "Device connected" in the last connection flap. Normally, the karaf log shows two events: one for disconnection and one for connection when the connection flaps: > grep ConnectionEvent 3) In between the two "Device Connected" ConnectionEvents described above, there is an handshake attempt from the OVS. OF Plugin sees thatthis device is already connected and closing all connections: 2017-06-13 06:10:23,680 | INFO | entLoopGroup-7-7 | ConnectionAdapterImpl | 295 - org.opendaylight.openflowjava.openflow-protocol-impl - 0.9.1.SNAPSHOT | Hello received |
| Comment by Jozef Bacigal [ 26/Jun/17 ] |
|
Recheck this please after this patch |
| Comment by Vinh Nguyen [ 29/Jun/17 ] |
|
The problem still exists after the patch: https://git.opendaylight.org/gerrit/#/c/57814/ The "Xid cannot be reserved " problem still happens after the duplicate connection is received: "Device openflow:158835981449481 already connected. Closing all connection to the device" |
| Comment by Jozef Bacigal [ 03/Jul/17 ] |
|
OK and now the double connection problem https://git.opendaylight.org/gerrit/#/c/59557/ Try please this patch. Thanks |
| Comment by Vinh Nguyen [ 05/Jul/17 ] |
|
Karaf.log for patch 59557 |
| Comment by Vinh Nguyen [ 05/Jul/17 ] |
|
Attachment karaf.log.7z has been added with description: karaf.log.patch59557 |
| Comment by Vinh Nguyen [ 05/Jul/17 ] |
|
The patch 59557 doesn't fix the problem. New flows are still not pushed when new connection handshake received for already-connected OVS device. Please see the attached karaf logs (last 10 logs, the already-connected connection issue happens in karaf.log file) |
| Comment by Vinh Nguyen [ 05/Jul/17 ] |
|
(In reply to Vinh Nguyen from comment #16) Please ignore this attachment |
| Comment by Vinh Nguyen [ 05/Jul/17 ] |
|
Attachment karaf.7z has been added with description: New karaf.log for 59557 |
| Comment by Vinh Nguyen [ 05/Jul/17 ] |
|
(In reply to Vinh Nguyen from comment #17) The error messages "Xid cannot be reserved.." are not seen in the logs but the problem now seems that odl connection are dropped but new connection is not processed properly. |
| Comment by Jozef Bacigal [ 14/Aug/17 ] |
|
Closing this, connection bug is 8805 |