-
Bug
-
Resolution: Done
-
Medium
-
None
-
Fluorine
-
None
-
When running following test case in downstream scale/perf setup
Create a network
Create a sbunet
Create a router
Attach router to subnet and oublic network
Boot VM with floating IP
Ping VMwe see that some VMs remain unpingable even after 300 seconds and we see a lot of the following exceptions in karaf logs
2018-04-13T22:57:05,411 | ERROR | org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.natservice.rev160111.napt.switches.RouterToNaptSwitch_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | AsyncDataTreeChangeListenerBase | 263 - org.opendaylight.genius.mdsalutil-api - 0.4.0 | Thread terminated due to uncaught exception: org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.natservice.rev160111.napt.switches.RouterToNaptSwitch_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0
java.lang.IllegalArgumentException: Invalid range: -1, expected: [[0..18446744073709551615]].
at org.opendaylight.yang.gen.v1.urn.opendaylight.model.match.types.rev131026.match.TunnelBuilder.checkTunnelIdRange(TunnelBuilder.java:76) [391:org.opendaylight.openflowplugin.model.flow-base:0.6.0]
at org.opendaylight.yang.gen.v1.urn.opendaylight.model.match.types.rev131026.match.TunnelBuilder.setTunnelId(TunnelBuilder.java:81) [391:org.opendaylight.openflowplugin.model.flow-base:0.6.0]
at org.opendaylight.genius.mdsalutil.matches.MatchTunnelId.populateBuilder(MatchTunnelId.java:40) [263:org.opendaylight.genius.mdsalutil-api:0.4.0]
at org.opendaylight.genius.mdsalutil.matches.MatchTunnelId.populateBuilder(MatchTunnelId.java:18) [263:org.opendaylight.genius.mdsalutil-api:0.4.0]
at org.opendaylight.genius.mdsalutil.matches.MatchInfoHelper.createInnerMatchBuilder(MatchInfoHelper.java:29) [263:org.opendaylight.genius.mdsalutil-api:0.4.0]
at org.opendaylight.genius.mdsalutil.MDSALUtil.buildMatches(MDSALUtil.java:336) [263:org.opendaylight.genius.mdsalutil-api:0.4.0]
at org.opendaylight.genius.mdsalutil.FlowEntity.getFlowBuilder(FlowEntity.java:87) [263:org.opendaylight.genius.mdsalutil-api:0.4.0]
at org.opendaylight.genius.mdsalutil.internal.MDSALManager.writeFlowEntityInternal(MDSALManager.java:180) [264:org.opendaylight.genius.mdsalutil-impl:0.4.0]
at org.opendaylight.genius.mdsalutil.internal.MDSALManager.installFlowInternal(MDSALManager.java:142) [264:org.opendaylight.genius.mdsalutil-impl:0.4.0]
at org.opendaylight.genius.mdsalutil.internal.MDSALManager.installFlow(MDSALManager.java:609) [264:org.opendaylight.genius.mdsalutil-impl:0.4.0]
at Proxya2c1fd24_dd19_4d37_b669_2a61174cba99.installFlow(Unknown Source) [?:?]
at Proxy8f8e72d8_5b27_48d7_a9e4_a78c5736e1ae.installFlow(Unknown Source) [?:?]
at org.opendaylight.netvirt.natservice.internal.AbstractSnatService.syncFlow(AbstractSnatService.java:329) [360:org.opendaylight.netvirt.natservice-impl:0.6.0]
at org.opendaylight.netvirt.natservice.internal.ConntrackBasedSnatService.installTerminatingServiceTblEntry(ConntrackBasedSnatService.java:148) [360:org.opendaylight.netvirt.natservice-impl:0.6.0]
at org.opendaylight.netvirt.natservice.internal.ConntrackBasedSnatService.installSnatSpecificEntriesForNaptSwitch(ConntrackBasedSnatService.java:73) [360:org.opendaylight.netvirt.natservice-impl:0.6.0]
at org.opendaylight.netvirt.natservice.internal.AbstractSnatService.handleSnat(AbstractSnatService.java:126) [360:org.opendaylight.netvirt.natservice-impl:0.6.0]
at org.opendaylight.netvirt.natservice.internal.FlatVlanConntrackBasedSnatService.handleSnat(FlatVlanConntrackBasedSnatService.java:49) [360:org.opendaylight.netvirt.natservice-impl:0.6.0]
at org.opendaylight.netvirt.natservice.internal.AbstractSnatService.handleSnatAllSwitch(AbstractSnatService.java:105) [360:org.opendaylight.netvirt.natservice-impl:0.6.0]
at org.opendaylight.netvirt.natservice.internal.FlatVlanConntrackBasedSnatService.handleSnatAllSwitch(FlatVlanConntrackBasedSnatService.java:39) [360:org.opendaylight.netvirt.natservice-impl:0.6.0]
at org.opendaylight.netvirt.natservice.internal.SnatServiceManagerImpl.notify(SnatServiceManagerImpl.java:57) [360:org.opendaylight.netvirt.natservice-impl:0.6.0]
at org.opendaylight.netvirt.natservice.ha.SnatCentralizedSwitchChangeListener.add(SnatCentralizedSwitchChangeListener.java:87) [360:org.opendaylight.netvirt.natservice-impl:0.6.0]
at org.opendaylight.netvirt.natservice.ha.SnatCentralizedSwitchChangeListener.add(SnatCentralizedSwitchChangeListener.java:32) [360:org.opendaylight.netvirt.natservice-impl:0.6.0]
at org.opendaylight.genius.datastoreutils.AsyncDataTreeChangeListenerBase$DataTreeChangeHandler.run(AsyncDataTreeChangeListenerBase.java:167) [263:org.opendaylight.genius.mdsalutil-api:0.4.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:?]Core issue is getting gw port notification before router was created in mdsal as confirmed by following log entry:
2018-05-03T13:22:28,967 | WARN | org.opendaylight.yang.gen.v1.urn.opendaylight.neutron.ports.rev150712.ports.attributes.ports.Port_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | NeutronPortChangeListener | 362 - org.opendaylight.netvirt.neutronvpn-impl - 0.6.0 | No router found for router GW port c69fd64f-4c6a-43a6-ae21-c72f11d29a72 for router 29ec1425-72df-4b11-9327-527cf41d58a9
{{ Router router = neutronvpnUtils.getNeutronRouter(routerId);
if (router == null) {
LOG.warn("No router found for router GW port {} for router {}", routerGwPort.getUuid().getValue(),
routerId.getValue());
return;
}
gwMacResolver.sendArpRequestsToExtGateways(router);setExternalGwMac(routerGwPort, routerId);
}private void setExternalGwMac(Port routerGwPort, Uuid routerId) {
// During full-sync networking-odl syncs routers before ports. As such,
// the MAC of the router's gw port is not available to be set when the
// router is written. We catch that here.}}The most likely reason for this is n-odl sending out of order and should be addressed in n-odl/NN. But even then it shouldn't lead to unhandled exception and thread termination. Since it is a race condition, a simple retry might be enough to avoid this problem.
When running following test case in downstream scale/perf setup Create a network Create a sbunet Create a router Attach router to subnet and oublic network Boot VM with floating IP Ping VM we see that some VMs remain unpingable even after 300 seconds and we see a lot of the following exceptions in karaf logs 2018-04-13T22:57:05,411 | ERROR | org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.natservice.rev160111.napt.switches.RouterToNaptSwitch_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | AsyncDataTreeChangeListenerBase | 263 - org.opendaylight.genius.mdsalutil-api - 0.4.0 | Thread terminated due to uncaught exception: org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.natservice.rev160111.napt.switches.RouterToNaptSwitch_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 java.lang.IllegalArgumentException: Invalid range: -1, expected: [ [0..18446744073709551615] ]. at org.opendaylight.yang.gen.v1.urn.opendaylight.model.match.types.rev131026.match.TunnelBuilder.checkTunnelIdRange(TunnelBuilder.java:76) [391:org.opendaylight.openflowplugin.model.flow-base:0.6.0] at org.opendaylight.yang.gen.v1.urn.opendaylight.model.match.types.rev131026.match.TunnelBuilder.setTunnelId(TunnelBuilder.java:81) [391:org.opendaylight.openflowplugin.model.flow-base:0.6.0] at org.opendaylight.genius.mdsalutil.matches.MatchTunnelId.populateBuilder(MatchTunnelId.java:40) [263:org.opendaylight.genius.mdsalutil-api:0.4.0] at org.opendaylight.genius.mdsalutil.matches.MatchTunnelId.populateBuilder(MatchTunnelId.java:18) [263:org.opendaylight.genius.mdsalutil-api:0.4.0] at org.opendaylight.genius.mdsalutil.matches.MatchInfoHelper.createInnerMatchBuilder(MatchInfoHelper.java:29) [263:org.opendaylight.genius.mdsalutil-api:0.4.0] at org.opendaylight.genius.mdsalutil.MDSALUtil.buildMatches(MDSALUtil.java:336) [263:org.opendaylight.genius.mdsalutil-api:0.4.0] at org.opendaylight.genius.mdsalutil.FlowEntity.getFlowBuilder(FlowEntity.java:87) [263:org.opendaylight.genius.mdsalutil-api:0.4.0] at org.opendaylight.genius.mdsalutil.internal.MDSALManager.writeFlowEntityInternal(MDSALManager.java:180) [264:org.opendaylight.genius.mdsalutil-impl:0.4.0] at org.opendaylight.genius.mdsalutil.internal.MDSALManager.installFlowInternal(MDSALManager.java:142) [264:org.opendaylight.genius.mdsalutil-impl:0.4.0] at org.opendaylight.genius.mdsalutil.internal.MDSALManager.installFlow(MDSALManager.java:609) [264:org.opendaylight.genius.mdsalutil-impl:0.4.0] at Proxya2c1fd24_dd19_4d37_b669_2a61174cba99.installFlow(Unknown Source) [?:?] at Proxy8f8e72d8_5b27_48d7_a9e4_a78c5736e1ae.installFlow(Unknown Source) [?:?] at org.opendaylight.netvirt.natservice.internal.AbstractSnatService.syncFlow(AbstractSnatService.java:329) [360:org.opendaylight.netvirt.natservice-impl:0.6.0] at org.opendaylight.netvirt.natservice.internal.ConntrackBasedSnatService.installTerminatingServiceTblEntry(ConntrackBasedSnatService.java:148) [360:org.opendaylight.netvirt.natservice-impl:0.6.0] at org.opendaylight.netvirt.natservice.internal.ConntrackBasedSnatService.installSnatSpecificEntriesForNaptSwitch(ConntrackBasedSnatService.java:73) [360:org.opendaylight.netvirt.natservice-impl:0.6.0] at org.opendaylight.netvirt.natservice.internal.AbstractSnatService.handleSnat(AbstractSnatService.java:126) [360:org.opendaylight.netvirt.natservice-impl:0.6.0] at org.opendaylight.netvirt.natservice.internal.FlatVlanConntrackBasedSnatService.handleSnat(FlatVlanConntrackBasedSnatService.java:49) [360:org.opendaylight.netvirt.natservice-impl:0.6.0] at org.opendaylight.netvirt.natservice.internal.AbstractSnatService.handleSnatAllSwitch(AbstractSnatService.java:105) [360:org.opendaylight.netvirt.natservice-impl:0.6.0] at org.opendaylight.netvirt.natservice.internal.FlatVlanConntrackBasedSnatService.handleSnatAllSwitch(FlatVlanConntrackBasedSnatService.java:39) [360:org.opendaylight.netvirt.natservice-impl:0.6.0] at org.opendaylight.netvirt.natservice.internal.SnatServiceManagerImpl.notify(SnatServiceManagerImpl.java:57) [360:org.opendaylight.netvirt.natservice-impl:0.6.0] at org.opendaylight.netvirt.natservice.ha.SnatCentralizedSwitchChangeListener.add(SnatCentralizedSwitchChangeListener.java:87) [360:org.opendaylight.netvirt.natservice-impl:0.6.0] at org.opendaylight.netvirt.natservice.ha.SnatCentralizedSwitchChangeListener.add(SnatCentralizedSwitchChangeListener.java:32) [360:org.opendaylight.netvirt.natservice-impl:0.6.0] at org.opendaylight.genius.datastoreutils.AsyncDataTreeChangeListenerBase$DataTreeChangeHandler.run(AsyncDataTreeChangeListenerBase.java:167) [263:org.opendaylight.genius.mdsalutil-api:0.4.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:?] Core issue is getting gw port notification before router was created in mdsal as confirmed by following log entry: 2018-05-03T13:22:28,967 | WARN | org.opendaylight.yang.gen.v1.urn.opendaylight.neutron.ports.rev150712.ports.attributes.ports.Port_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | NeutronPortChangeListener | 362 - org.opendaylight.netvirt.neutronvpn-impl - 0.6.0 | No router found for router GW port c69fd64f-4c6a-43a6-ae21-c72f11d29a72 for router 29ec1425-72df-4b11-9327-527cf41d58a9 {{ Router router = neutronvpnUtils.getNeutronRouter(routerId); if (router == null) { LOG.warn("No router found for router GW port {} for router {}", routerGwPort.getUuid().getValue(), routerId.getValue()); return; } gwMacResolver.sendArpRequestsToExtGateways(router); setExternalGwMac(routerGwPort, routerId); } private void setExternalGwMac(Port routerGwPort, Uuid routerId) { // During full-sync networking-odl syncs routers before ports. As such, // the MAC of the router's gw port is not available to be set when the // router is written. We catch that here.}} The most likely reason for this is n-odl sending out of order and should be addressed in n-odl/NN. But even then it shouldn't lead to unhandled exception and thread termination. Since it is a race condition, a simple retry might be enough to avoid this problem.
-
Normal
When running downstream scale/perf testing we see that some VMs remain unpingable even after 300 seconds and we see a lot of the exceptions in karaf logs when running following tests and ping fails
Create a network
Create a sbunet
Create a router
Attach router to subnet and oublic network
Boot VM with floating IP
Ping VM
Exceptions
--------------
2018-04-13T22:57:05,411 | ERROR | org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.natservice.rev160111.napt.switches.RouterToNaptSwitch_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | AsyncDataTreeChangeListenerBase | 263 - org.opendaylight.genius.mdsalutil-api - 0.4.0.redhat-5 | Thread terminated due to uncaught exception: org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.natservice.rev160111.napt.switches.RouterToNaptSwitch_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 java.lang.IllegalArgumentException: Invalid range: -1, expected: [[0..18446744073709551615]]. at org.opendaylight.yang.gen.v1.urn.opendaylight.model.match.types.rev131026.match.TunnelBuilder.checkTunnelIdRange(TunnelBuilder.java:76) at org.opendaylight.yang.gen.v1.urn.opendaylight.model.match.types.rev131026.match.TunnelBuilder.setTunnelId(TunnelBuilder.java:81) at org.opendaylight.genius.mdsalutil.matches.MatchTunnelId.populateBuilder(MatchTunnelId.java:40) at org.opendaylight.genius.mdsalutil.matches.MatchTunnelId.populateBuilder(MatchTunnelId.java:18) at org.opendaylight.genius.mdsalutil.matches.MatchInfoHelper.createInnerMatchBuilder(MatchInfoHelper.java:29) [ at org.opendaylight.genius.mdsalutil.MDSALUtil.buildMatches(MDSALUtil.java:336) at org.opendaylight.genius.mdsalutil.FlowEntity.getFlowBuilder(FlowEntity.java:87) at org.opendaylight.genius.mdsalutil.internal.MDSALManager.writeFlowEntityInternal(MDSALManager.java:180) at org.opendaylight.genius.mdsalutil.internal.MDSALManager.installFlowInternal(MDSALManager.java:142) at org.opendaylight.genius.mdsalutil.internal.MDSALManager.installFlow(MDSALManager.java:609) at Proxya2c1fd24_dd19_4d37_b669_2a61174cba99.installFlow(Unknown Source) [?:?] at Proxy8f8e72d8_5b27_48d7_a9e4_a78c5736e1ae.installFlow(Unknown Source) [?:?] at org.opendaylight.netvirt.natservice.internal.AbstractSnatService.syncFlow(AbstractSnatService.java:329) at org.opendaylight.netvirt.natservice.internal.ConntrackBasedSnatService.installTerminatingServiceTblEntry(ConntrackBasedSnatService.java:148) at org.opendaylight.netvirt.natservice.internal.ConntrackBasedSnatService.installSnatSpecificEntriesForNaptSwitch(ConntrackBasedSnatService.java:73) at org.opendaylight.netvirt.natservice.internal.AbstractSnatService.handleSnat(AbstractSnatService.java:126) at org.opendaylight.netvirt.natservice.internal.FlatVlanConntrackBasedSnatService.handleSnat(FlatVlanConntrackBasedSnatService.java:49) at org.opendaylight.netvirt.natservice.internal.AbstractSnatService.handleSnatAllSwitch(AbstractSnatService.java:105) at org.opendaylight.netvirt.natservice.internal.FlatVlanConntrackBasedSnatService.handleSnatAllSwitch(FlatVlanConntrackBasedSnatService.java:39) at org.opendaylight.netvirt.natservice.internal.SnatServiceManagerImpl.notify(SnatServiceManagerImpl.java:57) at org.opendaylight.netvirt.natservice.ha.SnatCentralizedSwitchChangeListener.add(SnatCentralizedSwitchChangeListener.java:87) at org.opendaylight.netvirt.natservice.ha.SnatCentralizedSwitchChangeListener.add(SnatCentralizedSwitchChangeListener.java:32) at org.opendaylight.genius.datastoreutils.AsyncDataTreeChangeListenerBase$DataTreeChangeHandler.run(AsyncDataTreeChangeListenerBase.java:167) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:?]
Looking at rest of the logs, the core issue is getting notifications for gw mac port creation before neutron router is created.
2018-05-03T13:22:28,967 | WARN | org.opendaylight.yang.gen.v1.urn.opendaylight.neutron.ports.rev150712.ports.attributes.ports.Port_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | NeutronPortChangeListener | 362 - org.opendaylight.netvirt.neutronvpn-impl - 0.6.0 | *No router found for router GW port c69fd64f-4c6a-43a6-ae21-c72f11d29a72 for router *29ec1425-72df-4b11-9327-527cf41d58a9 2018-05-03T13:22:29,263 | INFO | org.opendaylight.yang.gen.v1.urn.opendaylight.neutron.l3.rev150712.routers.attributes.routers.Router_AsyncClusteredDataTreeChangeListenerBase-DataTreeChangeHandler-0 | NeutronRouterChangeListener | 356 - org.opendaylight.netvirt.ipv6service-impl - 0.6.0 | Add Router notification handler is invoked Uuid [_value=29ec1425-72df-4b11-9327-527cf41d58a9]. 2018-05-03T13:22:29,283 | INFO | ForkJoinPool-2-worker-34 | VpnInstanceListener | 370 - org.opendaylight.netvirt.vpnmanager-impl - 0.6.0| VPN-ADD: addVpnInstance: VPN Id 106159 generated for VpnInstanceName 29ec1425-72df-4b11-9327-527cf41d58a9
Router router = neutronvpnUtils.getNeutronRouter(routerId);
if (router == null) {
LOG.warn("No router found for router GW port {} for router {}", routerGwPort.getUuid().getValue(),
routerId.getValue());
return;
}
gwMacResolver.sendArpRequestsToExtGateways(router);setExternalGwMac(routerGwPort, routerId);
}private void setExternalGwMac(Port routerGwPort, Uuid routerId) {
// During full-sync networking-odl syncs routers before ports. As such,
// the MAC of the router's gw port is not available to be set when the
// router is written. We catch that here.
This means it is not an issue of notifications out of order, but getting gw port notification before router was created in mdsal. The most likely reason for this is n-odl sending out of order, but it shouldn't result in unhandled exception and thread termination.