[NETVIRT-867] NPE at org.opendaylight.netvirt.elan.arp.responder.ArpResponderUtil.getMatchCriteria(ArpResponderUtil.java:178) Created: 23/Aug/17  Updated: 05/Dec/17  Resolved: 05/Dec/17

Status: Resolved
Project: netvirt
Component/s: General
Affects Version/s: Carbon
Fix Version/s: None

Type: Bug
Reporter: Michael Vorburger Assignee: Vinoth B
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 9035

 Description   

In the logs of CONTROLLER-1756 I'm seeing some of these, which should be looked at:

2017-08-22 17:11:50,048 | ERROR | Pool-1-worker-45 | DataStoreJobCoordinator | 319 - org.opendaylight.genius.mdsalutil-api - 0.2.2.Carbon | Exception when executing jobEntry: JobEntry

{key='VPNINTERFACE- 55ef861d-a228-4f3c-9b1c-110017f6604e', mainWorker=org.opendaylight.netvirt.vpnmanager.VpnInterfaceManager$$Lambda$584/679401846@6c234385, rollbackWorker=null, retryCount=0, futures=null}

java.lang.NullPointerException
at org.opendaylight.netvirt.elan.arp.responder.ArpResponderUtil.getMatchCriteria(ArpResponderUtil.java:178)
at org.opendaylight.netvirt.elan.utils.ElanUtils.addArpResponderFlow(ElanUtils.java:2333)
at org.opendaylight.netvirt.elan.internal.ElanServiceProvider.addArpResponderFlow(ElanServiceProvider.java:821)
at Proxy6027908b_d3c4_41a8_ad12_9b2b178ad779.addArpResponderFlow(Unknown Source)
at Proxy0897b899_5377_482b_9328_0843142bc04b.addArpResponderFlow(Unknown Source)
at org.opendaylight.netvirt.vpnmanager.arp.responder.ArpResponderHandler.addArpResponderFlow(ArpResponderHandler.java:95)
at org.opendaylight.netvirt.vpnmanager.VpnInterfaceManager.processVpnInterfaceAdjacencies(VpnInterfaceManager.java:633)
at org.opendaylight.netvirt.vpnmanager.VpnInterfaceManager.processVpnInterfaceUp(VpnInterfaceManager.java:354)
at org.opendaylight.netvirt.vpnmanager.VpnInterfaceManager.lambda$addVpnInterface$0(VpnInterfaceManager.java:238)
at org.opendaylight.genius.datastoreutils.DataStoreJobCoordinator$MainTask.run(DataStoreJobCoordinator.java:285)[319:org.opendaylight.genius.mdsalutil-api:0.2.2.Carbon]
at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)[:1.8.0_141]
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)[:1.8.0_141]
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)[:1.8.0_141]
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)[:1.8.0_141]
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)[:1.8.0_141]



 Comments   
Comment by Vinoth B [ 30/Aug/17 ]

After analyzing the karaf log, the ELANInstance is null for the particular interfaces. The VPN interface add event for these interfaces may be failed because of OOM. It causes the NPE while retrieving the ELAN instances.

I have added the null check before adding the ARP flow.

Please mention the steps to reproduce to verify the fix.

Comment by Michael Vorburger [ 30/Aug/17 ]

> The VPN interface add event for these interfaces may be failed because of OOM

That is very likely, because in case of an OOM we shut down Karaf completely, we don't just not do certain things... so this explanation doesn't really make that much sense.

> Please mention the steps to reproduce to verify the fix.

I've no idea - quote CONTROLLER-1756 "Our scenario is a cluster of 3 nodes with odl-netvirt-openstack being stress tested by OpenStack's rally benchmarking tool." ... so I would suggest that you propose a Gerrit doing what you think would fix this, get that reviewed and merged and we close this, and re-open it if it is still seen after merging your fix.

Comment by Vinoth B [ 31/Aug/17 ]

Patch pushed:

https://git.opendaylight.org/gerrit/#/c/62484/

Generated at Wed Feb 07 20:22:40 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.