[NETVIRT-1297] NullPointerException: Vpn Instance not available b7029bdc-4bd6-4767-902a-ed64717bf6c8 Created: 04/Jun/18  Updated: 19/Nov/19  Resolved: 03/Oct/18

Status: Resolved
Project: netvirt
Component/s: vpnmanager
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: High
Reporter: Sam Hague Assignee: Abhinav Gupta
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

https://jenkins.opendaylight.org/releng/job/netvirt-csit-1node-openstack-queens-upstream-stateful-fluorine/582/

2018-06-04T07:42:49,904 | WARN  | org.opendaylight.yang.gen.v1.urn.ericsson.params.xml.ns.yang.ebgp.rev150901.bgp.Networks_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | BgpConfigurationManager          | 355 - org.opendaylight.netvirt.bgpmanager-impl - 0.7.0.SNAPSHOT | networks : configuration received when BGP is inactive
2018-06-04T07:42:51,899 | ERROR | ForkJoinPool-1-worker-3 | VpnInterfaceManager              | 382 - org.opendaylight.netvirt.vpnmanager-impl - 0.7.0.SNAPSHOT | processVpnInterfaceDown: Unable to process delete/down for interface 464fcc02-51ae-4a0b-9944-7b089c9dca49 on dpn 83601713340639 as it is not available in operational data store
2018-06-04T07:42:51,900 | ERROR | ForkJoinPool-1-worker-0 | VpnInterfaceManager              | 382 - org.opendaylight.netvirt.vpnmanager-impl - 0.7.0.SNAPSHOT | processVpnInterfaceDown: Unable to process delete/down for interface 74a81a1c-6471-4d4b-863a-faac81af4956 on dpn 75779649151086 as it is not available in operational data store
2018-06-04T07:42:51,900 | ERROR | ForkJoinPool-1-worker-2 | VpnInterfaceManager              | 382 - org.opendaylight.netvirt.vpnmanager-impl - 0.7.0.SNAPSHOT | processVpnInterfaceDown: Unable to process delete/down for interface 67d03c23-2e67-4109-b3f8-fe3d5354c331 on dpn 80177565169773 as it is not available in operational data store
2018-06-04T07:42:51,937 | ERROR | org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.fibmanager.rev150330.vrfentries.VrfEntry_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | AsyncDataTreeChangeListenerBase  | 274 - org.opendaylight.genius.mdsalutil-api - 0.5.0.SNAPSHOT | Thread terminated due to uncaught exception: org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.fibmanager.rev150330.vrfentries.VrfEntry_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0java.lang.NullPointerException: Vpn Instance not available b7029bdc-4bd6-4767-902a-ed64717bf6c8	at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:900) [32:com.google.guava:23.6.0.jre]	at org.opendaylight.netvirt.fibmanager.RouterInterfaceVrfEntryHandler.installRouterFibEntries(RouterInterfaceVrfEntryHandler.java:72) [365:org.opendaylight.netvirt.fibmanager-impl:0.7.0.SNAPSHOT]	at org.opendaylight.netvirt.fibmanager.RouterInterfaceVrfEntryHandler.removeFlows(RouterInterfaceVrfEntryHandler.java:66) [365:org.opendaylight.netvirt.fibmanager-impl:0.7.0.SNAPSHOT]	at org.opendaylight.netvirt.fibmanager.VrfEntryListener.removeFibEntries(VrfEntryListener.java:251) [365:org.opendaylight.netvirt.fibmanager-impl:0.7.0.SNAPSHOT]	at org.opendaylight.netvirt.fibmanager.VrfEntryListener.remove(VrfEntryListener.java:232) [365:org.opendaylight.netvirt.fibmanager-impl:0.7.0.SNAPSHOT]	at org.opendaylight.netvirt.fibmanager.VrfEntryListener.remove(VrfEntryListener.java:107) [365:org.opendaylight.netvirt.fibmanager-impl:0.7.0.SNAPSHOT]	at org.opendaylight.genius.datastoreutils.AsyncDataTreeChangeListenerBase$DataTreeChangeHandler.run(AsyncDataTreeChangeListenerBase.java:158) [274:org.opendaylight.genius.mdsalutil-api:0.5.0.SNAPSHOT]	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:?]	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:?]	at java.lang.Thread.run(Thread.java:748) [?:?]


 Comments   
Comment by Vivekanandan Narasimhan [ 05/Jun/18 ]

Hi Sam,

This looks to be a genuine failure exposed simply by the timing between the following two threads…

The VRFEntry for router-inteface 10.20.20.1/32 got removed from FIB before cleanupDpnOnVpn succeeded for last DPN on VPN.
So the flows for the router-interface IP is not removed from the last DPN.

And when the VRFEntry remove() was being processed for router-interface holding 10.20.20.1 inside VRFEngine, the vpnIstanceOpData has been removed
already and so even that path could not remove the flows from Table 19 and Table 21 correctly.

One way to address this problem is to include router-interfaces into vpn-to-router-list inside vpn-instance-op-data and then remove such router-interfaces
as their ip-addresses are cleaned up from Table 19 (L3GWMAC) and Table 21 (ICMP) by the VRFEngine, but that is fair amount of code change.

You can raise a JIRA and assign to me so that I can route it to folks who will be able to address this properly..

@KiranN and @Hanumant, what do you think?

From [1]
2018-06-04T07:42:51,937 | INFO | org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.fibmanager.rev150330.vrfentries.VrfEntry_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | VrfEntryListener | 365 - org.opendaylight.netvirt.fibmanager-impl - 0.7.0.SNAPSHOT | REMOVE: Removed Fib Entry rd b7029bdc-4bd6-4767-902a-ed64717bf6c8 prefix 10.20.20.1/32 route-paths [RoutePaths{getLabel=100399, getNexthopAddress=0.0.0.0, augmentations={}}]

2018-06-04T07:42:51,937 | INFO | ForkJoinPool-1-worker-4 | VpnFootprintService$DpnEnterExitVpnWorker | 382 - org.opendaylight.netvirt.vpnmanager-impl - 0.7.0.SNAPSHOT | onSuccess: FootPrint cleared for vpn b7029bdc-4bd6-4767-902a-ed64717bf6c8 rd b7029bdc-4bd6-4767-902a-ed64717bf6c8 on dpn 80177565169773

2018-06-04T07:42:51,940 | INFO | CommitFutures-6 | VpnOpStatusListener$PostDeleteVpnInstanceWorker | 382 - org.opendaylight.netvirt.vpnmanager-impl - 0.7.0.SNAPSHOT | onSuccess: VpnId for VpnName b7029bdc-4bd6-4767-902a-ed64717bf6c8 is released to IdManager successfully.

2018-06-04T07:42:51,937 | ERROR | org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.fibmanager.rev150330.vrfentries.VrfEntry_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | AsyncDataTreeChangeListenerBase | 274 - org.opendaylight.genius.mdsalutil-api - 0.5.0.SNAPSHOT | Thread terminated due to uncaught exception: org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.fibmanager.rev150330.vrfentries.VrfEntry_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0
java.lang.NullPointerException: Vpn Instance not available b7029bdc-4bd6-4767-902a-ed64717bf6c8
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:900) [32:com.google.guava:23.6.0.jre]
at org.opendaylight.netvirt.fibmanager.RouterInterfaceVrfEntryHandler.installRouterFibEntries(RouterInterfaceVrfEntryHandler.java:72) [365:org.opendaylight.netvirt.fibmanager-impl:0.7.0.SNAPSHOT]
at org.opendaylight.netvirt.fibmanager.RouterInterfaceVrfEntryHandler.removeFlows(RouterInterfaceVrfEntryHandler.java:66) [365:org.opendaylight.netvirt.fibmanager-impl:0.7.0.SNAPSHOT]
at org.opendaylight.netvirt.fibmanager.VrfEntryListener.removeFibEntries(VrfEntryListener.java:251) [365:org.opendaylight.netvirt.fibmanager-impl:0.7.0.SNAPSHOT]

[1] https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-queens-upstream-stateful-fluorine/582/odl_1/odl1_karaf.log.gz

[2] https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-queens-upstream-stateful-fluorine/582/robot-plugin/log_10_arp_learning.html.gz#s1-t12-k11-k3-k1-k1-k1-k6-k1-k3

List of commits from May 31 till today on NetVirt. None of these commits seem to be the culprit.

Same is the case with Genius too.. Nothing identifiable as a culprit..


Thanks,

Vivek

From: Sam Hague shague@redhat.com
Sent: Monday, June 04, 2018 5:19 PM
To: N Vivekanandan <n.vivekanandan@ericsson.com>
Subject: arp tests failing

Vivek,

could you look at the CSIT in the arp suite? I merged some patches over the weekend and somewhere the arp tests started failing. ignore the ssh exception in the l2 suite as that is a infra issue.

Thanks, Sam

[1] https://jenkins.opendaylight.org/releng/job/netvirt-csit-1node-openstack-queens-upstream-stateful-fluorine/582/

Comment by Vivekanandan Narasimhan [ 05/Jun/18 ]

Hi Abhinav,

Can you kindly take a look and help raising a fixing ? I have quoted the approach to fix this issue in the email thread attached above..

Vivek

Comment by Sam Hague [ 06/Jun/18 ]

Another fail from today: https://jenkins.opendaylight.org/releng/job/netvirt-csit-1node-openstack-queens-upstream-stateful-fluorine/595/

Comment by Sam Hague [ 09/Jun/18 ]

Another fail: https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-queens-upstream-stateful-fluorine/612/

Generated at Wed Feb 07 20:23:43 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.