[NETVIRT-77] [Legacy] br-int not getting created when Netvirt used as 3node Created: 18/Aug/16  Updated: 09/Mar/18  Resolved: 24/Aug/16

Status: Resolved
Project: netvirt
Component/s: None
Affects Version/s: Boron
Fix Version/s: None

Type: Bug
Reporter: Venkatrangan Govindarajan Assignee: Bertrand Low
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Attachments: File odl1_bug_6455_log.tgz     File odl2_bug_6455_log.tgz     File odl3_bug_6455_log.tgz    
External issue ID: 6455

 Description   

distribution used for testing: https://nexus.opendaylight.org/content/repositories/opendaylight.snapshot/org/opendaylight/integration/distribution-karaf/0.5.0-SNAPSHOT/distribution-karaf-0.5.0-20160817.134922-4396.zip

Steps Tried out

1. Deploy 3 node as Cluster

2. Installed odl-ovsdb-openstack, odl-jolokia in 3nodes

3. Ensured the clustering is fine by checking sync status

4. Added the 3 node as Managers to OVS

Observation1: br-int not getting created

Observation2: Manually created br-int and then checked dump-fl;ows, the pipeline initial flows are nnot geting added.

Observation 3: saw this is ODL2 log
--------------------------------------------------------
2016-08-17 20:10:47,707 | WARN | rint Extender: 3 | NetvirtProvidersProvider | 314 - org.opendaylight.netvirt.openstack.net-virt-providers - 1.3.0.SNAPSHOT | Failed to start Netvirt:
java.lang.NullPointerException
at org.opendaylight.netvirt.openstack.netvirt.impl.ProviderNetworkManagerImpl.providerAdded(ProviderNetworkManagerImpl.java:74)
at org.opendaylight.netvirt.openstack.netvirt.providers.openflow13.OF13Provider.setDependencies(OF13Provider.java:2017)
at org.opendaylight.netvirt.openstack.netvirt.providers.ConfigActivator$1.addingService(ConfigActivator.java:186)
at org.osgi.util.tracker.ServiceTracker$Tracked.customizerAdding(ServiceTracker.java:932)[karaf-org.osgi.core.jar:]
at org.osgi.util.tracker.ServiceTracker$Tracked.customizerAdding(ServiceTracker.java:864)[karaf-org.osgi.core.jar:]
at org.osgi.util.tracker.AbstractTracked.trackAdding(AbstractTracked.java:256)[karaf-org.osgi.core.jar:]
at org.osgi.util.tracker.AbstractTracked.trackInitial(AbstractTracked.java:183)[karaf-org.osgi.core.jar:]
at org.osgi.util.tracker.ServiceTracker.open(ServiceTracker.java:317)[karaf-org.osgi.core.jar:]
at org.osgi.util.tracker.ServiceTracker.open(ServiceTracker.java:261)[karaf-org.osgi.core.jar:]
at org.opendaylight.netvirt.openstack.netvirt.providers.ConfigActivator.start(ConfigActivator.java:191)
at org.opendaylight.netvirt.openstack.netvirt.providers.NetvirtProvidersProvider.start(NetvirtProvidersProvider.java:97)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)[:1.8.0_101]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)[:1.8.0_101]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)[:1.8.0_101]
at java.lang.reflect.Method.invoke(Method.java:498)[:1.8.0_101]
at org.apache.aries.blueprint.utils.ReflectionUtils.invoke(ReflectionUtils.java:299)[15:org.apache.aries.blueprint.core:1.4.4]
at org.apache.aries.blueprint.container.BeanRecipe.invoke(BeanRecipe.java:956)[15:org.apache.aries.blueprint.core:1.4.4]
at org.apache.aries.blueprint.container.BeanRecipe.runBeanProcInit(BeanRecipe.java:712)[15:org.apache.aries.blueprint.core:1.4.4]
at org.apache.aries.blueprint.container.BeanRecipe.internalCreate2(BeanRecipe.java:824)[15:org.apache.aries.blueprint.core:1.4.4]
at org.apache.aries.blueprint.container.BeanRecipe.internalCreate(BeanRecipe.java:787)[15:org.apache.aries.blueprint.core:1.4.4]
at org.apache.aries.blueprint.di.AbstractRecipe$1.call(AbstractRecipe.java:79)[15:org.apache.aries.blueprint.core:1.4.4]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)[:1.8.0_101]
at org.apache.aries.blueprint.di.AbstractRecipe.create(AbstractRecipe.java:88)[15:org.apache.aries.blueprint.core:1.4.4]
at org.apache.aries.blueprint.container.BlueprintRepository.createInstances(BlueprintRepository.java:247)[15:org.apache.aries.blueprint.core:1.4.4]
at org.apache.aries.blueprint.container.BlueprintRepository.createAll(BlueprintRepository.java:183)[15:org.apache.aries.blueprint.core:1.4.4]
at org.apache.aries.blueprint.container.BlueprintContainerImpl.instantiateEagerComponents(BlueprintContainerImpl.java:682)[15:org.apache.aries.blueprint.core:1.4.4]
at org.apache.aries.blueprint.container.BlueprintContainerImpl.doRun(BlueprintContainerImpl.java:377)[15:org.apache.aries.blueprint.core:1.4.4]
at org.apache.aries.blueprint.container.BlueprintContainerImpl.run(BlueprintContainerImpl.java:269)[15:org.apache.aries.blueprint.core:1.4.4]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)[:1.8.0_101]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)[:1.8.0_101]
at org.apache.aries.blueprint.container.ExecutorServiceWrapper.run(ExecutorServiceWrapper.java:106)[15:org.apache.aries.blueprint.core:1.4.4]
at org.apache.aries.blueprint.utils.threading.impl.DiscardableRunnable.run(DiscardableRunnable.java:48)[15:org.apache.aries.blueprint.core:1.4.4]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)[:1.8.0_101]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)[:1.8.0_101]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)[:1.8.0_101]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)[:1.8.0_101]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)[:1.8.0_101]

------------------------------------------------

5. Checked the entity owners , ODL2 was the owner of netvirt-provider entity

6. Killed karaf in ODL2 and restarted

7. ODL1 became owner of netvirt-provider
Observation: br-int got created and pipeline flows got added



 Comments   
Comment by Venkatrangan Govindarajan [ 18/Aug/16 ]

Attachment odl1_bug_6455_log.tgz has been added with description: ODL1 logs

Comment by Venkatrangan Govindarajan [ 18/Aug/16 ]

Attachment odl2_bug_6455_log.tgz has been added with description: ODL2 logs

Comment by Venkatrangan Govindarajan [ 18/Aug/16 ]

Attachment odl3_bug_6455_log.tgz has been added with description: ODL3 logs

Comment by Venkatrangan Govindarajan [ 19/Aug/16 ]

The issue occurs in releng also now

https://logs.opendaylight.org/releng/jenkins092/netvirt-csit-3node-openstack-mitaka-openstack-boron/118/

Comment by Bertrand Low [ 23/Aug/16 ]

The error appears to be happening because of a timing issue between O13Provider and ProviderNetworkManagerImpl
1) An O13Provider calls setDependency when ProviderNetworkManagerImpl service registers
2) O13Provider::setDependency calls ProviderNetworkManagerImpl::providerAdd
3) ProviderNetworkManagerImpl::providerAdd calls ovsdbInventoryService.providersReady().
This statement causes the NullPointerException because ovsdbInventoryService is null. OvsdbInventoryService is a dependency for ProviderNetworkManagerImpl but when this exception occurs, it has not yet been added as such.

The effect of this error is that Netvirt will fail to start properly. In this particular case, because ovsdbInventoryService is null, OvsdbInventoryServiceImpl will never get notified that its providers are ready, and hence the following will be true: the ovsdbDataChangeListener will not start, and both the NeutronModelsDataChangeListeners and the NetvirtTopology will not be initialized. In general though, seeing the "Failed to start Netvirt" log does not tell us which component of Netvirt has failed; it only tells us that Netvirt will not behave in an expected way.

Likely, fixing the NullPointerException will allow Netvirt to start up properly and hence the successfully creation of br-int.

Comment by Bertrand Low [ 23/Aug/16 ]

Sam pointed out that blueprint migration may have contributed to this behaviour because blueprint causes the order of ConfigActivator.start() in Netvirt and in Netvirt-Providers to be random (see https://lists.opendaylight.org/pipermail/netvirt-dev/2016-August/001274.html). That is, sometimes, Netvirt-Providers will start up before Netvirt starts up.

This random order is in itself not the cause of the problem, as we have observed successful starts independent of whether Netvirt-Providers starts up before Netvirt or vice versa. However, in the instances where the timing issue described above occurs, Netvirt-Providers has been observed to start up before Netvirt.

We will confirm if adding the OvsdbInventoryService dependency to netvirt-provider.xml works to properly set all dependencies before startup, thus avoiding the NullPointerException.

Comment by Bertrand Low [ 24/Aug/16 ]

Patch submitted.

Inject dependency of OvsdbInventoryService to blueprint for NetvirtProvidersProvider

https://git.opendaylight.org/gerrit/#/c/44562/4

The patch has been tested and the bug has not been observed after applying the patch

Comment by Bertrand Low [ 24/Aug/16 ]

master branch:
https://git.opendaylight.org/gerrit/#/c/44572/1
https://git.opendaylight.org/gerrit/44623

Generated at Wed Feb 07 20:20:38 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.