[GENIUS-15] DataStoreJobCoordinator: java.lang.reflect.UndeclaredThrowableException Created: 29/Aug/16  Updated: 06/Apr/17  Resolved: 06/Apr/17

Status: Resolved
Project: genius
Component/s: General
Affects Version/s: (unspecified)
Fix Version/s: None

Type: Bug
Reporter: Sam Hague Assignee: Michael Vorburger
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Attachments: File karaf.tar.gz    
External issue ID: 6564

 Description   

2016-08-29 08:14:56,152 | ERROR | nPool-1-worker-2 | DataStoreJobCoordinator | 313 - org.opendaylight.genius.mdsalutil-api - 0.2.0.SNAPSHOT | Exception when executing jobEntry: JobEntry

{key='org.opendaylight.controller.config.yang.config.legacy_entity_ownership_service_provider.LegacyEntityOwnershipServiceProviderModule$1@50468e80', mainWorker=org.opendaylight.genius.utils.clustering.ClusteringUtils$CheckEntityOwnerTask@19edf62a, rollbackWorker=null, retryCount=0, futures=null}

, exception: [org.opendaylight.controller.config.yang.config.legacy_entity_ownership_service_provider.$Proxy52.getOwnershipState(Unknown Source), org.opendaylight.genius.utils.clustering.ClusteringUtils$CheckEntityOwnerTask.call(ClusteringUtils.java:83), org.opendaylight.genius.utils.clustering.ClusteringUtils$CheckEntityOwnerTask.call(ClusteringUtils.java:63), org.opendaylight.genius.datastoreutils.DataStoreJobCoordinator$MainTask.run(DataStoreJobCoordinator.java:248), java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1423), java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289), java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:902), java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1689), java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1644), java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)]
java.lang.reflect.UndeclaredThrowableException
at org.opendaylight.controller.config.yang.config.legacy_entity_ownership_service_provider.$Proxy52.getOwnershipState(Unknown Source)
at org.opendaylight.genius.utils.clustering.ClusteringUtils$CheckEntityOwnerTask.call(ClusteringUtils.java:83)
at org.opendaylight.genius.utils.clustering.ClusteringUtils$CheckEntityOwnerTask.call(ClusteringUtils.java:63)
at org.opendaylight.genius.datastoreutils.DataStoreJobCoordinator$MainTask.run(DataStoreJobCoordinator.java:248)
at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1423)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:902)
at java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1689)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1644)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedMethodAccessor118.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.opendaylight.controller.config.yang.config.legacy_entity_ownership_service_provider.LegacyEntityOwnershipServiceProviderModule$1.handleInvocation(LegacyEntityOwnershipServiceProviderModule.java:53)
at com.google.common.reflect.AbstractInvocationHandler.invoke(AbstractInvocationHandler.java:87)
... 10 more
Caused by: org.osgi.service.blueprint.container.ServiceUnavailableException: The Blueprint container is being or has been destroyed: (objectClass=org.opendaylight.mdsal.eos.dom.api.DOMEntityOwnershipService)
at org.apache.aries.blueprint.container.ReferenceRecipe.getService(ReferenceRecipe.java:241)
at org.apache.aries.blueprint.container.ReferenceRecipe.access$000(ReferenceRecipe.java:56)
at org.apache.aries.blueprint.container.ReferenceRecipe$ServiceDispatcher.call(ReferenceRecipe.java:306)
at Proxy590a948f_3def_402c_973a_9ad6f6115b8e.getOwnershipState(Unknown Source)
at org.opendaylight.controller.md.sal.dom.clustering.impl.LegacyEntityOwnershipServiceAdapter.getOwnershipState(LegacyEntityOwnershipServiceAdapter.java:60)
at Proxy7b737b0a_4c17_4053_bb8d_a311809db80d.getOwnershipState(Unknown Source)
... 15 more



 Comments   
Comment by Sam Hague [ 29/Aug/16 ]

Attachment karaf.tar.gz has been added with description: karaf.log

Comment by Faseela K [ 29/Aug/16 ]

Please specify the steps to reproduce

Comment by Sam Hague [ 30/Aug/16 ]

This is captured when using the NetvirtIT. The test simply creates a neutron network and two vm's on the network. Then issues a ping from one vm to the other. To use the IT you need dockr and docker-compose installed on the host. Build the netvirt code and then run the command below. ovsdb.controller.address is the address show for the docker0 interface if you issues ip addr on the host.

cd netvirt/vpnservice
mvn -nsu -f it/impl/pom.xml verify -Pintegrationtest -Dovsdb.controller.address=172.17.0.1 -Dit.test=NetvirtIT#testNeutronNet

Comment by Michael Vorburger [ 16/Mar/17 ]

Done a quick first analysis of this... it's some (interesting!) mix up of checked and unchecked exceptions related to not 1 but 2 instances of java.lang.reflect.Proxy, one by Aries BP and one from org.opendaylight.controller.config.yang.config.legacy_entity_ownership_service_provider.LegacyEntityOwnershipServiceProviderModule.createInstance() ...

... not sure yet what the right fix is. Also the LegacyEntityOwnershipServiceProviderModule is from some pre-BP pure CSS times, should ideally be removed, but let's try to fix it as is first. The actual bug and thus fix may be in Controller, not Genius (I'll move the Bugzilla Product if I'm sure).

> Caused by: org.osgi.service.blueprint.container.ServiceUnavailableException:
> The Blueprint container is being or has been destroyed:

Sam, have you seen this one "in the wild" (during real operations) or only during NetvirtIT ? I think this MAY (I'm not sure) be one of our (too many... yes) problems happening (only) during shutdown / feature un-install? It certainly would be good to fix it either way, but the priority would be a little different.

Comment by Michael Vorburger [ 16/Mar/17 ]

> one of our (too many... yes) problems happening (only)
> during shutdown / feature un-install

actually, even if I "fix" this (the Exception above), then another one will ust appear.. the real issue here probably is that a background thread in DataStoreJobCoordinator is still trying to access EntityOwnershipService when that bundles' "Blueprint container is being or has been destroyed" - Karaf is already shutting down.

The real fix here then is EITHER to stop DataStoreJobCoordinator correctly, which I've proposed in https://git.opendaylight.org/gerrit/#/c/52976/ (that's the right generic solution IMHO if whatever that background job does actually isn't really important), OR for maybe for NetvirtIT to wait for the DataStoreJobCoordinator to be done (that's the right local fix if the NetvirtIT does care about the outcome of that background job - but then it really should be extended with an assert of something that happens in that job?).

Comment by Michael Vorburger [ 16/Mar/17 ]

https://git.opendaylight.org/gerrit/#/c/53412/

Comment by Faseela K [ 17/Mar/17 ]

will this resolve the problem?

https://git.opendaylight.org/gerrit/#/c/53412/

Comment by Michael Vorburger [ 20/Mar/17 ]

> will c/53412 resolve the problem?

with this now merged, I expect another exception showing the actual root cause will probably appear now for this (not tested; perhaps Sam or Faseela you could copy/paste the new exception if you see it happen?).

The real issue here is that a background thread in DataStoreJobCoordinator is still trying to access EntityOwnershipService when that bundles' "Blueprint container is being or has been destroyed" - Karaf is already shutting down.

The real fix here then is EITHER to stop DataStoreJobCoordinator correctly, which I've proposed in https://git.opendaylight.org/gerrit/#/c/52976/ (that's the right generic solution IMHO if whatever that background job does actually isn't really important), OR for maybe for NetvirtIT to wait for the DataStoreJobCoordinator to be done (that's the right local fix if the NetvirtIT does care about the outcome of that background job - but then it really should be extended with an assert of something that happens in that job?).

Comment by Michael Vorburger [ 06/Apr/17 ]

https://git.opendaylight.org/gerrit/#/c/52976/ is now also merged, so the error above won't occur anymore like that. – It's possible that another exception will probably appear now at the end of the NetvirtIT (not tested), but it will look very different, so let's track that in a new bug to be opened when someone hits it.

PS: Maybe NetvirtIT would have to wait for the DataStoreJobCoordinator to be done (that's the right local fix if the NetvirtIT does care about the outcome of that background job - but then it really should be extended with an assert of something that happens in that job?).

Generated at Wed Feb 07 19:59:41 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.