Details
-
Improvement
-
Status: Open
-
High
-
Resolution: Unresolved
-
Oxygen
-
None
-
None
-
Operating System: All
Platform: All
Description
While working on https://git.opendaylight.org/gerrit/#/c/60303/ and adding the new JobCoordinatorTestModule to the ResourceManagerTest I came across an interesting concurrency issue which this fast running component test evidences, but which if I understand correctly what's going on is a potential real run-time production not just test issue:
So if you pull https://git.opendaylight.org/gerrit/#/c/60303/ (or after its merged, hopefully soon) and un-comment its LogCaptureRule, you'll find that its testAllocateResource(), and sometimes but not always its testReleaseResource(), fail because the LogCaptureRule detects below in the ERROR log.
This is because with c/60303 the ResourceManagerTest doesn't use the stupid only static DataStoreJobCoordinator anymore who's big problem in component tests was that enqueued jobs could "survive" between @Tests, but thanks to the JobCoordinatorTestModule now properly starts and shuts down the JobCoordinator for each @Test, thus finally guaranteeing real test isolation - and finding this issue!
The issue is that the ResourceManager submits jobs and then does not await its completion; it cannot, due to missing dependant INFRAUTILS-16. (That await doesn't have to be blocking; could and should be part of the Future returned by ResourceManager RPC, for callers to decide; a component test would do a blocking await; other code may choose not to.)
So in a fast component test like the ResourceManagerTest, the IdManager operations are still running when the test finishes, leading to below. But even in production, should the ResourceManager not wait for the IdManager to be done creating pools, locking and unlocking etc. ?
[ForkJoinPool-6-worker-3] ERROR org.opendaylight.genius.lockmanager.LockManager - unlock() failed: tablesorg.opendaylight.yang.gen.v1.urn.opendaylight.genius.resourcemanager.rev160622.ResourceTypeTableIds
java.util.concurrent.ExecutionException: Operation was interrupted
at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.wrapInExecutionException(MappingCheckedFuture.java:65)
at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.get(MappingCheckedFuture.java:74)
at org.opendaylight.genius.lockmanager.LockManager.unlock(LockManager.java:112)
at org.opendaylight.genius.idmanager.IdUtils.unlock(IdUtils.java:237)
at org.opendaylight.genius.idmanager.jobs.UpdateIdEntryJob.call(UpdateIdEntryJob.java:76)
at org.opendaylight.genius.idmanager.jobs.UpdateIdEntryJob.call(UpdateIdEntryJob.java:1)
at org.opendaylight.infrautils.jobcoordinator.internal.JobCoordinatorImpl$MainTask.run(JobCoordinatorImpl.java:279)
at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: ReadFailedException{message=read was interupted., errorList=[RpcError [message=read was interupted., severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=java.lang.InterruptedException]]}
at org.opendaylight.controller.md.sal.common.api.data.ReadFailedException$1.newWithCause(ReadFailedException.java:26)
at org.opendaylight.controller.md.sal.common.api.data.ReadFailedException$1.newWithCause(ReadFailedException.java:1)
at org.opendaylight.yangtools.util.concurrent.ExceptionMapper.apply(ExceptionMapper.java:95)
at org.opendaylight.yangtools.util.concurrent.ExceptionMapper.apply(ExceptionMapper.java:38)
at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.mapException(MappingCheckedFuture.java:61)
... 12 more
Caused by: java.lang.InterruptedException
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:458)
at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:79)
at com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:62)
at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.get(MappingCheckedFuture.java:71)
... 10 more
[JobCoordinator-JobQueueHandler-0] ERROR org.opendaylight.infrautils.jobcoordinator.internal.JobCoordinatorImpl - Exception while executing the tasks
java.util.concurrent.RejectedExecutionException
at java.util.concurrent.ForkJoinPool.externalSubmit(ForkJoinPool.java:2328)
at java.util.concurrent.ForkJoinPool.externalPush(ForkJoinPool.java:2419)
at java.util.concurrent.ForkJoinPool.execute(ForkJoinPool.java:2648)
at org.opendaylight.infrautils.jobcoordinator.internal.JobCoordinatorImpl$JobQueueHandler.run(JobCoordinatorImpl.java:329)
at java.lang.Thread.run(Thread.java:748)
[main] INFO org.opendaylight.infrautils.jobcoordinator.internal.JobCoordinatorImpl - JobCoordinator now closed for business.
[ForkJoinPool-6-worker-3] ERROR org.opendaylight.infrautils.jobcoordinator.internal.JobCoordinatorImpl - Exception when executing jobEntry: JobEntry{key='tables', mainWorker=org.opendaylight.genius.idmanager.jobs.UpdateIdEntryJob@65d9dd07, rollbackWorker=null, retryCount=6, futures=null}
TransactionCommitFailedException{message=commit was interupted., errorList=[RpcError [message=commit was interupted., severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=java.lang.InterruptedException]]}
at org.opendaylight.controller.md.sal.dom.broker.impl.TransactionCommitFailedExceptionMapper.newWithCause(TransactionCommitFailedExceptionMapper.java:37)
at org.opendaylight.controller.md.sal.dom.broker.impl.TransactionCommitFailedExceptionMapper.newWithCause(TransactionCommitFailedExceptionMapper.java:1)
at org.opendaylight.yangtools.util.concurrent.ExceptionMapper.apply(ExceptionMapper.java:95)
at org.opendaylight.yangtools.util.concurrent.ExceptionMapper.apply(ExceptionMapper.java:38)
at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.mapException(MappingCheckedFuture.java:61)
at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.wrapInExecutionException(MappingCheckedFuture.java:65)
at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.get(MappingCheckedFuture.java:74)
at com.google.common.util.concurrent.AbstractCheckedFuture.checkedGet(AbstractCheckedFuture.java:85)
at org.opendaylight.genius.idmanager.jobs.UpdateIdEntryJob.call(UpdateIdEntryJob.java:67)
at org.opendaylight.genius.idmanager.jobs.UpdateIdEntryJob.call(UpdateIdEntryJob.java:1)
at org.opendaylight.infrautils.jobcoordinator.internal.JobCoordinatorImpl$MainTask.run(JobCoordinatorImpl.java:279)
at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: java.lang.InterruptedException
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:476)
at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:79)
at com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:62)
at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.get(MappingCheckedFuture.java:71)
... 9 more
Attachments
Issue Links
- is blocked by
-
INFRAUTILS-16 JobCoordinator enqueueJob must return a ListenableFuture instead of void
-
- Resolved
-