[GENIUS-91] IdManager operations such as createIdPool and allocateIdRange should return a Future letting clients await finishing of the internal DataStoreJobCoordinator.enqueueJob() ? Created: 22/Sep/17  Updated: 08/Jan/20

Status: Open
Project: genius
Component/s: General
Affects Version/s: Oxygen
Fix Version/s: None

Type: Improvement Priority: High
Reporter: Michael Vorburger Assignee: Ankit Jain
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Issue Links:
Blocks
is blocked by INFRAUTILS-16 JobCoordinator enqueueJob must return... Resolved

 Description   

While working on https://git.opendaylight.org/gerrit/#/c/60303/ and adding the new JobCoordinatorTestModule to the ResourceManagerTest I came across an interesting concurrency issue which this fast running component test evidences, but which if I understand correctly what's going on is a potential real run-time production not just test issue:

So if you pull https://git.opendaylight.org/gerrit/#/c/60303/ (or after its merged, hopefully soon) and un-comment its LogCaptureRule, you'll find that its testAllocateResource(), and sometimes but not always its testReleaseResource(), fail because the LogCaptureRule detects below in the ERROR log.

This is because with c/60303 the ResourceManagerTest doesn't use the stupid only static DataStoreJobCoordinator anymore who's big problem in component tests was that enqueued jobs could "survive" between @Tests, but thanks to the JobCoordinatorTestModule now properly starts and shuts down the JobCoordinator for each @Test, thus finally guaranteeing real test isolation - and finding this issue!

The issue is that the ResourceManager submits jobs and then does not await its completion; it cannot, due to missing dependant INFRAUTILS-16. (That await doesn't have to be blocking; could and should be part of the Future returned by ResourceManager RPC, for callers to decide; a component test would do a blocking await; other code may choose not to.)

So in a fast component test like the ResourceManagerTest, the IdManager operations are still running when the test finishes, leading to below. But even in production, should the ResourceManager not wait for the IdManager to be done creating pools, locking and unlocking etc. ?

[ForkJoinPool-6-worker-3] ERROR org.opendaylight.genius.lockmanager.LockManager - unlock() failed: tablesorg.opendaylight.yang.gen.v1.urn.opendaylight.genius.resourcemanager.rev160622.ResourceTypeTableIds
java.util.concurrent.ExecutionException: Operation was interrupted
	at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.wrapInExecutionException(MappingCheckedFuture.java:65)
	at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.get(MappingCheckedFuture.java:74)
	at org.opendaylight.genius.lockmanager.LockManager.unlock(LockManager.java:112)
	at org.opendaylight.genius.idmanager.IdUtils.unlock(IdUtils.java:237)
	at org.opendaylight.genius.idmanager.jobs.UpdateIdEntryJob.call(UpdateIdEntryJob.java:76)
	at org.opendaylight.genius.idmanager.jobs.UpdateIdEntryJob.call(UpdateIdEntryJob.java:1)
	at org.opendaylight.infrautils.jobcoordinator.internal.JobCoordinatorImpl$MainTask.run(JobCoordinatorImpl.java:279)
	at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: ReadFailedException{message=read was interupted., errorList=[RpcError [message=read was interupted., severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=java.lang.InterruptedException]]}
	at org.opendaylight.controller.md.sal.common.api.data.ReadFailedException$1.newWithCause(ReadFailedException.java:26)
	at org.opendaylight.controller.md.sal.common.api.data.ReadFailedException$1.newWithCause(ReadFailedException.java:1)
	at org.opendaylight.yangtools.util.concurrent.ExceptionMapper.apply(ExceptionMapper.java:95)
	at org.opendaylight.yangtools.util.concurrent.ExceptionMapper.apply(ExceptionMapper.java:38)
	at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.mapException(MappingCheckedFuture.java:61)
	... 12 more
Caused by: java.lang.InterruptedException
	at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:458)
	at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:79)
	at com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:62)
	at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.get(MappingCheckedFuture.java:71)
	... 10 more
[JobCoordinator-JobQueueHandler-0] ERROR org.opendaylight.infrautils.jobcoordinator.internal.JobCoordinatorImpl - Exception while executing the tasks
java.util.concurrent.RejectedExecutionException
	at java.util.concurrent.ForkJoinPool.externalSubmit(ForkJoinPool.java:2328)
	at java.util.concurrent.ForkJoinPool.externalPush(ForkJoinPool.java:2419)
	at java.util.concurrent.ForkJoinPool.execute(ForkJoinPool.java:2648)
	at org.opendaylight.infrautils.jobcoordinator.internal.JobCoordinatorImpl$JobQueueHandler.run(JobCoordinatorImpl.java:329)
	at java.lang.Thread.run(Thread.java:748)
[main] INFO org.opendaylight.infrautils.jobcoordinator.internal.JobCoordinatorImpl - JobCoordinator now closed for business.
[ForkJoinPool-6-worker-3] ERROR org.opendaylight.infrautils.jobcoordinator.internal.JobCoordinatorImpl - Exception when executing jobEntry: JobEntry{key='tables', mainWorker=org.opendaylight.genius.idmanager.jobs.UpdateIdEntryJob@65d9dd07, rollbackWorker=null, retryCount=6, futures=null}
TransactionCommitFailedException{message=commit was interupted., errorList=[RpcError [message=commit was interupted., severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=java.lang.InterruptedException]]}
	at org.opendaylight.controller.md.sal.dom.broker.impl.TransactionCommitFailedExceptionMapper.newWithCause(TransactionCommitFailedExceptionMapper.java:37)
	at org.opendaylight.controller.md.sal.dom.broker.impl.TransactionCommitFailedExceptionMapper.newWithCause(TransactionCommitFailedExceptionMapper.java:1)
	at org.opendaylight.yangtools.util.concurrent.ExceptionMapper.apply(ExceptionMapper.java:95)
	at org.opendaylight.yangtools.util.concurrent.ExceptionMapper.apply(ExceptionMapper.java:38)
	at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.mapException(MappingCheckedFuture.java:61)
	at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.wrapInExecutionException(MappingCheckedFuture.java:65)
	at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.get(MappingCheckedFuture.java:74)
	at com.google.common.util.concurrent.AbstractCheckedFuture.checkedGet(AbstractCheckedFuture.java:85)
	at org.opendaylight.genius.idmanager.jobs.UpdateIdEntryJob.call(UpdateIdEntryJob.java:67)
	at org.opendaylight.genius.idmanager.jobs.UpdateIdEntryJob.call(UpdateIdEntryJob.java:1)
	at org.opendaylight.infrautils.jobcoordinator.internal.JobCoordinatorImpl$MainTask.run(JobCoordinatorImpl.java:279)
	at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: java.lang.InterruptedException
	at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:476)
	at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:79)
	at com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:62)
	at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.get(MappingCheckedFuture.java:71)
	... 9 more

Generated at Wed Feb 07 19:59:53 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.