-
Bug
-
Resolution: Cannot Reproduce
-
Medium
-
None
-
None
-
None
This technically isn't an "odlparent bug", but I don't know what other JIRA project to put this into, but believe it is useful to have an issue to discuss this problem, note findings, link possible future external issues, and perhaps one fine day in a far far away future upgrade Pax Exam under.
We relatively regularly hit this kind of problem from Pax Exam ITs (rare, just because we actually have very few real IT) or the org.opendaylight.odlparent.featuretest.SingleFeatureTest (AKA the SFT; that hits it reasonably frequently, but just because we have lots of them that run implicitly and automatically for each odl-* feature in all ODL sub-projects) :
java.rmi.NotBoundException: 8c248bd1-bb85-4fbe-a552-f46e7c70ee25 at sun.rmi.registry.RegistryImpl.lookup(RegistryImpl.java:227) at sun.rmi.registry.RegistryImpl_Skel.dispatch(RegistryImpl_Skel.java:115) at sun.rmi.server.UnicastServerRef.oldDispatch(UnicastServerRef.java:472) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:299) at sun.rmi.transport.Transport$1.run(Transport.java:200) at sun.rmi.transport.Transport$1.run(Transport.java:197) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:196) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:283) at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:260) at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:375) at sun.rmi.registry.RegistryImpl_Stub.lookup(RegistryImpl_Stub.java:119) at org.ops4j.pax.exam.rbc.client.intern.RemoteBundleContextClientImpl.getRemoteBundleContext(RemoteBundleContextClientImpl.java:248) at org.ops4j.pax.exam.rbc.client.intern.RemoteBundleContextClientImpl.waitForState(RemoteBundleContextClientImpl.java:218) at org.ops4j.pax.exam.karaf.container.internal.KarafTestContainer.waitForState(KarafTestContainer.java:646) at org.ops4j.pax.exam.karaf.container.internal.KarafTestContainer.startKaraf(KarafTestContainer.java:253) at org.ops4j.pax.exam.karaf.container.internal.KarafTestContainer.start(KarafTestContainer.java:187) at org.ops4j.pax.exam.spi.reactors.AllConfinedStagedReactor.invoke(AllConfinedStagedReactor.java:79) at org.ops4j.pax.exam.junit.impl.ProbeRunner$2.evaluate(ProbeRunner.java:267) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.ops4j.pax.exam.junit.impl.ProbeRunner.run(ProbeRunner.java:98) at org.ops4j.pax.exam.junit.PaxExam.run(PaxExam.java:93) at org.opendaylight.odlparent.featuretest.PerFeatureRunner.run(PerFeatureRunner.java:72) at org.opendaylight.odlparent.featuretest.PerRepoTestRunner.runChild(PerRepoTestRunner.java:153) at org.opendaylight.odlparent.featuretest.PerRepoTestRunner.runChild(PerRepoTestRunner.java:28) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103
We suspect that these are "just" transient problems with Karaf sometimes on slower Jenkins agent build VMs taking longer to come up than Pax Exam (e.g. in SFT) was waiting for (at least vorburger I've never hit this locally).
One solution for this may be to further increase some timeout, e.g. in SFT.
On https://lists.opendaylight.org/pipermail/release/2018-February/014115.html rovarga opines that this is something that could possibly be fixed in Pax Exam itself:
Yes, I do believe this is an issue with pax-exam integration. The fact
that the container is still booting should be known within the framework
(it is driven via pax-exam-container-karaf) and hence an attempt to
connect should not be made until the container is brought up.Slow build VMs are actually very good at uncovering such racey
assumptions (i.e. it will boot in 2 minutes for sure).