Uploaded image for project: 'odlparent'
  1. odlparent
  2. ODLPARENT-143

Pax Exam NotBoundException failures

    XMLWordPrintable

Details

    • Bug
    • Status: Resolved
    • Medium
    • Resolution: Cannot Reproduce
    • None
    • None
    • SFT
    • None

    Description

      This technically isn't an "odlparent bug", but I don't know what other JIRA project to put this into, but believe it is useful to have an issue to discuss this problem, note findings, link possible future external issues, and perhaps one fine day in a far far away future upgrade Pax Exam under.

      We relatively regularly hit this kind of problem from Pax Exam ITs (rare, just because we actually have very few real IT) or the org.opendaylight.odlparent.featuretest.SingleFeatureTest (AKA the SFT; that hits it reasonably frequently, but just because we have lots of them that run implicitly and automatically for each odl-* feature in all ODL sub-projects) :

      java.rmi.NotBoundException: 8c248bd1-bb85-4fbe-a552-f46e7c70ee25
      	at sun.rmi.registry.RegistryImpl.lookup(RegistryImpl.java:227)
      	at sun.rmi.registry.RegistryImpl_Skel.dispatch(RegistryImpl_Skel.java:115)
      	at sun.rmi.server.UnicastServerRef.oldDispatch(UnicastServerRef.java:472)
      	at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:299)
      	at sun.rmi.transport.Transport$1.run(Transport.java:200)
      	at sun.rmi.transport.Transport$1.run(Transport.java:197)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
      	at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
      	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
      	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      	at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:283)
      	at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:260)
      	at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:375)
      	at sun.rmi.registry.RegistryImpl_Stub.lookup(RegistryImpl_Stub.java:119)
      	at org.ops4j.pax.exam.rbc.client.intern.RemoteBundleContextClientImpl.getRemoteBundleContext(RemoteBundleContextClientImpl.java:248)
      	at org.ops4j.pax.exam.rbc.client.intern.RemoteBundleContextClientImpl.waitForState(RemoteBundleContextClientImpl.java:218)
      	at org.ops4j.pax.exam.karaf.container.internal.KarafTestContainer.waitForState(KarafTestContainer.java:646)
      	at org.ops4j.pax.exam.karaf.container.internal.KarafTestContainer.startKaraf(KarafTestContainer.java:253)
      	at org.ops4j.pax.exam.karaf.container.internal.KarafTestContainer.start(KarafTestContainer.java:187)
      	at org.ops4j.pax.exam.spi.reactors.AllConfinedStagedReactor.invoke(AllConfinedStagedReactor.java:79)
      	at org.ops4j.pax.exam.junit.impl.ProbeRunner$2.evaluate(ProbeRunner.java:267)
      	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
      	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
      	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
      	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
      	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
      	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
      	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
      	at org.ops4j.pax.exam.junit.impl.ProbeRunner.run(ProbeRunner.java:98)
      	at org.ops4j.pax.exam.junit.PaxExam.run(PaxExam.java:93)
      	at org.opendaylight.odlparent.featuretest.PerFeatureRunner.run(PerFeatureRunner.java:72)
      	at org.opendaylight.odlparent.featuretest.PerRepoTestRunner.runChild(PerRepoTestRunner.java:153)
      	at org.opendaylight.odlparent.featuretest.PerRepoTestRunner.runChild(PerRepoTestRunner.java:28)
      	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
      	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
      	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
      	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
      	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
      	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
      	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
      	at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
      	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
      	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
      	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
      	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
      	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103

      We suspect that these are "just" transient problems with Karaf sometimes on slower Jenkins agent build VMs taking longer to come up than Pax Exam (e.g. in SFT) was waiting for (at least vorburger I've never hit this locally).

      One solution for this may be to further increase some timeout, e.g. in SFT.

      On https://lists.opendaylight.org/pipermail/release/2018-February/014115.html rovarga opines that this is something that could possibly be fixed in Pax Exam itself:

      Yes, I do believe this is an issue with pax-exam integration. The fact
      that the container is still booting should be known within the framework
      (it is driven via pax-exam-container-karaf) and hence an attempt to
      connect should not be made until the container is brought up.

      Slow build VMs are actually very good at uncovering such racey
      assumptions (i.e. it will boot in 2 minutes for sure).

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            Unassigned Unassigned
            vorburger Michael Vorburger
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: