[ODLPARENT-123] After hard reset, ODL fails to connect to config subsystem netconf server Created: 26/Sep/17  Updated: 22/Aug/19  Resolved: 22/Aug/19

Status: Resolved
Project: odlparent
Component/s: General
Affects Version/s: 3.0.2
Fix Version/s: 4.0.0

Type: Bug
Reporter: Vratko Polak Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 9213

 Description   

This is seen in Carbon 3node all job. Most of the tests connect to netconf testtool (as opposed to config subsystem), so this affects only Netconfready suite, which fails even after 5 minutes of trying.

This is similar to ODLPARENT-113 in that this leads to CSIT failures only after performing a hard reset. That means killing ODL, deleting data/ snapshots/ journal/ and several other places which might persist data, and starting ODL again.

In Netconfready suite, ODL is claiming the mount point does not exist [0]. The first copy of Netconfready suite (before any resets) sees [1] the mounted data.

Looking at karaf.log [2] (and ignoring NETCONF-450 symptom) we see repeated NoClassDefFoundError:
2017-09-25 13:39:17,091 | WARN | o-group-thread-2 | ServerSession | 30 - org.apache.sshd.core - 0.14.0 | Exception caught
java.lang.NoClassDefFoundError: org/bouncycastle/openssl/PEMParser
at org.apache.sshd.server.keyprovider.PEMGeneratorHostKeyProvider.doReadKeyPair(PEMGeneratorHostKeyProvider.java:58)[30:org.apache.sshd.core:0.14.0]
at org.apache.sshd.server.keyprovider.AbstractGeneratorHostKeyProvider.readKeyPair(AbstractGeneratorHostKeyProvider.java:137)[30:org.apache.sshd.core:0.14.0]
at org.apache.sshd.server.keyprovider.AbstractGeneratorHostKeyProvider.loadKeys(AbstractGeneratorHostKeyProvider.java:117)[30:org.apache.sshd.core:0.14.0]
at org.apache.sshd.common.keyprovider.AbstractKeyPairProvider.getKeyTypes(AbstractKeyPairProvider.java:53)[30:org.apache.sshd.core:0.14.0]
at org.apache.sshd.server.session.ServerSession.sendKexInit(ServerSession.java:106)[30:org.apache.sshd.core:0.14.0]
at org.apache.sshd.server.session.ServerSession.readIdentification(ServerSession.java:168)[30:org.apache.sshd.core:0.14.0]
at org.apache.sshd.common.session.AbstractSession.messageReceived(AbstractSession.java:302)[30:org.apache.sshd.core:0.14.0]
at org.apache.sshd.common.AbstractSessionIoHandler.messageReceived(AbstractSessionIoHandler.java:54)[30:org.apache.sshd.core:0.14.0]
at org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:184)[30:org.apache.sshd.core:0.14.0]
at org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:170)[30:org.apache.sshd.core:0.14.0]
at org.apache.sshd.common.io.nio2.Nio2CompletionHandler$1.run(Nio2CompletionHandler.java:32)
at java.security.AccessController.doPrivileged(Native Method)[:1.8.0_141]
at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:30)[30:org.apache.sshd.core:0.14.0]
at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)[:1.8.0_141]
at sun.nio.ch.Invoker$2.run(Invoker.java:218)[:1.8.0_141]
at sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:112)[:1.8.0_141]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)[:1.8.0_141]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)[:1.8.0_141]
at java.lang.Thread.run(Thread.java:748)[:1.8.0_141]
Caused by: java.lang.ClassNotFoundException: org.bouncycastle.openssl.PEMParser
at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(BundleLoader.java:501)[org.eclipse.osgi-3.8.2.v20130124-134944.jar:]
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:421)[org.eclipse.osgi-3.8.2.v20130124-134944.jar:]
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:412)[org.eclipse.osgi-3.8.2.v20130124-134944.jar:]
at org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader.loadClass(DefaultClassLoader.java:107)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)[:1.8.0_141]
... 19 more

Interestingly, the same error happens also during the first Netconfready suite, but only once (at 13:35:29,627).

Possibly the difference is that the first Netconfready happens later in the ODL boot sequence, which might be important considering we are running all CSIT jobs.

[0] https://logs.opendaylight.org/releng/jenkins092/netconf-csit-3node-clustering-all-carbon/413/log.html.gz#s1-s5-t1-k3-k1-k1-k1-k1-k4-k1-k1-k1-k3-k4-k1
[1] https://logs.opendaylight.org/releng/jenkins092/netconf-csit-3node-clustering-all-carbon/413/log.html.gz#s1-s1-t1-k3-k1-k1-k1-k1-k5-k1-k2-k1-k1-k3-k4-k1
[2] https://logs.opendaylight.org/releng/jenkins092/netconf-csit-3node-clustering-all-carbon/413/odl1_karaf.log.gz



 Comments   
Comment by Vratko Polak [ 26/Sep/17 ]

> in Carbon 3node all job

Notably, Nitrogen tests are passing without anything bad in karaf log. But of course Nitrogen is Karaf 4 with different upstream ssh libraries.

Comment by Tomas Cere [ 04/Oct/17 ]

With karaf4 there the loading of bouncycastle has changed iirc, so if the changes that odlparent has done are replicated on carbon it might be possible to fix this.

Comment by Andrej Vanko [ 09/Oct/17 ]

With karaf4 there the loading of bouncycastle has changed iirc, so if the changes that odlparent has done are replicated on carbon it might be possible to fix this.

moved to ODL parent project

Comment by Robert Varga [ 22/Aug/19 ]

Seems to have been fixed through the switch to Karaf 4.

Generated at Wed Feb 07 20:27:41 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.