<!-- 
RSS generated by JIRA (8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d) at Wed Feb 07 19:08:53 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>OpenDaylight JIRA</title>
    <link>https://jira.opendaylight.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>8.20.10</version>
        <build-number>820010</build-number>
        <build-date>22-06-2022</build-date>
    </build-info>


<item>
            <title>[AAA-195] aaa bundles failing to start</title>
                <link>https://jira.opendaylight.org/browse/AAA-195</link>
                <project id="10102" key="AAA">aaa</project>
                    <description>&lt;p&gt;This issue is seen in 3node (aka cluster) CSIT and happens when bringing up a node&lt;br/&gt;
that had previously been killed. With a trimmed down version of &lt;a href=&quot;https://jenkins.opendaylight.org/releng/job/ovsdb-csit-3node-upstream-clustering-only-magnesium&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;this ovsdb job &lt;/a&gt;&lt;br/&gt;
I was only able to hit the problem once in ~75 tries so it does seem to be very&lt;br/&gt;
infrequent.&lt;/p&gt;

&lt;p&gt;I have noticed this in a 3node openflowplugin job, as well as in the sodium branch.&lt;/p&gt;

&lt;p&gt;The problem reproduced in the sandbox (logs saved for 6 months):&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/builder-copy-sandbox-logs/753/ovsdb-csit-3node-upstream-clustering-only-magnesium/51/robot-plugin/log.html.gz&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;robot log &lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;https://logs.opendaylight.org/sandbox/vex-yul-odl-jenkins-2/ovsdb-csit-3node-upstream-clustering-only-magnesium/51/odl_2/odl2_karaf.log.gz&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;karaf log of node having trouble &lt;/a&gt;&lt;/p&gt;

&lt;p&gt;the full log is above, but sort of a walk through from what I can see:&lt;/p&gt;

&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;karaf is killed then started a short time later:
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
2020-03-17T09:53:47,235 | INFO  | pipe-log:log &lt;span class=&quot;code-quote&quot;&gt;&quot;ROBOT MESSAGE: Starting test ovsdb-upstream-clustering.txt.Start OVS Multiple Connections&quot;&lt;/span&gt; | core                             | 119 - org.apache.karaf.log.core - 4.2.6 | ROBOT MESSAGE: Starting test ovsdb-upstream-clustering.txt.Kill Candidate Instance
2020-03-17T09:53:47,535 | INFO  | pipe-log:log &lt;span class=&quot;code-quote&quot;&gt;&quot;ROBOT MESSAGE: Starting test ovsdb-upstream-clustering.txt.Start OVS Multiple Connections&quot;&lt;/span&gt; | core                             | 119 - org.apache.karaf.log.core - 4.2.6 | ROBOT MESSAGE: Killing ODL2 10.30.170.78
Mar 17, 2020 9:54:16 AM org.apache.karaf.main.lock.SimpleFileLock lock
INFO: Trying to lock /tmp/karaf-0.12.0-SNAPSHOT/lock
Mar 17, 2020 9:54:17 AM org.apache.karaf.main.lock.SimpleFileLock lock
INFO: Lock acquired
Mar 17, 2020 9:54:17 AM org.apache.karaf.main.Main$KarafLockCallback lockAcquired
INFO: Lock acquired. Setting startlevel to 100
2020-03-17T09:54:18,567 | INFO  | Start Level: Equinox Container: 19e92b2f-ab93-4cb9-a1bd-1a16ef11e074 | BlueprintContainerImpl           | 82 - org.apache.aries.blueprint.core - 1.10.2 | Blueprint bundle org.apache.aries.blueprint.cm/1.3.1 has been started
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Just glossing over the logs after startup, it appears that normal startup operations are&lt;br/&gt;
underway, but after some time the log starts to get flooded with these:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
020-03-17T09:54:32,561 | WARN  | opendaylight-cluster-data-akka.actor.&lt;span class=&quot;code-keyword&quot;&gt;default&lt;/span&gt;-dispatcher-35 | EndpointReader                   | 47 - com.typesafe.akka.slf4j - 2.5.26 | Discarding inbound message to [Actor[akka:&lt;span class=&quot;code-comment&quot;&gt;//opendaylight-cluster-data/]] in read-only association to [akka.tcp://opendaylight-cluster-data@10.30.170.82:2550]. If &lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt; happens often you may consider using akka.remote.use-passive-connections=off or use Artery TCP.
&lt;/span&gt;2020-03-17T09:54:32,590 | WARN  | opendaylight-cluster-data-akka.actor.&lt;span class=&quot;code-keyword&quot;&gt;default&lt;/span&gt;-dispatcher-35 | EndpointReader                   | 47 - com.typesafe.akka.slf4j - 2.5.26 | Discarding inbound message to [Actor[akka:&lt;span class=&quot;code-comment&quot;&gt;//opendaylight-cluster-data/]] in read-only association to [akka.tcp://opendaylight-cluster-data@10.30.170.82:2550]. If &lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt; happens often you may consider using akka.remote.use-passive-connections=off or use Artery TCP.
&lt;/span&gt;2020-03-17T09:54:32,591 | WARN  | opendaylight-cluster-data-akka.actor.&lt;span class=&quot;code-keyword&quot;&gt;default&lt;/span&gt;-dispatcher-35 | EndpointReader                   | 47 - com.typesafe.akka.slf4j - 2.5.26 | Discarding inbound message to [Actor[akka:&lt;span class=&quot;code-comment&quot;&gt;//opendaylight-cluster-data/]] in read-only association to [akka.tcp://opendaylight-cluster-data@10.30.170.82:2550]. If &lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt; happens often you may consider using akka.remote.use-passive-connections=off or use Artery TCP.&lt;/span&gt;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This was noticed in the flood of the above logs:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
2020-03-17T09:56:00,372 | ERROR | Blueprint Extender: 3 | AbstractDataStore                | 227 - org.opendaylight.controller.sal-distributed-datastore - 1.11.0.SNAPSHOT | Shard leaders failed to settle in 90 seconds, giving up
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;eventually some timeoutexceptions come aprox 2.5m after starting:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
2020-03-17T09:56:21,549 | WARN  | ForkJoinPool.commonPool-worker-3 | AbstractShardBackendResolver     | 227 - org.opendaylight.controller.sal-distributed-datastore - 1.11.0.SNAPSHOT | Failed to resolve shard
java.util.concurrent.TimeoutException: Shard has no current leader
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;a few more of those timeoutexception warnings show up, then some blueprint&lt;br/&gt;
related ERROR:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
2020-03-17T09:58:01,620 | ERROR | Blueprint Extender: 3 | PrefixedShardConfigWriter        | 227 - org.opendaylight.controller.sal-distributed-datastore - 1.11.0.SNAPSHOT | Unable to write initial shard config parent.
java.util.concurrent.ExecutionException: org.opendaylight.controller.cluster.access.client.RequestTimeoutException: ModifyTransactionRequest{target=member-2-datastore-Shard-prefix-configuration-shard-fe-2-chn-1-txn-0-0, sequence=1, replyTo=Actor[akka:&lt;span class=&quot;code-comment&quot;&gt;//opendaylight-cluster-data/user/$c#222048557], modifications=0, protocol=SIMPLE} timed out after 120.005185571 seconds. The backend &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; prefix-configuration-shard is not available.&lt;/span&gt;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;finally after aprox 5.5m, aaa bundles show up with ERRORs for failing to start, although&lt;br/&gt;
my guess is that they are just the first in line to fail as there is some mention of a timeout&lt;br/&gt;
waiting for dependencies. examples:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
2020-03-17T09:59:21,848 | ERROR | Blueprint Extender: 1 | BlueprintContainerImpl           | 82 - org.apache.aries.blueprint.core - 1.10.2 | Unable to start container &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; blueprint bundle org.opendaylight.aaa.cert/0.11.0.SNAPSHOT due to unresolved dependencies [(objectClass=org.opendaylight.mdsal.binding.api.DataBroker), (&amp;amp;(|(type=&lt;span class=&quot;code-keyword&quot;&gt;default&lt;/span&gt;)(!(type=*)))(objectClass=org.opendaylight.mdsal.binding.api.DataBroker)), (objectClass=org.opendaylight.aaa.encrypt.AAAEncryptionService)]
java.util.concurrent.TimeoutException: &lt;span class=&quot;code-keyword&quot;&gt;null&lt;/span&gt;
	at org.apache.aries.blueprint.container.BlueprintContainerImpl$1.run(BlueprintContainerImpl.java:393) [82:org.apache.aries.blueprint.core:1.10.2]
	at org.apache.aries.blueprint.utils.threading.impl.DiscardableRunnable.run(DiscardableRunnable.java:45) [82:org.apache.aries.blueprint.core:1.10.2]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at java.lang.&lt;span class=&quot;code-object&quot;&gt;Thread&lt;/span&gt;.run(&lt;span class=&quot;code-object&quot;&gt;Thread&lt;/span&gt;.java:834) [?:?]
2020-03-17T09:59:21,851 | WARN  | Blueprint Event Dispatcher: 1 | BlueprintBundleTracker           | 204 - org.opendaylight.controller.blueprint - 0.12.0.SNAPSHOT | Blueprint container &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; bundle org.opendaylight.aaa.cert_0.11.0.SNAPSHOT [192] timed out waiting &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; dependencies - restarting it
2020-03-17T09:59:21,854 | INFO  | BlueprintContainerRestartService | BlueprintExtender                | 82 - org.apache.aries.blueprint.core - 1.10.2 | Destroying container &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; blueprint bundle org.opendaylight.aaa.cert/0.11.0.SNAPSHOT
2020-03-17T09:59:21,864 | ERROR | Blueprint Extender: 2 | BlueprintContainerImpl           | 82 - org.apache.aries.blueprint.core - 1.10.2 | Unable to start container &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; blueprint bundle org.opendaylight.aaa.encrypt-service-impl/0.11.0.SNAPSHOT due to unresolved dependencies [(objectClass=org.opendaylight.mdsal.binding.api.DataBroker), (&amp;amp;(|(type=&lt;span class=&quot;code-keyword&quot;&gt;default&lt;/span&gt;)(!(type=*)))(objectClass=org.opendaylight.mdsal.binding.api.DataBroker))]
java.util.concurrent.TimeoutException: &lt;span class=&quot;code-keyword&quot;&gt;null&lt;/span&gt;
	at org.apache.aries.blueprint.container.BlueprintContainerImpl$1.run(BlueprintContainerImpl.java:393) [82:org.apache.aries.blueprint.core:1.10.2]
	at org.apache.aries.blueprint.utils.threading.impl.DiscardableRunnable.run(DiscardableRunnable.java:45) [82:org.apache.aries.blueprint.core:1.10.2]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at java.lang.&lt;span class=&quot;code-object&quot;&gt;Thread&lt;/span&gt;.run(&lt;span class=&quot;code-object&quot;&gt;Thread&lt;/span&gt;.java:834) [?:?]
2020-03-17T09:59:21,865 | WARN  | Blueprint Event Dispatcher: 1 | BlueprintBundleTracker           | 204 - org.opendaylight.controller.blueprint - 0.12.0.SNAPSHOT | Blueprint container &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; bundle org.opendaylight.aaa.encrypt-service-impl_0.11.0.SNAPSHOT [194] timed out waiting &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; dependencies - restarting it
2020-03-17T09:59:21,869 | INFO  | BlueprintContainerRestartService | BlueprintContainerImpl           | 82 - org.apache.aries.blueprint.core - 1.10.2 | Blueprint bundle org.opendaylight.aaa.cert/0.11.0.SNAPSHOT is waiting &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; dependencies [(&amp;amp;(|(type=&lt;span class=&quot;code-keyword&quot;&gt;default&lt;/span&gt;)(!(type=*)))(objectClass=org.opendaylight.mdsal.binding.api.DataBroker)), (objectClass=org.opendaylight.mdsal.binding.api.DataBroker), (objectClass=org.opendaylight.aaa.encrypt.AAAEncryptionService)]
2020-03-17T09:59:21,870 | INFO  | BlueprintContainerRestartService | BlueprintExtender                | 82 - org.apache.aries.blueprint.core - 1.10.2 | Destroying container &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; blueprint bundle org.opendaylight.aaa.encrypt-service-impl/0.11.0.SNAPSHOT
2020-03-17T09:59:21,875 | INFO  | BlueprintContainerRestartService | BlueprintContainerImpl           | 82 - org.apache.aries.blueprint.core - 1.10.2 | Blueprint bundle org.opendaylight.aaa.encrypt-service-impl/0.11.0.SNAPSHOT is waiting &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; dependencies [(objectClass=org.opendaylight.mdsal.binding.api.DataBroker), (&amp;amp;(|(type=&lt;span class=&quot;code-keyword&quot;&gt;default&lt;/span&gt;)(!(type=*)))(objectClass=org.opendaylight.mdsal.binding.api.DataBroker))]
2020-03-17T09:59:21,900 | INFO  | awaitility[checkBundleDiagInfos] | KarafSystemReady                 | 239 - org.opendaylight.infrautils.ready-impl - 1.7.0.SNAPSHOT | checkBundleDiagInfos: Elapsed time 298s, remaining time 1s, diag: Booting {Installed=0, Resolved=7, Unknown=0, GracePeriod=17, Waiting=0, Starting=0, Active=356, Stopping=0, Failure=0}
2020-03-17T09:59:21,959 | WARN  | opendaylight-cluster-data-akka.actor.&lt;span class=&quot;code-keyword&quot;&gt;default&lt;/span&gt;-dispatcher-5 | EndpointReader                   | 47 - com.typesafe.akka.slf4j - 2.5.26 | Discarding inbound message to [Actor[akka:&lt;span class=&quot;code-comment&quot;&gt;//opendaylight-cluster-data/]] in read-only association to [akka.tcp://opendaylight-cluster-data@10.30.170.82:2550]. If &lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt; happens often you may consider using akka.remote.use-passive-connections=off or use Artery TCP.
&lt;/span&gt;2020-03-17T09:59:21,998 | WARN  | opendaylight-cluster-data-akka.actor.&lt;span class=&quot;code-keyword&quot;&gt;default&lt;/span&gt;-dispatcher-5 | EndpointReader                   | 47 - com.typesafe.akka.slf4j - 2.5.26 | Discarding inbound message to [Actor[akka:&lt;span class=&quot;code-comment&quot;&gt;//opendaylight-cluster-data/]] in read-only association to [akka.tcp://opendaylight-cluster-data@10.30.170.82:2550]. If &lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt; happens often you may consider using akka.remote.use-passive-connections=off or use Artery TCP.
&lt;/span&gt;2020-03-17T09:59:22,037 | ERROR | Blueprint Extender: 2 | BlueprintContainerImpl           | 82 - org.apache.aries.blueprint.core - 1.10.2 | Unable to start container &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; blueprint bundle org.opendaylight.aaa.password-service-impl/0.11.0.SNAPSHOT due to unresolved dependencies [(&amp;amp;(|(type=&lt;span class=&quot;code-keyword&quot;&gt;default&lt;/span&gt;)(!(type=*)))(objectClass=org.opendaylight.mdsal.binding.api.DataBroker))]
java.util.concurrent.TimeoutException: &lt;span class=&quot;code-keyword&quot;&gt;null&lt;/span&gt;
	at org.apache.aries.blueprint.container.BlueprintContainerImpl$1.run(BlueprintContainerImpl.java:393) [82:org.apache.aries.blueprint.core:1.10.2]
	at org.apache.aries.blueprint.utils.threading.impl.DiscardableRunnable.run(DiscardableRunnable.java:45) [82:org.apache.aries.blueprint.core:1.10.2]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at java.lang.&lt;span class=&quot;code-object&quot;&gt;Thread&lt;/span&gt;.run(&lt;span class=&quot;code-object&quot;&gt;Thread&lt;/span&gt;.java:834) [?:?]
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The log continues to show these kinds of troubles until the job gives up which is aprox 8.5m&lt;br/&gt;
after the controller was started.&lt;/p&gt;</description>
                <environment></environment>
        <key id="32496">AAA-195</key>
            <summary>aaa bundles failing to start</summary>
                <type id="10104" iconUrl="https://jira.opendaylight.org/secure/viewavatar?size=xsmall&amp;avatarId=10303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="1" iconUrl="https://jira.opendaylight.org/images/icons/priorities/blocker.svg">Highest</priority>
                        <status id="10004" iconUrl="https://jira.opendaylight.org/images/icons/status_generic.gif" description="">Verified</status>
                    <statusCategory id="3" key="done" colorName="green"/>
                                    <resolution id="10000">Done</resolution>
                                        <assignee username="rovarga">Robert Varga</assignee>
                                    <reporter username="jluhrsen">Jamo Luhrsen</reporter>
                        <labels>
                            <label>CSIT</label>
                            <label>csit:3node</label>
                    </labels>
                <created>Tue, 17 Mar 2020 23:06:53 +0000</created>
                <updated>Mon, 23 Mar 2020 18:06:18 +0000</updated>
                            <resolved>Mon, 23 Mar 2020 18:06:14 +0000</resolved>
                                                    <fixVersion>Magnesium SR1</fixVersion>
                                    <component>General</component>
                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                                                                <comments>
                            <comment id="67971" author="jluhrsen" created="Wed, 18 Mar 2020 16:04:06 +0000"  >&lt;p&gt;I finally have a semi-reliable way to reproduce this. &lt;a href=&quot;https://git.opendaylight.org/gerrit/c/integration/test/+/88447&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;This patch &lt;/a&gt; essentially is stripping all ovsdb&lt;br/&gt;
work/test-cases from the suite and just repeating the kill/restart steps 10x. The frequency of occurrence is not higher, but after running only 30 jobs I saw the&lt;br/&gt;
problem 6 times, which would take aprox 8 hours to complete if they run back to back.&lt;/p&gt;

&lt;p&gt;I will start a job with a revert of our &lt;a href=&quot;https://git.opendaylight.org/gerrit/c/aaa/+/87733&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;first suspect from AAA-180 &lt;/a&gt; and see what that looks like after repeating&lt;br/&gt;
the job.&lt;/p&gt;</comment>
                            <comment id="67972" author="jluhrsen" created="Thu, 19 Mar 2020 03:01:38 +0000"  >&lt;p&gt;The problem is still there with both the distro from the revert patch (which is Magnesium based) as well as in&lt;br/&gt;
sodium SR2. I am trying the original Sodium release now to see if it is also there or not.&lt;/p&gt;</comment>
                            <comment id="67973" author="gvrangan" created="Thu, 19 Mar 2020 07:42:35 +0000"  >&lt;p&gt;Akka Ports are unreachable&lt;/p&gt;


&lt;p&gt;,103 | INFO  | opendaylight-cluster-data-akka.actor.default-dispatcher-41 | Cluster(akka://opendaylight-cluster-data) | 47 - com.typesafe.akka.slf4j - 2.5.26 | Cluster Node &lt;span class=&quot;error&quot;&gt;&amp;#91;akka.tcp://opendaylight-cluster-data@10.30.170.23:2550&amp;#93;&lt;/span&gt; - Leader can currently not perform its duties, reachability status: [akka.tcp://opendaylight-cluster-data@10.30.170.68:2550 -&amp;gt; akka.tcp://opendaylight-cluster-data@10.30.170.8:2550: Unreachable &lt;span class=&quot;error&quot;&gt;&amp;#91;Unreachable&amp;#93;&lt;/span&gt; (1), akka.tcp://opendaylight-cluster-data@10.30.170.8:2550 -&amp;gt; akka.tcp://opendaylight-cluster-data@10.30.170.68:2550: Unreachable &lt;span class=&quot;error&quot;&gt;&amp;#91;Unreachable&amp;#93;&lt;/span&gt; (3)], member status: &lt;span class=&quot;error&quot;&gt;&amp;#91;akka.tcp://opendaylight-cluster-data@10.30.170.23:2550 Up seen=true, akka.tcp://opendaylight-cluster-data@10.30.170.68:2550 Up seen=true, akka.tcp://opendaylight-cluster-data@10.30.170.8:2550 Up seen=true&amp;#93;&lt;/span&gt;&lt;/p&gt;</comment>
                            <comment id="67977" author="jluhrsen" created="Thu, 19 Mar 2020 21:25:44 +0000"  >&lt;p&gt;Wondering if it&apos;s because of the broken controller doing this:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
2020-03-17T09:54:30,760 | WARN  | opendaylight-cluster-data-akka.actor.&lt;span class=&quot;code-keyword&quot;&gt;default&lt;/span&gt;-dispatcher-35 | EndpointReader                   | 47 - com.typesafe.akka.slf4j - 2.5.26 | Discarding inbound message to [Actor[akka:&lt;span class=&quot;code-comment&quot;&gt;//opendaylight-cluster-data/]] in read-only association to [akka.tcp://opendaylight-cluster-data@10.30.170.82:2550]. If &lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt; happens often you may consider using akka.remote.use-passive-connections=off or use Artery TCP.&lt;/span&gt;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I see what looks like all three controllers noticing the controller coming up and starting to sync with each&lt;br/&gt;
other. I see Leadership decision messages (Candidate to Follower or Leader, etc.) I also see messages&lt;br/&gt;
like this coming from the broken controller:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
2020-03-17T09:54:28,670 | INFO  | opendaylight-cluster-data-akka.actor.&lt;span class=&quot;code-keyword&quot;&gt;default&lt;/span&gt;-dispatcher-44 | Cluster(akka:&lt;span class=&quot;code-comment&quot;&gt;//opendaylight-cluster-data) | 47 - com.typesafe.akka.slf4j - 2.5.26 | Cluster Node [akka.tcp://opendaylight-cluster-data@10.30.170.59:2550] - Received InitJoin message from [Actor[akka.tcp://opendaylight-cluster-data@10.30.170.78:2550/system/cluster/core/daemon/joinSeedNodeProcess-1#1614826351]] to [akka.tcp://opendaylight-cluster-data@10.30.170.59:2550]
&lt;/span&gt;2020-03-17T09:54:28,671 | INFO  | opendaylight-cluster-data-akka.actor.&lt;span class=&quot;code-keyword&quot;&gt;default&lt;/span&gt;-dispatcher-44 | Cluster(akka:&lt;span class=&quot;code-comment&quot;&gt;//opendaylight-cluster-data) | 47 - com.typesafe.akka.slf4j - 2.5.26 | Cluster Node [akka.tcp://opendaylight-cluster-data@10.30.170.59:2550] - Sending InitJoinAck message from node [akka.tcp://opendaylight-cluster-data@10.30.170.59:2550] to [Actor[akka.tcp://opendaylight-cluster-data@10.30.170.78:2550/system/cluster/core/daemon/joinSeedNodeProcess-1#1614826351]] (version [2.5.26])&lt;/span&gt;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.opendaylight.org/secure/ViewProfile.jspa?name=rovarga&quot; class=&quot;user-hover&quot; rel=&quot;rovarga&quot;&gt;rovarga&lt;/a&gt; has a &lt;a href=&quot;https://git.opendaylight.org/gerrit/c/controller/+/88437&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;patch to disable &quot;passive&quot; connections &lt;/a&gt; which is clearly suggested in that discard log above. I&apos;ll try&lt;br/&gt;
reproducing this with that patch next. It&apos;s merged in Aluminium/master now.&lt;/p&gt;

&lt;p&gt;BTW, this issue was also seen in the original Sodium release so clearly it&apos;s been around for a while now and we are&lt;br/&gt;
just now noticing and following up. Originally I thought it might be some new issue, but I was wrong.&lt;/p&gt;</comment>
                            <comment id="67978" author="jluhrsen" created="Fri, 20 Mar 2020 17:41:46 +0000"  >&lt;p&gt;The patch to disable passive connections seems to resolve this. 400+ iterations without the failure now.&lt;br/&gt;
Recall before that in 300 tries I saw the problem 6 times. I&apos;ll let it keep running through the day just in&lt;br/&gt;
case.&lt;/p&gt;

&lt;p&gt;here are the cherry-picks to sodium and magnesium that I think we want to get in as well:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://git.opendaylight.org/gerrit/c/controller/+/88304&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://git.opendaylight.org/gerrit/c/controller/+/88304&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;https://git.opendaylight.org/gerrit/c/controller/+/88305&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://git.opendaylight.org/gerrit/c/controller/+/88305&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="67979" author="jluhrsen" created="Sat, 21 Mar 2020 04:37:20 +0000"  >&lt;p&gt;totally convinced now &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.opendaylight.org/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt; 80+ without a failure. once the cherry-picks are merged I will close this as resolved.&lt;/p&gt;</comment>
                            <comment id="67981" author="jluhrsen" created="Mon, 23 Mar 2020 18:06:14 +0000"  >&lt;p&gt;thanks for the fix. I love when we fix those random sporadic failures that end up annoying us at the worst times (like when releasing)&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                            <customfield id="customfield_11400" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10000" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>0|i03rjj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>