<!-- 
RSS generated by JIRA (8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d) at Wed Feb 07 19:56:14 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>OpenDaylight JIRA</title>
    <link>https://jira.opendaylight.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>8.20.10</version>
        <build-number>820010</build-number>
        <build-date>22-06-2022</build-date>
    </build-info>


<item>
            <title>[CONTROLLER-1706] Large transaction traffic prevents leader to be moved</title>
                <link>https://jira.opendaylight.org/browse/CONTROLLER-1706</link>
                <project id="10113" key="CONTROLLER">controller</project>
                    <description>&lt;p&gt;This symptom is affecting current cluster testing, but after fixing other bugs this might no longer be critical. This has been only seen on Sandbox so far.&lt;/p&gt;

&lt;p&gt;The robot symptom &quot;Leader not found&quot; &lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt; is similar to &lt;a href=&quot;https://jira.opendaylight.org/browse/CONTROLLER-1693&quot; title=&quot;UnreachableMember during remove-shard-replica prevents new leader to get elected&quot; class=&quot;issue-link&quot; data-issue-key=&quot;CONTROLLER-1693&quot;&gt;&lt;del&gt;CONTROLLER-1693&lt;/del&gt;&lt;/a&gt; but this time there is no &quot;UnreachableMember&quot; seen in karaf.log &lt;span class=&quot;error&quot;&gt;&amp;#91;1&amp;#93;&lt;/span&gt; (perhaps because &lt;a href=&quot;https://jira.opendaylight.org/browse/CONTROLLER-1703&quot; title=&quot;Tweak Akka and Java timeouts to a reasonable compromise between stability and failure detection&quot; class=&quot;issue-link&quot; data-issue-key=&quot;CONTROLLER-1703&quot;&gt;CONTROLLER-1703&lt;/a&gt;).&lt;br/&gt;
&lt;a href=&quot;https://jira.opendaylight.org/browse/CONTROLLER-1675&quot; title=&quot;Leadership transfer failed: Follower is not ready to become leader&quot; class=&quot;issue-link&quot; data-issue-key=&quot;CONTROLLER-1675&quot;&gt;CONTROLLER-1675&lt;/a&gt; is similar, but here the leader moving call (remove-shard-replica in this case) does not fail, and karaf.log just states the leadership transfer was not successful:&lt;/p&gt;

&lt;p&gt;2017-06-06 07:51:36,160 | WARN  | lt-dispatcher-28 | aftActorLeadershipTransferCohort | 193 - org.opendaylight.controller.sal-akka-raft - 1.5.1.SNAPSHOT | member-1-shard-default-config: Failed to transfer leadership in 10.01 s&lt;/p&gt;

&lt;p&gt;I suspect the large transaction is the creation of a large list at the start of write-transactions, see &lt;a href=&quot;https://jira.opendaylight.org/browse/CONTROLLER-1703&quot; title=&quot;Tweak Akka and Java timeouts to a reasonable compromise between stability and failure detection&quot; class=&quot;issue-link&quot; data-issue-key=&quot;CONTROLLER-1703&quot;&gt;CONTROLLER-1703&lt;/a&gt;. But I am not sure whether the following proves or disproves that:&lt;/p&gt;

&lt;p&gt;2017-06-06 07:52:40,474 | INFO  | ternal.Finalizer | lientBackedTransaction$Finalizer | 199 - org.opendaylight.controller.sal-distributed-datastore - 1.5.1.SNAPSHOT | Aborted orphan transaction ClientSnapshot&lt;/p&gt;
{identifier=member-1-datastore-config-fe-0-txn-7-0}

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt; &lt;a href=&quot;https://logs.opendaylight.org/sandbox/jenkins091/controller-csit-3node-clustering-only-carbon/4/log.html.gz#s1-s18-t1-k2-k12-k1-k3-k1&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://logs.opendaylight.org/sandbox/jenkins091/controller-csit-3node-clustering-only-carbon/4/log.html.gz#s1-s18-t1-k2-k12-k1-k3-k1&lt;/a&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;1&amp;#93;&lt;/span&gt; &lt;a href=&quot;https://logs.opendaylight.org/sandbox/jenkins091/controller-csit-3node-clustering-only-carbon/4/odl1_karaf.log.gz&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://logs.opendaylight.org/sandbox/jenkins091/controller-csit-3node-clustering-only-carbon/4/odl1_karaf.log.gz&lt;/a&gt;&lt;/p&gt;</description>
                <environment>&lt;p&gt;Operating System: All&lt;br/&gt;
Platform: All&lt;/p&gt;</environment>
        <key id="26260">CONTROLLER-1706</key>
            <summary>Large transaction traffic prevents leader to be moved</summary>
                <type id="10104" iconUrl="https://jira.opendaylight.org/secure/viewavatar?size=xsmall&amp;avatarId=10303&amp;avatarType=issuetype">Bug</type>
                                                <status id="5" iconUrl="https://jira.opendaylight.org/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="green"/>
                                    <resolution id="10000">Done</resolution>
                                        <assignee username="jmorvay@cisco.com">Jakub Morvay</assignee>
                                    <reporter username="vrpolak">Vratko Polak</reporter>
                        <labels>
                    </labels>
                <created>Tue, 6 Jun 2017 12:49:32 +0000</created>
                <updated>Tue, 25 Jul 2023 08:24:41 +0000</updated>
                            <resolved>Wed, 21 Jun 2017 09:12:20 +0000</resolved>
                                                                    <component>clustering</component>
                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                                                                <comments>
                            <comment id="52330" author="vrpolak" created="Tue, 6 Jun 2017 12:58:19 +0000"  >&lt;p&gt;Note that this happens both for module-based and prefix-based shards (tell-based protocol) but only when 3 writers/producers (and a single listener) are used. Similar suite with single writer/producer passes (mostly).&lt;/p&gt;</comment>
                            <comment id="52331" author="vrpolak" created="Wed, 7 Jun 2017 15:02:37 +0000"  >&lt;p&gt;With the initial big transaction removed &lt;span class=&quot;error&quot;&gt;&amp;#91;2&amp;#93;&lt;/span&gt; (now merged to stable/carbon), this failure still affects remove-shard-replica when there are frequent small transactions, but only for module-based shards (not prefix-based ones) for some reason.&lt;/p&gt;

&lt;p&gt;Recent sandbox, robot &lt;span class=&quot;error&quot;&gt;&amp;#91;3&amp;#93;&lt;/span&gt;, member-1 karaf.log (huge) &lt;span class=&quot;error&quot;&gt;&amp;#91;4&amp;#93;&lt;/span&gt;:&lt;/p&gt;

&lt;p&gt;2017-06-06 19:14:27,516 | WARN  | lt-dispatcher-25 | aftActorLeadershipTransferCohort | 193 - org.opendaylight.controller.sal-akka-raft - 1.5.1.SNAPSHOT | member-1-shard-default-config: Failed to transfer leadership in 10.03 s&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;2&amp;#93;&lt;/span&gt; &lt;a href=&quot;https://git.opendaylight.org/gerrit/58355&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://git.opendaylight.org/gerrit/58355&lt;/a&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;3&amp;#93;&lt;/span&gt; &lt;a href=&quot;https://logs.opendaylight.org/sandbox/jenkins091/controller-csit-3node-clustering-only-carbon/6/log.html.gz#s1-s18-t1-k2-k12-k1-k3-k1&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://logs.opendaylight.org/sandbox/jenkins091/controller-csit-3node-clustering-only-carbon/6/log.html.gz#s1-s18-t1-k2-k12-k1-k3-k1&lt;/a&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;4&amp;#93;&lt;/span&gt; &lt;a href=&quot;https://logs.opendaylight.org/sandbox/jenkins091/controller-csit-3node-clustering-only-carbon/6/odl1_karaf.log.gz&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://logs.opendaylight.org/sandbox/jenkins091/controller-csit-3node-clustering-only-carbon/6/odl1_karaf.log.gz&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="52332" author="vrpolak" created="Wed, 7 Jun 2017 15:24:11 +0000"  >&lt;p&gt;&amp;gt; (not prefix-based ones) for some reason.&lt;/p&gt;

&lt;p&gt;For comparison, prefix shard performance:&lt;/p&gt;

&lt;p&gt;2017-06-06 19:23:12,676 | INFO  | ult-dispatcher-3 | aftActorLeadershipTransferCohort | 193 - org.opendaylight.controller.sal-akka-raft - 1.5.1.SNAPSHOT | member-3-shard-id-ints!-config: Successfully transferred leadership to null in 2.031 s&lt;/p&gt;</comment>
                            <comment id="52333" author="jmorvay@cisco.com" created="Fri, 9 Jun 2017 12:31:55 +0000"  >&lt;p&gt;I have analyzed logs from run &lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt;. The only differences between this job and the jobs mentioned above by Vratko are that we are using PoisonPill to kill shard leader replica (not remove-shard-replica rpc) and we have debug logs for raft messages.&lt;/p&gt;

&lt;p&gt;Moreover, I have been looking on Clean Leader Shutdown suite, not on Remote Listener suite mentioned by Vratko but I believe they are failing because of the same reason.&lt;/p&gt;

&lt;p&gt;Scenario goes as follows. We have three node cluster and on leader, we start writing data to shard-default-config shard (that&apos;s member-3 in this particular run). Then we kill shard-default-config shard on leader node (we send PoisonPill to member-3-shard-default-config shard). We expect that followers will elect new leader quickly, in span of small amount of election timeouts. But this is not the case. Leader is not elected in one minute and test fails.&lt;/p&gt;

&lt;p&gt;From the logs we can see the leader shard is going down in 11:06:31,494&lt;br/&gt;
2017-06-08 11:06:31,494 | INFO  | lt-dispatcher-20 | Shard                            | 192 - org.opendaylight.controller.sal-clustering-commons - 1.5.1.SNAPSHOT | Stopping Shard member-3-shard-default-config&lt;br/&gt;
2017-06-08 11:06:31,494 | DEBUG | lt-dispatcher-20 | Shard                            | 192 - org.opendaylight.controller.sal-clustering-commons - 1.5.1.SNAPSHOT | member-3-shard-default-config: Aborting 307 pending queued transactions&lt;/p&gt;

&lt;p&gt;When we look on followers, they are still processing AppendEntries messages for shard-default-config shard from leader for some small amount of time. But we can see they timeout elections properly, roughly after 10 seconds the leader is down. We can see this on member-2&lt;/p&gt;

&lt;p&gt;2017-06-08 11:06:41,479 | TRACE | lt-dispatcher-30 | Shard                            | 192 - org.opendaylight.controller.sal-clustering-commons - 1.5.1.SNAPSHOT | Received message ElectionTimeout&lt;br/&gt;
2017-06-08 11:06:41,479 | DEBUG | lt-dispatcher-30 | Shard                            | 192 - org.opendaylight.controller.sal-clustering-commons - 1.5.1.SNAPSHOT | member-2-shard-default-config (Follower): Checking for leader akka.tcp://opendaylight-cluster-data@10.29.15.136:2550 in the cluster unreachable set []&lt;br/&gt;
2017-06-08 11:06:41,481 | DEBUG | lt-dispatcher-30 | Shard                            | 192 - org.opendaylight.controller.sal-clustering-commons - 1.5.1.SNAPSHOT | member-2-shard-default-config (Follower): Leader akka.tcp://opendaylight-cluster-data@10.29.15.136:2550 cluster status is Up - leader is available&lt;br/&gt;
2017-06-08 11:06:41,481 | DEBUG | lt-dispatcher-30 | Shard                            | 192 - org.opendaylight.controller.sal-clustering-commons - 1.5.1.SNAPSHOT | member-2-shard-default-config (Follower): Received ElectionTimeout but leader appears to be available&lt;/p&gt;

&lt;p&gt;And this is repeated every 10 seconds, until the test ends with failure. We can see same logs also on member-1.&lt;/p&gt;

&lt;p&gt;When we look closely into the implementation of handling election timeout on followers, we can see that followers are checking also the cluster status of leader before switching to candidate. And they do it for 180 seconds (if the election timeout is 10 seconds). This was introduced in &lt;a href=&quot;https://git.opendaylight.org/gerrit/#/c/43265/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://git.opendaylight.org/gerrit/#/c/43265/&lt;/a&gt; as a fix for &lt;a href=&quot;https://jira.opendaylight.org/browse/CONTROLLER-1495&quot; title=&quot;Prevent Follower from becoming Candidate when Akka Cluster reports Leader as Reachable&quot; class=&quot;issue-link&quot; data-issue-key=&quot;CONTROLLER-1495&quot;&gt;&lt;del&gt;CONTROLLER-1495&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Clearly, this prevents quick elections in the case the leader shard actor is down but its cluster node is still up and the leadership transfer was not successful. For example, this is the case if you send PoisonPill to leader shard actor or if RaftActor&apos;s pauseLeader &lt;span class=&quot;error&quot;&gt;&amp;#91;1&amp;#93;&lt;/span&gt; method fails during leadership transfer attempt after sending Shutdown message to leader shard actor (this can happen as a part of remove-shard-replica rpc operation).&lt;/p&gt;


&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt; &lt;a href=&quot;https://logs.opendaylight.org/sandbox/jenkins091/controller-csit-3node-clustering-only-carbon-2nd/17/log.html.gz#s1-s2&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://logs.opendaylight.org/sandbox/jenkins091/controller-csit-3node-clustering-only-carbon-2nd/17/log.html.gz#s1-s2&lt;/a&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;1&amp;#93;&lt;/span&gt; &lt;a href=&quot;https://github.com/opendaylight/controller/blob/87551f3a44856d7494ffef678b63bb01a1b4ab2e/opendaylight/md-sal/sal-akka-raft/src/main/java/org/opendaylight/controller/cluster/raft/RaftActorLeadershipTransferCohort.java#L103&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/opendaylight/controller/blob/87551f3a44856d7494ffef678b63bb01a1b4ab2e/opendaylight/md-sal/sal-akka-raft/src/main/java/org/opendaylight/controller/cluster/raft/RaftActorLeadershipTransferCohort.java#L103&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="52334" author="tpantelis" created="Fri, 9 Jun 2017 13:32:38 +0000"  >&lt;p&gt;yes - I warned about side-effects in the patch with just blindly killing the shard in the tests with PoisonPill. On election timeout, followers check akka&apos;s cluster state - this is to avoid spurious elections if the leader shard is busy for period of time. We want to avoid deposing the leader unless we can be reasonably sure the node is not available. Normally in production the leader node either is shut down or becomes unavailable due to network partition. In either case, akka&apos;s cluster state will mark the node as unreachable/down so followers will initiate a new election on timeout. On graceful leader shutdown, we try to transfer leadership to speed up the process. &lt;/p&gt;

&lt;p&gt;We do have an upper deadline to initiate a new election on election timeout as a fail-safe as you mentioned. This would occur if the leader shard actor had a hard failure and couldn&apos;t recover or the process is hosed (low memory, out of memory etc), either of which should be uncommon. &lt;/p&gt;

&lt;p&gt;In a 3-node setup, what are you really trying to test by killing the leader shard actor? Why don&apos;t you just shut the leader node down, or kill the process, or remove the shard replica? If you really want stop the shard while keeping the process up then I would suggest stopping it gracefully so it transfers leadership.&lt;/p&gt;</comment>
                            <comment id="52335" author="tpantelis" created="Fri, 9 Jun 2017 13:41:29 +0000"  >&lt;p&gt;(In reply to Jakub Morvay from comment #4)&lt;/p&gt;

&lt;p&gt;&amp;gt; Clearly, this prevents quick elections in the case the leader shard actor is&lt;br/&gt;
&amp;gt; down but its cluster node is still up and the leadership transfer was not&lt;br/&gt;
&amp;gt; successful. For example, this is the case if you send PoisonPill to leader&lt;br/&gt;
&amp;gt; shard actor or if RaftActor&apos;s pauseLeader &lt;span class=&quot;error&quot;&gt;&amp;#91;1&amp;#93;&lt;/span&gt; method fails during leadership&lt;br/&gt;
&amp;gt; transfer attempt after sending Shutdown message to leader shard actor (this&lt;br/&gt;
&amp;gt; can happen as a part of remove-shard-replica rpc operation).&lt;br/&gt;
&amp;gt; &lt;/p&gt;

&lt;p&gt;yes - but a failure in pauseLeader or leadership transfer indicates follower unavailability (normally at least), either followers are down or the leader is partitioned. For the latter, the followers would detect the partition as well and initiate a new election.&lt;/p&gt;</comment>
                            <comment id="52336" author="jmorvay@cisco.com" created="Fri, 9 Jun 2017 14:45:15 +0000"  >&lt;p&gt;(In reply to Tom Pantelis from comment #5)&lt;br/&gt;
&amp;gt; In a 3-node setup, what are you really trying to test by killing the leader&lt;br/&gt;
&amp;gt; shard actor? Why don&apos;t you just shut the leader node down, or kill the&lt;br/&gt;
&amp;gt; process, or remove the shard replica? If you really want stop the shard&lt;br/&gt;
&amp;gt; while keeping the process up then I would suggest stopping it gracefully so&lt;br/&gt;
&amp;gt; it transfers leadership.&lt;/p&gt;

&lt;p&gt;We have used remove-shard-replica rpc for this and it is still used on releng. We are going to try to stop shard gracefully in our tests, I&apos;m currently working on such rpcs. However I believe this failure will be still present, because pauseLeader method will fail.&lt;/p&gt;</comment>
                            <comment id="52337" author="jmorvay@cisco.com" created="Fri, 9 Jun 2017 14:58:17 +0000"  >&lt;p&gt;(In reply to Tom Pantelis from comment #6)&lt;br/&gt;
&amp;gt; yes - but a failure in pauseLeader or leadership transfer indicates follower&lt;br/&gt;
&amp;gt; unavailability (normally at least), either followers are down or the leader&lt;br/&gt;
&amp;gt; is partitioned. For the latter, the followers would detect the partition as&lt;br/&gt;
&amp;gt; well and initiate a new election.&lt;/p&gt;

&lt;p&gt;Well, is that true for pauseLeader method &quot;timeout&quot;? If leader cannot complete all pending tasks in election timeout period, it just aborts the leadership transfer and do nothing (apart from shutting itself, become non-voting etc.). If we know, that leader is stepping down anyway, shouldn&apos;t we send TimeoutNow message to random follower to speed things up and not wait 180 seconds for new leader to emerge?&lt;/p&gt;</comment>
                            <comment id="52338" author="tpantelis" created="Fri, 9 Jun 2017 15:05:48 +0000"  >&lt;p&gt;(In reply to Jakub Morvay from comment #7)&lt;br/&gt;
&amp;gt; (In reply to Tom Pantelis from comment #5)&lt;br/&gt;
&amp;gt; &amp;gt; In a 3-node setup, what are you really trying to test by killing the leader&lt;br/&gt;
&amp;gt; &amp;gt; shard actor? Why don&apos;t you just shut the leader node down, or kill the&lt;br/&gt;
&amp;gt; &amp;gt; process, or remove the shard replica? If you really want stop the shard&lt;br/&gt;
&amp;gt; &amp;gt; while keeping the process up then I would suggest stopping it gracefully so&lt;br/&gt;
&amp;gt; &amp;gt; it transfers leadership.&lt;br/&gt;
&amp;gt; &lt;br/&gt;
&amp;gt; We have used remove-shard-replica rpc for this and it is still used on&lt;br/&gt;
&amp;gt; releng. We are going to try to stop shard gracefully in our tests, I&apos;m&lt;br/&gt;
&amp;gt; currently working on such rpcs. However I believe this failure will be still&lt;br/&gt;
&amp;gt; present, because pauseLeader method will fail.&lt;/p&gt;

&lt;p&gt;pauseLeader in Shard tries to wait for all pending transactions to complete within the election timeout interval. Why would you to expect this to time out in the test suite? &lt;/p&gt;

&lt;p&gt;In any event, if you anticipate leadership transfer could fail then you can increase the expected deadline in the suite to take into account the potential 180 second timeout.&lt;/p&gt;</comment>
                            <comment id="52339" author="jmorvay@cisco.com" created="Fri, 9 Jun 2017 15:31:29 +0000"  >&lt;p&gt;(In reply to Tom Pantelis from comment #9) &lt;br/&gt;
&amp;gt; pauseLeader in Shard tries to wait for all pending transactions to complete&lt;br/&gt;
&amp;gt; within the election timeout interval. Why would you to expect this to time&lt;br/&gt;
&amp;gt; out in the test suite? &lt;br/&gt;
&amp;gt; &lt;br/&gt;
&amp;gt; In any event, if you anticipate leadership transfer could fail then you can&lt;br/&gt;
&amp;gt; increase the expected deadline in the suite to take into account the&lt;br/&gt;
&amp;gt; potential 180 second timeout.&lt;/p&gt;

&lt;p&gt;Yeah well, we have seen such case, where leader couldn&apos;t complete all txs within the election timeout interval.&lt;/p&gt;</comment>
                            <comment id="52340" author="tpantelis" created="Fri, 9 Jun 2017 15:40:33 +0000"  >&lt;p&gt;(In reply to Jakub Morvay from comment #8)&lt;br/&gt;
&amp;gt; (In reply to Tom Pantelis from comment #6)&lt;br/&gt;
&amp;gt; &amp;gt; yes - but a failure in pauseLeader or leadership transfer indicates follower&lt;br/&gt;
&amp;gt; &amp;gt; unavailability (normally at least), either followers are down or the leader&lt;br/&gt;
&amp;gt; &amp;gt; is partitioned. For the latter, the followers would detect the partition as&lt;br/&gt;
&amp;gt; &amp;gt; well and initiate a new election.&lt;br/&gt;
&amp;gt; &lt;br/&gt;
&amp;gt; Well, is that true for pauseLeader method &quot;timeout&quot;? If leader cannot&lt;br/&gt;
&amp;gt; complete all pending tasks in election timeout period, it just aborts the&lt;br/&gt;
&amp;gt; leadership transfer and do nothing (apart from shutting itself, become&lt;br/&gt;
&amp;gt; non-voting etc.). If we know, that leader is stepping down anyway, shouldn&apos;t&lt;br/&gt;
&amp;gt; we send TimeoutNow message to random follower to speed things up and not&lt;br/&gt;
&amp;gt; wait 180 seconds for new leader to emerge?&lt;/p&gt;

&lt;p&gt;Keep in mind that hitting the 180 second deadline should be very uncommon. You hit it b/c you blindly killed the leader actor ungracefully while keeping the process up which is abnormal.&lt;/p&gt;

&lt;p&gt;If pauseLeader fails it most likely means it can&apos;t get consensus for a transaction in which case leadership transfer would fail as well as there aren&apos;t enough available nodes to elect a new leader. Of course it could time out for uncommon reasons, eg akka&apos;s transport is backed up or followers are too busy or in a bad state or the volume of transactions is large.&lt;/p&gt;

&lt;p&gt;Just sending TimeoutNow message to a random follower isn&apos;t ideal as the follower may not be able to collect the votes to become leader. This is why the leader first attempts to &quot;catch up&quot; at least one follower. &lt;/p&gt;

&lt;p&gt;If leadership transfer times out, could you just send TimeoutNow to all followers? We could but there could be unwanted side-effects. In the common case where a majority of followers isn&apos;t available, sending TimeoutNow probably won&apos;t help any. However the TimeoutNow message may get queued in akka&apos;s transport layer and eventually be delivered at some point which may not be desirable. Same could happen for the uncommon reasons above.&lt;/p&gt;</comment>
                            <comment id="52341" author="tpantelis" created="Fri, 9 Jun 2017 15:48:33 +0000"  >&lt;p&gt;(In reply to Jakub Morvay from comment #10)&lt;br/&gt;
&amp;gt; (In reply to Tom Pantelis from comment #9) &lt;br/&gt;
&amp;gt; &amp;gt; pauseLeader in Shard tries to wait for all pending transactions to complete&lt;br/&gt;
&amp;gt; &amp;gt; within the election timeout interval. Why would you to expect this to time&lt;br/&gt;
&amp;gt; &amp;gt; out in the test suite? &lt;br/&gt;
&amp;gt; &amp;gt; &lt;br/&gt;
&amp;gt; &amp;gt; In any event, if you anticipate leadership transfer could fail then you can&lt;br/&gt;
&amp;gt; &amp;gt; increase the expected deadline in the suite to take into account the&lt;br/&gt;
&amp;gt; &amp;gt; potential 180 second timeout.&lt;br/&gt;
&amp;gt; &lt;br/&gt;
&amp;gt; Yeah well, we have seen such case, where leader couldn&apos;t complete all txs&lt;br/&gt;
&amp;gt; within the election timeout interval.&lt;/p&gt;

&lt;p&gt;When you&apos;re testing scale you usually have to be more lenient with the test expectations and deadlines. The test VM environment could randomly cause delays wrt to system and network resources and result in intermittent failures. Even if you forced a TimeoutNow message, if there&apos;s a large volume of pending transactions then the follower&apos;s queue will likely be backed up so no guarantee when the TimeoutNow message would be delivered.&lt;/p&gt;</comment>
                            <comment id="52342" author="tpantelis" created="Sat, 10 Jun 2017 13:35:41 +0000"  >&lt;p&gt;Thinking about this some more, if pauseLeader times out, I think it does make sense to continue with leadership transfer instead of aborting. The shard may have a lot of transactions queued up which it can&apos;t finish in time but there may still be a follower that is caught up (ie whose matchIndex equals the leader&apos;s lastIndex) or would be caught up if leadership transfer continued. Worst case is no follower is available and the &quot;catch up&quot; phase of leadership transfer also times out which would lengthen shut down time but that should be fine. I can look into that.&lt;/p&gt;

&lt;p&gt;As a last resort if leadership transfer times out, we could force the TimeoutNow message to &quot;active&quot; followers, ie those that have responded recently so should be up.&lt;/p&gt;</comment>
                            <comment id="52343" author="vrpolak" created="Mon, 12 Jun 2017 13:22:35 +0000"  >&lt;p&gt;&quot;Failed to transfer leadership&quot; still happens occasionally (&lt;span class=&quot;error&quot;&gt;&amp;#91;2&amp;#93;&lt;/span&gt; at 09:14:29,865). For example after remove-shard-replica with single transaction-writer &lt;span class=&quot;error&quot;&gt;&amp;#91;3&amp;#93;&lt;/span&gt; (on a node separate to leader of module-based shard, tell-based protocol).&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;2&amp;#93;&lt;/span&gt; &lt;a href=&quot;https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-clustering-only-carbon/740/odl1_karaf.log.gz&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-clustering-only-carbon/740/odl1_karaf.log.gz&lt;/a&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;3&amp;#93;&lt;/span&gt; &lt;a href=&quot;https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-clustering-only-carbon/740/log.html.gz#s1-s20-t3-k2-k9&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-clustering-only-carbon/740/log.html.gz#s1-s20-t3-k2-k9&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="52344" author="vrpolak" created="Mon, 12 Jun 2017 13:34:40 +0000"  >&lt;p&gt;Also the original &quot;Leader not found&quot; symptom is seen &lt;span class=&quot;error&quot;&gt;&amp;#91;4&amp;#93;&lt;/span&gt; occasionally.&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;4&amp;#93;&lt;/span&gt; &lt;a href=&quot;https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-clustering-only-carbon/740/log.html.gz#s1-s36-t1-k2-k12-k1-k3-k1&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-clustering-only-carbon/740/log.html.gz#s1-s36-t1-k2-k12-k1-k3-k1&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="52345" author="tpantelis" created="Mon, 12 Jun 2017 13:35:19 +0000"  >&lt;p&gt;(In reply to Vratko Pol&#225;k from comment #14)&lt;br/&gt;
&amp;gt; &quot;Failed to transfer leadership&quot; still happens occasionally (&lt;span class=&quot;error&quot;&gt;&amp;#91;2&amp;#93;&lt;/span&gt; at&lt;br/&gt;
&amp;gt; 09:14:29,865). For example after remove-shard-replica with single&lt;br/&gt;
&amp;gt; transaction-writer &lt;span class=&quot;error&quot;&gt;&amp;#91;3&amp;#93;&lt;/span&gt; (on a node separate to leader of module-based shard,&lt;br/&gt;
&amp;gt; tell-based protocol).&lt;br/&gt;
&amp;gt; &lt;br/&gt;
&amp;gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;2&amp;#93;&lt;/span&gt;&lt;br/&gt;
&amp;gt; &lt;a href=&quot;https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-&lt;/a&gt;&lt;br/&gt;
&amp;gt; clustering-only-carbon/740/odl1_karaf.log.gz&lt;br/&gt;
&amp;gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;3&amp;#93;&lt;/span&gt;&lt;br/&gt;
&amp;gt; &lt;a href=&quot;https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-&lt;/a&gt;&lt;br/&gt;
&amp;gt; clustering-only-carbon/740/log.html.gz#s1-s20-t3-k2-k9&lt;/p&gt;

&lt;p&gt;No matter what we do, leadership transfer is a best effort and may not be able to successfully complete in time under load for the reasons I&apos;ve mentioned. You have to take that into account in the tests with more lenient deadlines.&lt;/p&gt;</comment>
                            <comment id="52346" author="tpantelis" created="Mon, 12 Jun 2017 19:00:11 +0000"  >&lt;p&gt;Submitted &lt;a href=&quot;https://git.opendaylight.org/gerrit/#/c/58740/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://git.opendaylight.org/gerrit/#/c/58740/&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10000">
                    <name>Blocks</name>
                                            <outwardlinks description="blocks">
                                        <issuelink>
            <issuekey id="26247">CONTROLLER-1693</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is blocked by">
                                        <issuelink>
            <issuekey id="26257">CONTROLLER-1703</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                            <customfield id="customfield_11400" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10208" key="com.atlassian.jira.plugin.system.customfieldtypes:textfield">
                        <customfieldname>External issue ID</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>8606</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10201" key="com.atlassian.jira.plugin.system.customfieldtypes:url">
                        <customfieldname>External issue URL</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[https://bugs.opendaylight.org/show_bug.cgi?id=8606]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10206" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Issue Type</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10300"><![CDATA[Bug]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10000" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>0|i02s87:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>