Details
-
Bug
-
Status: Resolved
-
Resolution: Done
-
None
-
None
-
None
-
Operating System: All
Platform: All
-
8540
Description
CSIT testing has shown that we can end up trying to reconnect a conneection which is already connecting:
2017-05-23 12:25:46,658 | DEBUG | ult-dispatcher-5 | AbstractClientConnection | 197 - org.opendaylight.controller.cds-access-client - 1.1.0.SNAPSHOT | Connection ConnectingClientConnection{client=ClientIdentifier
{frontend=member-2-frontend-datastore-config, generation=0}, cookie=0} has not seen activity from
backend for 30032074900 nanoseconds, timing out
2017-05-23 12:25:46,659 | ERROR | ult-dispatcher-5 | OneForOneStrategy | 174 - com.typesafe.akka.slf4j - 2.4.17 | Attempted to reconnect a connecting connection
java.lang.UnsupportedOperationException: Attempted to reconnect a connecting connection
at org.opendaylight.controller.cluster.access.client.ConnectingClientConnection.lockedReconnect(ConnectingClientConnection.java:35)[197:org.opendaylight.controller.cds-access-client:1.1.0.SNAPSHOT]
at org.opendaylight.controller.cluster.access.client.AbstractClientConnection.runTimer(AbstractClientConnection.java:267)[197:org.opendaylight.controller.cds-access-client:1.1.0.SNAPSHOT]
at org.opendaylight.controller.cluster.access.client.ClientActorBehavior.onReceiveCommand(ClientActorBehavior.java:120)[197:org.opendaylight.controller.cds-access-client:1.1.0.SNAPSHOT]
at org.opendaylight.controller.cluster.access.client.ClientActorBehavior.onReceiveCommand(ClientActorBehavior.java:44)[197:org.opendaylight.controller.cds-access-client:1.1.0.SNAPSHOT]
at org.opendaylight.controller.cluster.access.client.AbstractClientActor.onReceiveCommand(AbstractClientActor.java:59)[197:org.opendaylight.controller.cds-access-client:1.1.0.SNAPSHOT]
at akka.persistence.UntypedPersistentActor.onReceive(PersistentActor.scala:170)[180:com.typesafe.akka.persistence:2.4.17]
at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)[173:com.typesafe.akka.actor:2.4.17]
at akka.actor.Actor$class.aroundReceive(Actor.scala:497)[173:com.typesafe.akka.actor:2.4.17]
at akka.persistence.UntypedPersistentActor.akka$persistence$Eventsourced$$super$aroundReceive(PersistentActor.scala:168)[180:com.typesafe.akka.persistence:2.4.17]
at akka.persistence.Eventsourced$$anon$1.stateReceive(Eventsourced.scala:664)[180:com.typesafe.akka.persistence:2.4.17]
at akka.persistence.Eventsourced$class.aroundReceive(Eventsourced.scala:183)[180:com.typesafe.akka.persistence:2.4.17]
at akka.persistence.UntypedPersistentActor.aroundReceive(PersistentActor.scala:168)[180:com.typesafe.akka.persistence:2.4.17]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)[173:com.typesafe.akka.actor:2.4.17]
at akka.actor.ActorCell.invoke(ActorCell.scala:495)[173:com.typesafe.akka.actor:2.4.17]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)[173:com.typesafe.akka.actor:2.4.17]
at akka.dispatch.Mailbox.run(Mailbox.scala:224)[173:com.typesafe.akka.actor:2.4.17]
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)[173:com.typesafe.akka.actor:2.4.17]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)[169:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)[169:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)[169:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)[169:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
Caused by: org.opendaylight.controller.cluster.access.concepts.RuntimeRequestException: Backend connection timed out
... 20 more
Caused by: java.util.concurrent.TimeoutException
... 20 more
This comes from the fact that the timer is running (to service request timeouts), but it does not realize we are not actually transmitting, hence there is no way for us to hear from the leader.