[CONTROLLER-1696] tell-based-protocol: Attempting to reconnect a connecting connection Created: 23/May/17  Updated: 25/Jul/23  Resolved: 30/May/17

Status: Resolved
Project: controller
Component/s: clustering
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Robert Varga Assignee: Robert Varga
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 8540

 Description   

CSIT testing has shown that we can end up trying to reconnect a conneection which is already connecting:

2017-05-23 12:25:46,658 | DEBUG | ult-dispatcher-5 | AbstractClientConnection | 197 - org.opendaylight.controller.cds-access-client - 1.1.0.SNAPSHOT | Connection ConnectingClientConnection{client=ClientIdentifier

{frontend=member-2-frontend-datastore-config, generation=0}

, cookie=0} has not seen activity from
backend for 30032074900 nanoseconds, timing out
2017-05-23 12:25:46,659 | ERROR | ult-dispatcher-5 | OneForOneStrategy | 174 - com.typesafe.akka.slf4j - 2.4.17 | Attempted to reconnect a connecting connection
java.lang.UnsupportedOperationException: Attempted to reconnect a connecting connection
at org.opendaylight.controller.cluster.access.client.ConnectingClientConnection.lockedReconnect(ConnectingClientConnection.java:35)[197:org.opendaylight.controller.cds-access-client:1.1.0.SNAPSHOT]
at org.opendaylight.controller.cluster.access.client.AbstractClientConnection.runTimer(AbstractClientConnection.java:267)[197:org.opendaylight.controller.cds-access-client:1.1.0.SNAPSHOT]
at org.opendaylight.controller.cluster.access.client.ClientActorBehavior.onReceiveCommand(ClientActorBehavior.java:120)[197:org.opendaylight.controller.cds-access-client:1.1.0.SNAPSHOT]
at org.opendaylight.controller.cluster.access.client.ClientActorBehavior.onReceiveCommand(ClientActorBehavior.java:44)[197:org.opendaylight.controller.cds-access-client:1.1.0.SNAPSHOT]
at org.opendaylight.controller.cluster.access.client.AbstractClientActor.onReceiveCommand(AbstractClientActor.java:59)[197:org.opendaylight.controller.cds-access-client:1.1.0.SNAPSHOT]
at akka.persistence.UntypedPersistentActor.onReceive(PersistentActor.scala:170)[180:com.typesafe.akka.persistence:2.4.17]
at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)[173:com.typesafe.akka.actor:2.4.17]
at akka.actor.Actor$class.aroundReceive(Actor.scala:497)[173:com.typesafe.akka.actor:2.4.17]
at akka.persistence.UntypedPersistentActor.akka$persistence$Eventsourced$$super$aroundReceive(PersistentActor.scala:168)[180:com.typesafe.akka.persistence:2.4.17]
at akka.persistence.Eventsourced$$anon$1.stateReceive(Eventsourced.scala:664)[180:com.typesafe.akka.persistence:2.4.17]
at akka.persistence.Eventsourced$class.aroundReceive(Eventsourced.scala:183)[180:com.typesafe.akka.persistence:2.4.17]
at akka.persistence.UntypedPersistentActor.aroundReceive(PersistentActor.scala:168)[180:com.typesafe.akka.persistence:2.4.17]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)[173:com.typesafe.akka.actor:2.4.17]
at akka.actor.ActorCell.invoke(ActorCell.scala:495)[173:com.typesafe.akka.actor:2.4.17]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)[173:com.typesafe.akka.actor:2.4.17]
at akka.dispatch.Mailbox.run(Mailbox.scala:224)[173:com.typesafe.akka.actor:2.4.17]
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)[173:com.typesafe.akka.actor:2.4.17]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)[169:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)[169:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)[169:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)[169:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
Caused by: org.opendaylight.controller.cluster.access.concepts.RuntimeRequestException: Backend connection timed out
... 20 more
Caused by: java.util.concurrent.TimeoutException
... 20 more

This comes from the fact that the timer is running (to service request timeouts), but it does not realize we are not actually transmitting, hence there is no way for us to hear from the leader.



 Comments   
Comment by Robert Varga [ 23/May/17 ]

https://git.opendaylight.org/gerrit/57722

Generated at Wed Feb 07 19:56:13 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.