[INTDIST-64] all 3 nodes job broken in integration due to "Use Akka artery for remote transport " Created: 09/Jan/17  Updated: 20/Oct/17  Resolved: 11/Jan/17

Status: Resolved
Project: integration-distribution
Component/s: Scripts
Affects Version/s: unspecified
Fix Version/s: None

Type: Bug
Reporter: Peter Gubka Assignee: Ravit Peretz
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 7493

 Description   

Based on the information i received the patch set https://git.opendaylight.org/gerrit/#/c/49466/ caused akka not to create/synchronize cluster in 3 nodes jobs.

All 3 node jobs in https://jenkins.opendaylight.org/releng/view/controller/
or https://jenkins.opendaylight.org/releng/view/bgpcep/ are failing and karaf log contains

2017-01-06 22:57:36,630 | ERROR | lt-dispatcher-16 | ClusterActorRefProvider | 202 - com.typesafe.akka.slf4j - 2.4.16 | No root guardian at [akka.tcp://opendaylight-cluster-data@10.29.12.92:2550]
java.lang.IllegalArgumentException: Wrong protocol of [akka.tcp://opendaylight-cluster-data@10.29.12.92:2550/], expected [akka]
at akka.remote.RemoteActorRef.<init>(RemoteActorRefProvider.scala:504)[214:com.typesafe.akka.remote:2.4.16]
at akka.remote.RemoteActorRefProvider.rootGuardianAt(RemoteActorRefProvider.scala:367)[214:com.typesafe.akka.remote:2.4.16]
at akka.actor.ActorRefFactory$class.actorSelection(ActorRefProvider.scala:356)[201:com.typesafe.akka.actor:2.4.16]
at akka.actor.ActorCell.actorSelection(ActorCell.scala:374)[201:com.typesafe.akka.actor:2.4.16]
at akka.cluster.JoinSeedNodeProcess$$anonfun$receive$4$$anonfun$applyOrElse$1.applyOrElse(ClusterDaemon.scala:1271)[212:com.typesafe.akka.cluster:2.4.16]
at akka.cluster.JoinSeedNodeProcess$$anonfun$receive$4$$anonfun$applyOrElse$1.applyOrElse(ClusterDaemon.scala:1270)[212:com.typesafe.akka.cluster:2.4.16]
at scala.PartialFunction$$anonfun$runWith$1.apply(PartialFunction.scala:141)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.PartialFunction$$anonfun$runWith$1.apply(PartialFunction.scala:140)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.collection.Iterator$class.foreach(Iterator.scala:893)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.collection.TraversableLike$class.collect(TraversableLike.scala:271)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.collection.AbstractTraversable.collect(Traversable.scala:104)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at akka.cluster.JoinSeedNodeProcess$$anonfun$receive$4.applyOrElse(ClusterDaemon.scala:1270)[212:com.typesafe.akka.cluster:2.4.16]
at akka.actor.Actor$class.aroundReceive(Actor.scala:496)[201:com.typesafe.akka.actor:2.4.16]
at akka.cluster.JoinSeedNodeProcess.aroundReceive(ClusterDaemon.scala:1254)[212:com.typesafe.akka.cluster:2.4.16]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)[201:com.typesafe.akka.actor:2.4.16]
at akka.actor.ActorCell.invoke(ActorCell.scala:495)[201:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)[201:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.Mailbox.run(Mailbox.scala:224)[201:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)[201:com.typesafe.akka.actor:2.4.16]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]



 Comments   
Comment by Peter Gubka [ 09/Jan/17 ]

Lowering importance from blocker to normal as csit jobs manipulate with akka.conf files and this may have the unintended impact. To be verified on "test" side first.

Comment by Ravit Peretz [ 09/Jan/17 ]

please review:
https://git.opendaylight.org/gerrit/50129

Comment by Vladimir Lavor [ 10/Jan/17 ]

Tried today with latest changes but I'm getting this error thrown on every node for every other cluster machine :

2017-01-10 13:17:05,891 | ERROR | lt-dispatcher-22 | kka://opendaylight-cluster-data) | 204 - com.typesafe.akka.slf4j - 2.4.16 | Outbound message stream to [akka://opendaylight-cluster-data@192.168.1.6:2550] failed. Restarting it. Handshake with [akka://opendaylight-cluster-data@192.168.1.6:2550] did not complete within 20000 ms
akka.remote.artery.OutboundHandshake$HandshakeTimeoutException: Handshake with [akka://opendaylight-cluster-data@192.168.1.6:2550] did not complete within 20000 ms

Full logs here:
Node 192.168.1.6: http://pastebin.com/CRCYDiEQ
Node 192.168.1.8: http://pastebin.com/F0psBYw7
Node 192.168.1.7: http://pastebin.com/1WGqYzs4

Comment by Vratko Polak [ 10/Jan/17 ]

Distribution is currently in agreement with Controller akka.conf
so I slightly prefer opening a new Bug against Controller (as opposed to adding additional symptoms to this one).

Comment by Vratko Polak [ 11/Jan/17 ]

> opening a new Bug against Controller

https://bugs.opendaylight.org/show_bug.cgi?id=7518

Generated at Wed Feb 07 20:02:41 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.