Uploaded image for project: 'bgpcep'
  1. bgpcep
  2. BGPCEP-878

BGP does not reconnect after partitioned cluster heals

XMLWordPrintable

      Steps are as follows:

      1. 3-node cluster, bgp and open-config shard local
      2. bgp connection to node1
      3. node1 gets isolated from node2 and node3
      4. bgp connection drops
      5. isolation is removed and node1 rejoins cluster
      6. bgp connection never gets reestablished
      7. below NPE is seen in karaf.log
      2019-08-20T19:35:34,739 | INFO  | opendaylight-cluster-data-shard-dispatcher-46 | ShardManager                     | 282 - org.opendaylight.controller.sal-distributed-datastore - 1.8.1 | shard-manager-operational Received follower initial sync status for member-1-shard-default-operational status sync done true
      2019-08-20T19:35:34,748 | WARN  | opendaylight-cluster-data-akka.actor.default-dispatcher-52 | ClusterSingletonServiceGroupImpl | 335 - org.opendaylight.mdsal.singleton-dom-impl - 2.5.1 | Service group bgp-rib-service-group service org.opendaylight.protocol.bgp.rib.impl.config.BGPClusterSingletonService@17397d1 failed to start, attempting to continue
      java.lang.NullPointerException: null
              at org.opendaylight.protocol.bgp.rib.impl.AdjRibInWriter.transform(AdjRibInWriter.java:149) ~[223:org.opendaylight.bgpcep.bgp-rib-impl:0.10.1]
              at org.opendaylight.protocol.bgp.rib.impl.ApplicationPeer.instantiateServiceInstance(ApplicationPeer.java:154) ~[223:org.opendaylight.bgpcep.bgp-rib-impl:0.10.1]
              at org.opendaylight.protocol.bgp.rib.impl.config.AppPeer$BgpAppPeerSingletonService.instantiateServiceInstance(AppPeer.java:135) ~[223:org.opendaylight.bgpcep.bgp-rib-impl:0.10.1]
              at org.opendaylight.protocol.bgp.rib.impl.config.AppPeer.instantiateServiceInstance(AppPeer.java:88) ~[223:org.opendaylight.bgpcep.bgp-rib-impl:0.10.1]
              at java.util.HashMap$Values.forEach(HashMap.java:981) [?:?]
              at org.opendaylight.protocol.bgp.rib.impl.config.BGPClusterSingletonService.instantiateServiceInstance(BGPClusterSingletonService.java:98) [223:org.opendaylight.bgpcep.bgp-rib-impl:0.10.1]
              at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.ensureServicesStarting(ClusterSingletonServiceGroupImpl.java:636) [335:org.opendaylight.mdsal.singleton-dom-impl:2.5.1]
              at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.tryReconcileState(ClusterSingletonServiceGroupImpl.java:563) [335:org.opendaylight.mdsal.singleton-dom-impl:2.5.1]
              at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.reconcileState(ClusterSingletonServiceGroupImpl.java:458) [335:org.opendaylight.mdsal.singleton-dom-impl:2.5.1]
              at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.ownershipChanged(ClusterSingletonServiceGroupImpl.java:339) [335:org.opendaylight.mdsal.singleton-dom-impl:2.5.1]
              at org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl.ownershipChanged(AbstractClusterSingletonServiceProviderImpl.java:238) [335:org.opendaylight.mdsal.singleton-dom-impl:2.5.1]
              at org.opendaylight.mdsal.singleton.dom.impl.DOMClusterSingletonServiceProviderImpl.ownershipChanged(DOMClusterSingletonServiceProviderImpl.java:23) [335:org.opendaylight.mdsal.singleton-dom-impl:2.5.1]
              at org.opendaylight.controller.cluster.datastore.entityownership.EntityOwnershipListenerActor.onEntityOwnershipChanged(EntityOwnershipListenerActor.java:44) [282:org.opendaylight.controller.sal-distributed-datastore:1.8.1]
              at org.opendaylight.controller.cluster.datastore.entityownership.EntityOwnershipListenerActor.handleReceive(EntityOwnershipListenerActor.java:33) [282:org.opendaylight.controller.sal-distributed-datastore:1.8.1]
              at org.opendaylight.controller.cluster.common.actor.AbstractUntypedActor.onReceive(AbstractUntypedActor.java:38) [274:org.opendaylight.controller.sal-clustering-commons:1.8.1]
              at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167) [37:com.typesafe.akka.actor:2.5.11]
              at akka.actor.Actor.aroundReceive(Actor.scala:517) [37:com.typesafe.akka.actor:2.5.11]
              at akka.actor.Actor.aroundReceive$(Actor.scala:515) [37:com.typesafe.akka.actor:2.5.11]
              at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97) [37:com.typesafe.akka.actor:2.5.11]
              at akka.actor.ActorCell.receiveMessage(ActorCell.scala:590) [37:com.typesafe.akka.actor:2.5.11]
              at akka.actor.ActorCell.invoke(ActorCell.scala:559) [37:com.typesafe.akka.actor:2.5.11]
              at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) [37:com.typesafe.akka.actor:2.5.11]
              at akka.dispatch.Mailbox.run(Mailbox.scala:224) [37:com.typesafe.akka.actor:2.5.11]
              at akka.dispatch.Mailbox.exec(Mailbox.scala:234) [37:com.typesafe.akka.actor:2.5.11]
              at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [37:com.typesafe.akka.actor:2.5.11]
              at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [37:com.typesafe.akka.actor:2.5.11]
              at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [37:com.typesafe.akka.actor:2.5.11]
              at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [37:com.typesafe.akka.actor:2.5.11] 

            ajayslele Ajay Lele
            ajayslele Ajay Lele
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: