[BGPCEP-878] BGP does not reconnect after partitioned cluster heals Created: 28/Aug/19 Updated: 14/Dec/19 Resolved: 27/Nov/19 |
|
| Status: | Resolved |
| Project: | bgpcep |
| Component/s: | BGP |
| Affects Version/s: | None |
| Fix Version/s: | Neon SR3, Magnesium, Sodium SR1 |
| Type: | Bug | Priority: | Medium |
| Reporter: | Ajay Lele | Assignee: | Ajay Lele |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
Steps are as follows:
2019-08-20T19:35:34,739 | INFO | opendaylight-cluster-data-shard-dispatcher-46 | ShardManager | 282 - org.opendaylight.controller.sal-distributed-datastore - 1.8.1 | shard-manager-operational Received follower initial sync status for member-1-shard-default-operational status sync done true
2019-08-20T19:35:34,748 | WARN | opendaylight-cluster-data-akka.actor.default-dispatcher-52 | ClusterSingletonServiceGroupImpl | 335 - org.opendaylight.mdsal.singleton-dom-impl - 2.5.1 | Service group bgp-rib-service-group service org.opendaylight.protocol.bgp.rib.impl.config.BGPClusterSingletonService@17397d1 failed to start, attempting to continue
java.lang.NullPointerException: null
at org.opendaylight.protocol.bgp.rib.impl.AdjRibInWriter.transform(AdjRibInWriter.java:149) ~[223:org.opendaylight.bgpcep.bgp-rib-impl:0.10.1]
at org.opendaylight.protocol.bgp.rib.impl.ApplicationPeer.instantiateServiceInstance(ApplicationPeer.java:154) ~[223:org.opendaylight.bgpcep.bgp-rib-impl:0.10.1]
at org.opendaylight.protocol.bgp.rib.impl.config.AppPeer$BgpAppPeerSingletonService.instantiateServiceInstance(AppPeer.java:135) ~[223:org.opendaylight.bgpcep.bgp-rib-impl:0.10.1]
at org.opendaylight.protocol.bgp.rib.impl.config.AppPeer.instantiateServiceInstance(AppPeer.java:88) ~[223:org.opendaylight.bgpcep.bgp-rib-impl:0.10.1]
at java.util.HashMap$Values.forEach(HashMap.java:981) [?:?]
at org.opendaylight.protocol.bgp.rib.impl.config.BGPClusterSingletonService.instantiateServiceInstance(BGPClusterSingletonService.java:98) [223:org.opendaylight.bgpcep.bgp-rib-impl:0.10.1]
at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.ensureServicesStarting(ClusterSingletonServiceGroupImpl.java:636) [335:org.opendaylight.mdsal.singleton-dom-impl:2.5.1]
at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.tryReconcileState(ClusterSingletonServiceGroupImpl.java:563) [335:org.opendaylight.mdsal.singleton-dom-impl:2.5.1]
at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.reconcileState(ClusterSingletonServiceGroupImpl.java:458) [335:org.opendaylight.mdsal.singleton-dom-impl:2.5.1]
at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.ownershipChanged(ClusterSingletonServiceGroupImpl.java:339) [335:org.opendaylight.mdsal.singleton-dom-impl:2.5.1]
at org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl.ownershipChanged(AbstractClusterSingletonServiceProviderImpl.java:238) [335:org.opendaylight.mdsal.singleton-dom-impl:2.5.1]
at org.opendaylight.mdsal.singleton.dom.impl.DOMClusterSingletonServiceProviderImpl.ownershipChanged(DOMClusterSingletonServiceProviderImpl.java:23) [335:org.opendaylight.mdsal.singleton-dom-impl:2.5.1]
at org.opendaylight.controller.cluster.datastore.entityownership.EntityOwnershipListenerActor.onEntityOwnershipChanged(EntityOwnershipListenerActor.java:44) [282:org.opendaylight.controller.sal-distributed-datastore:1.8.1]
at org.opendaylight.controller.cluster.datastore.entityownership.EntityOwnershipListenerActor.handleReceive(EntityOwnershipListenerActor.java:33) [282:org.opendaylight.controller.sal-distributed-datastore:1.8.1]
at org.opendaylight.controller.cluster.common.actor.AbstractUntypedActor.onReceive(AbstractUntypedActor.java:38) [274:org.opendaylight.controller.sal-clustering-commons:1.8.1]
at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167) [37:com.typesafe.akka.actor:2.5.11]
at akka.actor.Actor.aroundReceive(Actor.scala:517) [37:com.typesafe.akka.actor:2.5.11]
at akka.actor.Actor.aroundReceive$(Actor.scala:515) [37:com.typesafe.akka.actor:2.5.11]
at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97) [37:com.typesafe.akka.actor:2.5.11]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:590) [37:com.typesafe.akka.actor:2.5.11]
at akka.actor.ActorCell.invoke(ActorCell.scala:559) [37:com.typesafe.akka.actor:2.5.11]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) [37:com.typesafe.akka.actor:2.5.11]
at akka.dispatch.Mailbox.run(Mailbox.scala:224) [37:com.typesafe.akka.actor:2.5.11]
at akka.dispatch.Mailbox.exec(Mailbox.scala:234) [37:com.typesafe.akka.actor:2.5.11]
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [37:com.typesafe.akka.actor:2.5.11]
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [37:com.typesafe.akka.actor:2.5.11]
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [37:com.typesafe.akka.actor:2.5.11]
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [37:com.typesafe.akka.actor:2.5.11]
|
| Comments |
| Comment by Ajay Lele [ 14/Dec/19 ] |
|
Change done to BgpPeer.class in original commit causes below regression in CSIT test. More investigation in progress 2019-12-13T03:32:33,408 | ERROR | opendaylight-cluster-data-notification-dispatcher-51 | DataTreeChangeListenerActor | 292 - org.opendaylight.controller.sal-clustering-commons - 1.9.3 | member-1-shard-default-config: Error notifying listener org.opendaylight.protocol.bgp.rib.impl.config.BgpDeployerImpl@70bb4e76 java.lang.IllegalStateException: Previous peer instance was not closed. at com.google.common.base.Preconditions.checkState(Preconditions.java:507) ~[36:com.google.guava:25.1.0.jre] at org.opendaylight.protocol.bgp.rib.impl.config.BgpPeer.start(BgpPeer.java:137) ~[238:org.opendaylight.bgpcep.bgp-rib-impl:0.11.3] at org.opendaylight.protocol.bgp.rib.impl.config.BgpPeer.restart(BgpPeer.java:149) ~[238:org.opendaylight.bgpcep.bgp-rib-impl:0.11.3] at org.opendaylight.protocol.bgp.rib.impl.config.BGPClusterSingletonService.restartNeighbors(BGPClusterSingletonService.java:367) ~[238:org.opendaylight.bgpcep.bgp-rib-impl:0.11.3] at org.opendaylight.protocol.bgp.rib.impl.config.BgpDeployerImpl.lambda$rebootNeighbors$6(BgpDeployerImpl.java:209) ~[238:org.opendaylight.bgpcep.bgp-rib-impl:0.11.3] at java.util.HashMap$Values.forEach(HashMap.java:981) ~[?:?] at org.opendaylight.protocol.bgp.rib.impl.config.BgpDeployerImpl.rebootNeighbors(BgpDeployerImpl.java:209) ~[238:org.opendaylight.bgpcep.bgp-rib-impl:0.11.3] at org.opendaylight.protocol.bgp.rib.impl.config.BgpDeployerImpl.handlePeersChange(BgpDeployerImpl.java:195) ~[238:org.opendaylight.bgpcep.bgp-rib-impl:0.11.3] at org.opendaylight.protocol.bgp.rib.impl.config.BgpDeployerImpl.handleModifications(BgpDeployerImpl.java:160) ~[238:org.opendaylight.bgpcep.bgp-rib-impl:0.11.3] at org.opendaylight.protocol.bgp.rib.impl.config.BgpDeployerImpl.onDataTreeChanged(BgpDeployerImpl.java:144) ~[238:org.opendaylight.bgpcep.bgp-rib-impl:0.11.3] at org.opendaylight.controller.md.sal.binding.impl.BindingDOMDataTreeChangeListenerAdapter.onDataTreeChanged(BindingDOMDataTreeChangeListenerAdapter.java:42) ~[287:org.opendaylight.controller.sal-binding-broker-impl:1.9.3] at org.opendaylight.controller.sal.core.compat.LegacyDOMDataBrokerAdapter$ProxyListener.onDataTreeChanged(LegacyDOMDataBrokerAdapter.java:353) ~[298:org.opendaylight.controller.sal-core-compat:1.9.3] at org.opendaylight.controller.cluster.datastore.DataTreeChangeListenerActor.dataChanged(DataTreeChangeListenerActor.java:82) [300:org.opendaylight.controller.sal-distributed-datastore:1.9.3] at org.opendaylight.controller.cluster.datastore.DataTreeChangeListenerActor.handleReceive(DataTreeChangeListenerActor.java:43) [300:org.opendaylight.controller.sal-distributed-datastore:1.9.3] at org.opendaylight.controller.cluster.common.actor.AbstractUntypedActor.onReceive(AbstractUntypedActor.java:40) [292:org.opendaylight.controller.sal-clustering-commons:1.9.3] at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167) [41:com.typesafe.akka.actor:2.5.26] at akka.actor.Actor.aroundReceive(Actor.scala:539) [41:com.typesafe.akka.actor:2.5.26] at akka.actor.Actor.aroundReceive$(Actor.scala:537) [41:com.typesafe.akka.actor:2.5.26] at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97) [41:com.typesafe.akka.actor:2.5.26] at akka.actor.ActorCell.receiveMessage(ActorCell.scala:612) [41:com.typesafe.akka.actor:2.5.26] at akka.actor.ActorCell.invoke(ActorCell.scala:581) [41:com.typesafe.akka.actor:2.5.26] at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:268) [41:com.typesafe.akka.actor:2.5.26] at akka.dispatch.Mailbox.run(Mailbox.scala:229) [41:com.typesafe.akka.actor:2.5.26] at akka.dispatch.Mailbox.exec(Mailbox.scala:241) [41:com.typesafe.akka.actor:2.5.26] at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [41:com.typesafe.akka.actor:2.5.26] at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [41:com.typesafe.akka.actor:2.5.26] at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [41:com.typesafe.akka.actor:2.5.26] at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [41:com.typesafe.akka.actor:2.5.26] |