[BGPCEP-760] Deadlock in manypeers_changecount test Created: 20/Feb/18 Updated: 18/Apr/18 Resolved: 01/Mar/18 |
|
| Status: | Verified |
| Project: | bgpcep |
| Component/s: | BGP |
| Affects Version/s: | Nitrogen, Carbon, Oxygen |
| Fix Version/s: | Nitrogen, Carbon, Oxygen |
| Type: | Bug | Priority: | Medium |
| Reporter: | Tomas Markovic | Assignee: | Claudio David Gasparini |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Description |
|
temporary link for sandbox(run 1 and 2): https://jenkins.opendaylight.org/sandbox/job/tomas-bgpcep-csit-1node-periodic-bgp-ingest-all-oxygen/ First Errors from sandbox test 2018-02-20T10:57:38,373 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher | 356 - org.opendaylight.infrautils.metrics-impl - 1.3.0.SNAPSHOT | Oh nose - there are 2 deadlocked threads!! :-( 2018-02-20T10:57:38,377 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher | 356 - org.opendaylight.infrautils.metrics-impl - 1.3.0.SNAPSHOT | Deadlocked thread stack trace: opendaylight-cluster-data-notification-dispatcher-92 locked on org.opendaylight.protocol.bgp.rib.impl.ExportPolicyPeerTrackerImpl@43eaef86 (owned by epollEventLoopGroup-10-7): at org.opendaylight.protocol.bgp.rib.impl.ExportPolicyPeerTrackerImpl.getPeerGroup(ExportPolicyPeerTrackerImpl.java:114) at org.opendaylight.protocol.bgp.mode.spi.AbstractRouteEntry.getRoutePeerIdRole(AbstractRouteEntry.java:96) at org.opendaylight.protocol.bgp.mode.impl.base.BaseAbstractRouteEntry.lambda$fillAdjRibsOut$0(BaseAbstractRouteEntry.java:187) at org.opendaylight.protocol.bgp.mode.impl.base.BaseAbstractRouteEntry$$Lambda$1418/229233901.accept(Unknown Source) at org.opendaylight.protocol.bgp.rib.impl.PeerExportGroupImpl.forEach(PeerExportGroupImpl.java:48) at org.opendaylight.protocol.bgp.mode.impl.base.BaseAbstractRouteEntry.fillAdjRibsOut(BaseAbstractRouteEntry.java:186) at org.opendaylight.protocol.bgp.mode.impl.base.BaseAbstractRouteEntry.addPathToDataStore(BaseAbstractRouteEntry.java:161) at org.opendaylight.protocol.bgp.mode.impl.base.BaseAbstractRouteEntry.updateRoute(BaseAbstractRouteEntry.java:111) at org.opendaylight.protocol.bgp.rib.impl.LocRibWriter.walkThrough(LocRibWriter.java:276) at org.opendaylight.protocol.bgp.rib.impl.LocRibWriter.onDataTreeChanged(LocRibWriter.java:179) at org.opendaylight.controller.cluster.datastore.DataTreeChangeListenerActor.dataChanged(DataTreeChangeListenerActor.java:67) at org.opendaylight.controller.cluster.datastore.DataTreeChangeListenerActor.handleReceive(DataTreeChangeListenerActor.java:41) at org.opendaylight.controller.cluster.common.actor.AbstractUntypedActor.onReceive(AbstractUntypedActor.java:38) at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:166) at akka.actor.Actor.aroundReceive(Actor.scala:514) at akka.actor.Actor.aroundReceive$(Actor.scala:512) at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:96) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:527) at akka.actor.ActorCell.invoke(ActorCell.scala:496) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) at akka.dispatch.Mailbox.run(Mailbox.scala:224) at akka.dispatch.Mailbox.exec(Mailbox.scala:234) at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Locally I reproduced this deadlock on nitrogen with installed features: odl-restconf odl-bgpcep-bgp I added scripts for quick config. Almost immediately after start there is deadlock occuring, from which odl doesn't recover even after killing the script. I am adding deadlock.log from yourkit profiler, and communication from wireshark. There's some problem with open message communication between the script and odl. I also added countroutes.sh which is restconf script to get the number of routes which went through. pretty easy to spot when deadlock occured with this. |
| Comments |
| Comment by Claudio David Gasparini [ 26/Feb/18 ] |
|
carbon https://git.opendaylight.org/gerrit/#/c/68665/ nitrogen https://git.opendaylight.org/gerrit/#/c/68667/ master https://git.opendaylight.org/gerrit/#/c/68648/
|
| Comment by Claudio David Gasparini [ 26/Feb/18 ] |
|
Hi, Tomas please confirm fix worked as expected for all version and close the bug.
Regards, |
| Comment by Michael Vorburger [ 28/Feb/18 ] |
|
tomas.markovic and cdgasparini I'm glad that PS FYI also |