[BGPCEP-689] ConcurrentModificationException when closing bgp-state-provider-service-group service instance Created: 14/Sep/17  Updated: 03/Mar/19  Resolved: 16/Jan/18

Status: Verified
Project: bgpcep
Component/s: BGP
Affects Version/s: Bugzilla Migration
Fix Version/s: Bugzilla Migration

Type: Bug
Reporter: Vratko Polak Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 9164

 Description   

This has been discovered when investigating a failure in CSIT. Not sure whether this is the primary reason for the failure, but such exceptions should not hapen anyway.

The test isolates and rejoins a cluster member while ExaBGP is connected, the failure [0] happens when Robo reads bgp-rib for the peer just after the rejoin:

{"errors":{"error":[

{"error-type":"application","error-tag":"data-missing","error-message":"Request could not be completed because the relevant data model content does not exist "}

]}}

The affected member-1 contains the exception is its karaf.log [1]:

2017-09-13 07:34:20,705 | WARN | ult-dispatcher-4 | ClusterSingletonServiceGroupImpl | 288 - org.opendaylight.mdsal.singleton-dom-impl - 2.3.0 | Service group bgp-state-provider-service-group service org.opendaylight.protocol.bgp.state.StateProviderImpl@2d0e1d16 failed to stop, attempting to continue
java.util.ConcurrentModificationException
at java.util.HashMap$KeySet.forEach(HashMap.java:935)[:1.8.0_141]
at org.opendaylight.protocol.bgp.state.StateProviderImpl.closeServiceInstance(StateProviderImpl.java:177)[190:org.opendaylight.bgpcep.bgp-openconfig-state:0.8.0]
at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.stopServices(ClusterSingletonServiceGroupImpl.java:515)[288:org.opendaylight.mdsal.singleton-dom-impl:2.3.0]
at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.lostOwnership(ClusterSingletonServiceGroupImpl.java:487)[288:org.opendaylight.mdsal.singleton-dom-impl:2.3.0]
at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.serviceOwnershipChanged(ClusterSingletonServiceGroupImpl.java:410)[288:org.opendaylight.mdsal.singleton-dom-impl:2.3.0]
at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.lockedOwnershipChanged(ClusterSingletonServiceGroupImpl.java:354)[288:org.opendaylight.mdsal.singleton-dom-impl:2.3.0]
at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.ownershipChanged(ClusterSingletonServiceGroupImpl.java:337)[288:org.opendaylight.mdsal.singleton-dom-impl:2.3.0]
at org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl.ownershipChanged(AbstractClusterSingletonServiceProviderImpl.java:234)[288:org.opendaylight.mdsal.singleton-dom-impl:2.3.0]
at org.opendaylight.mdsal.singleton.dom.impl.DOMClusterSingletonServiceProviderImpl.ownershipChanged(DOMClusterSingletonServiceProviderImpl.java:23)[288:org.opendaylight.mdsal.singleton-dom-impl:2.3.0]
at org.opendaylight.controller.cluster.datastore.entityownership.EntityOwnershipListenerActor.onEntityOwnershipChanged(EntityOwnershipListenerActor.java:44)[237:org.opendaylight.controller.sal-distributed-datastore:1.6.0]
at org.opendaylight.controller.cluster.datastore.entityownership.EntityOwnershipListenerActor.handleReceive(EntityOwnershipListenerActor.java:33)[237:org.opendaylight.controller.sal-distributed-datastore:1.6.0]
at org.opendaylight.controller.cluster.common.actor.AbstractUntypedActor.onReceive(AbstractUntypedActor.java:38)[230:org.opendaylight.controller.sal-clustering-commons:1.6.0]
at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)[35:com.typesafe.akka.actor:2.4.18]
at akka.actor.Actor$class.aroundReceive(Actor.scala:502)[35:com.typesafe.akka.actor:2.4.18]
at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)[35:com.typesafe.akka.actor:2.4.18]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)[35:com.typesafe.akka.actor:2.4.18]
at akka.actor.ActorCell.invoke(ActorCell.scala:495)[35:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)[35:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.Mailbox.run(Mailbox.scala:224)[35:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)[35:com.typesafe.akka.actor:2.4.18]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)[323:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)[323:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)[323:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)[323:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]

[0] https://logs.opendaylight.org/releng/jenkins092/bgpcep-csit-3node-periodic-bgpclustering-ha-only-nitrogen/152/log.html.gz#s1-s2-t10-k2-k2
[1] https://logs.opendaylight.org/releng/jenkins092/bgpcep-csit-3node-periodic-bgpclustering-ha-only-nitrogen/152/odl1_karaf.log.gz



 Comments   
Comment by Vratko Polak [ 14/Sep/17 ]

> Not sure whether this is the primary reason for the failure

The same Robot failure happens (with roughly 30% chance) also on Carbon [2], but karaf.log [3] shows a different warning:

2017-09-12 08:02:37,027 | WARN | rd-dispatcher-35 | ShardDataTree | 199 - org.opendaylight.controller.sal-distributed-datastore - 1.5.2.Carbon | member-1-shard-default-operational: Store Tx member-3-datastore-operational-fe-0-chn-26-txn-2-0: Data validation failed for path /(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2013-09-25)bgp-rib/rib/rib[

{(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2013-09-25)id=example-bgp-rib}

]/peer/peer[

{(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2013-09-25)peer-id=bgp://10.29.13.156}

].
org.opendaylight.yangtools.yang.data.api.schema.tree.ModifiedNodeDoesNotExistException: Node /(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2013-09-25)bgp-rib/rib/rib[

{(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2013-09-25)id=example-bgp-rib}

]/peer/peer[

{(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2013-09-25)peer-id=bgp://10.29.13.156}

] does not exist. Cannot apply modification to its children.

[2] https://logs.opendaylight.org/releng/jenkins092/bgpcep-csit-3node-periodic-bgpclustering-ha-only-carbon/381/log.html.gz#s1-s2-t10-k2-k2
[3] https://logs.opendaylight.org/releng/jenkins092/bgpcep-csit-3node-periodic-bgpclustering-ha-only-carbon/381/odl1_karaf.log.gz

Comment by Claudio David Gasparini [ 19/Oct/17 ]

Duplicate bug bixed by BUG-9205

Generated at Wed Feb 07 19:13:50 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.