[BGPCEP-520] B: keepalive probably not send from odl and Conflicting modification Created: 16/Aug/16  Updated: 03/Mar/19  Resolved: 26/Aug/16

Status: Resolved
Project: bgpcep
Component/s: BGP
Affects Version/s: Bugzilla Migration
Fix Version/s: Bugzilla Migration

Type: Bug
Reporter: Peter Gubka Assignee: Milos Fabian
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 6442

 Description   

job: https://jenkins.opendaylight.org/releng/view/bgpcep/job/bgpcep-csit-1node-periodic-bgp-ingest-mixed-only-boron/208/

logs:https://logs.opendaylight.org/releng/jenkins092/bgpcep-csit-1node-periodic-bgp-ingest-mixed-only-boron/208/archives/karaf.log.gz

play.py tool started to advetize 1M routes. After the hold timer expired on the play.py side, the connection was over.
Them various exceptions appeared in the karaf.log

2016-08-16 02:30:41,849 | WARN | ult-dispatcher-4 | ShardDataTree | 173 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | member-1-shard-default-operational: Store Tx member-1-datastore-operational-fe-0-chn-23-txn-76656: Conflicting modification for path /(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2013-09-25)bgp-rib/rib/rib[

{(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2013-09-25)id=example-bgp-rib}

]/peer/peer[

{(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2013-09-25)peer-id=bgp://10.29.12.188}

].

016-08-16 02:30:41,850 | ERROR | lt-dispatcher-36 | LocalThreePhaseCommitCohort | 173 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | Failed to prepare transaction member-1-datastore-operational-fe-0-chn-23-txn-76656 on backend
OptimisticLockFailedException

{message=Optimistic lock failed., errorList=[RpcError [message=Optimistic lock failed., severity=ERROR, errorType=APPLICATION, tag=resource-denied, applicationTag=null, info=null, cause=org.opendaylight.yangtools.yang.data.api.schema.tree.ConflictingModificationAppliedException: Node was deleted by other transaction.]]}

 Comments   
Comment by Milos Fabian [ 18/Aug/16 ]

master: https://git.opendaylight.org/gerrit/#/c/44254/

Comment by Peter Gubka [ 22/Aug/16 ]

seems like bgp keepalive still not sent

https://logs.opendaylight.org/releng/jenkins092/bgpcep-csit-1node-periodic-bgp-ingest-mixed-only-boron/212/archives/karaf.log.gz

tool log:

Unhandled exception in thread started by <function job at 0x7f7863b909b0>
Traceback (most recent call last):
File "play.py", line 1848, in job
state.perform_one_loop_iteration()
File "play.py", line 1742, in perform_one_loop_iteration
self.timer.check_peer_hold_time(self.timer.snapshot_time)
File "play.py", line 1266, in check_peer_hold_time
raise RuntimeError("Peer has overstepped the hold timer.")
RuntimeError: Peer has overstepped the hold timer.
Traceback (most recent call last):
File "play.py", line 1944, in <module>
threaded_job(arguments)
File "play.py", line 1938, in threaded_job
rpcserver.serve_forever()
File "/usr/lib/python2.7/SocketServer.py", line 236, in serve_forever
poll_interval)
File "/usr/lib/python2.7/SocketServer.py", line 155, in _eintr_retry
return func(*args)
KeyboardInterrupt

odl log:

2016-08-22 02:27:46,234 | INFO | entLoopGroup-5-1 | StrictBGPPeerRegistry | 186 - org.opendaylight.bgpcep.bgp-rib-impl - 0.7.0.SNAPSHOT | BGP Open message session parameters differ, session still accepted.
2016-08-22 02:27:46,240 | INFO | entLoopGroup-5-1 | BGPSessionImpl | 186 - org.opendaylight.bgpcep.bgp-rib-impl - 0.7.0.SNAPSHOT | BGP HoldTimer new value: 180
2016-08-22 02:27:46,247 | INFO | entLoopGroup-5-1 | BGPPeer | 186 - org.opendaylight.bgpcep.bgp-rib-impl - 0.7.0.SNAPSHOT | Session with peer 10.29.12.167 went up with tables [] and Add Path tables []
2016-08-22 02:27:46,284 | INFO | entLoopGroup-5-1 | AbstractBGPSessionNegotiator | 186 - org.opendaylight.bgpcep.bgp-rib-impl - 0.7.0.SNAPSHOT | BGP Session with peer [id: 0x1d6db583, L:/10.29.12.104:1790 - R:/10.29.12.167:17900] established successfully.
2016-08-22 02:31:05,382 | INFO | entLoopGroup-5-1 | BGPSessionImpl | 186 - org.opendaylight.bgpcep.bgp-rib-impl - 0.7.0.SNAPSHOT | End of input detected. Close the session.
2016-08-22 02:31:05,382 | INFO | entLoopGroup-5-1 | BGPPeer | 186 - org.opendaylight.bgpcep.bgp-rib-impl - 0.7.0.SNAPSHOT | Session with peer 10.29.12.167 went down
2016-08-22 02:31:05,406 | WARN | lt-dispatcher-36 | ShardDataTree | 173 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | member-1-shard-default-operational: Store Tx member-1-datastore-operational-fe-0-chn-23-txn-74196: Conflicting modification for path /(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2013-09-25)bgp-rib/rib/rib[

{(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2013-09-25)id=example-bgp-rib}

]/peer/peer[

{(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2013-09-25)peer-id=bgp://10.29.12.167}

].
2016-08-22 02:31:05,410 | INFO | entLoopGroup-5-1 | BGPSessionImpl | 186 - org.opendaylight.bgpcep.bgp-rib-impl - 0.7.0.SNAPSHOT | Closing session: BGPSessionImpl

{channel=[id: 0x1d6db583, L:/10.29.12.104:1790 ! R:/10.29.12.167:17900], state=UP}

2016-08-22 02:31:05,416 | ERROR | lt-dispatcher-23 | TransactionChainProxy | 173 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | Tx: member-1-datastore-operational-fe-0-chn-23-txn-74197 - ready future failed for previous Tx member-1-datastore-operational-fe-0-chn-23-txn-74197
2016-08-22 02:31:05,412 | ERROR | lt-dispatcher-22 | LocalThreePhaseCommitCohort | 173 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | Failed to prepare transaction member-1-datastore-operational-fe-0-chn-23-txn-74196 on backend
OptimisticLockFailedException

{message=Optimistic lock failed., errorList=[RpcError [message=Optimistic lock failed., severity=ERROR, errorType=APPLICATION, tag=resource-denied, applicationTag=null, info=null, cause=org.opendaylight.yangtools.yang.data.api.schema.tree.ConflictingModificationAppliedException: Node was deleted by other transaction.]]}

at org.opendaylight.controller.cluster.datastore.ShardDataTree.processNextTransaction(ShardDataTree.java:531)[173:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.ShardDataTree.startCanCommit(ShardDataTree.java:559)[173:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.SimpleShardDataTreeCohort.canCommit(SimpleShardDataTreeCohort.java:81)[173:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.CohortEntry.canCommit(CohortEntry.java:98)[173:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator.handleCanCommit(ShardCommitCoordinator.java:237)[173:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNA

Comment by Milos Fabian [ 23/Aug/16 ]

stable/boron: https://git.opendaylight.org/gerrit/#/c/44536/

Comment by Milos Fabian [ 23/Aug/16 ]

master: https://git.opendaylight.org/gerrit/#/c/44547/

Comment by Ajay L [ 25/Aug/16 ]

Good catch with the epoll mode! Wondering if similar fix is needed in few other places, esp. PCEPDispatcherImpl.java

./pcep/pcc-mock/src/main/java/org/opendaylight/protocol/pcep/pcc/mock/protocol/PCCDispatcherImpl.java: b.option(ChannelOption.MAX_MESSAGES_PER_READ, 1);
./pcep/impl/src/main/java/org/opendaylight/protocol/pcep/impl/PCEPDispatcherImpl.java: b.childOption(ChannelOption.MAX_MESSAGES_PER_READ, 1);
./bgp/rib-impl/src/main/java/org/opendaylight/protocol/bgp/rib/impl/BGPDispatcherImpl.java: bootstrap.option(ChannelOption.MAX_MESSAGES_PER_READ, 1);
./bgp/rib-impl/src/main/java/org/opendaylight/protocol/bgp/rib/impl/BGPDispatcherImpl.java: serverBootstrap.option(ChannelOption.MAX_MESSAGES_PER_READ, 1);
./bgp/rib-impl/src/test/java/org/opendaylight/protocol/bgp/rib/impl/TestClientDispatcher.java: bootstrap.option(ChannelOption.MAX_MESSAGES_PER_READ, 1);

Comment by Milos Fabian [ 26/Aug/16 ]

Yes, you are right. The PCE's/PCC's dispatcher should be modified too.
I will file a new bug for them

Generated at Wed Feb 07 19:13:19 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.