-
Bug
-
Resolution: Unresolved
-
High
-
None
-
None
-
None
-
None
The regression is detected here:
https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-bgp-ingest-mixed-all-neon/
BGP scale test uses play.py script to setup BGP session and inject 500K prefixes to controller:
python play.py --amount 500000 --myip=10.30.171.84 --myport=17900 --peerip=10.30.170.47 --peerport=1790 --insert=10 --withdraw=9 --prefill 10 --update single --info --results bgp.csv &> play.py.out
After ~3 minutes (hold timer) the BGP script throws this ERROR:
2019-05-06 10:08:44,090 INFO BGP-Dummy-1: Iteration: 296000 - total remaining prefixes: 203991 2019-05-06 10:08:44,726 ERROR BGP-Dummy-1: Peer has overstepped the hold timer. Unhandled exception in thread started by <function job at 0x7fefba965938> Traceback (most recent call last): File "play.py", line 2066, in job state.perform_one_loop_iteration() File "play.py", line 1958, in perform_one_loop_iteration self.timer.check_peer_hold_time(self.timer.snapshot_time) File "play.py", line 1429, in check_peer_hold_time raise RuntimeError("Peer has overstepped the hold timer.") RuntimeError: Peer has overstepped the hold timer. Traceback (most recent call last): File "play.py", line 2168, in <module> threaded_job(arguments) File "play.py", line 2162, in threaded_job rpcserver.serve_forever() File "/usr/lib/python2.7/SocketServer.py", line 231, in serve_forever poll_interval) File "/usr/lib/python2.7/SocketServer.py", line 150, in _eintr_retry return func(*args) KeyboardInterrupt
And the session is disconnected:
2019-05-06T10:08:46,632 | INFO | epollEventLoopGroup-10-1 | BGPSessionImpl | 242 - org.opendaylight.bgpcep.bgp-rib-impl - 0.11.1.SNAPSHOT | End of input detected. Close the session. 2019-05-06T10:08:46,633 | INFO | epollEventLoopGroup-10-1 | BGPPeer | 242 - org.opendaylight.bgpcep.bgp-rib-impl - 0.11.1.SNAPSHOT | Session with peer 10.30.171.99 went down 2019-05-06T10:08:46,633 | INFO | epollEventLoopGroup-10-1 | BGPPeer | 242 - org.opendaylight.bgpcep.bgp-rib-impl - 0.11.1.SNAPSHOT | Closing session with peer 2019-05-06T10:08:46,650 | INFO | epollEventLoopGroup-10-1 | AbstractPeer | 242 - org.opendaylight.bgpcep.bgp-rib-impl - 0.11.1.SNAPSHOT | Closed per Peer /(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2018-03-29)bgp-rib/rib/rib[{(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2018-03-29)id=example-bgp-rib}]/peer/peer[{(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2018-03-29)peer-id=bgp://10.30.171.99}] removed 2019-05-06T10:08:46,653 | INFO | epollEventLoopGroup-10-1 | AbstractPeer | 242 - org.opendaylight.bgpcep.bgp-rib-impl - 0.11.1.SNAPSHOT | Closing peer chain Uri{_value=bgp://10.30.171.99} 2019-05-06T10:08:46,659 | INFO | epollEventLoopGroup-10-1 | BGPSessionImpl | 242 - org.opendaylight.bgpcep.bgp-rib-impl - 0.11.1.SNAPSHOT | Closing session: BGPSessionImpl{channel=[id: 0x79806baf, L:/10.30.171.133:1790 ! R:/10.30.171.99:17900], state=UP}
According to the play.py logs controller does not send any KEEPALIVE message while it is learning the prefixes, this is main reason the test fails. See attached test tool logs, when it works controller sends 1 KEEPALIVE every ~1 min.