|
please see the beryllium result for 10x100 prefixes: https://jenkins.opendaylight.org/sandbox/job/bgpcep-csit-1node-periodic-bgp-ingest-only-beryllium/2/artifact/karaf.log/*view*/
2016-02-02 11:30:37,947 | WARN | lt-dispatcher-20 | SimpleShardDataTreeCohort | 143 - org.opendaylight.controller.sal-distributed-datastore - 1.3.0.SNAPSHOT | Store Tx member-1-chn-39-txn-53: Data validation failed for path /(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2013-09-25)bgp-rib/rib/rib[
{(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2013-09-25)id=example-bgp-rib}
]/peer/peer[
{(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2013-09-25)peer-id=bgp://127.0.0.4}
].
org.opendaylight.yangtools.yang.data.api.schema.tree.ModifiedNodeDoesNotExistException: Node /(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2013-09-25)bgp-rib/rib/rib[
{(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2013-09-25)id=example-bgp-rib}
]/peer/peer[
{(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2013-09-25)peer-id=bgp://127.0.0.4}
] does not exist. Cannot apply modification to its children.
2016-02-02 11:30:38,105 | WARN | oupCloseable-6-6 | DefaultChannelPipeline | 102 - io.netty.common - 4.0.33.Final | An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
java.lang.IllegalStateException: Transaction chain has failed
at com.google.common.base.Preconditions.checkState(Preconditions.java:173)[39:com.google.guava:18.0.0]
at org.opendaylight.controller.cluster.databroker.DOMBrokerTransactionChain.checkNotFailed(DOMBrokerTransactionChain.java:73)[143:org.opendaylight.controller.sal-distributed-datastore:1.3.0.SNAPSHOT]
at org.opendaylight.controller.cluster.databroker.DOMBrokerTransactionChain.submit(DOMBrokerTransactionChain.java:84)[143:org.opendaylight.controller.sal-distributed-datastore:1.3.0.SNAPSHOT]
at org.opendaylight.controller.cluster.databroker.AbstractDOMBrokerWriteTransaction.submit(AbstractDOMBrokerWriteTransaction.java:137)[143:org.opendaylight.controller.sal-distributed-datastore:1.3.0.SNAPSHOT]
Test suite console:
===================
bgpcep-bgp-ingest.txt.Manypeers Peercount :: BGP performance of ingesting f...
==============================================================================
Check_For_Empty_Ipv4_Topology_Before_Talking :: Wait for example-i... | PASS |
------------------------------------------------------------------------------
Reconfigure_ODL_To_Accept_Connections :: Configure BGP peer module... | PASS |
------------------------------------------------------------------------------
Start_Talking_BGP_Manager :: Start Python manager to connect speak... | PASS |
------------------------------------------------------------------------------
Wait_For_Ipv4_Topology :: Wait until example-ipv4-topology reaches... 226.0s
2016-02-02 11:30:36: actual (expected) prefix count is 0 (1000)
2016-02-02 11:30:46: actual (expected) prefix count is 14 (1000)
2016-02-02 11:30:56: actual (expected) prefix count is 14 (1000)
2016-02-02 11:31:06: actual (expected) prefix count is 14 (1000)
2016-02-02 11:31:16: actual (expected) prefix count is 14 (1000)
2016-02-02 11:31:26: actual (expected) prefix count is 14 (1000)
2016-02-02 11:31:36: actual (expected) prefix count is 14 (1000)
2016-02-02 11:31:46: actual (expected) prefix count is 14 (1000)
2016-02-02 11:31:56: actual (expected) prefix count is 14 (1000)
2016-02-02 11:32:06: actual (expected) prefix count is 14 (1000)
2016-02-02 11:32:16: actual (expected) prefix count is 14 (1000)
2016-02-02 11:32:27: actual (expected) prefix count is 14 (1000)
2016-02-02 11:32:37: actual (expected) prefix count is 14 (1000)
2016-02-02 11:32:47: actual (expected) prefix count is 14 (1000)
2016-02-02 11:32:57: actual (expected) prefix count is 14 (1000)
2016-02-02 11:33:07: actual (expected) prefix count is 14 (1000)
2016-02-02 11:33:17: actual (expected) prefix count is 14 (1000)
2016-02-02 11:33:27: actual (expected) prefix count is 14 (1000)
2016-02-02 11:33:37: actual (expected) prefix count is 14 (1000)
2016-02-02 11:33:47: actual (expected) prefix count is 14 (1000)
2016-02-02 11:33:57: actual (expected) prefix count is 14 (1000)
2016-02-02 11:34:07: actual (expected) prefix count is 14 (1000)
2016-02-02 11:34:17: actual (expected) prefix count is 14 (1000)
2016-02-02 11:34:27: actual (expected) prefix count is 14 (1000)
This test fails due to https://bugs.opendaylight.org/show_bug.cgi?id=5185
Keyword 'Check_Ipv4_Topology' failed after retrying for 3 minutes 46 seconds. The last error was: 14 != 1000
|
|
+ source /w/workspace/bgpcep-csit-1node-periodic-bgp-ingest-only-beryllium/test/csit/scripts/controllermem_6g.sh
++ CONTROLLERMEM=6144m
garbage collector - no info in logs anyway this suite is just a part of the whole BGP performance suite = the same as for the rest ...
... current sandbox result passed the topology check: https://jenkins.opendaylight.org/sandbox/job/bgpcep-csit-1node-periodic-bgp-ingest-only-beryllium/1/console
Wait_For_Ipv4_Topology :: Wait until example-ipv4-topology reaches... 49620.0s
2016-02-08 08:29:08: actual (expected) prefix count is 0 (1000000)
2016-02-08 08:29:18: actual (expected) prefix count is 4210 (1000000)
...
2016-02-08 09:20:05: actual (expected) prefix count is 1000000 (1000000)
------------------------------------------------------------------------------
Check_Logs_For_Updates :: Check BGP peer logs for received updates. | FAIL |
Keyword 'Check_File_For_Occurence' failed after retrying for 20 seconds. The last error was: '0 >= 2' should be true.
and failed on one of bgp peer update count ...
|
|
On top of possible (still not verified) memory pressure on ODL_SYSTEM, there seem to be problems on robot slave too [0]:
Output: /w/workspace/bgpcep-csit-1node-periodic-bgp-ingest-only-stable-lithium/output.xml
Traceback (most recent call last):
File "/tmp/tmp.UyJmSZoTFJrobot_venv/bin/pybot", line 6, in <module>
run_cli(sys.argv[1:])
File "/tmp/tmp.UyJmSZoTFJrobot_venv/lib/python2.7/site-packages/robot/run.py", line 466, in run_cli
RobotFramework().execute_cli(arguments)
File "/tmp/tmp.UyJmSZoTFJrobot_venv/lib/python2.7/site-packages/robot/utils/application.py", line 45, in execute_cli
rc = self._execute(arguments, options)
File "/tmp/tmp.UyJmSZoTFJrobot_venv/lib/python2.7/site-packages/robot/utils/application.py", line 87, in _execute
error, details = get_error_details(exclude_robot_traces=False)
File "/tmp/tmp.UyJmSZoTFJrobot_venv/lib/python2.7/site-packages/robot/utils/error.py", line 46, in get_error_details
details = ErrorDetails(exclude_robot_traces=exclude_robot_traces)
File "/tmp/tmp.UyJmSZoTFJrobot_venv/lib/python2.7/site-packages/robot/utils/error.py", line 59, in ErrorDetails
raise exc_value
MemoryError
Robot VM now uses only 2GB RAM, so scaling suite should be reviewed for memory inefficiencies.
[0] https://jenkins.opendaylight.org/releng/view/bgpcep/job/bgpcep-csit-1node-periodic-bgp-ingest-only-stable-lithium/116/console
|
|
Testing locally with Beryllium SR2 and latest - 10 rr-clients, each advertise 10k prefixes.
The log contain a lot of "Failed to acquire operation permit for transaction" and " Generation of DataTreeCandidateNode events took longer than expected. Elapsed time"
After some time, ODL runs out of memory:
2016-04-29 11:26:20,980 | WARN | oupCloseable-2-6 | BGPSessionImpl | 185 - org.opendaylight.bgpcep.bgp-rib-impl - 0.5.2.SNAPSHOT | Failed to send message Update [_attributes=Attributes [_asPath=AsPath{getSegments=[], augmentations={}}, _clusterId=ClusterId{getCluster=[Ipv4Address [_value=127.0.0.1]], augmentations={}}, _localPref=LocalPref{getPref=100, augmentations={}}, _origin=Origin{getValue=Igp, augmentations={}}, _originatorId=OriginatorId{getOriginator=Ipv4Address [_value=127.0.0.1], augmentations={}}, _unrecognizedAttributes=[], augmentation=[Attributes1 [_mpReachNlri=MpReachNlri [_advertizedRoutes=AdvertizedRoutes [_destinationType=DestinationIpv4Case [_destinationIpv4=DestinationIpv4 [_ipv4Prefixes=[Ipv4Prefixes [_prefix=Ipv4Prefix [_value=2.0.210.48/28], augmentation=[]]], augmentation=[]], augmentation=[]], augmentation=[]], _afi=class org.opendaylight.yang.gen.v1.urn.opendaylight.params.xml.ns.yang.bgp.types.rev130919.Ipv4AddressFamily, _cNextHop=Ipv4NextHopCase{getIpv4NextHop=Ipv4NextHop{getGlobal=Ipv4Address [_value=192.0.2.1], augmentations={}}, augmentations={}}, _safi=class org.opendaylight.yang.gen.v1.urn.opendaylight.params.xml.ns.yang.bgp.types.rev130919.UnicastSubsequentAddressFamily, augmentation=[]]]]], augmentation=[]] to socket io.netty.handler.codec.EncoderException: java.lang.OutOfMemoryError: GC overhead limit exceeded
Uncaught error from thread [opendaylight-cluster-data-notification-dispatcher-56] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[opendaylight-cluster-data]
2016-04-29 11:26:29,548 | WARN | on-dispatcher-52 | OperationLimiter | 175 - org.opendaylight.controller.sal-distributed-datastore - 1.3.2.SNAPSHOT | Failed to acquire operation permit for transaction member-1-chn-5-txn-3699-1461921917380
java.lang.OutOfMemoryError: GC overhead limit exceeded
Uncaught error from thread [opendaylight-cluster-data-akka.actor.default-dispatcher-17] shutting down JVM since 'akka.jvm-exit-o2016-04-29 11:26:37,565 | WARN | on-dispatcher-52 | OperationLimiter | 175 - org.opendaylight.controller.sal-distributed-datastore - 1.3.2.SNAPSHOT | Failed to acquire operation permit for transaction member-1-chn-5-txn-3699-1461921917380
n-fatal-error' is enabled for ActorSystem[opendaylight-cluster-data]
at org.opendaylight.yang.gen.v1.urn.opendaylight.params.xml.ns.yang.bgp.multiprotocol.rev130919.Attributes1Builder.build(Attributes1Builder.java:38)
at org.opendaylight.protocol.bgp.rib.spi.AbstractRIBSupport.buildUpdate(AbstractRIBSupport.java:241)
2016-04-29 11:26:45,944 | WARN | on-dispatcher-52 | OperationLimiter | 175 - org.opendaylight.controller.sal-distributed-datastore - 1.3.2.SNAPSHOT | Failed to acquire operation permit for transaction member-1-chn-5-txn-3699-1461921917380
at org.opendaylight.protocol.bgp.rib.impl.AdjRibOutListener.advertise(AdjRibOutListener.java:143)
at org.opendaylight.protocol.bgp.rib.impl.AdjRibOutListener.processRouteChange(AdjRibOutListener.java:114)
at org.opendaylight.protocol.bgp.rib.impl.AdjRibOutListener.processSupportedFamilyRoutes(AdjRibOutListener.java:95)
at org.opendaylight.protocol.bgp.rib.impl.AdjRibOutListener.onDataTreeChanged(AdjRibOutListener.java:87)
at org.opendaylight.controller.cluster.datastore.DataTreeChangeListenerActor.dataChanged(DataTreeChangeListenerActor.java:53)
at org.opendaylight.controller.cluster.datastore.DataTreeChangeListenerActor.handleReceive(DataTreeChangeListenerActor.java:37)
at org.opendaylight.controller.cluster.common.actor.AbstractUntypedActor.onReceive(AbstractUntypedActor.java:34)
at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
java.lang.OutOfMemoryError: GC overhead limit exceeded
Initial try-outs shows that pressure can be lowered by narrowing LocRibWriter DCN listener scope to Effective-RIB-In tables only.
|