[CONTROLLER-653] Cbench throughput mode kills the MD-SAL data store Created: 29/Jul/14  Updated: 16/Sep/14  Due: 07/Aug/14  Resolved: 16/Sep/14

Status: Resolved
Project: controller
Component/s: mdsal
Affects Version/s: Helium
Fix Version/s: None

Type: Bug
Reporter: Jan Medved Assignee: Tom Pantelis
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: Mac OS
Platform: Macintosh


External issue ID: 1446

 Description   

To reproduce:

1. Put the drop test feedback to go through the MD-SAL Data Store. From the controller OSGI console, type
dropAllPackets on

2. Run cbench in the throughput mode, for example:

cbench -c 192.168.162.1 -p 6633 -m 10000 -l 10 -s 16 -M 10000 -t

The throughput is very low (about 2.6% of the latency cbench throughput going through the data store). After the test finishes, the CPU utilization remains very high for several minutes (I had to kill the controller after 5 min, because I ran out of time). This indicates some massive queue buildup in the system.

I am adding thsi bug to MD-SAL because the throughput mode through RPC loopback is table, and performs very well. It's the loopback through the data store that knocks out the controller.



 Comments   
Comment by Tom Pantelis [ 30/Jul/14 ]

It may be the single-threaded datastore commits that are the bottleneck. I'm working on off-loading the notifications with https://bugs.opendaylight.org/show_bug.cgi?id=1430.

Unfortunately we can't get any debugging info to see the internal executor/queue stats. It would be nice to have JMX bean wrappers for the internal executors et al so we can view stats via the JConsole. I can look into that.

Comment by Jan Medved [ 30/Jul/14 ]

I violently second the request for stats. I've asking for whatever MD-SAL stats i can get for over a year now

Comment by Tom Pantelis [ 07/Aug/14 ]

I've prototyped changes to make some stats available via JMX (JConsole).

Stats for the thread pool executors:

activeThreadCount
currentThreadPoolSize
largestThreadPoolSize
maxThreadPoolSize
currentQueueSize
maxQueueSize
completedTaskCount
totalTaskCount

Stats are available for the following executors:

Single-threaded commit coordinator
Commit Future listener notification
Config data store 3-phase commit
Config data store DataChangeListener notification
Operational data store 3-phase commit
Operational data store DataChangeListener notification

I also added commit stats:

totalCommits
longestCommitTime
shortestCommitTime
averageCommitTime

Comment by Tom Pantelis [ 13/Aug/14 ]

Gerrits for stat work:

https://git.opendaylight.org/gerrit/#/c/9837/
https://git.opendaylight.org/gerrit/#/c/9797/

Comment by Tom Pantelis [ 11/Sep/14 ]

I added stats with this bug but it was originally created to address performance issues. There's been performance fixes since then - do we need to keep this bug open?

Comment by Tom Pantelis [ 16/Sep/14 ]

There's been a lot of performance enhancements made upstream since this bug was opened and it appears the IMDS is in pretty good shape. This bug ended up being used to add stats.

Generated at Wed Feb 07 19:53:33 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.