[OPNFLWPLUG-802] Cbench test - ODL throughput drops by increasing the number of switches Created: 18/Oct/16  Updated: 27/Sep/21  Resolved: 16/Oct/17

Status: Resolved
Project: OpenFlowPlugin
Component/s: General
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Mohamad Darianian Assignee: Luis Gomez
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Attachments: Text File Cbench Used Commands.txt     Text File Installed features of TP test.txt     Text File Yourkit Java Profiler Error (When Profiling ODL).txt    
External issue ID: 6983

 Description   

ODL version: Beryllium-SR2
Benchmarking Tool: Cbench

I conducted a throughput test using cbench on ODL and ONOS. I noticed there's a huge gap between the numbers that I get from these two controllers. (ONOS throughout is almost x10 and latency is almost x2 better than ODL overall)

When scaling up the number of switches from 1 to 8 throughout gets better (the maximum observed throughput is with 8 switches). However, throughput drops when increasing the number of switches more than 8.

Following above observation I figured ONOS and ODL using the same I/O plugin library (Netty) and even ODL uses a newer version. Using more CPU threads, assign more memory, and tweaking Java heap size does not change the numbers that much (Less than 5%)

Considering the fact these two controllers using the same I/O plugin (which is being tested in cbench throughput test), and allocating more resources to ODL does not help to improve the performance, I'm suspecting maybe there's a bug in ODL code.

Cheers,
Mohamad



 Comments   
Comment by Mohamad Darianian [ 18/Oct/16 ]

Attachment Yourkit Java Profiler Error (When Profiling ODL).txt has been added with description: Java profiler error in TP test

Comment by Mohamad Darianian [ 18/Oct/16 ]

Attachment Installed features of TP test.txt has been added with description: ODL installed features for the throughput test

Comment by Mohamad Darianian [ 18/Oct/16 ]

Attachment Cbench Used Commands.txt has been added with description: Cbench commands

Comment by Luis Gomez [ 20/Oct/16 ]

Thanks Mohamad, I will take a deeper look at cbench test next week when I am done with longevity and scalability test refactor.

Comment by Luis Gomez [ 16/Oct/17 ]

This was never updated in the bug but there was a mail thread:

--------------------------------------------------------------------------------------

I recently did some cbench test with 16 switches in my laptop and ONOS Goldeneye was ~50% faster (60K vs 40K) than ODL Beryllium, however ODL Boron was ~50% faster (~90K) than ONOS.

I used this command: cbench -c 192.168.0.1 -t -m 12000 -M 100 -l 5 -s 16 -D 5000

Of course other cbench options (more switches, etc..) could throw other numbers, that is why I am asking Mohamad to post the cbench commands.

BR/Luis

See More from Abhijit Kumbhare

See More from Abhijit Kumbhare

I was wondering to ask is there any conclusion and/or report to explain the cbench limitation(s) for ODL performance evaluation? In essence I'm trying to understand regardless of limitation(s) of stress testing tools (cbench, MT-cbench, etc) why ODL throughput is not that good (comparing to other controllers) and it drops drastically with increasing the number of switches?
I would be grateful if you could shed some light on these ambiguities.

Cheers,
Mohamad​

On Fri, Sep 30, 2016 at 3:00 PM, Luis Gomez <ecelgp@gmail.com> wrote:
Hi Mohamad,

We use cbench test in ODL just to detect perf regression, but not to get controller numbers because of cbench multiple limitations. Please check this report using other tools like MT-Cbench and Multinet:

https://raw.githubusercontent.com/wiki/intracom-telecom-sdn/nstat/files/ODL_performance_report_v1.2.pdf

Also please let me know if the results of this report match your observation.

BR/Luis

On Sep 30, 2016, at 9:57 AM, Mohamad Darianian <mohamad.drnn@gmail.com> wrote:

Hi Luis,

Hope all is well. We (me and my colleague) posted in ODL mailing lists regarding ODL performance issue (more specifically its throughput) a while ago, but, didn't get any helpful feedback. Here (https://lists.opendaylight.org/pipermail/opendaylight-users/2016-September/000656.html)

I saw you're very active in the mailing lists and also bugs.opendaylight. Hence, I thought to drop you a line and see if you know the solution to fix this issue or the reason behind ODL's poor performance when increasing number of switches. Also this issue has been highlighted in a few studies before. The version that we are using is Beryllium-SR2.

I would greatly appreciate your help. Thank you!

Cheers,

Mohamad Darianian
Sr. Network/Security Engineer

--------------------------------------------------------------------------------------

Comment by Luis Gomez [ 16/Oct/17 ]

I think we can close this as Boron release showed good numbers.

Generated at Wed Feb 07 20:33:25 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.