[CONTROLLER-710] Increasing RAM usage over time Created: 20/Aug/14  Updated: 30/Oct/17  Resolved: 05/May/15

Status: Resolved
Project: controller
Component/s: mdsal
Affects Version/s: Helium
Fix Version/s: None

Type: Bug
Reporter: Daniel Farrell Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: Linux
Platform: PC


Attachments: PNG File Increasing RAM usage.png     PNG File Screen Shot 2014-08-26 at 7.04.49 PM.png    
External issue ID: 1591

 Description   

A number of devs have observed increasing RAM usage over time, in (at lest) some cases at a very substantial rate.

An easy way to replicate this is to run WCBench[1] against ODL for a while, then use the included `stats.py` script to build a graph of used RAM per run (`./stats.py -g ram`, the RAM data will be collected automatically).

It's possible that this has to do with logging being enabled. It's also possible that this isn't harmful (unused RAM is wasted RAM). I'm not trying to wave a "serious problem" flag, just to document this behavior and hopefully get someone (maybe me when I have time) to give it some attention.

[1]: https://github.com/dfarrell07/wcbench



 Comments   
Comment by Daniel Farrell [ 20/Aug/14 ]

Attachment Increasing RAM usage.png has been added with description: Graph that shows increasing RAM usage over time, produced by WCBench

Comment by Abhijit Kumbhare [ 20/Aug/14 ]

Daniel - any reason you think this is openflowplugin rather than the controller?

Comment by Abhijit Kumbhare [ 20/Aug/14 ]

Changed to assign to controller.

Comment by Daniel Farrell [ 20/Aug/14 ]

> Daniel - any reason you think this is openflowplugin rather than the controller?

I asked CASP3R about this, he suggested openflowplugin. Honestly not sure.

Comment by Jan Medved [ 27/Aug/14 ]

Attachment Screen Shot 2014-08-26 at 7.04.49 PM.png has been added with description: Profiler screen shot

Comment by Jan Medved [ 27/Aug/14 ]

Hi Daniel,

I hooked up the controller to a profiler and let it run cbench (dropAllPacketsRpc) for an extended period of time. Please see the output in the attached screenshot. You can see the old gen memory slowly growing. Heap is being grabage collected regularly. Also, you can see that there is no major garbage collections happening.

At around 1h58 minutes I manually triggered garbage collection from the profiler. You can see a major GC happening (the graph at the bottom of the screenshot), and all memory coming down to pretty much where it was at the beginning of the cbench run.

I built the controller from today's master on the openflow plugin project, and used the following jvm parameters for the run:

> ./run.sh -Xmx4G -XX:+UseG1GC -XX:MaxPermSize=512m

I also set the logging level to ERROR and removed the 'simple forwarding' and 'arp handler' bundles from the distribution before running the test (they interfere with the test - although the loopback is set with dropAllPacketsRpc, these components still get packet_in notifications and I am not sure what they are doing). The controller was running natively on a Mac, cbench was running in a Linux VM on the same Mac as the controller.

Another observation is that performance remains stable (i.e. i see no performance degratiation ove rthe course of 2 hours).

Comment by Daniel Farrell [ 19/Nov/14 ]

I'm not seeing this in Helium 0.2.1. Lots of details on this WCBench wiki page[1].

[1]: https://github.com/dfarrell07/wcbench/wiki/Helium-0.2.1-Performance-Results#results-0

Comment by Carol Sanders [ 04/May/15 ]

This bug is part of the project to Move all ADSAL associated component bugs to ADSAL

Generated at Wed Feb 07 19:53:41 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.