[OPNFLWPLUG-156] The Finisher queue size is unbounded and causes the controller to run out of memory under stress Created: 12/May/14  Updated: 27/Sep/21  Resolved: 13/May/14

Status: Resolved
Project: OpenFlowPlugin
Component/s: General
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Jan Medved Assignee: Jan Medved
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: Mac OS
Platform: Macintosh


Attachments: Zip Archive Class-list.zip     Zip Archive Object-explorer.zip    
External issue ID: 985

 Description   

The TicketFinisher queue is unbounded (created with integer.MAX_VALUE). Now that the MD-SAL queues do backpresssure,the plugin's Finisher queue will grow indefinitely and cause the controller to run out of memory under stress.

To reproduce, run the the cbench throughput test on a controller that uses the IF1.3 plugin, for example:

cbench -c 192.168.162.1 -p 6633 -m 1000 -l 10 -s 16 -M 100000 -

This bug is similar to OPNFLWPLUG-119

The Finisher queue should be created with a max number of elements (say 500 or 1000). The max number of elements in the Finisher queue should be configurable.



 Comments   
Comment by Jan Medved [ 12/May/14 ]

Changed the queue initialization to limit the Finished queue 1000 elements in https://git.opendaylight.org/gerrit/6885.

Now the Finisher queue is not growing out of bounds anymore, but another out-of-memory condition is manifesting itself somewhere in the parser - see the attached memory/ object dump from the YourKit profiler.

Comment by Jan Medved [ 12/May/14 ]

Attachment Class-list.zip has been added with description: class list dump from Yourkit

Comment by Jan Medved [ 12/May/14 ]

Showing the objects on the Hash map that did not get garbage collected

Comment by Jan Medved [ 12/May/14 ]

Attachment Object-explorer.zip has been added with description: Object Dump from YourKit

Comment by Jan Medved [ 12/May/14 ]

(In reply to Jan Medved from comment #0)
> The TicketFinisher queue is unbounded (created with integer.MAX_VALUE). Now
> that the MD-SAL queues do backpresssure,the plugin's Finisher queue will
> grow indefinitely and cause the controller to run out of memory under stress.
>
> To reproduce, run the the cbench throughput test on a controller that uses
> the IF1.3 plugin, for example:
>
> cbench -c 192.168.162.1 -p 6633 -m 1000 -l 10 -s 16 -M 100000 -
>
> This bug is similar to OPNFLWPLUG-119
>
> The Finisher queue should be created with a max number of elements (say 500
> or 1000). The max number of elements in the Finisher queue should be
> configurable.

One more thing: this change allows 3-4 runs of cbench in throughput mode, whereas before the controller locked up after one run.

Comment by Robert Varga [ 12/May/14 ]

I have run a simple 'start SP' test, profiled it, it turns out we do have a leak in yangtools – one of the node implementations is holding a reference which it should not. We have pushed a few improvements to memory usage already, expect to lose ~120MB once the fix is in. It can also explain why NETCONF has such sucky scaling numbers.

Comment by Robert Varga [ 12/May/14 ]

BUG-987 is tracking the yangtools thing.

Comment by Michal Rehak [ 13/May/14 ]

https://git.opendaylight.org/gerrit/6885

Comment by Michal Rehak [ 13/May/14 ]

please verify

Generated at Wed Feb 07 20:31:42 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.