[OPNFLWPLUG-597] Longevity Li plugin: controller Out Of Memory after 20 runs Created: 19/Jan/16  Updated: 27/Sep/21  Resolved: 30/Jan/16

Status: Resolved
Project: OpenFlowPlugin
Component/s: General
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Luis Gomez Assignee: Robert Varga
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 5001

 Description   

Current CI longevity test shows Li plugin goes out of memory after 20 runs of bringing 200 switches in the network:

https://jenkins.opendaylight.org/releng/view/openflowplugin/job/openflowplugin-csit-1node-periodic-longevity-lithium-redesign-only-beryllium/

BR/Luis



 Comments   
Comment by Robert Varga [ 24/Jan/16 ]

Can we reconfigure the test to capture a memory dump?

Comment by Robert Varga [ 25/Jan/16 ]

Memory allocation failure when creating a thread – this points to failure to allocate thread stack (8MB by default), which actually lies outside of the heap.

2016-01-25 12:10:54,210 | INFO | ol-7903-thread-1 | SalRoleServiceImpl | 168 - org.opendaylight.openflowplugin.impl - 0.2.0.SNAPSHOT | RoleChangeTask called on device:openflow:124 OFPRole:BECOMEMASTER
2016-01-25 12:10:54,810 | INFO | ol-7905-thread-1 | SalRoleServiceImpl | 168 - org.opendaylight.openflowplugin.impl - 0.2.0.SNAPSHOT | RoleChangeTask called on device:openflow:125 OFPRole:BECOMEMASTER

Noting the high sequence number, I suspect somebody somewhere is allocating threadpools and not shutting them down, leading to a huge number of threads which are sitting idle.

Comment by Robert Varga [ 26/Jan/16 ]

Boils down to SalRoleServiceImpl allocating a single-threaded executor, which is not being closed at all.

master: https://git.opendaylight.org/gerrit/33521

Comment by Robert Varga [ 26/Jan/16 ]

boron: https://git.opendaylight.org/gerrit/33521
boron: https://git.opendaylight.org/gerrit/33560

beryllium: https://git.opendaylight.org/gerrit/33551
beryllium: https://git.opendaylight.org/gerrit/33562

Comment by Michal Rehak [ 27/Jan/16 ]

merged

Comment by Luis Gomez [ 30/Jan/16 ]

Robert, I do not know how you do in yangtools, but in openflowplugin project we ask the bug reporter person to restest before closing the bug.

Comment by Robert Varga [ 30/Jan/16 ]

Well, this has been the workflow on all projects I have participated ever since OpenDaylight came to existence.

What is the Bugzilla state for 'the patch merged, waiting for confirmation that the bug is gone' and what does the RESOLVED/VERIFIED state stand for? Is there a wiki page describing these for openflowplugin?

Comment by Luis Gomez [ 30/Jan/16 ]

OK, you are right, there is no project instruction as well as bugzilla does not seem to have the states for what I believe are good and common practices for bug/issue workflow. I hope this will be better handled in Jira.

Generated at Wed Feb 07 20:32:53 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.