[OVSDB-345] thread leak when connecting/disconnecting ovs nodes in a loop Created: 27/May/16  Updated: 15/Jun/16  Resolved: 15/Jun/16

Status: Resolved
Project: ovsdb
Component/s: Library
Affects Version/s: unspecified
Fix Version/s: None

Type: Bug
Reporter: Jamo Luhrsen Assignee: Dileep Ranganathan
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 5976

 Description   

just do this in a loop:

ovs-vsctl set-manager ${ODL_IP} 6640
sleep 1;
ovs-vsctl del-manager
sleep 1;

each connection is generating some threads but they are not cleaned
up after the disconnection or after waiting for some period of time.

feature installed is odl-ovsdb-southbound-impl-rest



 Comments   
Comment by Anil Vishnoi [ 27/May/16 ]

Adding more details from the mail discussion.

This is what i did:

1) Started the controller with ovsdb
2) Connected profiler
3) Once controller was up and stable took the first thread dump
4) Then i started the script that set-controller and delete-controller (ovs-vsctl set-manager tcp:192.168.122.1:6640;sleep 4;ovs-vsctl show;sleep 4;ovs-vsctl del-manager ; ovs-vsctl show) for 200 times
5) went to sleep
6) Next morning stopped the controller and thread monitor was constant, so it means controller was cooled down after these 200 connection/disconnection
7) Took another thread dump.

You can look at these thread dump at following urls
First Thread Dump (Step 3 above) : https://gist.githubusercontent.com/vishnoianil/c87bc620f0b35a87cd698bcadc244cc4/raw/4010fe1b758015cb2fd1f6f0d801eaafd171dcbd/first-thread-dump.txt
Last Thread Dump (Step 7 above) : https://gist.githubusercontent.com/vishnoianil/48d3be92630d93f4e5317474b01a681e/raw/75da31ef97a0e59431c1f7d02d62d8d7f3894363/last-thread-dump
Threads that are additional in Last thread dump compared to First Thread Dump : https://gist.githubusercontent.com/vishnoianil/56b9e5ce8b8b25203d09d1a4b7d67049/raw/3f108b62a0cef6c9d4681636fb8b664ada17e81b/newly-created-and-not-reaped-threads.txt

This is what i see:

Total Threads in First Dump : 98
Total Threads in Last Dump : 313
Additional Thread spawned : ~215.

I see three category of threads

Thread "PassiveConnection-X" (9 threads) – These are expected.
Thread "nioEventLoopGroup-5-X" ( 7 Threads) – These are expected as well for netty library.

Thread "pool-XX-thread-Y" (197 threads) – These threads are problem. Each thread has the same stack. This stack related to is executor pool threads.
Thread "pool-95-thread-1":
at sun.misc.Unsafe.park(boolean, long)
at java.util.concurrent.locks.LockSupport.park(java.lang.Object) (line: 175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() (line: 2039)
at java.util.concurrent.LinkedBlockingQueue.take() (line: 442)
at java.util.concurrent.ThreadPoolExecutor.getTask() (line: 1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) (line: 1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run() (line: 617)
at java.lang.Thread.run() (line: 745)

As of now I am not sure whether we are creating these threads or any of the core controller component is causing these. I am working on a patch, that actually name all executor pools in the ovsdb code, so that we can figure out whether these "pool-xxx-thread-y" are created by us or someone else. BUT indeed it's a thread leak somewhere.

Comment by Dileep Ranganathan [ 07/Jun/16 ]

https://git.opendaylight.org/gerrit/#/c/39917/

Comment by Sam Hague [ 14/Jun/16 ]

be: https://git.opendaylight.org/gerrit/#/c/39986/
b: https://git.opendaylight.org/gerrit/#/c/39917/

Comment by Jamo Luhrsen [ 14/Jun/16 ]

I'm going to bring a CSIT test case to check for this. Once I do, I'll
mark this verified/fixed.

Comment by Anil Vishnoi [ 14/Jun/16 ]

I was about to ask this and you sensed it , thanks a lot jamo. I need to create the wiki page that actually should list all these bugs for which we need to write a integration tests. I will do that as soon as possible

Comment by Sam Hague [ 15/Jun/16 ]

Anil, Jamo,

add the test cases to the spreadsheet Venkat and Josh started: https://docs.google.com/spreadsheets/d/1n4yCFc9kogkkRrWwFjrVJyDPeO4lu79PZtwjuourDWg/edit#gid=0

Generated at Wed Feb 07 20:36:09 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.