[CONTROLLER-2028] Thousands of guava Finalizer threads in waiting Created: 24/Jan/22  Updated: 04/Feb/22  Resolved: 03/Feb/22

Status: Resolved
Project: controller
Component/s: None
Affects Version/s: 4.0.7
Fix Version/s: None

Type: Bug Priority: Medium
Reporter: Martin Sunal Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File 10000kdump10k.txt     Text File 2000kdump10k.txt     Text File 5000kdump10k.txt     Text File 7000kdump10k.txt     Text File before-mount-dump10k.txt    

 Description   

Distribution: onap-karaf-0.15.1
Features installed: odl-netconf-topology, odl-restconf-nb-rfc8040, odl-mdsal-apidocs, jolokia

Situation:

Mounting of 10 000 netconf-testtool devices (sending of PUT mount requests one after the other) is causing 10 000 threads in "waiting".

Thread dump (jstack <pid>) was captured during mounting 10k devices

before-mount-dump10k.txt - thread dump after ODL starts, before sending PUT requests
2000kdump10k.txt - thread dump after 2k PUT mount requests
5000kdump10k.txt - thread dump after 5k PUT mount requests
7000kdump10k.txt - thread dump after 7k PUT mount requests
10000kdump10k.txt - thread dump after 10k PUT mount requests

 

Snippet of a thread in waiting:

"com.google.common.base.internal.Finalizer" #4549 daemon prio=5 os_prio=0 cpu=0.09ms elapsed=76.30s tid=0x00007f4e8c1f9800 nid=0x7a78 in Object.wait()  [0x00007f4d50f50000]"com.google.common.base.internal.Finalizer" #4549 daemon prio=5 os_prio=0 cpu=0.09ms elapsed=76.30s tid=0x00007f4e8c1f9800 nid=0x7a78 in Object.wait()  [0x00007f4d50f50000]   java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(java.base@11.0.12/Native Method) - waiting on <no object reference available> at java.lang.ref.ReferenceQueue.remove(java.base@11.0.12/ReferenceQueue.java:155) - waiting to re-lock in wait() <0x00000007272f7b60> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(java.base@11.0.12/ReferenceQueue.java:176) at com.google.common.base.internal.Finalizer.run(Finalizer.java:145) at java.lang.Thread.run(java.base@11.0.12/Thread.java:829)

I found only this reference which uses Finalizer internally:
https://github.com/opendaylight/controller/blob/37e9a32e285a174443c312ef257fc1df359b50d9/opendaylight/md-sal/sal-distributed-datastore/src/main/java/org/opendaylight/controller/cluster/datastore/TransactionContextCleanup.java#L33

based on:
https://guava.dev/releases/7.0/api/docs/com/google/common/base/internal/Finalizer.html

 



 Comments   
Comment by Robert Varga [ 03/Feb/22 ]

I cannot reproduce the problem with the information provided.
Furthermore even if I could, the referenced field cannot reasonably be the cause for the issue – it is a static field and the JLS is rather clear on how those work.

Comment by Martin Sunal [ 04/Feb/22 ]

I tried it on different system and I cannot reproduce it there.

In case I figure out conditions when it can be reproduced I will open this issue again.

Generated at Wed Feb 07 19:57:01 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.