[GENIUS-114] Genius karaf shutdock deadlocks Created: 19/Feb/18 Updated: 19/Feb/18 Resolved: 19/Feb/18 |
|
| Status: | Resolved |
| Project: | genius |
| Component/s: | General |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Highest |
| Reporter: | Michael Vorburger | Assignee: | Unassigned |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Description |
|
We currently cannot cleanly shutdown Karaf with genius install anymore, see: cd genius/karaf
git checkout master
git pulll
mvn clean package
target/assembly/bin/karaf
feature:install odl-genius
shutdown
This "hangs" - until you Ctrl-C it. Attached is the jstack from this, showing a few non-daemon threads I suspect are responsible for this. I'll open separate new linked issues for problems in projects outside of genius likely causing this. |
| Comments |
| Comment by Michael Vorburger [ 19/Feb/18 ] |
$ cat ~/GENIUS-114-jstack.txt | grep os_prio= | grep -v daemon "Framework stop" #329 prio=5 os_prio=0 tid=0x00007fbd640073a0 nid=0x3e6e waiting for monitor entry [0x00007fbd47c9b000] "nioEventLoopGroup-4-1" #291 prio=10 os_prio=0 tid=0x00007fbdbc146f60 nid=0x3d81 runnable [0x00007fbd45080000] "Thread-39" #283 prio=5 os_prio=0 tid=0x00007fbd6c028780 nid=0x3d79 in Object.wait() [0x00007fbd45884000] "transaction-invoker-impl-0" #280 prio=5 os_prio=0 tid=0x00007fbd843ceb00 nid=0x3d76 waiting on condition [0x00007fbd45d87000] "pool-35-thread-2" #246 prio=5 os_prio=0 tid=0x00007fbd80081940 nid=0x3d53 waiting on condition [0x00007fbd47e9e000] "pool-35-thread-1" #245 prio=5 os_prio=0 tid=0x00007fbd80080710 nid=0x3d52 waiting on condition [0x00007fbd47d9d000] "config-blank-txn-0" #162 prio=5 os_prio=0 tid=0x00007fbdd0063770 nid=0x3d01 waiting on condition [0x00007fbd4ce48000] "config-bundle-tracker-0" #157 prio=5 os_prio=0 tid=0x00007fbd54b5a350 nid=0x3cfc waiting on condition [0x00007fbd4e170000] "RMI Reaper" #77 prio=5 os_prio=0 tid=0x00007fbd781be090 nid=0x3c90 in Object.wait() [0x00007fbd4ebec000] "features-1-thread-1" #28 prio=5 os_prio=0 tid=0x00007fbd5813b4f0 nid=0x3c61 waiting on condition [0x00007fbd8a50b000] "Karaf Lock Monitor Thread" #17 prio=5 os_prio=0 tid=0x00007fbdec5d4140 nid=0x3c55 waiting on condition [0x00007fbd8bb43000] "Active Thread: Equinox Container: 5e136c7a-ea62-4200-96f7-f3a0acddaff9" #13 prio=5 os_prio=0 tid=0x00007fbdec48a880 nid=0x3c52 waiting on condition [0x00007fbdb010d000] "main" #1 prio=5 os_prio=0 tid=0x00007fbdec00a8d0 nid=0x3c40 in Object.wait() [0x00007fbdf343e000] "VM Thread" os_prio=0 tid=0x00007fbdec084e00 nid=0x3c45 runnable "GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007fbdec01fe80 nid=0x3c41 runnable "GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007fbdec0212f0 nid=0x3c42 runnable "GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007fbdec022760 nid=0x3c43 runnable "GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007fbdec023bd0 nid=0x3c44 runnable "VM Periodic Task Thread" os_prio=0 tid=0x00007fbdec2b0a40 nid=0x3c4e waiting on condition I've started creating issues ... actually I'm not so sure that these non-daemon threads are what's causing this - can others chime in here? The daemon would block shutdown AFTER the main thread returns, right? That is stuck however: "main" #1 prio=5 os_prio=0 tid=0x00007fbdec00a8d0 nid=0x3c40 in Object.wait() [0x00007fbdf343e000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000000807aadf0> (a java.util.concurrent.atomic.AtomicReference) at org.eclipse.osgi.container.SystemModule.waitForStop(SystemModule.java:168) - locked <0x00000000807aadf0> (a java.util.concurrent.atomic.AtomicReference) at org.eclipse.osgi.internal.framework.EquinoxBundle$SystemBundle.waitForStop(EquinoxBundle.java:250) at org.eclipse.osgi.launch.Equinox.waitForStop(Equinox.java:181) at org.apache.karaf.main.Main.awaitShutdown(Main.java:631) at org.apache.karaf.main.Main.main(Main.java:189) I do not understand what this is waiting for without digging much deeper - which I don't currently have time for. |
| Comment by Tom Pantelis [ 19/Feb/18 ] |
|
The stuck thread is in an Object.wait in the waitForStop. I see no other threads that are really doing anything so I suspect whatever code was supposed to notify the Object didn't for some reason, possibly due to an exception/error path that elided the notify. I think we'd need to see the karaf log that hopefully shows a smoking gun exception. |
| Comment by Michael Vorburger [ 19/Feb/18 ] |
|
> I think we'd need to see the karaf log that hopefully shows a smoking gun exception. Unforatuntely I wiped it, but thought it was easy to reproduce.. and now cannot anymore - it works?! May be a timing issue... I do hit (other.. weird) issues if I do shutdown too quickly, before diag return empty. Closing this as CANNOT REPRO - if others hit this again, please open with new details. |