[NETCONF-577] sshd-2.0.0 seems to be stuck on shutdown Created: 29/Oct/18 Updated: 07/Jan/19 Resolved: 07/Jan/19 |
|
| Status: | Resolved |
| Project: | netconf |
| Component/s: | netconf |
| Affects Version/s: | Neon, Fluorine SR1 |
| Fix Version/s: | Neon, Fluorine SR2 |
| Type: | Bug | Priority: | High |
| Reporter: | Robert Varga | Assignee: | Tomas Cere |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Description |
|
This is seen in SFT: 2018-10-29T16:33:39,505 | ERROR | Framework stop | BeanRecipe | 18 - org.apache.aries.blueprint.core - 1.8.3 | The blueprint bean netconfMdsalServer in bundle mdsal-netconf-ssh/1.5.1.SNAPSHOT incorrectly threw an exception from its destroy method. java.net.SocketTimeoutException: Failed to receive closure confirmation within 60000 millis at org.apache.sshd.server.SshServer.stop(SshServer.java:341) [132:org.apache.sshd.core:2.0.0] at org.opendaylight.netconf.ssh.SshProxyServer.close(SshProxyServer.java:106) [102:mdsal-netconf-ssh:1.5.1.SNAPSHOT] at org.opendaylight.netconf.ssh.NetconfNorthboundSshServer.close(NetconfNorthboundSshServer.java:82) [102:mdsal-netconf-ssh:1.5.1.SNAPSHOT] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:?] at org.apache.aries.blueprint.utils.ReflectionUtils.invoke(ReflectionUtils.java:299) [18:org.apache.aries.blueprint.core:1.8.3] at org.apache.aries.blueprint.container.BeanRecipe.invoke(BeanRecipe.java:980) [18:org.apache.aries.blueprint.core:1.8.3] at org.apache.aries.blueprint.container.BeanRecipe.destroy(BeanRecipe.java:887) [18:org.apache.aries.blueprint.core:1.8.3] at org.apache.aries.blueprint.container.BlueprintRepository.destroy(BlueprintRepository.java:329) [18:org.apache.aries.blueprint.core:1.8.3] at org.apache.aries.blueprint.container.BlueprintContainerImpl.destroyComponents(BlueprintContainerImpl.java:745) [18:org.apache.aries.blueprint.core:1.8.3] at org.apache.aries.blueprint.container.BlueprintContainerImpl.tidyupComponents(BlueprintContainerImpl.java:956) [18:org.apache.aries.blueprint.core:1.8.3] at org.apache.aries.blueprint.container.BlueprintContainerImpl.destroy(BlueprintContainerImpl.java:890) [18:org.apache.aries.blueprint.core:1.8.3] at org.apache.aries.blueprint.container.BlueprintExtender$3.run(BlueprintExtender.java:325) [18:org.apache.aries.blueprint.core:1.8.3] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:?] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:?] at org.apache.aries.blueprint.container.BlueprintExtender.destroyContainer(BlueprintExtender.java:346) [18:org.apache.aries.blueprint.core:1.8.3] at org.apache.aries.blueprint.container.BlueprintExtender.access$400(BlueprintExtender.java:68) [18:org.apache.aries.blueprint.core:1.8.3] at org.apache.aries.blueprint.container.BlueprintExtender$BlueprintContainerServiceImpl.destroyContainer(BlueprintExtender.java:633) [18:org.apache.aries.blueprint.core:1.8.3] at org.opendaylight.controller.blueprint.BlueprintBundleTracker.shutdownAllContainers(BlueprintBundleTracker.java:286) [193:org.opendaylight.controller.blueprint:0.9.1.SNAPSHOT] at org.opendaylight.controller.blueprint.BlueprintBundleTracker.bundleChanged(BlueprintBundleTracker.java:191) [193:org.opendaylight.controller.blueprint:0.9.1.SNAPSHOT] at org.eclipse.osgi.internal.framework.BundleContextImpl.dispatchEvent(BundleContextImpl.java:908) [?:?] at org.eclipse.osgi.framework.eventmgr.EventManager.dispatchEvent(EventManager.java:230) [?:?] at org.eclipse.osgi.framework.eventmgr.ListenerQueue.dispatchEventSynchronous(ListenerQueue.java:148) [?:?] at org.eclipse.osgi.internal.framework.EquinoxEventPublisher.publishBundleEventPrivileged(EquinoxEventPublisher.java:213) [?:?] at org.eclipse.osgi.internal.framework.EquinoxEventPublisher.publishBundleEvent(EquinoxEventPublisher.java:120) [?:?] at org.eclipse.osgi.internal.framework.EquinoxEventPublisher.publishBundleEvent(EquinoxEventPublisher.java:112) [?:?] at org.eclipse.osgi.internal.framework.EquinoxContainerAdaptor.publishModuleEvent(EquinoxContainerAdaptor.java:168) [?:?] at org.eclipse.osgi.container.Module.publishEvent(Module.java:476) [?:?] at org.eclipse.osgi.container.Module.doStop(Module.java:634) [?:?] at org.eclipse.osgi.container.Module.stop(Module.java:498) [?:?] at org.eclipse.osgi.container.SystemModule.stop(SystemModule.java:202) [?:?] at org.eclipse.osgi.internal.framework.EquinoxBundle$SystemBundle$EquinoxSystemModule$1.run(EquinoxBundle.java:165) [?:?] at java.lang.Thread.run(Thread.java:748) [?:?] |
| Comments |
| Comment by Robert Varga [ 29/Oct/18 ] |
|
This can be seen during container shutdown, too, and seems to be cause by the following: "Framework stop" #205 daemon prio=5 os_prio=0 tid=0x00007ff94c04f000 nid=0x4a69 in Object.wait() [0x00007ff9d9d85000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at org.apache.aries.blueprint.container.ReferenceRecipe.getService(ReferenceRecipe.java:211) - locked <0x00000000d8b44be0> (a java.lang.Object) at org.apache.aries.blueprint.container.ReferenceRecipe.access$000(ReferenceRecipe.java:56) at org.apache.aries.blueprint.container.ReferenceRecipe$ServiceDispatcher.call(ReferenceRecipe.java:306) at Proxy376b1c8f_149c_43ab_a8ef_84bcff61daa3.execute(Unknown Source) at sun.nio.ch.AsynchronousChannelGroupImpl.executeOnPooledThread(AsynchronousChannelGroupImpl.java:188) at sun.nio.ch.Invoker.invokeIndirectly(Invoker.java:212) at sun.nio.ch.UnixAsynchronousServerSocketChannelImpl.implClose(UnixAsynchronousServerSocketChannelImpl.java:118) at sun.nio.ch.AsynchronousServerSocketChannelImpl.close(AsynchronousServerSocketChannelImpl.java:111) at org.apache.sshd.common.io.nio2.Nio2Acceptor.unbind(Nio2Acceptor.java:113) at org.apache.sshd.common.io.nio2.Nio2Acceptor.unbind(Nio2Acceptor.java:99) at org.apache.sshd.common.io.nio2.Nio2Acceptor.close(Nio2Acceptor.java:141) at org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:56) at org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:45) at org.apache.sshd.common.future.AbstractSshFuture.notifyListener(AbstractSshFuture.java:159) at org.apache.sshd.common.future.DefaultSshFuture.addListener(DefaultSshFuture.java:167) at org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:57) at org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:45) at org.apache.sshd.common.util.closeable.SequentialCloseable.doClose(SequentialCloseable.java:69) at org.apache.sshd.common.util.closeable.SimpleCloseable.close(SimpleCloseable.java:63) at org.apache.sshd.common.util.closeable.AbstractInnerCloseable.doCloseImmediately(AbstractInnerCloseable.java:46) at org.apache.sshd.common.util.closeable.AbstractCloseable.close(AbstractCloseable.java:83) at org.apache.sshd.server.SshServer.stop(SshServer.java:339) at org.opendaylight.netconf.ssh.SshProxyServer.close(SshProxyServer.java:106) at org.opendaylight.netconf.ssh.NetconfNorthboundSshServer.close(NetconfNorthboundSshServer.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.aries.blueprint.utils.ReflectionUtils.invoke(ReflectionUtils.java:299) at org.apache.aries.blueprint.container.BeanRecipe.invoke(BeanRecipe.java:980) at org.apache.aries.blueprint.container.BeanRecipe.destroy(BeanRecipe.java:887) at org.apache.aries.blueprint.container.BlueprintRepository.destroy(BlueprintRepository.java:329) at org.apache.aries.blueprint.container.BlueprintContainerImpl.destroyComponents(BlueprintContainerImpl.java:745) at org.apache.aries.blueprint.container.BlueprintContainerImpl.tidyupComponents(BlueprintContainerImpl.java:956) at org.apache.aries.blueprint.container.BlueprintContainerImpl.destroy(BlueprintContainerImpl.java:890) at org.apache.aries.blueprint.container.BlueprintExtender$3.run(BlueprintExtender.java:325) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at org.apache.aries.blueprint.container.BlueprintExtender.destroyContainer(BlueprintExtender.java:346) at org.apache.aries.blueprint.container.BlueprintExtender.access$400(BlueprintExtender.java:68) at org.apache.aries.blueprint.container.BlueprintExtender$BlueprintContainerServiceImpl.destroyContainer(BlueprintExtender.java:633) at org.opendaylight.controller.blueprint.BlueprintBundleTracker.shutdownAllContainers(BlueprintBundleTracker.java:286) at org.opendaylight.controller.blueprint.BlueprintBundleTracker.bundleChanged(BlueprintBundleTracker.java:191) at org.eclipse.osgi.internal.framework.BundleContextImpl.dispatchEvent(BundleContextImpl.java:908) at org.eclipse.osgi.framework.eventmgr.EventManager.dispatchEvent(EventManager.java:230) at org.eclipse.osgi.framework.eventmgr.ListenerQueue.dispatchEventSynchronous(ListenerQueue.java:148) at org.eclipse.osgi.internal.framework.EquinoxEventPublisher.publishBundleEventPrivileged(EquinoxEventPublisher.java:213) at org.eclipse.osgi.internal.framework.EquinoxEventPublisher.publishBundleEvent(EquinoxEventPublisher.java:120) at org.eclipse.osgi.internal.framework.EquinoxEventPublisher.publishBundleEvent(EquinoxEventPublisher.java:112) at org.eclipse.osgi.internal.framework.EquinoxContainerAdaptor.publishModuleEvent(EquinoxContainerAdaptor.java:168) at org.eclipse.osgi.container.Module.publishEvent(Module.java:476) at org.eclipse.osgi.container.Module.doStop(Module.java:634) at org.eclipse.osgi.container.Module.stop(Module.java:498) at org.eclipse.osgi.container.SystemModule.stop(SystemModule.java:202) at org.eclipse.osgi.internal.framework.EquinoxBundle$SystemBundle$EquinoxSystemModule$1.run(EquinoxBundle.java:165) at java.lang.Thread.run(Thread.java:748) i.e. it would seem we are re-entering the threadpool injected via blueprint, which is not accessible. |
| Comment by Robert Varga [ 29/Oct/18 ] |
2018-10-29T16:27:39,469 | INFO | Framework stop | BlueprintExtender | 18 - org.apache.aries.blueprint.core - 1.8.3 | Destroying BlueprintContainer for bundle org.opendaylight.controller.netty-event-executor-config/0.9.1.SNAPSHOT 2018-10-29T16:27:39,472 | INFO | Framework stop | BlueprintExtender | 18 - org.apache.aries.blueprint.core - 1.8.3 | Destroying BlueprintContainer for bundle org.opendaylight.aaa.cert/0.8.1.SNAPSHOT 2018-10-29T16:27:39,483 | INFO | Framework stop | BlueprintExtender | 18 - org.apache.aries.blueprint.core - 1.8.3 | Destroying BlueprintContainer for bundle mdsal-netconf-tcp/1.5.1.SNAPSHOT 2018-10-29T16:27:39,498 | INFO | Framework stop | BlueprintExtender | 18 - org.apache.aries.blueprint.core - 1.8.3 | Destroying BlueprintContainer for bundle mdsal-netconf-ssh/1.5.1.SNAPSHOT 2018-10-29T16:27:39,509 | INFO | opendaylight-cluster-data-akka.actor.default-dispatcher-46 | LocalActorRef | 75 - com.typesafe.akka.slf4j - 2.5.11 | Message [org.opendaylight.controller.remote.rpc.registry.RpcRegistry$Messages$RemoveRoutes] without sender to Actor[akka://opendaylight-cluster-data/user/rpc/registry#720141878] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. 2018-10-29T16:32:39,502 | INFO | Framework stop | ReferenceRecipe | 18 - org.apache.aries.blueprint.core - 1.8.3 | Timeout expired when waiting for mandatory OSGi service reference (&(type=global-event-executor)(objectClass=io.netty.util.concurrent.EventExecutor)) 2018-10-29T16:32:39,504 | WARN | Framework stop | DefaultCloseFuture | 54 - org.apache.sshd.core - 1.6.0 | notifyListener(DefaultCloseFuture[id=SshServer[2cbcef70]][value=true]) failed (ServiceUnavailableException) to invoke org.apache.sshd.common.util.closeable.SequentialCloseable$1@b918d23: Timeout expired when waiting for mandatory OSGi service reference: (&(type=global-event-executor)(objectClass=io.netty.util.concurrent.EventExecutor)) this eans that netty-event-executor-config is being destroyed before the SSH server, explaining why this is happening. |
| Comment by Robert Varga [ 31/Oct/18 ] |
|
This looks to be related to the sshd-netty use, I will investigate further. |
| Comment by Robert Varga [ 28/Nov/18 ] |
|
Ran out of cycles and will not be able to look at this this year. Jakub can you take a look? |
| Comment by Jakub Morvay [ 29/Nov/18 ] |
|
rovarga Ok, I will have a look at this. |
| Comment by Richard Kosegi [ 03/Dec/18 ] |
|
Can we raise priority on this, please? It is affecting integration tests in our environment that are restarting ODL and are expecting that shutdown will happen in some time. |
| Comment by Robert Varga [ 03/Jan/19 ] |
|
Perhaps not using netty, but NIO2 would solve this. Netty is not buying us anything here, really, as we shuffle NETCONF subchannel data synchronously, anyway. |
| Comment by Tomas Cere [ 07/Jan/19 ] |
| Comment by Tomas Cere [ 07/Jan/19 ] |
|
Fluorine: |