[NETCONF-577] sshd-2.0.0 seems to be stuck on shutdown Created: 29/Oct/18  Updated: 07/Jan/19  Resolved: 07/Jan/19

Status: Resolved
Project: netconf
Component/s: netconf
Affects Version/s: Neon, Fluorine SR1
Fix Version/s: Neon, Fluorine SR2

Type: Bug Priority: High
Reporter: Robert Varga Assignee: Tomas Cere
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by NETCONF-583 Karaf takes 6 minutes to shutdown whe... Resolved

 Description   

This is seen in SFT:

2018-10-29T16:33:39,505 | ERROR | Framework stop   | BeanRecipe                       | 18 - org.apache.aries.blueprint.core - 1.8.3 | The blueprint bean netconfMdsalServer in bundle mdsal-netconf-ssh/1.5.1.SNAPSHOT incorrectly threw an
exception from its destroy method.
java.net.SocketTimeoutException: Failed to receive closure confirmation within 60000 millis
        at org.apache.sshd.server.SshServer.stop(SshServer.java:341) [132:org.apache.sshd.core:2.0.0]
        at org.opendaylight.netconf.ssh.SshProxyServer.close(SshProxyServer.java:106) [102:mdsal-netconf-ssh:1.5.1.SNAPSHOT]
        at org.opendaylight.netconf.ssh.NetconfNorthboundSshServer.close(NetconfNorthboundSshServer.java:82) [102:mdsal-netconf-ssh:1.5.1.SNAPSHOT]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:?]
        at org.apache.aries.blueprint.utils.ReflectionUtils.invoke(ReflectionUtils.java:299) [18:org.apache.aries.blueprint.core:1.8.3]
        at org.apache.aries.blueprint.container.BeanRecipe.invoke(BeanRecipe.java:980) [18:org.apache.aries.blueprint.core:1.8.3]
        at org.apache.aries.blueprint.container.BeanRecipe.destroy(BeanRecipe.java:887) [18:org.apache.aries.blueprint.core:1.8.3]
        at org.apache.aries.blueprint.container.BlueprintRepository.destroy(BlueprintRepository.java:329) [18:org.apache.aries.blueprint.core:1.8.3]
        at org.apache.aries.blueprint.container.BlueprintContainerImpl.destroyComponents(BlueprintContainerImpl.java:745) [18:org.apache.aries.blueprint.core:1.8.3]
        at org.apache.aries.blueprint.container.BlueprintContainerImpl.tidyupComponents(BlueprintContainerImpl.java:956) [18:org.apache.aries.blueprint.core:1.8.3]
        at org.apache.aries.blueprint.container.BlueprintContainerImpl.destroy(BlueprintContainerImpl.java:890) [18:org.apache.aries.blueprint.core:1.8.3]
        at org.apache.aries.blueprint.container.BlueprintExtender$3.run(BlueprintExtender.java:325) [18:org.apache.aries.blueprint.core:1.8.3]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:?]
        at org.apache.aries.blueprint.container.BlueprintExtender.destroyContainer(BlueprintExtender.java:346) [18:org.apache.aries.blueprint.core:1.8.3]
        at org.apache.aries.blueprint.container.BlueprintExtender.access$400(BlueprintExtender.java:68) [18:org.apache.aries.blueprint.core:1.8.3]
        at org.apache.aries.blueprint.container.BlueprintExtender$BlueprintContainerServiceImpl.destroyContainer(BlueprintExtender.java:633) [18:org.apache.aries.blueprint.core:1.8.3]
        at org.opendaylight.controller.blueprint.BlueprintBundleTracker.shutdownAllContainers(BlueprintBundleTracker.java:286) [193:org.opendaylight.controller.blueprint:0.9.1.SNAPSHOT]
        at org.opendaylight.controller.blueprint.BlueprintBundleTracker.bundleChanged(BlueprintBundleTracker.java:191) [193:org.opendaylight.controller.blueprint:0.9.1.SNAPSHOT]
        at org.eclipse.osgi.internal.framework.BundleContextImpl.dispatchEvent(BundleContextImpl.java:908) [?:?]
        at org.eclipse.osgi.framework.eventmgr.EventManager.dispatchEvent(EventManager.java:230) [?:?]
        at org.eclipse.osgi.framework.eventmgr.ListenerQueue.dispatchEventSynchronous(ListenerQueue.java:148) [?:?]
        at org.eclipse.osgi.internal.framework.EquinoxEventPublisher.publishBundleEventPrivileged(EquinoxEventPublisher.java:213) [?:?]
        at org.eclipse.osgi.internal.framework.EquinoxEventPublisher.publishBundleEvent(EquinoxEventPublisher.java:120) [?:?]
        at org.eclipse.osgi.internal.framework.EquinoxEventPublisher.publishBundleEvent(EquinoxEventPublisher.java:112) [?:?]
        at org.eclipse.osgi.internal.framework.EquinoxContainerAdaptor.publishModuleEvent(EquinoxContainerAdaptor.java:168) [?:?]
        at org.eclipse.osgi.container.Module.publishEvent(Module.java:476) [?:?]
        at org.eclipse.osgi.container.Module.doStop(Module.java:634) [?:?]
        at org.eclipse.osgi.container.Module.stop(Module.java:498) [?:?]
        at org.eclipse.osgi.container.SystemModule.stop(SystemModule.java:202) [?:?]
        at org.eclipse.osgi.internal.framework.EquinoxBundle$SystemBundle$EquinoxSystemModule$1.run(EquinoxBundle.java:165) [?:?]
        at java.lang.Thread.run(Thread.java:748) [?:?]


 Comments   
Comment by Robert Varga [ 29/Oct/18 ]

This can be seen during container shutdown, too, and seems to be cause by the following:

"Framework stop" #205 daemon prio=5 os_prio=0 tid=0x00007ff94c04f000 nid=0x4a69 in Object.wait() [0x00007ff9d9d85000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    at org.apache.aries.blueprint.container.ReferenceRecipe.getService(ReferenceRecipe.java:211)
    - locked <0x00000000d8b44be0> (a java.lang.Object)
    at org.apache.aries.blueprint.container.ReferenceRecipe.access$000(ReferenceRecipe.java:56)
    at org.apache.aries.blueprint.container.ReferenceRecipe$ServiceDispatcher.call(ReferenceRecipe.java:306)
    at Proxy376b1c8f_149c_43ab_a8ef_84bcff61daa3.execute(Unknown Source)
    at sun.nio.ch.AsynchronousChannelGroupImpl.executeOnPooledThread(AsynchronousChannelGroupImpl.java:188)
    at sun.nio.ch.Invoker.invokeIndirectly(Invoker.java:212)
    at sun.nio.ch.UnixAsynchronousServerSocketChannelImpl.implClose(UnixAsynchronousServerSocketChannelImpl.java:118)
    at sun.nio.ch.AsynchronousServerSocketChannelImpl.close(AsynchronousServerSocketChannelImpl.java:111)
    at org.apache.sshd.common.io.nio2.Nio2Acceptor.unbind(Nio2Acceptor.java:113)
    at org.apache.sshd.common.io.nio2.Nio2Acceptor.unbind(Nio2Acceptor.java:99)
    at org.apache.sshd.common.io.nio2.Nio2Acceptor.close(Nio2Acceptor.java:141)
    at org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:56)
    at org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:45)
    at org.apache.sshd.common.future.AbstractSshFuture.notifyListener(AbstractSshFuture.java:159)
    at org.apache.sshd.common.future.DefaultSshFuture.addListener(DefaultSshFuture.java:167)
    at org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:57)
    at org.apache.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:45)
    at org.apache.sshd.common.util.closeable.SequentialCloseable.doClose(SequentialCloseable.java:69)
    at org.apache.sshd.common.util.closeable.SimpleCloseable.close(SimpleCloseable.java:63)
    at org.apache.sshd.common.util.closeable.AbstractInnerCloseable.doCloseImmediately(AbstractInnerCloseable.java:46)
    at org.apache.sshd.common.util.closeable.AbstractCloseable.close(AbstractCloseable.java:83)
    at org.apache.sshd.server.SshServer.stop(SshServer.java:339)
    at org.opendaylight.netconf.ssh.SshProxyServer.close(SshProxyServer.java:106)
    at org.opendaylight.netconf.ssh.NetconfNorthboundSshServer.close(NetconfNorthboundSshServer.java:82)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.aries.blueprint.utils.ReflectionUtils.invoke(ReflectionUtils.java:299)
    at org.apache.aries.blueprint.container.BeanRecipe.invoke(BeanRecipe.java:980)
    at org.apache.aries.blueprint.container.BeanRecipe.destroy(BeanRecipe.java:887)
    at org.apache.aries.blueprint.container.BlueprintRepository.destroy(BlueprintRepository.java:329)
    at org.apache.aries.blueprint.container.BlueprintContainerImpl.destroyComponents(BlueprintContainerImpl.java:745)
    at org.apache.aries.blueprint.container.BlueprintContainerImpl.tidyupComponents(BlueprintContainerImpl.java:956)
    at org.apache.aries.blueprint.container.BlueprintContainerImpl.destroy(BlueprintContainerImpl.java:890)
    at org.apache.aries.blueprint.container.BlueprintExtender$3.run(BlueprintExtender.java:325)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at org.apache.aries.blueprint.container.BlueprintExtender.destroyContainer(BlueprintExtender.java:346)
    at org.apache.aries.blueprint.container.BlueprintExtender.access$400(BlueprintExtender.java:68)
    at org.apache.aries.blueprint.container.BlueprintExtender$BlueprintContainerServiceImpl.destroyContainer(BlueprintExtender.java:633)
    at org.opendaylight.controller.blueprint.BlueprintBundleTracker.shutdownAllContainers(BlueprintBundleTracker.java:286)
    at org.opendaylight.controller.blueprint.BlueprintBundleTracker.bundleChanged(BlueprintBundleTracker.java:191)
    at org.eclipse.osgi.internal.framework.BundleContextImpl.dispatchEvent(BundleContextImpl.java:908)
    at org.eclipse.osgi.framework.eventmgr.EventManager.dispatchEvent(EventManager.java:230)
    at org.eclipse.osgi.framework.eventmgr.ListenerQueue.dispatchEventSynchronous(ListenerQueue.java:148)
    at org.eclipse.osgi.internal.framework.EquinoxEventPublisher.publishBundleEventPrivileged(EquinoxEventPublisher.java:213)
    at org.eclipse.osgi.internal.framework.EquinoxEventPublisher.publishBundleEvent(EquinoxEventPublisher.java:120)
    at org.eclipse.osgi.internal.framework.EquinoxEventPublisher.publishBundleEvent(EquinoxEventPublisher.java:112)
    at org.eclipse.osgi.internal.framework.EquinoxContainerAdaptor.publishModuleEvent(EquinoxContainerAdaptor.java:168)
    at org.eclipse.osgi.container.Module.publishEvent(Module.java:476)
    at org.eclipse.osgi.container.Module.doStop(Module.java:634)
    at org.eclipse.osgi.container.Module.stop(Module.java:498)
    at org.eclipse.osgi.container.SystemModule.stop(SystemModule.java:202)
    at org.eclipse.osgi.internal.framework.EquinoxBundle$SystemBundle$EquinoxSystemModule$1.run(EquinoxBundle.java:165)
    at java.lang.Thread.run(Thread.java:748)

i.e. it would seem we are re-entering the threadpool injected via blueprint, which is not accessible.

Comment by Robert Varga [ 29/Oct/18 ]
2018-10-29T16:27:39,469 | INFO  | Framework stop   | BlueprintExtender                | 18 - org.apache.aries.blueprint.core - 1.8.3 | Destroying BlueprintContainer for bundle org.opendaylight.controller.netty-event-executor-config/0.9.1.SNAPSHOT
2018-10-29T16:27:39,472 | INFO  | Framework stop   | BlueprintExtender                | 18 - org.apache.aries.blueprint.core - 1.8.3 | Destroying BlueprintContainer for bundle org.opendaylight.aaa.cert/0.8.1.SNAPSHOT
2018-10-29T16:27:39,483 | INFO  | Framework stop   | BlueprintExtender                | 18 - org.apache.aries.blueprint.core - 1.8.3 | Destroying BlueprintContainer for bundle mdsal-netconf-tcp/1.5.1.SNAPSHOT
2018-10-29T16:27:39,498 | INFO  | Framework stop   | BlueprintExtender                | 18 - org.apache.aries.blueprint.core - 1.8.3 | Destroying BlueprintContainer for bundle mdsal-netconf-ssh/1.5.1.SNAPSHOT
2018-10-29T16:27:39,509 | INFO  | opendaylight-cluster-data-akka.actor.default-dispatcher-46 | LocalActorRef                    | 75 - com.typesafe.akka.slf4j - 2.5.11 | Message [org.opendaylight.controller.remote.rpc.registry.RpcRegistry$Messages$RemoveRoutes] without sender to Actor[akka://opendaylight-cluster-data/user/rpc/registry#720141878] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
2018-10-29T16:32:39,502 | INFO  | Framework stop   | ReferenceRecipe                  | 18 - org.apache.aries.blueprint.core - 1.8.3 | Timeout expired when waiting for mandatory OSGi service reference (&(type=global-event-executor)(objectClass=io.netty.util.concurrent.EventExecutor))
2018-10-29T16:32:39,504 | WARN  | Framework stop   | DefaultCloseFuture               | 54 - org.apache.sshd.core - 1.6.0 | notifyListener(DefaultCloseFuture[id=SshServer[2cbcef70]][value=true]) failed (ServiceUnavailableException) to invoke org.apache.sshd.common.util.closeable.SequentialCloseable$1@b918d23: Timeout expired when waiting for mandatory OSGi service reference: (&(type=global-event-executor)(objectClass=io.netty.util.concurrent.EventExecutor))

this eans that netty-event-executor-config is being destroyed before the SSH server, explaining why this is happening.

Comment by Robert Varga [ 31/Oct/18 ]

This looks to be related to the sshd-netty use, I will investigate further.

Comment by Robert Varga [ 28/Nov/18 ]

Ran out of cycles and will not be able to look at this this year. Jakub can you take a look?

Comment by Jakub Morvay [ 29/Nov/18 ]

rovarga Ok, I will have a look at this.

Comment by Richard Kosegi [ 03/Dec/18 ]

Can we raise priority on this, please?

It is affecting integration tests in our environment that are restarting ODL and are expecting that shutdown will happen in some time.

Comment by Robert Varga [ 03/Jan/19 ]

Perhaps not using netty, but NIO2 would solve this. Netty is not buying us anything here, really, as we shuffle NETCONF subchannel data synchronously, anyway.

Comment by Tomas Cere [ 07/Jan/19 ]

https://git.opendaylight.org/gerrit/#/c/79268/

Comment by Tomas Cere [ 07/Jan/19 ]

Fluorine:

https://git.opendaylight.org/gerrit/#/c/79270/

Generated at Wed Feb 07 20:15:22 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.