[NETCONF-832] SSH global request returns failure/ unsupported state Created: 12/Nov/21  Updated: 04/May/22  Resolved: 22/Dec/21

Status: Resolved
Project: netconf
Component/s: netconf
Affects Version/s: None
Fix Version/s: 1.13.6, 2.0.12

Type: Bug Priority: Medium
Reporter: Lena Peuker Assignee: Ivan Hrasko
Resolution: Done Votes: 1
Labels: pt
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
relates to NETCONF-681 Netconf Callhome SSH: drop connections Resolved
Priority: Normal

 Description   

It seems that SSH-global-requests / SSH-keepalive mechanisms are not correctly handled and return an error.

RFC keepalive@example.com8071] (s. S7) specifies specific keepalive-mechanisms in order to establish a persisent connection. For a connection, that is established via SSH,  global requests ( SSH_MSG_GLOBAL_REQUEST ) with request-name and want-reply are exchanged. RFC 4254]specifies these SSH global requests. If the client supports these mechanisms it should return SSH_MSG_REQUEST_SUCCESS or in failure case / unsupported state SSH_MSG_REQUEST_FAILURE.

In general it looks that ODL sends SSH_MSG_REQUEST_FAILURE and handels the global request SSH_MSG_GLOBAL_REQUEST as unsupported.

Scenario:

  • Netconf-connection is established via callhome. Device is correctly mounted and hello-messages are exchanged.
  • Device sends global requests SSH_MSG_GLOBAL_REQUEST to the client / ODL.
  • ODL logs these requests as unsupported.
  • Device receives SSH_MSG_REQUEST_FAILURE-response and drops connection because of missing SSH_MSG_REQUEST_SUCCESS-response.
08:43:09.147 TRACE [nioEventLoopGroup-4-2] decode(ClientSessionImpl[admin@/127.0.0.1:35381]) packet #15 [chunk #1](27/27) 50 00 00 00 15 6b 65 65 70 61 6c 69 76 65 40 65 78 61 6d 70 6c 65 2e 63 6f 6d 01                                                                                                                   P....keepalive@example.com.
08:43:09.151 TRACE [nioEventLoopGroup-4-2] doHandleMessage(ClientSessionImpl[admin@/127.0.0.1:35381]) process SSH_MSG_GLOBAL_REQUEST
08:43:09.152 DEBUG [nioEventLoopGroup-4-2] globalRequest(ClientConnectionService[ClientSessionImpl[admin@/127.0.0.1:35381]]) received SSH_MSG_GLOBAL_REQUEST keepalive@example.com want-reply=true
08:43:09.155 DEBUG [nioEventLoopGroup-4-2] process(ClientConnectionService[ClientSessionImpl[admin@/127.0.0.1:35381]]) request=keepalive@example.com, want-reply=true
08:43:09.157 TRACE [nioEventLoopGroup-4-2] globalRequest(ClientConnectionService[ClientSessionImpl[admin@/127.0.0.1:35381]]) OpenSshHostKeysHandler#process(keepalive@example.com)[want-reply=true] : Unsupported
08:43:09.159 WARN [nioEventLoopGroup-4-2] handleUnknownRequest(ClientConnectionService[ClientSessionImpl[admin@/127.0.0.1:35381]]) unknown global request: keepalive@example.com
08:43:09.162 DEBUG [nioEventLoopGroup-4-2] sendGlobalResponse(ClientConnectionService[ClientSessionImpl[admin@/127.0.0.1:35381]])[keepalive@example.com] result=Unsupported, want-reply=true
08:43:09.163 DEBUG [nioEventLoopGroup-4-2] encode(ClientSessionImpl[admin@/127.0.0.1:35381]) packet #12 sending command=82[SSH_MSG_REQUEST_FAILURE] len=1
08:43:09.164 TRACE [nioEventLoopGroup-4-2] encode(ClientSessionImpl[admin@/127.0.0.1:35381]) packet #12 [chunk #1](1/1) 52                                                                                                                                                                                                 R
08:43:09.166 TRACE [nioEventLoopGroup-4-2] encode(ClientSessionImpl[admin@/127.0.0.1:35381]) packet #12 command=82[SSH_MSG_REQUEST_FAILURE] len=28, pad=26, mac=BaseMac[HmacSHA256] -  block=32/32 bytes, encrypt-then-mac=false
08:43:09.169 TRACE [nioEventLoopGroup-4-2] decode(ClientSessionImpl[admin@/127.0.0.1:35381]) packet #16 [chunk #1](27/27) 50 00 00 00 15 6b 65 65 70 61 6c 69 76 65 40 65 78 61 6d 70 6c 65 2e 63 6f 6d 01                                                                                                                   P....keepalive@example.com.
08:43:09.171 TRACE [nioEventLoopGroup-4-2] doHandleMessage(ClientSessionImpl[admin@/127.0.0.1:35381]) process SSH_MSG_GLOBAL_REQUEST
08:43:09.178 DEBUG [nioEventLoopGroup-4-2] globalRequest(ClientConnectionService[ClientSessionImpl[admin@/127.0.0.1:35381]]) received SSH_MSG_GLOBAL_REQUEST keepalive@example.com want-reply=true
08:43:09.179 DEBUG [nioEventLoopGroup-4-2] process(ClientConnectionService[ClientSessionImpl[admin@/127.0.0.1:35381]]) request=keepalive@example.com, want-reply=true
08:43:09.180 TRACE [nioEventLoopGroup-4-2] globalRequest(ClientConnectionService[ClientSessionImpl[admin@/127.0.0.1:35381]]) OpenSshHostKeysHandler#process(keepalive@example.com)[want-reply=true] : Unsupported
08:43:09.181 WARN [nioEventLoopGroup-4-2] handleUnknownRequest(ClientConnectionService[ClientSessionImpl[admin@/127.0.0.1:35381]]) unknown global request: keepalive@example.com
08:43:09.182 DEBUG [nioEventLoopGroup-4-2] sendGlobalResponse(ClientConnectionService[ClientSessionImpl[admin@/127.0.0.1:35381]])[keepalive@example.com] result=Unsupported, want-reply=true
08:43:09.184 DEBUG [nioEventLoopGroup-4-2] encode(ClientSessionImpl[admin@/127.0.0.1:35381]) packet #13 sending command=82[SSH_MSG_REQUEST_FAILURE] len=1
08:43:09.184 TRACE [nioEventLoopGroup-4-2] encode(ClientSessionImpl[admin@/127.0.0.1:35381]) packet #13 [chunk #1](1/1) 52                                                                                                                                                                                                 R
08:43:09.185 TRACE [nioEventLoopGroup-4-2] encode(ClientSessionImpl[admin@/127.0.0.1:35381]) packet #13 command=82[SSH_MSG_REQUEST_FAILURE] len=28, pad=26, mac=BaseMac[HmacSHA256] -  block=32/32 bytes, encrypt-then-mac=false
08:43:09.187 DEBUG [nioEventLoopGroup-4-2] close(ClientSessionImpl[admin@/127.0.0.1:35381]) Closing immediately
08:43:09.188 DEBUG [nioEventLoopGroup-4-2] signalAuthFailure(ClientSessionImpl[admin@/127.0.0.1:35381]) type=SshException, signalled=false: Session is being closed
08:43:09.190 DEBUG [nioEventLoopGroup-4-2] SSH Session ClientSessionImpl[admin@/127.0.0.1:35381] closed
08:43:09.193 DEBUG [nioEventLoopGroup-4-2] sessionClosed(ClientSessionImpl[admin@/127.0.0.1:35381]) un-tracked
08:43:09.194 TRACE [nioEventLoopGroup-4-2] doClose(org.opendaylight.netconf.shaded.sshd.common.util.closeable.SequentialCloseable$1@5612e771) closing ParallelCloseable[DefaultCloseFuture[id=ClientSessionImpl[admin@/127.0.0.1:35381]][value=null]] immediately=true
08:43:09.195 TRACE [nioEventLoopGroup-4-2] doClose(true) pending closeables: 2
08:43:09.196 DEBUG [nioEventLoopGroup-4-2] close(ClientConnectionService[ClientSessionImpl[admin@/127.0.0.1:35381]]) Closing immediately
08:43:09.196 DEBUG [nioEventLoopGroup-4-2] stopHeartBeat(ClientSessionImpl[admin@/127.0.0.1:35381]) no heartbeat to stop
08:43:09.197 TRACE [nioEventLoopGroup-4-2] doClose(true) pending closeables: 2
08:43:09.198 DEBUG [nioEventLoopGroup-4-2] close(ChannelSubsystem[id=0, recipient=0]-ClientSessionImpl[admin@/127.0.0.1:35381][netconf]) Closing immediately
08:43:09.198 DEBUG [nioEventLoopGroup-4-2] close(ChannelSubsystem[id=0, recipient=0]-ClientSessionImpl[admin@/127.0.0.1:35381][netconf]) prevent sending EOF
08:43:09.199 DEBUG [nioEventLoopGroup-4-2] Closing Window[client/local](ChannelSubsystem[id=0, recipient=0]-ClientSessionImpl[admin@/127.0.0.1:35381][netconf])
08:43:09.199 DEBUG [nioEventLoopGroup-4-2] Closing Window[client/remote](ChannelSubsystem[id=0, recipient=0]-ClientSessionImpl[admin@/127.0.0.1:35381][netconf])
08:43:09.201 TRACE [nioEventLoopGroup-4-2] doClose(org.opendaylight.netconf.shaded.sshd.common.util.closeable.SequentialCloseable$1@59307114) closing SequentialCloseable[DefaultCloseFuture[id=Builder][value=null]] immediately=true
08:43:09.201 TRACE [nioEventLoopGroup-4-2] doClose(org.opendaylight.netconf.shaded.sshd.common.util.closeable.SequentialCloseable$1@4a2d8fab) closing FuturesCloseable[DefaultCloseFuture[id=ChannelSubsystem[id=0, recipient=-1]-ClientSessionImpl[admin@/127.0.0.1:35381][netconf]][value=null]] immediately=true
08:43:09.202 TRACE [nioEventLoopGroup-4-2] doClose(org.opendaylight.netconf.shaded.sshd.common.util.closeable.SequentialCloseable$1@4a2d8fab) closing [DefaultCloseFuture[id=ChannelSubsystem[id=0, recipient=0]-ClientSessionImpl[admin@/127.0.0.1:35381][netconf]][value=null]] immediately=true
08:43:09.203 TRACE [nioEventLoopGroup-4-2] doClose(org.opendaylight.netconf.shaded.sshd.common.util.closeable.SequentialCloseable$1@4a2d8fab) closing ParallelCloseable[DefaultCloseFuture[id=Builder][value=null]] immediately=true
08:43:09.205 TRACE [nioEventLoopGroup-4-2] doClose(true) pending closeables: 2
08:43:09.205 DEBUG [nioEventLoopGroup-4-2] close([ChannelSubsystem[id=0, recipient=0]-ClientSessionImpl[admin@/127.0.0.1:35381][netconf]] cmd=SSH_MSG_CHANNEL_DATA) Closing immediately
08:43:09.206 DEBUG [nioEventLoopGroup-4-2] close([ChannelSubsystem[id=0, recipient=0]-ClientSessionImpl[admin@/127.0.0.1:35381][netconf]] cmd=SSH_MSG_CHANNEL_DATA)[Immediately] closed
08:43:09.207 TRACE [nioEventLoopGroup-4-2] doClose(true) completed pending: 1
08:43:09.207 TRACE [nioEventLoopGroup-4-2] doClose(true) pending closeables: 2
08:43:09.208 DEBUG [nioEventLoopGroup-4-2] close(ChannelAsyncInputStream[ChannelSubsystem[id=0, recipient=0]-ClientSessionImpl[admin@/127.0.0.1:35381][netconf]]) Closing immediately
08:43:09.209 DEBUG [nioEventLoopGroup-4-2] Ssh session dropped on channel: netconf
org.opendaylight.netconf.shaded.sshd.common.SshException: Closed
   at org.opendaylight.netconf.shaded.sshd.common.channel.ChannelAsyncInputStream.preClose(ChannelAsyncInputStream.java:97) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.AbstractCloseable.close(AbstractCloseable.java:94) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.ParallelCloseable.doClose(ParallelCloseable.java:65) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SimpleCloseable.close(SimpleCloseable.java:63) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:56) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:45) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.future.AbstractSshFuture.notifyListener(AbstractSshFuture.java:173) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.future.DefaultSshFuture.addListener(DefaultSshFuture.java:166) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:57) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:45) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.future.AbstractSshFuture.notifyListener(AbstractSshFuture.java:173) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.future.DefaultSshFuture.addListener(DefaultSshFuture.java:166) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:57) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:45) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SequentialCloseable.doClose(SequentialCloseable.java:69) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SimpleCloseable.close(SimpleCloseable.java:63) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:56) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:45) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SequentialCloseable.doClose(SequentialCloseable.java:69) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SimpleCloseable.close(SimpleCloseable.java:63) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.AbstractInnerCloseable.doCloseImmediately(AbstractInnerCloseable.java:48) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.AbstractCloseable.close(AbstractCloseable.java:95) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.ParallelCloseable.doClose(ParallelCloseable.java:65) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SimpleCloseable.close(SimpleCloseable.java:63) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.AbstractInnerCloseable.doCloseImmediately(AbstractInnerCloseable.java:48) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.AbstractCloseable.close(AbstractCloseable.java:95) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.ParallelCloseable.doClose(ParallelCloseable.java:65) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SimpleCloseable.close(SimpleCloseable.java:63) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:56) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SequentialCloseable$1.operationComplete(SequentialCloseable.java:45) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SequentialCloseable.doClose(SequentialCloseable.java:69) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SimpleCloseable.close(SimpleCloseable.java:63) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.AbstractInnerCloseable.doCloseImmediately(AbstractInnerCloseable.java:48) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.util.closeable.AbstractCloseable.close(AbstractCloseable.java:95) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.common.session.helpers.AbstractSessionIoHandler.sessionClosed(AbstractSessionIoHandler.java:46) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.netty.NettyIoSession.channelInactive(NettyIoSession.java:244) [bundleFile:?]
   at org.opendaylight.netconf.shaded.sshd.netty.NettyIoSession$Adapter.channelInactive(NettyIoSession.java:290) [bundleFile:?]
   at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) [bundleFile:4.1.56.Final]
   at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) [bundleFile:4.1.56.Final]
   at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241) [bundleFile:4.1.56.Final]
   at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405) [bundleFile:4.1.56.Final]
   at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) [bundleFile:4.1.56.Final]
   at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) [bundleFile:4.1.56.Final]
   at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:901) [bundleFile:4.1.56.Final]
   at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:819) [bundleFile:4.1.56.Final]
   at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) [bundleFile:4.1.56.Final]
   at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) [bundleFile:4.1.56.Final]
   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:497) [bundleFile:4.1.56.Final]
   at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [bundleFile:4.1.56.Final]
   at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [bundleFile:4.1.56.Final]
   at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [bundleFile:4.1.56.Final]
   at java.lang.Thread.run(Unknown Source) [?:?]


 Comments   
Comment by Ivan Hrasko [ 26/Nov/21 ]

lenapeuk What is the NETCONF version you are using? Are you able to reproduce this issue with netopeer/netopeer2? Can you reproduce the issue when device is connected "manually" by adding its configuration into netconf topology or the problem appears only when using call-home feature?

Comment by Lena Peuker [ 29/Nov/21 ]

ivanhrasko: Thanks for you response. Actually the NetworkFunction we are working with is not using netopeer instead confd is used. The issue / error that the global requests are unsupported also occures if we mount the device manually, but basically the connection is not dropped but this is related to the implementation on the device side.

Comment by Ivan Hrasko [ 29/Nov/21 ]

I am afraid it can be a problem on the device. I am running the following command with running ODL Aluminium SR3 (NETCONF version 1.9.3) controller:

ssh -vvv -p 2830 -o ServerAliveInterval=2 -o ServerAliveCountMax=3 admin@localhost -s netconf

And I am getting the response as follows: 

debug3: send packet: type 80
debug3: receive packet: type 81
debug3: send packet: type 80
debug3: receive packet: type 81
debug3: send packet: type 80
debug3: receive packet: type 81
debug3: send packet: type 80
debug3: receive packet: type 81
debug3: send packet: type 80
debug3: receive packet: type 81

If my assumptions are correct then 80 stands for SSH_MSG_GLOBAL_REQUEST and 81 stands for SSH_MSG_REQUEST_SUCCESS. Thus I think that ODL controller works as expected.

 

Comment by Lena Peuker [ 29/Nov/21 ]

ivanhrasko Thanks for the info. I will also check this on our side and analyze in more details

 

Comment by Ivan Hrasko [ 29/Nov/21 ]

Thanks, it would really help if you are able to check the versions you're using.

Comment by Robert Varga [ 29/Nov/21 ]

https://datatracker.ietf.org/doc/html/rfc4254#section-9 :

            SSH_MSG_GLOBAL_REQUEST                  80
            SSH_MSG_REQUEST_SUCCESS                 81
            SSH_MSG_REQUEST_FAILURE                 82
            SSH_MSG_CHANNEL_OPEN                    90
            SSH_MSG_CHANNEL_OPEN_CONFIRMATION       91
            SSH_MSG_CHANNEL_OPEN_FAILURE            92
            SSH_MSG_CHANNEL_WINDOW_ADJUST           93
            SSH_MSG_CHANNEL_DATA                    94
            SSH_MSG_CHANNEL_EXTENDED_DATA           95
            SSH_MSG_CHANNEL_EOF                     96
            SSH_MSG_CHANNEL_CLOSE                   97
            SSH_MSG_CHANNEL_REQUEST                 98
            SSH_MSG_CHANNEL_SUCCESS                 99
            SSH_MSG_CHANNEL_FAILURE                100 

 

Comment by Lena Peuker [ 30/Nov/21 ]

ivanhrasko It is version 1.0

Comment by Ivan Hrasko [ 30/Nov/21 ]

So NETCONF version is 1.0.0-Beryllium? Are you still using ODL Beryllium?

Comment by Robert Varga [ 30/Nov/21 ]

Is that the version of the VNF or the version of OpenDaylight stack (or ONAP SDNC, or what is it?)

Comment by Lena Peuker [ 30/Nov/21 ]

This is the version of the NF-device. We actually use ONAP SDNC istanbul release with ODL silicon-version.

Comment by Ivan Hrasko [ 30/Nov/21 ]

I have checked and the command:

ssh -vvv -p 2830 -o ServerAliveInterval=2 -o ServerAliveCountMax=3 admin@localhost -s netconf

works as expected also with ODL Silicon SR2.

Comment by Ivan Hrasko [ 02/Dec/21 ]

lenapeuk Sure, some configuration has to be done on both the controller and the device to setup the call-home correctly. For example the controller needs to have device's SSH public key and device needs to have call-home feature enabled as well. Do you remember the steps how you configured the both? Which requests have you send to ODL?

 

POST /restconf/config/odl-netconf-callhome-server:netconf-callhome-server/allowed-devices
{
 "device": {
 "unique-id": "ssh-device-to-discover",
 "ssh-client-params": {
 "host-key": "device's public key here"
 }
 }
}

 

Comment by Lena Peuker [ 03/Dec/21 ]

Hi ivanhrasko, thanks for the info. We have exactly used these Endpoint to add the the device to the allowed device by passing the required parameters (SSH-public-key, id). Basically this works fine as expected, so we can see that hello-capabilities are exchanged, yand-models are loaded and that the device is mounted under the specified unique-id. But after a few minutes / seconds we run into this exception, that the ssh-session is closed.

Actually we try to validate if it is possible to reproduce it via ssh / other netconf-client.

Comment by Ivan Hrasko [ 03/Dec/21 ]

lenapeuk I have setup SSH Call-Home session with Netopeer2 device and I have observed the logs like this:

2021-12-03T14:55:21,530 | DEBUG | nioEventLoopGroup-4-2 | AsyncSshHandlerReader | 281 - org.opendaylight.netconf.netty-util - 1.13.4 | Ssh session dropped on channel: netconf
org.opendaylight.netconf.shaded.sshd.common.SshException: Closed
 at org.opendaylight.netconf.shaded.sshd.common.channel.ChannelAsyncInputStream.preClose(ChannelAsyncInputStream.java:97) ~[bundleFile:?]
 at org.opendaylight.netconf.shaded.sshd.common.util.closeable.AbstractCloseable.close(AbstractCloseable.java:94) ~[bundleFile:?]
 at org.opendaylight.netconf.shaded.sshd.common.util.closeable.ParallelCloseable.doClose(ParallelCloseable.java:65) ~[bundleFile:?]
 at org.opendaylight.netconf.shaded.sshd.common.util.closeable.SimpleCloseable.close(SimpleCloseable.java:63) ~[bundleFile:?]

To appear every 10 minutes. Every 10 minutes the connection is terminated and established new one on different port.

 

I expect you hit this idle timeout/disconnection and the fact that you see SSH global requests unsupported/failed is:

a) not related to the disconnection itself

b) it is a consequence of terminated connection.

 

Is this explanation applicable to your problem? Do you see anything else/different happening?

Comment by Ivan Hrasko [ 03/Dec/21 ]

Other netconf clients work?

Comment by Ivan Hrasko [ 03/Dec/21 ]

Its also possible that timeout appears as a consequence of not correctly implemented heart-beat as you said

Comment by Ivan Hrasko [ 03/Dec/21 ]

Anyway, I think that also unsupported response should be enough as heart-beat and it can be a problem of the device that it does not accept it.

Comment by Vivekanandan Muthukrishnan [ 06/Dec/21 ]

Hi lenapeuk . I assume the CALL HOME is coming from devices behind NAT. Can you please check with your router admin to validate the default NAT entries timeout. It could be related to NAT entries being timed out which consequently terminates the TCP sessions for every 10 minutes also. I guess, we can consider increasing the NAT timeout and check if that helps here.

Most router settings will change NAT default timer to few seconds to few minutes. This will end up closing existing TCP connections behind NAT. It seems to be the case with your device for every 10 minutes also.

Comment by Ivan Hrasko [ 09/Dec/21 ]

lenapeuk I have tried call home with ConfD Basic (for free) and I have observed that it sends no SSH_MSG_GLOBAL_REQUEST. Have you configured it to send it somehow?

In general, every device is different, nor Netopeer2 nor ConfD in default configuration sends SSH_MSG_GLOBAL_REQUEST.
I believe this is the reason why NETCONF solves the problem of keeping the connection to device alive by its own mechanism.

This mechanism was turned off for call home mounted devices - I have enabled it by the gerrit change: https://git.opendaylight.org/gerrit/c/netconf/+/98872
With gerrit change NETCONF sends keep-alive message to the call home device every two minutes.

IMO the SSH_MSG_REQUEST_FAILURE response is good enough according to Section 4 of RFC 4254. Its just not needed/supported - that's not reason for the device to drop connection.

When you apply the proposed fix you can turn off sending SSH_MSG_GLOBAL_REQUEST and connection stays alive.

If you anyway would to use SSH_MSG_GLOBAL_REQUEST instead of NETCONF own keep-alive mechanism - let me know - we need to discuss it - because it is a new feature then.

Comment by Ivan Hrasko [ 09/Dec/21 ]

Yes, that is one of the reasons why we need some keep-alive - heartbeats mechanism to keep connection alive. The current 10 minutes in ODL are configured here: https://github.com/apache/mina-sshd/blob/2772c7c8f6afb8c53546ca803501f52118bd0491/sshd-core/src/main/java/org/apache/sshd/core/CoreModuleProperties.java#L319

Comment by Lena Peuker [ 09/Dec/21 ]

ivanhrasko Thanks you very much for the update. From our understanding it is configured on Device-side. Actually we try to check the device in more details.  With regards to Section 4 of RFC 4254, we have a simular understanding, so maybe there might be some issues on device side.

Comment by Lena Peuker [ 04/Jan/22 ]

Hi ivanhrasko, after some detailed testing and some further investigations, we found out the issue seems not be related to ODL or confd and we had some deeper problems with network setup of the netconf-device. It took some time to identify this because e.g. with python netconf-client it works probably and we could only see this behaviour and the drop of the connection inside ODL. So with the fix on network side after the callhome was triggered, the connection remains stable (we still see the warnings related to the global-requests, but we see it just as a warning but nothing related to our inital problem). Thank you very much for you support and your investigations

Generated at Wed Feb 07 23:08:52 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.