Uploaded image for project: 'netconf'
  1. netconf
  2. NETCONF-1067

Unrecoverable NETCONF session negotiation failure

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: High High
    • 4.0.9, 5.0.9
    • 5.0.6, 4.0.8
    • netconf
    • None

      An exception may get raised from the channel pipeline during NETCONF session negotiation and once this happens the channel negotiation cannot complete successfully even without the cause that originally triggered the exception.

      This is a such case where AbstractNetconfSessionNegotiator.exceptionCaught() is called during a session negotiation:

      2023-06-27T00:32:36,057 | INFO  | globalWorkerGroup-3-3 | AbstractNetconfSessionNegotiator | 265 - org.opendaylight.netconf.netty-util - 6.0.0.SNAPSHOT | Unexpected error during negotiation on channel [id: 0x2d4c45fc]
      io.netty.handler.codec.DecoderException: org.xml.sax.SAXParseException; lineNumber: 28; columnNumber: 3; The element type "nc:rpc-reply" must be terminated by the matching end-tag "</nc:rpc-reply>".
      	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:499) ~[bundleFile:4.1.94.Final]
      	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290) ~[bundleFile:4.1.94.Final]
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[bundleFile:4.1.94.Final]
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[bundleFile:4.1.94.Final]
      	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[bundleFile:4.1.94.Final]
      	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346) ~[bundleFile:4.1.94.Final]
      	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:333) ~[bundleFile:4.1.94.Final]
      	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:454) ~[bundleFile:4.1.94.Final]
      	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290) ~[bundleFile:4.1.94.Final]
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[bundleFile:4.1.94.Final]
      	at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:61) ~[bundleFile:4.1.94.Final]
      	at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:425) ~[bundleFile:4.1.94.Final]
      	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) ~[bundleFile:4.1.94.Final]
      	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) ~[bundleFile:4.1.94.Final]
      	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) ~[bundleFile:4.1.94.Final]
      	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569) ~[bundleFile:4.1.94.Final]
      	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[bundleFile:4.1.94.Final]
      	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[bundleFile:4.1.94.Final]
      	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[bundleFile:4.1.94.Final]
      	at java.lang.Thread.run(Thread.java:833) ~[?:?]
      

      This was triggered by a corrupted hello message from the device:

      <!-- No zombies were killed during the creation of this user interface -->
      <!-- user MSEFULL, class j-MSEFULL -->
      <nc:hello xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0">
         <nc:capabilities>
          <nc:capability>urn:ietf:params:netconf:base:1.0</nc:capability>
          <nc:capability>urn:ietf:params:netconf:capability:candidate:1.0</nc:capability>
          <nc:capability>urn:ietf:params:netconf:capability:confirmed-commit:1.0</nc:capability>
          <nc:capability>urn:ietf:params:netconf:capability:validate:1.0</nc:capability>
          <nc:capability>urn:ietf:params:netconf:capability:url:1.0?scheme=http,ftp,file</nc:capability>
          <nc:capability>urn:ietf:params:xml:ns:netconf:base:1.0?module=ietf-netconf&amp;revision=2011-06-01</nc:capability>
          <nc:capability>urn:ietf:params:xml:ns:netconf:capability:candidate:1.0</nc:capability>
          <nc:capability>urn:ietf:params:xml:ns:netconf:capability:confirmed-commit:1.0</nc:capability>
          <nc:capability>urn:ietf:params:xml:ns:netconf:capability:validate:1.0</nc:capability>
          <nc:capability>urn:ietf:params:xml:ns:netconf:capability:url:1.0?scheme=http,ftp,file</nc:capability>
          <nc:capability>urn:ietf:params:xml:ns:yang:ietf-inet-types?module=ietf-inet-types&amp;revision=2013-07-15</nc:capability>
          <nc:capability>urn:ietf:params:xml:ns:yang:ietf-netconf-monitoring</nc:capability>
          <nc:capability>http://xml.juniper.net/netconf/junos/1.0</nc:capability>
          <nc:capability>http://xml.juniper.net/dmi/system/1.0</nc:capability>
      <nc:rpc-reply xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0">
      <rpc-error>
      <error-type>protocol</error-type>
      <error-tag>operation-failed</error-tag>
      <error-severity>error</error-severity>
      <error-message>
      Failed to unlink YANG path : /var/run/db/yangs
      </error-message>
      </rpc-error>
      </rpc-reply>
        </nc:capabilities>
        <nc:session-id>90451</nc:session-id>
      </nc:hello>
      

      This exception causes negotiation failure and in turn changes the state of NetconfSessionPromise to 'failure'. Once this happens the state cannot be changed while controller keeps retrying session negotiations meaninglessly & indefinitely with max-connection-attempts set to 0.

      In most cases the only way to get out of this repeated session negotiation failure and properly reestablish a NETCONF session with the device is to restart the controller, which is very disruptive in the production environment.

            sangwookha Sangwook Ha
            sangwookha Sangwook Ha
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: