Uploaded image for project: 'netconf'
  1. netconf
  2. NETCONF-470

Device access can fail shortly after cluster member is killed

    XMLWordPrintable

Details

    • Bug
    • Status: Resolved
    • Resolution: Cannot Reproduce
    • None
    • None
    • netconf
    • None
    • Operating System: All
      Platform: All

    • 9148

    Description

      This manifests as a Robot failure, especially in all tests, both on Carbon and Nitrogen.

      For example in this [0] failure, post fail with:
      java.lang.IllegalStateException: Can't create ProxyReadTransaction
      at org.opendaylight.netconf.topology.singleton.impl.ProxyDOMDataBroker.newReadWriteTransaction(ProxyDOMDataBroker.java:92)
      ...

      The real cause is:
      Caused by:</h3><pre>java.util.concurrent.TimeoutException: Futures timed out after [5 seconds]
      at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223)
      at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227)
      at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
      at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
      at scala.concurrent.Await$.result(package.scala:190)
      at scala.concurrent.Await.result(package.scala)
      at org.opendaylight.netconf.topology.singleton.impl.ProxyDOMDataBroker.newReadWriteTransaction(ProxyDOMDataBroker.java:84)
      at org.opendaylight.netconf.sal.restconf.impl.BrokerFacade.commitConfigurationDataPost(BrokerFacade.java:475)
      ...

      Looking at karaf.log [1], new leaders were not elected at that time yet, so akka ask is expected to fail.

      Data broker now supports tell-based protocol, designed to work in such cases. As netconf does not use data broker, it should improve its own code to offer similar functionality, or at least document that accessing mounted devices can randomly fail during cluster HA events.

      Robot tests can be relaxed (by waiting for new leaders) if Netconf behavior is not going to be improved soon.

      [0] https://logs.opendaylight.org/releng/jenkins092/netconf-csit-3node-clustering-all-carbon/399/log.html.gz#s1-s9-t7-k2-k1-k1-k4-k7-k1
      [1] https://logs.opendaylight.org/releng/jenkins092/netconf-csit-3node-clustering-all-carbon/399/odl2_karaf.log.gz

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            Kostiantyn Kostiantyn Nosach
            vrpolak Vratko Polak
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: