Uploaded image for project: 'bgpcep'
  1. bgpcep
  2. BGPCEP-221

"IllegalStateException: MpReachNlri codec not available" when pushing 10k routes or more

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Bugzilla Migration 1.0
    • Bugzilla Migration
    • BGP
    • None
    • Operating System: All
      Platform: All

    • 3186
    • High

      A "heisenbug" which occurs about 90% of the time. When pushing large amounts of routes, the BGP dies with this exception being logged for every update. The 90% figure I got reported for a case with 20 routes with linkstate; what I tested is a case with 10k routes without linkstate which hits the bug in 6 cases out of 7 or something like this. The problem seems to be less likely to occur when less routes are being pushed (with the probability approaching zero at around 2000 routes) and much more likely to occur when linkstate gets involved.

      Steps to reproduce:
      1. ODL_ROOT=<where_yourODL_installation_lives>
      2. mkdir -p $ODL_ROOT/etc/opendaylight/karaf.
      3. cp $ODL_ROOT/system/org/opendaylight/bgpcep/bgp-controller-config/*/bgp-controller-config-0.4.0-SNAPSHOT.xml $ODL_ROOT/etc/opendaylight/karaf/41-bgp-example.xml.
      4. Uncomment the deactivated "single BFP peer" section in the just created file ($ODL_ROOT/etc/opendaylight/karaf/41-bgp-example.xml).
      5. Boot ODL.
      6. Install features "odl-restconf", "odl-bgppcep-bgp-all" and "odl-netconf-connector-all".
      7. Wait for ODL to fully load (run "top" in another console and wait until CPU usage of the massive Java process stays below 5%).
      8. Get the tool from https://git.opendaylight.org/gerrit/#/c/19603/3/test/tools/fastbgp/play.py
      9. python play.py --gencount=10000

      When the bug hits (if it does not, reboot ODL and try again), no routes will make it to RIB nor topology (use curl with the apropriate restconf URL to verify this) and the log file will contain a heavy load of exceptions like this:

      2015-05-11 15:13:10,329 | WARN | oupCloseable-3-3 | DefaultChannelPipeline | 149 - io.netty.common - 4.0.26.Final | An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
      java.lang.IllegalStateException: MpReachNlri codec not available
      at com.google.common.base.Preconditions.checkState(Preconditions.java:173)[94:com.google.guava:18.0.0]
      at org.opendaylight.protocol.bgp.rib.impl.RIBSupportContextImpl.serialiazeReachNlri(RIBSupportContextImpl.java:168)[259:org.opendaylight.bgpcep.bgp-rib-impl:0.4.0.SNAPSHOT]
      at org.opendaylight.protocol.bgp.rib.impl.RIBSupportContextImpl.writeRoutes(RIBSupportContextImpl.java:133)[259:org.opendaylight.bgpcep.bgp-rib-impl:0.4.0.SNAPSHOT]
      at org.opendaylight.protocol.bgp.rib.impl.TableContext.writeRoutes(TableContext.java:49)[259:org.opendaylight.bgpcep.bgp-rib-impl:0.4.0.SNAPSHOT]
      at org.opendaylight.protocol.bgp.rib.impl.AdjRibInWriter.updateRoutes(AdjRibInWriter.java:229)[259:org.opendaylight.bgpcep.bgp-rib-impl:0.4.0.SNAPSHOT]
      at org.opendaylight.protocol.bgp.rib.impl.BGPPeer.onMessage(BGPPeer.java:120)[259:org.opendaylight.bgpcep.bgp-rib-impl:0.4.0.SNAPSHOT]
      at org.opendaylight.protocol.bgp.rib.impl.BGPPeer.onMessage(BGPPeer.java:65)[259:org.opendaylight.bgpcep.bgp-rib-impl:0.4.0.SNAPSHOT]
      at org.opendaylight.protocol.bgp.rib.impl.BGPSessionImpl.handleMessage(BGPSessionImpl.java:217)[259:org.opendaylight.bgpcep.bgp-rib-impl:0.4.0.SNAPSHOT]
      at org.opendaylight.protocol.bgp.rib.impl.BGPSessionImpl.handleMessage(BGPSessionImpl.java:53)[259:org.opendaylight.bgpcep.bgp-rib-impl:0.4.0.SNAPSHOT]
      at org.opendaylight.protocol.framework.AbstractProtocolSession.channelRead0(AbstractProtocolSession.java:53)[151:org.opendaylight.controller.protocol-framework:0.6.0.SNAPSHOT]
      at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)[148:io.netty.transport:4.0.26.Final]
      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)[148:io.netty.transport:4.0.26.Final]
      at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)[148:io.netty.transport:4.0.26.Final]
      at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)[174:io.netty.codec:4.0.26.Final]
      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)[148:io.netty.transport:4.0.26.Final]
      at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)[148:io.netty.transport:4.0.26.Final]
      at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)[174:io.netty.codec:4.0.26.Final]
      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)[148:io.netty.transport:4.0.26.Final]
      at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)[148:io.netty.transport:4.0.26.Final]
      at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)[148:io.netty.transport:4.0.26.Final]
      at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)[148:io.netty.transport:4.0.26.Final]
      at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)[148:io.netty.transport:4.0.26.Final]
      at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)[148:io.netty.transport:4.0.26.Final]
      at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)[148:io.netty.transport:4.0.26.Final]
      at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)[148:io.netty.transport:4.0.26.Final]
      at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)[149:io.netty.common:4.0.26.Final]
      at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)[149:io.netty.common:4.0.26.Final]
      at java.lang.Thread.run(Unknown Source)[:1.7.0_67]

            dkutenicsova Dana Kutenicsova
            jbehran@cisco.com Jozef Behran
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: