|
Along with the PCEP session going down, there is an unhandled PCErr message noticed in the karaf logs of the controller. Please find the snapshot of the logs seen:
2016-04-08 02:24:50,208 | WARN | oupCloseable-6-2 | tateful07TopologySessionListener | 282 - org.opendaylight.bgpcep.pcep-topology-provider - 0.4.4.Lithium-SR4 | Unhandled PCErr message PcerrMessage [_errors=[Errors [_errorObject=ErrorObject [_type=6, _value=10, _ignore=false, _processingRule=false, augmentation=[]], augmentation=[]]], augmentation=[]].
2016-04-08 02:24:50,208 | INFO | oupCloseable-6-2 | AbstractTopologySessionListener | 282 - org.opendaylight.bgpcep.pcep-topology-provider - 0.4.4.Lithium-SR4 | Unhandled message Pcerr [_pcerrMessage=PcerrMessage [_errors=[Errors [_errorObject=ErrorObject [_type=6, _value=10, _ignore=false, _processingRule=false, augmentation=[]], augmentation=[]]], augmentation=[]], augmentation=[]] on session PCEPSessionImpl
{channel=[id: 0xd566f018, /10.18.132.98:22756 => /10.18.162.90:4189], localOpen=Open [_deadTimer=120, _keepalive=30, _sessionId=0, _tlvs=Tlvs [augmentation=[Tlvs1 [_srPceCapability=SrPceCapability [_msd=0, augmentation=[]]], Tlvs1 [_stateful=Stateful [_lspUpdateCapability=true, augmentation=[Stateful1 [_initiation=true]]]]]], augmentation=[]], remoteOpen=Open [_deadTimer=30, _keepalive=5, _sessionId=1, _tlvs=Tlvs [augmentation=[Tlvs1 [], Tlvs1 [_stateful=Stateful [_lspUpdateCapability=true, augmentation=[Stateful1 [_initiation=true]]]]]], _version=ProtocolVersion [_value=1], _ignore=false, _processingRule=false, augmentation=[]]}
|
|
Attachment SRP_object_included.pcap has been added with description: PCAP files with SRP objects in an update packet.
|
|
Attachment SRP_object_missing.pcap has been added with description: PCAP file with SRP object NOT sent in an UPDATE packet
|
|
The SRP is mandatory object in RCUpd message, hence error message is expected.
Ref.: https://tools.ietf.org/html/draft-ietf-pce-stateful-pce-07#section-7.2
The warning in logs is basically harmless, just notifying user that received error message is not binded to any request (as there is no SRP in Error message), so no action is done in the application.
Could you please provide a piece of the log when session went down? Did PCC requested the session closing? How did you managed to send update without SRP object?
|
|
Hi Milos,
We changed the controller code to not send the SRP object. We wanted to generate PCEP error type 6 and value 10. Find the snapshot of the log that shows the PCEP session went down unexpectedly. I am attaching the entire karaf file as well to this bug. Let me know if you would need any further information.
2016-04-08 02:10:48,747 | WARN | oupCloseable-6-1 | AbstractTopologySessionListener | 282 - org.opendaylight.bgpcep.pcep-topology-provider - 0.4.4.Lithium-SR4 | Session PCEPSessionImpl
{channel=[id: 0xc238978e, /10.18.132.98:57951 :> /10.18.162.90:4189], localOpen=Open [_deadTimer=120, _keepalive=30, _sessionId=0, _tlvs=Tlvs [augmentation=[Tlvs1 [_stateful=Stateful [_lspUpdateCapability=true, augmentation=[Stateful1 [_initiation=true]]]], Tlvs1 [_srPceCapability=SrPceCapability [_msd=0, augmentation=[]]]]], augmentation=[]], remoteOpen=Open [_deadTimer=30, _keepalive=5, _sessionId=1, _tlvs=Tlvs [augmentation=[Tlvs1 [_stateful=Stateful [_lspUpdateCapability=true, augmentation=[Stateful1 [_initiation=true]]]], Tlvs1 []]], _version=ProtocolVersion [_value=1], _ignore=false, _processingRule=false, augmentation=[]]}
went down unexpectedly
java.io.IOException: End of input detected. Close the session.
at org.opendaylight.protocol.pcep.impl.PCEPSessionImpl.endOfInput(PCEPSessionImpl.java:266)[268:org.opendaylight.bgpcep.pcep-impl:0.4.4.Lithium-SR4]
at org.opendaylight.protocol.framework.AbstractProtocolSession.channelInactive(AbstractProtocolSession.java:40)[54:org.opendaylight.controller.protocol-framework:0.6.4.Lithium-SR4]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237)[51:io.netty.transport:4.0.26.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:223)[51:io.netty.transport:4.0.26.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:294)[73:io.netty.codec:4.0.26.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237)[51:io.netty.transport:4.0.26.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:223)[51:io.netty.transport:4.0.26.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:294)[73:io.netty.codec:4.0.26.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237)[51:io.netty.transport:4.0.26.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:223)[51:io.netty.transport:4.0.26.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:829)[51:io.netty.transport:4.0.26.Final]
at io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:610)[51:io.netty.transport:4.0.26.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)[52:io.netty.common:4.0.26.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)[51:io.netty.transport:4.0.26.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)[52:io.netty.common:4.0.26.Final]
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)[52:io.netty.common:4.0.26.Final]
at java.lang.Thread.run(Thread.java:745)[:1.7.0_95]
2016-04-08 02:10:48,749 | WARN | oupCloseable-6-1 | DefaultChannelPipeline | 52 - io.netty.common - 4.0.26.Final | An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
java.lang.IllegalStateException: New transaction PingPongTransaction
{delegate=org.opendaylight.controller.cluster.databroker.DOMBrokerReadWriteTransaction@4abfcdd1}
raced with transacion PingPongTransaction
{delegate=org.opendaylight.controller.cluster.databroker.DOMBrokerReadWriteTransaction@655a0f70}
at org.opendaylight.controller.md.sal.dom.broker.impl.PingPongTransactionChain.slowAllocateTransaction(PingPongTransactionChain.java:136)[167:org.opendaylight.controller.sal-broker-impl:1.2.4.Lithium-SR4]
at org.opendaylight.controller.md.sal.dom.broker.impl.PingPongTransactionChain.allocateTransaction(PingPongTransactionChain.java:148)[167:org.opendaylight.controller.sal-broker-impl:1.2.4.Lithium-SR4]
|
|
Attachment pcep_session_down.log has been added with description: Karaf Log
|
|
Adding one more snapshot to this bug. When a LSP update request without SRP object was sent from the controller it was waiting for a response.
|
|
Attachment PCEP_UPDATE_without_SRP.jpg has been added with description: LSP update request without SRP object waiting for a response
|
|
(In reply to Ajay Chhabria from comment #3)
> Hi Milos,
>
> We changed the controller code to not send the SRP object. We wanted to
> generate PCEP error type 6 and value 10. Find the snapshot of the log that
> shows the PCEP session went down unexpectedly. I am attaching the entire
> karaf file as well to this bug. Let me know if you would need any further
> information.
>
> 2016-04-08 02:10:48,747 | WARN | oupCloseable-6-1 |
> AbstractTopologySessionListener | 282 -
> org.opendaylight.bgpcep.pcep-topology-provider - 0.4.4.Lithium-SR4 | Session
> PCEPSessionImpl
{channel=[id: 0xc238978e, /10.18.132.98:57951 :>
> /10.18.162.90:4189], localOpen=Open [_deadTimer=120, _keepalive=30,
> _sessionId=0, _tlvs=Tlvs [augmentation=[Tlvs1 [_stateful=Stateful
> [_lspUpdateCapability=true, augmentation=[Stateful1 [_initiation=true]]]],
> Tlvs1 [_srPceCapability=SrPceCapability [_msd=0, augmentation=[]]]]],
> augmentation=[]], remoteOpen=Open [_deadTimer=30, _keepalive=5,
> _sessionId=1, _tlvs=Tlvs [augmentation=[Tlvs1 [_stateful=Stateful
> [_lspUpdateCapability=true, augmentation=[Stateful1 [_initiation=true]]]],
> Tlvs1 []]], _version=ProtocolVersion [_value=1], _ignore=false,
> _processingRule=false, augmentation=[]]}
went down unexpectedly
> java.io.IOException: End of input detected. Close the session.
> at
> org.opendaylight.protocol.pcep.impl.PCEPSessionImpl.
> endOfInput(PCEPSessionImpl.java:266)[268:org.opendaylight.bgpcep.pcep-impl:0.
> 4.4.Lithium-SR4]
> at
> org.opendaylight.protocol.framework.AbstractProtocolSession.
> channelInactive(AbstractProtocolSession.java:40)[54:org.opendaylight.
> controller.protocol-framework:0.6.4.Lithium-SR4]
> at
> io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelInactive(AbstractChannelHandlerContext.java:237)[51:io.netty.
> transport:4.0.26.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.
> fireChannelInactive(AbstractChannelHandlerContext.java:223)[51:io.netty.
> transport:4.0.26.Final]
> at
> io.netty.handler.codec.ByteToMessageDecoder.
> channelInactive(ByteToMessageDecoder.java:294)[73:io.netty.codec:4.0.26.
> Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelInactive(AbstractChannelHandlerContext.java:237)[51:io.netty.
> transport:4.0.26.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.
> fireChannelInactive(AbstractChannelHandlerContext.java:223)[51:io.netty.
> transport:4.0.26.Final]
> at
> io.netty.handler.codec.ByteToMessageDecoder.
> channelInactive(ByteToMessageDecoder.java:294)[73:io.netty.codec:4.0.26.
> Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelInactive(AbstractChannelHandlerContext.java:237)[51:io.netty.
> transport:4.0.26.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.
> fireChannelInactive(AbstractChannelHandlerContext.java:223)[51:io.netty.
> transport:4.0.26.Final]
> at
> io.netty.channel.DefaultChannelPipeline.
> fireChannelInactive(DefaultChannelPipeline.java:829)[51:io.netty.transport:4.
> 0.26.Final]
> at
> io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:
> 610)[51:io.netty.transport:4.0.26.Final]
> at
> io.netty.util.concurrent.SingleThreadEventExecutor.
> runAllTasks(SingleThreadEventExecutor.java:357)[52:io.netty.common:4.0.26.
> Final]
> at
> io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)[51:io.netty.
> transport:4.0.26.Final]
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.
> run(SingleThreadEventExecutor.java:111)[52:io.netty.common:4.0.26.Final]
> at
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.
> run(DefaultThreadFactory.java:137)[52:io.netty.common:4.0.26.Final]
> at java.lang.Thread.run(Thread.java:745)[:1.7.0_95]
> 2016-04-08 02:10:48,749 | WARN | oupCloseable-6-1 | DefaultChannelPipeline
> | 52 - io.netty.common - 4.0.26.Final | An exceptionCaught() event was
> fired, and it reached at the tail of the pipeline. It usually means the last
> handler in the pipeline did not handle the exception.
> java.lang.IllegalStateException: New transaction
> PingPongTransaction
{delegate=org.opendaylight.controller.cluster.databroker.
> DOMBrokerReadWriteTransaction@4abfcdd1}
raced with transacion
> PingPongTransaction
{delegate=org.opendaylight.controller.cluster.databroker.
> DOMBrokerReadWriteTransaction@655a0f70}
> at
> org.opendaylight.controller.md.sal.dom.broker.impl.PingPongTransactionChain.
> slowAllocateTransaction(PingPongTransactionChain.java:136)[167:org.
> opendaylight.controller.sal-broker-impl:1.2.4.Lithium-SR4]
> at
> org.opendaylight.controller.md.sal.dom.broker.impl.PingPongTransactionChain.
> allocateTransaction(PingPongTransactionChain.java:148)[167:org.opendaylight.
> controller.sal-broker-impl:1.2.4.Lithium-SR4]
The log says the session was terminated because of expired dead-timer, which mean the ODL PCE did not received any message from PCC for some time (30s in this case), however something goes wrong at closing as there are some exceptions cumming from MD-SAL.
Could you check what is happening on the router (PCC) side?
|
|
(In reply to Ajay Chhabria from comment #5)
> Adding one more snapshot to this bug. When a LSP update request without SRP
> object was sent from the controller it was waiting for a response.
This infinite waiting problem is tracked in https://bugs.opendaylight.org/show_bug.cgi?id=5612
|
|
Thank You Milos for the comments. I now repeated the test without configuring the dead timer on the XRv. I think the default value is 120 seconds. I now see the following exception:
2016-04-12 11:20:12,008 | INFO | oupCloseable-5-3 | AbstractTopologySessionListener | 282 - org.opendaylight.bgpcep.pcep-topology-provider - 0.4.4.Lithium-SR4 | Unhandled message Pcerr [_pcerrMessage=PcerrMessage [_errors=[Errors [_errorObject=ErrorObject [_type=6, _value=10, _ignore=false, _processingRule=false, augmentation=[]], augmentation=[]]], augmentation=[]], augmentation=[]] on session PCEPSessionImpl
{channel=[id: 0xfafea19e, /10.18.132.94:60011 => /10.18.160.123:4189], localOpen=Open [_deadTimer=120, _keepalive=30, _sessionId=1, _tlvs=Tlvs [augmentation=[Tlvs1 [_srPceCapability=SrPceCapability [_msd=0, augmentation=[]]], Tlvs1 [_stateful=Stateful [_lspUpdateCapability=true, augmentation=[Stateful1 [_initiation=true]]]]]], augmentation=[]], remoteOpen=Open [_deadTimer=120, _keepalive=30, _sessionId=2, _tlvs=Tlvs [augmentation=[Tlvs1 [], Tlvs1 [_stateful=Stateful [_lspUpdateCapability=true, augmentation=[Stateful1 [_initiation=true]]]]]], _version=ProtocolVersion [_value=1], _ignore=false, _processingRule=false, augmentation=[]]}
2016-04-12 11:20:21,728 | INFO | oupCloseable-5-2 | AbstractSessionNegotiator | 54 - org.opendaylight.controller.protocol-framework - 0.6.4.Lithium-SR4 | Failed to send message Pcerr [_pcerrMessage=PcerrMessage [_errors=[Errors [_errorObject=ErrorObject [_type=1, _value=2, augmentation=[]], augmentation=[]]], augmentation=[]], augmentation=[]]
java.nio.channels.ClosedChannelException
2016-04-12 11:20:21,730 | WARN | oupCloseable-5-2 | DefaultPromise | 52 - io.netty.common - 4.0.26.Final | An exception was thrown by org.opendaylight.protocol.framework.AbstractSessionNegotiator$1.operationComplete()
java.lang.IllegalStateException: complete already: DefaultPromise@19303a72(failure(java.io.IOException: Connection reset by peer)
at io.netty.util.concurrent.DefaultPromise.setFailure(DefaultPromise.java:418)[52:io.netty.common:4.0.26.Final]
at org.opendaylight.protocol.framework.AbstractSessionNegotiator.negotiationFailed(AbstractSessionNegotiator.java:53)[54:org.opendaylight.controller.protocol-framework:0.6.4.Lithium-SR4]
at org.opendaylight.protocol.framework.AbstractSessionNegotiator$1.operationComplete(AbstractSessionNegotiator.java:69)[54:org.opendaylight.controller.protocol-framework:0.6.4.Lithium-SR4]
at org.opendaylight.protocol.framework.AbstractSessionNegotiator$1.operationComplete(AbstractSessionNegotiator.java:64)[54:org.opendaylight.controller.protocol-framework:0.6.4.Lithium-SR4]
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)[52:io.netty.common:4.0.26.Final]
at io.netty.util.concurrent.DefaultPromise.notifyLateListener(DefaultPromise.java:621)[52:io.netty.common:4.0.26.Final]
at io.netty.util.concurrent.DefaultPromise.addListener(DefaultPromise.java:138)[52:io.netty.common:4.0.26.Final]
at io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:93)[51:io.netty.transport:4.0.26.Final]
at io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:28)[51:io.netty.transport:4.0.26.Final]
at org.opendaylight.protocol.framework.AbstractSessionNegotiator.sendMessage(AbstractSessionNegotiator.java:63)[54:org.opendaylight.controller.protocol-framework:0.6.4.Lithium-SR4]
at org.opendaylight.protocol.pcep.impl.AbstractPCEPSessionNegotiator.sendErrorMessage(AbstractPCEPSessionNegotiator.java:145)[268:org.opendaylight.bgpcep.pcep-impl:0.4.4.Lithium-SR4]
at org.opendaylight.protocol.pcep.impl.AbstractPCEPSessionNegotiator.access$100(AbstractPCEPSessionNegotiator.java:45)[268:org.opendaylight.bgpcep.pcep-impl:0.4.4.Lithium-SR4]
at org.opendaylight.protocol.pcep.impl.AbstractPCEPSessionNegotiator$1.run(AbstractPCEPSessionNegotiator.java:167)[268:org.opendaylight.bgpcep.pcep-impl:0.4.4.Lithium-SR4]
at io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38)[52:io.netty.common:4.0.26.Final]
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:120)[52:io.netty.common:4.0.26.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)[52:io.netty.common:4.0.26.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)[51:io.netty.transport:4.0.26.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)[52:io.netty.common:4.0.26.Final]
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)[52:io.netty.common:4.0.26.Final]
at java.lang.Thread.run(Thread.java:745)[:1.7.0_75]
Caused by: java.nio.channels.ClosedChannelException
|
|
The PCEP-ERROR Type=1 Value=2 -> "no Open message received before the expiration of the OpenWait timer".
Looks like something wrong is going on there...
I will try to reproduce it with the latest Beryllium.
|
|
The problem happens when ODL PCE receive unhanded message (i.e. Error message w/o SRP) -> new transaction is started but never submitted, hence it cause race-condition between transactions.
|
|
The session is closed by remote end. The exception mentioned above, is risen when session goes down.
|
|
master: https://git.opendaylight.org/gerrit/#/c/37806/
|
|
Beryllium cherry-pick
https://git.opendaylight.org/gerrit/#/c/38308/2
|
Generated at Wed Feb 07 19:13:02 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.