[OPNFLWPLUG-471] NullPointerException from OpenflowPlugin.application.StatisticsManagerImpl.java:304 Created: 29/May/15  Updated: 27/Sep/21  Resolved: 02/Jun/15

Status: Resolved
Project: OpenFlowPlugin
Component/s: General
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Vasanthan Balasubramaniyan Assignee: Vasanthan Balasubramaniyan
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 3431

 Description   

Build# 1901
OpenvSwitch# version 2.3.1

Problem:
Statistics Manager thrown with NullPointerException or Statstics Manager killed.

As Summary,
Observation#1 : 64 switch, TSDR collection is successful, but thrown NullPointerException on Statistics Manger
Observation#2: 127 switch, occasionally TSDR collection partially successful, but got OpenFlowPlugin Exception
Observation#3: 127 Switch, Most consistent TSDR collection doesn’t successful (Waited for 4 hours for same)

Observation#1 (with #64 switches got NullPointerException)
===================================================
Exception in thread "pool-28-thread-1" java.lang.RuntimeException: java.lang.NullPointerException
at com.lmax.disruptor.FatalExceptionHandler.handleEventException(FatalExceptionHandler.java:45)
at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:147)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at org.opendaylight.openflowplugin.applications.statistics.manager.impl.StatisticsManagerImpl.disconnectedNodeUnregistration(StatisticsManagerImpl.java:304)
at org.opendaylight.openflowplugin.applications.statistics.manager.impl.StatNodeRegistrationImpl.disconnectFlowCapableNode(StatNodeRegistrationImpl.java:144)
at org.opendaylight.openflowplugin.applications.statistics.manager.impl.StatNodeRegistrationImpl.onNodeRemoved(StatNodeRegistrationImpl.java:167)
at org.opendaylight.yangtools.yang.binding.util.NotificationListenerInvoker.invokeNotification(NotificationListenerInvoker.java:91)
at org.opendaylight.controller.md.sal.binding.impl.BindingDOMNotificationListenerAdapter.onNotification(BindingDOMNotificationListenerAdapter.java:44)
at org.opendaylight.controller.md.sal.dom.broker.impl.DOMNotificationRouterEvent.deliverNotification(DOMNotificationRouterEvent.java:50)
at org.opendaylight.controller.md.sal.dom.broker.impl.DOMNotificationRouter$1.onEvent(DOMNotificationRouter.java:68)
at org.opendaylight.controller.md.sal.dom.broker.impl.DOMNotificationRouter$1.onEvent(DOMNotificationRouter.java:65)
at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
... 3 more
============================================================
Observation#2 (Sometimes particall collection happening with 127 switch,but exception as below)
===========================================================
015-05-28 15:11:15,290 | WARN | OFmsgFinisher-0 | TicketFinisherImpl | 254 - org.opendaylight.openflowplugin - 0.1.0.SNAPSHOT | processing (translate, publish) of ticket failed
java.util.concurrent.ExecutionException: java.lang.NullPointerException
at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)[61:com.google.guava:18.0.0]
at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)[61:com.google.guava:18.0.0]
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)[61:com.google.guava:18.0.0]
at org.opendaylight.openflowplugin.openflow.md.queue.TicketFinisherImpl.run(TicketFinisherImpl.java:50)[254:org.opendaylight.openflowplugin:0.1.0.SNAPSHOT]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)[:1.7.0_75]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)[:1.7.0_75]
at java.lang.Thread.run(Thread.java:745)[:1.7.0_75]
Caused by: java.lang.NullPointerException
2015-05-28 15:11:15,292 | WARN | OFmsgProcessor-2 | TicketProcessorFactoryImpl | 254 - org.opendaylight.openflowplugin - 0.1.0.SNAPSHOT | translation problem: null
2015-05-28 15:11:15,292 | WARN | OFmsgFinisher-0 | TicketFinisherImpl | 254 - org.opendaylight.openflowplugin - 0.1.0.SNAPSHOT | processing (translate, publish) of ticket failed
java.util.concurrent.ExecutionException: java.lang.NullPointerException
at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)[61:com.google.guava:18.0.0]
at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)[61:com.google.guava:18.0.0]
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)[61:com.google.guava:18.0.0]
at org.opendaylight.openflowplugin.openflow.md.queue.TicketFinisherImpl.run(TicketFinisherImpl.java:50)[254:org.opendaylight.openflowplugin:0.1.0.SNAPSHOT]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)[:1.7.0_75]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)[:1.7.0_75]
at java.lang.Thread.run(Thread.java:745)[:1.7.0_75]
Caused by: java.lang.NullPointerException
==========================================================================
Observation#3# TSDR Collection itself doesn’t happening and got the following message from Statistics Manager.I believe Statistics Manager doesn’t functioning properly.
sudo mn --topo tree,7 --switch ovsk,protocols=OpenFlow13 --controller remote,ip=10.16.148.232 (i.e 127 switches)
=================================================================
2015-05-28 20:58:46,685 | INFO | pool-32-thread-1 | StatisticsManagerImpl | 259 - org.opendaylight.openflowplugin.applications.statistics-manager - 0.1.0.SNAPSHOT | No existing collector found for new node. Creating a new collector for KeyedInstanceIdentifier

{targetType=interface org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node, path=[org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.Nodes, org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node[key=NodeKey [_id=Uri [_value=openflow:45]]]]}

2015-05-28 20:58:46,840 | INFO | pool-32-thread-1 | StatisticsManagerImpl | 259 - org.opendaylight.openflowplugin.applications.statistics-manager - 0.1.0.SNAPSHOT | No existing collector found for new node. Creating a new collector for KeyedInstanceIdentifier

{targetType=interface org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node, path=[org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.Nodes, org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node[key=NodeKey [_id=Uri [_value=openflow:124]]]]}

2015-05-28 20:58:46,923 | INFO | pool-32-thread-1 | StatisticsManagerImpl | 259 - org.opendaylight.openflowplugin.applications.statistics-manager - 0.1.0.SNAPSHOT | No existing collector found for new node. Creating a new collector for KeyedInstanceIdentifier

{targetType=interface org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node, path=[org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.Nodes, org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node[key=NodeKey [_id=Uri [_value=openflow:37]]]]}

2015-05-28 21:00:06,279 | INFO | l for user karaf | ZooKeeper | 298 - org.apache.hadoop.zookeeper - 3.3.5 | Initiating client connection, connectString=localhost:2181 sessionTimeout=180000 watcher=hconnection
2015-05-28 21:00:06,280 | INFO | raf-SendThread() | ClientCnxn | 298 - org.apache.hadoop.zookeeper - 3.3.5 | Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181
2015-05-28 21:00:06,281 | INFO | l for user karaf | RecoverableZooKeeper | 301 - wrap_mvn_org.apache.hbase_hbase_0.94.15 - 0.0.0 | The identifier of this process is 2960@user-01
2015-05-28 21:00:06,282 | INFO | (localhost:2181) | ClientCnxn | 298 - org.apache.hadoop.zookeeper - 3.3.5 | Socket connection established to localhost/0:0:0:0:0:0:0:1:2181, initiating session
2015-05-28 21:00:06,307 | INFO | (localhost:2181) | ClientCnxn | 298 - org.apache.hadoop.zookeeper - 3.3.5 | Session establishment complete on server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x14d9b1e5f8c0015, negotiated timeout = 40000



 Comments   
Comment by Kamal Rameshan [ 29/May/15 ]

Its happening as sometimes we see more than 1 nodeRemoved notifications come in.

Working on the fix.

Comment by Kamal Rameshan [ 29/May/15 ]

Hi Vasanthan,

Check the latest and mark it as closed after verification.

Thanks
Kamal

Comment by Vasanthan Balasubramaniyan [ 01/Jun/15 ]

Tested on Build#2064 (integration).

Tested with upto 254 switches doesn't generated this exception and TSDR collection was successful.

Note: Unable to move the status from "Waiting for Review" to "Closed".Pls close this bugzilla once review completed.

Comment by Vasanthan Balasubramaniyan [ 01/Jun/15 ]

On Build#2064 with CpQD switch this issue is visible.

Note: openvswitch with 2.3.1 issue not visible!

========Exception================

OT | translation problem: null
2015-06-01 17:00:55,206 | WARN | OFmsgFinisher-0 | TicketFinisherImpl | 254 - org.opendaylight.openflowplugin - 0.1.0.SNAPSHOT | processing (translate, publish) of ticket failed
java.util.concurrent.ExecutionException: java.lang.NullPointerException
at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)[61:com.google.guava:18.0.0]
at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)[61:com.google.guava:18.0.0]
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)[61:com.google.guava:18.0.0]
at org.opendaylight.openflowplugin.openflow.md.queue.TicketFinisherImpl.run(TicketFinisherImpl.java:50)[254:org.opendaylight.openflowplugin:0.1.0.SNAPSHOT]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)[:1.7.0_75]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)[:1.7.0_75]
at java.lang.Thread.run(Thread.java:745)[:1.7.0_75]
Caused by: java.lang.NullPointerException
at org.opendaylight.openflowplugin.openflow.md.core.sal.convertor.ActionConvertor.toMDSalActions(ActionConvertor.java:944)[254:org.opendaylight.openflowplugin:0.1.0.SNAPSHOT]
at org.opendaylight.openflowplugin.openflow.md.core.sal.convertor.OFToMDSalFlowConvertor.toSALInstruction(OFToMDSalFlowConvertor.java:71)[254:org.opendaylight.openflowplugin:0.1.0.SNAPSHOT]
at org.opendaylight.openflowplugin.openflow.md.core.sal.convertor.FlowStatsResponseConvertor.toSALFlowStats(FlowStatsResponseConvertor.java:103)[254:org.opendaylight.openflowplugin:0.1.0.SNAPSHOT]
at org.opendaylight.openflowplugin.openflow.md.core.sal.convertor.FlowStatsResponseConvertor.toSALFlowStatsList(FlowStatsResponseConvertor.java:49)[254:org.opendaylight.openflowplugin:0.1.0.SNAPSHOT]
at org.opendaylight.openflowplugin.openflow.md.core.translator.MultipartReplyTranslator.translate(MultipartReplyTranslator.java:134)[254:org.opendaylight.openflowplugin:0.1.0.SNAPSHOT]

Comment by Kamal Rameshan [ 01/Jun/15 ]

This does not seem to be related to the statistics bug.

It is happening in the converter when converting actions present in the flow to md-sal actions.

Seems like the action-list inside the instruction of the flow is null.

Comment by Vasanthan Balasubramaniyan [ 02/Jun/15 ]

(In reply to Kamal Rameshan from comment #5)
> This does not seem to be related to the statistics bug.
>
> It is happening in the converter when converting actions present in the flow
> to md-sal actions.
>
> Seems like the action-list inside the instruction of the flow is null.

Does required raised another bugzilla in MD-SAL?

Comment by Kamal Rameshan [ 02/Jun/15 ]

Are you asking that do we need a separate bug to capture this issue?

I would say yes. And not under MDSAL but under Openflowplugin.

Also under that bug description , please give as much details as possible, so that we can reproduce.
For instance if you can post the flow xml which is causing the flow-stats to break it would be helpful.

I am marking this issue as resolved.

Generated at Wed Feb 07 20:32:34 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.