[NETVIRT-758] Many Null Pointer Exceptions for NAT and ACL Services seen on Carbon CSIT runs Created: 03/Jul/17  Updated: 03/May/18  Resolved: 11/Jul/17

Status: Resolved
Project: netvirt
Component/s: General
Affects Version/s: Carbon
Fix Version/s: None

Type: Bug
Reporter: Vivekanandan Narasimhan Assignee: Vivekanandan Narasimhan
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 8793

 Description   

Lot of NPEs seen in ACLService and NATService in recent Carbon CSIT:

https://logs.opendaylight.org/releng/jenkins092/netvirt-csit-1node-openstack-newton-nodl-v2-gate-stateful-carbon/965/odl1_karaf.log.gz



 Comments   
Comment by Vivekanandan Narasimhan [ 04/Jul/17 ]

This issue has the potential to cause upstream CSIT failures.

So bumping this bug to critical.

Comment by Aswin Suryanarayanan [ 04/Jul/17 ]

Fixes NPE in programConntrackRecircRules and syncSpecificAclFlow in AclService

https://git.opendaylight.org/gerrit/#/c/59900/

Comment by Shashidhar R [ 04/Jul/17 ]

Below commit fixes NPE observed in bindAcl() of AclService:
https://git.opendaylight.org/gerrit/#/c/59916/

Comment by Sam Hague [ 05/Jul/17 ]

https://git.opendaylight.org/gerrit/59887

Comment by A H [ 05/Jul/17 ]

A patch was submitted to fix this bug in Carbon SR1: https://git.opendaylight.org/gerrit/#/c/59887/
https://git.opendaylight.org/gerrit/#/c/59900
https://git.opendaylight.org/gerrit/#/c/59910

To better assess the impact of this bug and fix, could someone from your team please help us identify the following:
Regression: Is this bug a regression of functionality/performance/feature compared to Carbon?
Severity: Could you elaborate on the severity of this bug? Is this a BLOCKER such that we cannot release Carbon SR1 without it?
Workaround: Is there a workaround such that we can write a release note instead?
Testing: Could you also elaborate on the testing of this patch? How extensively has this patch been tested? Is it covered by any unit tests or system tests?
Impact: Does this fix impact any dependent projects?

Comment by Stephen Kitt [ 05/Jul/17 ]

(In reply to A H from comment #5)
> Regression: Is this bug a regression of functionality/performance/feature
> compared to Carbon?

Not sure, it’s probably also present in Carbon.

> Severity: Could you elaborate on the severity of this bug? Is this a
> BLOCKER such that we cannot release Carbon SR1 without it?

From our point of view yes, these errors cause severe mis-behaviours in NetVirt, rendering the controller useless in setups where NetVirt is used and the conditions arise causing these NPEs. Given their frequency in CSIT, that’s likely to happen in production setups too.

> Workaround: Is there a workaround such that we can write a release note
> instead?

No.

> Testing: Could you also elaborate on the testing of this patch? How
> extensively has this patch been tested? Is it covered by any unit tests or
> system tests?

It’s covered by CSIT.

> Impact: Does this fix impact any dependent projects?

No.

Comment by A H [ 05/Jul/17 ]

Please help elaborate on the regression. After the cutoff, the release team only permits blocker bugs with true regression in functionality from a previous release. My understanding from reading the bug description is that the NPE did not appear before, but have started appearing in Carbon SR1. This suggests to me that there exists some demonstrable regression.

(In reply to Stephen Kitt from comment #6)
> (In reply to A H from comment #5)
> > Regression: Is this bug a regression of functionality/performance/feature
> > compared to Carbon?
>
> Not sure, it’s probably also present in Carbon.
>
> > Severity: Could you elaborate on the severity of this bug? Is this a
> > BLOCKER such that we cannot release Carbon SR1 without it?
>
> From our point of view yes, these errors cause severe mis-behaviours in
> NetVirt, rendering the controller useless in setups where NetVirt is used
> and the conditions arise causing these NPEs. Given their frequency in CSIT,
> that’s likely to happen in production setups too.
>
> > Workaround: Is there a workaround such that we can write a release note
> > instead?
>
> No.
>
> > Testing: Could you also elaborate on the testing of this patch? How
> > extensively has this patch been tested? Is it covered by any unit tests or
> > system tests?
>
> It’s covered by CSIT.
>
> > Impact: Does this fix impact any dependent projects?
>
> No.

Comment by A H [ 11/Jul/17 ]

Has this blocker bug been verified as fixed in Carbon SR1 Build 20170711?

Generated at Wed Feb 07 20:22:23 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.