[NEUTRON-159] Sporadic NeutronNetworkJAXBTest & NeutronFirewallJAXBTest failures Created: 10/Apr/18 Updated: 10/Oct/18 Resolved: 10/Oct/18 |
|
| Status: | Resolved |
| Project: | neutron |
| Component/s: | neutron-spi |
| Affects Version/s: | Oxygen, Fluorine |
| Fix Version/s: | Oxygen-SR4, Fluorine-SR1, Neon |
| Type: | Bug | Priority: | Medium |
| Reporter: | Michael Vorburger | Assignee: | Michael Vorburger |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Description |
|
as raised on https://lists.opendaylight.org/pipermail/neutron-dev/2018-March/001633.html, and seen again e.g. in https://jenkins.opendaylight.org/releng/job/neutron-maven-verify-fluorine-mvn33-openjdk8/25/console for (unrelated) minor change https://git.opendaylight.org/gerrit/#/c/70354/, we're occassionally but regularly enough hitting these weird but recurring NeutronNetworkJAXBTest & NeutronFirewallJAXBTest failures: Failed tests: NeutronFirewallJAXBTest.test_NeutronFirewallPolicy_JAXB:61 NeutronFirewallPolicy JAXB Test 2: Testing tenant_id failed expected:<aa902936679e4ea29bfe1158e3450a13> but was:<null> NeutronFirewallJAXBTest.test_NeutronFirewall_JAXB:31 NeutronFirewall JAXB Test 2: Testing tenant_id failed expected:<aa902936679e4ea29bfe1158e3450a13> but was:<null> NeutronFloatingIpJAXBTest.test_NeutronFloatingIp_JAXB:34 NeutronFloatingIp JAXB Test 2: Testing tenant_id failed expected:<4969c491a3c74ee4af974e6d800c62de> but was:<null> NeutronLoadBalancerHealthMonitorJAXBTest.test_NeutronLoadBalancerHealthMonitor_JAXB:58 NeutronLoadBalancerHealthMonitor JAXB Test 10: Testing tenant_id failed expected:<00045a7b-796b-4f26-9cf9-9e82d248fda7> but was:<null> NeutronLoadBalancerJAXBTest.test_NeutronLoadBalancer_JAXB:48 NeutronLoadBalancer JAXB Test 8: Testing tenant_id failed expected:<4969c491a3c74ee4af974e6d800c62de> but was:<null> NeutronLoadBalancerListenerJAXBTest.test_NeutronLoadBalancerListener_JAXB:62 NeutronLoadBalancerListener JAXB Test 9: Testing tenant_id failed expected:<11145a7b-796b-4f26-9cf9-9e82d248fda7> but was:<null> NeutronLoadBalancerPoolJAXBTest.test_NeutronLoadBalancerPool_JAXB:47 NeutronLoadBalancerPool JAXB Test 7: Testing Tenant_id failed expected:<1a3e005cf9ce40308c900bcb08e5320c> but was:<null> NeutronLoadBalancerPoolMemberJAXBTest.test_NeutronLoadBalancerPoolMember_JAXB:47 NeutronLoadBalancerPoolMember JAXB Test 7: Testing tenant_id failed expected:<00045a7b-796b-4f26-9cf9-9e82d248fda7> but was:<null> NeutronMeteringLabelJAXBTest.test_NeutronMeteringLabel_JAXB:34 NeutronMeteringLabel JAXB Test 4: Testing tenant_id failed expected:<9bacb3c5d39d41a79512987f338cf177> but was:<null> NeutronNetworkJAXBTest.test_NeutronNetwork_MultipleProvider_JAXB:79 NeutronNetwork JAXB Test 2: Testing tenant_id failed expected:<9bacb3c5d39d41a79512987f338cf177> but was:<null> NeutronNetworkJAXBTest.test_NeutronNetwork_SingleProvider_JAXB:35 NeutronNetwork JAXB Test 2: Testing tenant_id failed expected:<9bacb3c5d39d41a79512987f338cf177> but was:<null> NeutronNetworkQosJAXBTest.test_NeutronNetworkQos_JAXB:33 NeutronNetwork JAXB Test 2: Testing tenant_id failed expected:<9bacb3c5d39d41a79512987f338cf177> but was:<null> NeutronPortJAXBTest.test_NeutronPort_JAXB:39 NeutronPort JAXB Test 2: Testing tenant_id failed expected:<9bacb3c5d39d41a79512987f338cf177> but was:<null> NeutronPortQosJAXBTest.test_PortQosEnabled_JAXB:42 NeutronPort JAXB Test 2: Testing tenant_id failed expected:<9bacb3c5d39d41a79512987f338cf177> but was:<null> NeutronPortSecurityJAXBTest.test_NeutronPortSecurityDefault_JAXB:77->test_PortSecurityEnabled_JAXB:87 NeutronPort JAXB Test 2: Testing tenant_id failed expected:<9bacb3c5d39d41a79512987f338cf177> but was:<null> NeutronPortSecurityJAXBTest.test_NeutronPortSecurityDisabled_JAXB:68->test_PortSecurityEnabled_JAXB:87 NeutronPort JAXB Test 2: Testing tenant_id failed expected:<9bacb3c5d39d41a79512987f338cf177> but was:<null> NeutronPortSecurityJAXBTest.test_NeutronPortSecurityEnabled_JAXB:63->test_PortSecurityEnabled_JAXB:87 NeutronPort JAXB Test 2: Testing tenant_id failed expected:<9bacb3c5d39d41a79512987f338cf177> but was:<null> NeutronQosJAXBTest.test_NeutronQosPolicy_JAXB:40 NeutronQosPolicy JAXB Test 2: Testing tenant_id failed expected:<aa902936679e4ea29bfe1158e3450a13> but was:<null> NeutronRouterJAXBTest.test_NeutronRouter_JAXB:45 NeutronFloatingIp JAXB Test 5: Testing tenant_id failed expected:<aa902936679e4ea29bfe1158e3450a13> but was:<null> NeutronSFCFlowClassifierJAXBTest.test_NeutronSFCFlowClassifier_JAXB:39 NeutronSFCFlowClassifier JAXB Test 2: Testing tenant_id failed expected:<4969c491a3c74ee4af974e6d800c62de> but was:<null> NeutronSFCPortChainJAXBTest.test_NeutronSFCPortChain_JAXB:38 NeutronSFCPortChain JAXB Test 2: Testing tenant_id failed expected:<4969c491a3c74ee4af974e6d800c62de> but was:<null> NeutronSFCPortPairGroupJAXBTest.test_NeutronSFCPortPairGroup_JAXB:32 NeutronSFCPortPairGroup JAXB Test 2: Testing tenant_id failed expected:<4969c491a3c74ee4af974e6d800c62de> but was:<null> NeutronSFCPortPairJAXBTest.test_NeutronSFCPortPair_JAXB:35 NeutronSFCPortPair JAXB Test 2: Testing tenant_id failed expected:<4969c491a3c74ee4af974e6d800c62de> but was:<null> NeutronSecurityGroupJAXBTest.test_NeutronSecurityGroup_JAXB:39 NeutronSecurityGroup JAXB Test 4: Testing port range min failed expected:<b4f50856753b4dc6afee5fa6b9b6c550> but was:<null> NeutronSecurityRuleJAXBTest.test_NeutronSecurityRule_JAXB:74 NeutronSecurityRule JAXB Test 10: Testing tenant id failed expected:<e4f50856753b4dc6afee5fa6b9b6c550> but was:<null> NeutronSubnetJAXBTest.test_NeutronSubnet_JAXB:47 NeutronSubnet JAXB Test 2: Testing tenant_id failed expected:<379ffe2b9cda498d9e17b319733ec889> but was:<null> NeutronTapFlowJAXBTest.test_NeutronTapFlow_JAXB:33 NeutronTapFlow JAXB Test 2: Testing tenant_id failed expected:<aa902936679e4ea29bfe1158e3450a13> but was:<null> NeutronTapServiceJAXBTest.test_NeutronTapService_JAXB:31 NeutronTapService JAXB Test 2: Testing tenant_id failed expected:<aa902936679e4ea29bfe1158e3450a13> but was:<null> NeutronTrunkJAXBTest.test_NeutronTrunk_JAXB:43 NeutronTrunk JAXB Test 5: Testing tenant_id failed expected:<cc3641789c8a4304abaa841c64f638d9> but was:<null> NeutronVpnIkePolicyJAXBTest.test_NeutronVpnIkePolicy_JAXB:33 NeutronVpnIkePolicy JAXB Test 2: Testing tenant id failed expected:<ccb81365fe36411a9011e90491fe1330> but was:<null> NeutronVpnIpSecPolicyJAXBTest.test_NeutronVpnIPSecPolicy_JAXB:34 NeutronVpnIpSecPolicy JAXB Test 2: Testing tenant id failed expected:<ccb81365fe36411a9011e90491fe1330> but was:<null> NeutronVpnIpSecSiteConnectionJAXBTest.test_NeutronVpnIPSecSiteConnection_JAXB:39 NeutronVpnIpSecSiteConnection JAXB Test 2: Testing tenant id failed expected:<ccb81365fe36411a9011e90491fe1330> but was:<null> NeutronVpnServiceJAXBTest.test_NeutronVPNService_JAXB:45 NeutronVpnService JAXB Test 6: Testing Tenant Id failed expected:<ccb81365fe36411a9011e90491fe1330> but was:<null> Tests in error: NeutronFirewallJAXBTest.test_NeutronFirewallRule_JAXB:89 NullPointer Tests run: 62, Failures: 33, Errors: 1, Skipped: 0 |
| Comments |
| Comment by Michael Vorburger [ 10/Apr/18 ] |
|
I've run the NeutronNetworkJAXBTest about 20'000 times locally (using the org.opendaylight.infrautils.testutils.RunUntilFailureRule) but cannot reproduce this locally. How can this simple test fail only on Jenkins, only every now and then? This is typically indicative of a concurrency timing issue, but I don't see how this could apply here, this is REALLY weird... because the failing tests are really quite simple, just some trivial looking JAX JSON unmarshalling and assert; so I've had a closer look, more because I'm intrigugined by the mystery, although I guess in theory this could be a real problem at runtime in production as well: The 33 failures in NeutronNetworkJAXBTest are because a NeutronObject getTenantID() is SOMETIMES null - however the asserts on getID() which failing all tests do just before passes. What's so special about this tenantID? It has an if isEmpty() return null check in its getter... are there some known concurrency issues with JAXB where this could cause problems?? I'm going to re-order the asserts in the tests to put tenantID last, and see if ALL other properties did get unmarshalled correctly, whenever this hits us next... The NeutronFirewallJAXBTest.test_NeutronFirewallRule_JAXB:89 failure is an NPE where the JaxbTestHelper.jaxbUnmarshall returns null; so that's a little bit different (entire object, not just 1 property). The JaxbTestHelper has nothing obciously wrong, that I can see. May be close the reader? Cache the JAXBContext? |
| Comment by Michael Vorburger [ 10/Apr/18 ] |
|
> re-order the asserts in the tests to put tenantID last, and see if ALL E.g. in both NeutronLoadBalancerHealthMonitorJAXBTest, NeutronLoadBalancerPoolMemberJAXBTest, NeutronSFCPortPairGroupJAXBTest and NeutronMeteringLabelJAXBTest and more this coindentially was already done like this, so all other properties are asserted on just fine there, only getTenantID() then returns null; this supports the theory that there is some weird issue related specifically to the tenant_id getter... hm. https://git.opendaylight.org/gerrit/#/c/70706/ will eventually confirm this for good, but I would say there is a high likelyhood that that custom getter is somehow occassionally causing havoc. |
| Comment by Michael Vorburger [ 10/Apr/18 ] |
|
On the off chance (probably unlikely, but you never know) that this is some wacky sporadic bug in the EclipseLink Moxy JAXB implementation we use for the JSON processing, let's try to bump it to the lastest in |
| Comment by Michael Vorburger [ 17/Apr/18 ] |
|
This happened again today, but only on 1 of 5 of my Neutron changes that were built successfully today. But c/70717 with the Moxy version bump is not yet merged; I need that to go in before looking any further. |
| Comment by Michael Vorburger [ 05/Jul/18 ] |
|
This recently happened again on stable oxygen, twice on autorelease-release-oxygen/343 and autorelease-release-oxygen/339. I had A Closer Look look through all autorelease-release-fluorine for the last 1 month, and it's a datapoint worthwhile noting that it hasn't happened on master anymore. So while strictly speaking this is not conclusive proof of course, it supports the theory (or least doesn't contradict it) that my earlier |
| Comment by Michael Vorburger [ 05/Jul/18 ] |
|
Closing this issue now, as it's not been seen on master Fluorine in a while (see above), and hoping that |
| Comment by Michael Vorburger [ 03/Sep/18 ] |
|
Seen again today on stable/oxygen on https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/autorelease-release-oxygen/408/ |
| Comment by Michael Vorburger [ 03/Sep/18 ] |
|
c/75508 does another bump of Bump EclipseLink Moxy JAXB impl 2.7.1 → 2.7.3 - if we are exceptionally lucky, that fixes some issue. But what we (someone) should really do here is try to run these occassionally failing tests under infrautils' RunUntilFailureRule and see if it can reproduced running over night - then debug. |
| Comment by Michael Vorburger [ 10/Sep/18 ] |
|
Seen again today on neon on https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/autorelease-release-neon/34/ .. I just spent 10' having another closer look. E.g. NeutronTrunkJAXBTest - that getTenantID() really cannot be null, I'm very puzzled. |
| Comment by Michael Vorburger [ 10/Sep/18 ] |
|
Looked again more into this PITA, I still don't really understand how it could happen, but I have just pushed three changes which may help with this tenant_id empty/null business (which, historically, is due to https://bugs.opendaylight.org/show_bug.cgi?id=4775 and its old https://git.opendaylight.org/gerrit/#/c/31324 and https://git.opendaylight.org/gerrit/#/c/31361/) ... let's try to:
If we don't see it on master for say 2 weeks, then cherry-pick to stable/fluorine and stable/oxygen. |
| Comment by Michael Vorburger [ 02/Oct/18 ] |
|
> If we don't see it on master for say 2 weeks, then cherry-pick to stable/fluorine and stable/oxygen. done today |