[CONTROLLER-1070] Clustering: Robot integration tests failing Created: 15/Dec/14 Updated: 24/Jan/15 Resolved: 24/Jan/15 |
|
| Status: | Resolved |
| Project: | controller |
| Component/s: | mdsal |
| Affects Version/s: | Helium |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Tom Pantelis | Assignee: | Tom Pantelis |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| External issue ID: | 2516 |
| Priority: | Normal |
| Description |
|
The clustering integration tests have been failing for a while now. The "010 Credential Authentication" AAA test always fails - I assume this is an issue with the test setup. Of more concern are the sporadic failures, specifically "Inventory Scalability OF10". An example can be seen at The "Get Stats for a node" test case looks for "flow-capable-node-connector-statistics" in the REST output. This query is repeated for 2 minutes waiting for it to exceed. In looking at the karaf.log, it appears the following OptimisticLockFailedException corresponds to the test time out failure: 2014-12-12 15:30:56,993 | WARN | lt-dispatcher-30 | InMemoryDOMDataStore | 145 - org.opendaylight.controller.sal-inmemory-datastore - 1.2.0.SNAPSHOT | Store Tx: member-1-shard-inventory-operational-441 Conflicting modification for /(urn:opendaylight:inventory?revision=2013-08-19)nodes/node/node[ {(urn:opendaylight:inventory?revision=2013-08-19)id=openflow:2}]. at com.google.common.util.concurrent.Futures$ImmediateFailedFuture.get(Futures.java:183)[52:com.google.guava:14.0.1] ... 2014-12-12 15:30:57,003 | WARN | ds-oper-thread-0 | StatisticsManagerImpl | 153 - org.opendaylight.controller.md.statistics-manager - 1.2.0.SNAPSHOT | Unhandled exception during processing statistics. Restarting transaction chain. Note the "Node was deleted by other transaction." error message. This appears to indicate some code (StatisticsManager?) is supposed to put stats under an OF node (id=openflow:2) but some parent node doesn't exist (probably the "openflow:2" Node itself), either because: It doesn't appear to be #3 because I don't see any previous commit failures in the log. There are also sporadic failures in "Compatible.AD SAL NSF OF10" - https://jenkins.opendaylight.org/integration/job/integration-master-csit-cluster-min/205/robot/report/log.html - where "FlowProgrammer.Check flow in flow stats" and "StatisticsManager.get port stats" fail. These same tests have not been failing without clustering. |
| Comments |
| Comment by Tom Pantelis [ 05/Jan/15 ] |
|
The OptimisticLockFailedExceptions occur w/o clustering as well and aren't related to the test failures. Anil identified an issue in the StatisticsManager that is fixed by https://bugs.opendaylight.org/show_bug.cgi?id=2551. This appears to be the cause of the intermittent test failures. The issue is actually not related to clustering although the failures only seemed to occur with clustering enabled; maybe just coincidence or clustering changes the timing of things to increase the chance for test failures. I'll leave this open for a bit to verify the integration tests succeed. |
| Comment by Tom Pantelis [ 24/Jan/15 ] |
|
There's still intermittent test failures but I've also seen similar failures in the integration-master-csit-compatible-min tests w/o clustering. Closing this bug... |