[DAEXIM-11] CSIT blocker: daexim-csit-3node-clustering-basic-only-fluorine Created: 24/Oct/18 Updated: 25/Oct/18 Resolved: 24/Oct/18 |
|
| Status: | Resolved |
| Project: | daexim |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Highest |
| Reporter: | Ariel Adam | Assignee: | Shaleen Saxena |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | csit:stable:blocker | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Description |
|
Looking at the daily stable CI jobs (https://jenkins.opendaylight.org/releng/view/csit-stable-f/) we observed that the daexim-csit-3node-clustering-basic-only-fluorine had a number of failures:
|
| Comments |
| Comment by Shaleen Saxena [ 24/Oct/18 ] |
|
This looks to be an intermittent failure in cluster bring up. |
| Comment by Ariel Adam [ 24/Oct/18 ] |
|
If this is a random failure appreciate if you could remove the failing tests or stabilize them. We need to reach a point where this subset always passes as a basis for a gate. |
| Comment by Richard Kosegi [ 24/Oct/18 ] |
|
Problem is definitely outside of daexim. Keyword "ClusterManagement .Start_Members_From_List_Or_All" fail to reach last node in cluster. There is nothing we can do about it other then disable failing test (but that will only hide real issue). shows: 2018-10-23T03:06:37,040 | ERROR | opendaylight-cluster-data-akka.actor.default-dispatcher-40 | ActorSystemImpl | 43 - com.typesafe.akka.slf4j - 2.5.11 | Uncaught error from thread [opendaylight-cluster-data-akka.actor.default-dispatcher-32]: org/w3c/dom/ElementTraversal, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[opendaylight-cluster-data] which is similar to https://jira.opendaylight.org/browse/NETCONF-576 In any case, we can't fix that in daexim.
|
| Comment by Richard Kosegi [ 24/Oct/18 ] |
|
There is nothing to be done in daexim and problem is already tracked in |
| Comment by Ariel Adam [ 25/Oct/18 ] |
|
Appreciate the detailed answer. Will raise it today on the TSC call to decide how to progress. That fact that it's the same problem as in the netconf is a very important point. |