[NETVIRT-1572] Connectivity to VM s lost after ODL are brought up and down in a particular sequence. Created: 13/Mar/19 Updated: 14/Jan/20 |
|
| Status: | Open |
| Project: | netvirt |
| Component/s: | None |
| Affects Version/s: | Magnesium |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Medium |
| Reporter: | Jaya Priyadarshini | Assignee: | Srinivas Rachakonda |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | csit:3node | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
Steps to reproduce 13)Take Down ODL2 and ODL3
Logs ================================================== https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/builder-copy-sandbox-logs/644
|
| Comments |
| Comment by Srinivas Rachakonda [ 18/Dec/19 ] |
|
Test still failing:
|
| Comment by Nishchya Gupta [ 14/Jan/20 ] |
|
Hi Srini,
Steps followed in this test case consist of multiple up and down of nodes/shards, below are the number of restart of nodes or shards in this suite.
Bring down Default shard leader Bring up leader for default shard Down then Up odl1 Down then Up odl2 Down then Up odl3 Down odl1 and odl2 Up odl1 and odl2 Down odl2 and odl3 Up odl2 and odl3
But in none of the above case we are verifying that after bring up the nodes/shard, what is the state of shards and who is the owner for the shards.
Whereas, in logs I can see below lines multiple times, as per my knowledge which states there might be multiple shards owners are present at the same time. Even it looks like one of the node is in different cluster and other node is in different cluster, means at the same time 2 clusters are present which is resulting into inconsistency and we are observing multiple failures after that. 2019-12-05T04:47:46,895 | INFO | opendaylight-cluster-data-akka.actor.default-dispatcher-19 | Cluster(akka://opendaylight-cluster-data) | 47 - com.typesafe.akka.slf4j - 2.5.25 | Cluster Node [akka.tcp://opendaylight-cluster-data@10.30.170.59:2550] - Node [akka.tcp://opendaylight-cluster-data@10.30.170.59:2550] is JOINING itself (with roles [member-1, dc-default]) and forming new cluster2019-12-05T04:47:46,898 | INFO | opendaylight-cluster-data-akka.actor.default-dispatcher-19 | Cluster(akka://opendaylight-cluster-data) | 47 - com.typesafe.akka.slf4j - 2.5.25 | Cluster Node [akka.tcp://opendaylight-cluster-data@10.30.170.59:2550] - is the new leader among reachable nodes (more leaders may exist) It would be better if we change our script and add shard owner details after every restart to know better, that on what restart its actually failing. And, its good to add someone from clustering/Akka team to look into it as its not having expertise into clustering side.
Regards, Nishchya |