[CONTROLLER-1442] Clustering: Inconsistent behaviour in basic flow installation for SR2 most of times fails at operational datastore Created: 03/Nov/15  Updated: 28/Jan/16  Resolved: 28/Jan/16

Status: Resolved
Project: controller
Component/s: clustering
Affects Version/s: Lithium
Fix Version/s: None

Type: Bug
Reporter: Sanjib Mohapatra Assignee: Moiz Raja
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Attachments: File ODL TR.rar    
External issue ID: 4574

 Description   

Clustering: Inconsistent behaviour in basic flow installation for SR2. Most of times (8 out of 10 times) fails at operational datastore

1.
Using SR2 build, I started a 5 node cluster with below IPs. deploy_odl.py script is used to start it .
[
controller-1 = 10.183.181.41
controller-2 = 10.183.181.42
controller-3 = 10.183.181.43
controller-4 = 10.183.181.44
controller-5 = 10.183.181.45
]

cd /home/mininet/integration/test/tools/clustering/cluster-deployer/

./deploy_odl.py --clean --distribution=/home/mininet/controller-Li/distribution-karaf-0.3.2-Lithium-SR2.zip --rootdir=/home/mininet/controller-Li --hosts=10.183.181.41,10.183.181.42,10.183.181.43,10.183.181.44,10.183.181.45 --user=root --password=rootroot --template=/multi-node-test

2. I observe in all the cluster nodes 2551 and 2550 ports came up properly.

root@mininet-vm:~# netstat -na | grep 2550
tcp6 0 0 10.183.181.41:2550 :::* LISTEN
tcp6 0 0 10.183.181.41:48109 10.183.181.44:2550 ESTABLISHED
tcp6 0 0 10.183.181.41:42152 10.183.181.45:2550 ESTABLISHED
tcp6 0 0 10.183.181.41:2550 10.183.181.44:44754 ESTABLISHED
tcp6 0 0 10.183.181.41:34346 10.183.181.43:2550 ESTABLISHED
tcp6 0 0 10.183.181.41:36484 10.183.181.42:2550 ESTABLISHED
tcp6 0 0 10.183.181.41:2550 10.183.181.43:51115 ESTABLISHED
tcp6 0 0 10.183.181.41:2550 10.183.181.45:45284 ESTABLISHED
tcp6 0 0 10.183.181.41:2550 10.183.181.42:50675 ESTABLISHED
root@mininet-vm:~# netstat -na | grep 2551
tcp6 0 0 10.183.181.41:2551 :::* LISTEN
tcp6 0 0 10.183.181.41:2551 10.183.181.45:33739 ESTABLISHED
tcp6 0 0 10.183.181.41:48196 10.183.181.43:2551 ESTABLISHED
tcp6 0 0 10.183.181.41:2551 10.183.181.42:55729 ESTABLISHED
tcp6 0 0 10.183.181.41:53302 10.183.181.45:2551 ESTABLISHED
tcp6 0 0 10.183.181.41:56779 10.183.181.42:2551 ESTABLISHED
tcp6 0 0 10.183.181.41:2551 10.183.181.43:36887 ESTABLISHED
tcp6 0 0 10.183.181.41:2551 10.183.181.44:45111 ESTABLISHED
tcp6 0 0 10.183.181.41:53763 10.183.181.44:2551 ESTABLISHED
root@mininet-vm:~#

3. Installed following bundles in all cluster nodes.

feature:install odl-clustering-test-app
feature:install odl-restconf-all
feature:install odl-mdsal-all
feature:install odl-openflowplugin-all-li

OF ports comes in 3 nodes controller-3, controller-4 and controller-5 but not in controller-1 and controller-2.

In SR2, most of the time OF ports not coming up in some nodes in cluster due to ODL CONTROLLER-1441. However the to over come from it i have to restart 2 nodes. So restarted controller-1 and controller-2.

3. Once the system is stable ( means all necessary ports are up, ex:OF ports, RPC port, config data ports etc)

from jconsole i observe controller-4 is leader. I connect the mininet switch to follower controller-2 and pushed flows from controller-1, i observer all flows are configured in switch.

I observe in config datastore flows are replicated in all 5 nodes, however in operation datastore i find flows not detected. 8 out of 10 times i see this problematic behaviour. Rarely flows installed in OVS switch.

Please find the attached logs of all the cluster nodes.

root@mininet-vm:/home/mininet/integration/test/tools/odl-mdsal-clustering-tests/clustering-performance-test# ./inventory_crawler.py --auth --host 10.183.181.42 --datastore config
Crawling 'http://10.183.181.42:8181/restconf/config/opendaylight-inventory:nodes'

Totals:
Nodes: 1
Reported flows: 0
Found flows: 10
root@mininet-vm:/home/mininet/integration/test/tools/odl-mdsal-clustering-tests/clustering-performance-test# ./inventory_crawler.py --auth --host 10.183.181.42 --datastore operational
Crawling 'http://10.183.181.42:8181/restconf/operational/opendaylight-inventory:nodes'

Totals:
Nodes: 1
Reported flows: 9
Found flows: 0
root@mininet-vm:/home/mininet/integration/test/tools/odl-mdsal-clustering-tests/clustering-performance-test# ./inventory_crawler.py --auth --host 10.183.181.44 --datastore config
Crawling 'http://10.183.181.44:8181/restconf/config/opendaylight-inventory:nodes'

Totals:
Nodes: 1
Reported flows: 0
Found flows: 10
root@mininet-vm:/home/mininet/integration/test/tools/odl-mdsal-clustering-tests/clustering-performance-test# ./inventory_crawler.py --auth --host 10.183.181.45 --datastore config
Crawling 'http://10.183.181.45:8181/restconf/config/opendaylight-inventory:nodes'

Totals:
Nodes: 1
Reported flows: 0
Found flows: 10
root@mininet-vm:/home/mininet/integration/test/tools/odl-mdsal-clustering-tests/clustering-performance-test# ./inventory_crawler.py --auth --host 10.183.181.41 --datastore operational
Crawling 'http://10.183.181.41:8181/restconf/operational/opendaylight-inventory:nodes'

Totals:
Nodes: 1
Reported flows: 9
Found flows: 0
root@mininet-vm:/home/mininet/integration/test/tools/odl-mdsal-clustering-tests/clustering-performance-test# cd /home/mininet/controller-Li/deploy/current/odl/data/log



 Comments   
Comment by Sanjib Mohapatra [ 03/Nov/15 ]

Attachment ODL TR.rar has been added with description: All 5 Cluster nodes karaf log

Comment by Ryan Goulding [ 19/Jan/16 ]

Do we know if this occurs in Beryllium?

Comment by Ryan Goulding [ 26/Jan/16 ]

This to pertains to a 5-node cluster, which is not tested upstream. Is this failing in a 3-node cluster? Can you report back if this occurs in Beryllium w/ 3 nodes?

Comment by Ryan Goulding [ 26/Jan/16 ]

This to pertains to a 5-node cluster, which is not tested upstream. Is this failing in a 3-node cluster? Can you report back if this occurs in Beryllium w/ 3 nodes?

Comment by Anil Vishnoi [ 26/Jan/16 ]

which plugin are you using for testing it? Old He plugin (Current default) or new Li (alternate) plugin. Current default plugin don't support clustering in Lithium, only alternate Li plugin supports it in Lithium.

Comment by Sanjib Mohapatra [ 28/Jan/16 ]

I have tested using Beryllium build( both Lithium & Helium plugin). The issue is no longer seen in 3 node cluster.

Generated at Wed Feb 07 19:55:33 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.