Details
-
Bug
-
Status: Resolved
-
Resolution: Cannot Reproduce
-
Lithium
-
None
-
None
-
Operating System: All
Platform: All
-
4572
-
Normal
Description
Clustering Issue : OF ports (6653,6633) not coming up in couple of nodes after all required OF bundle installed in 5 node cluster ( seen very very frequently 9 outof 10 times in SR2. Please refer below steps.
1.
Using SR2 build, Starting a 5 node cluster ( c1, c2, c3, c4 & c5) using deploy script as below.
./deploy_odl.py --clean --distribution=/home/mininet/controller-Li/distribution-karaf-0.3.2-Lithium-SR2.zip --rootdir=/home/mininet/controller-Li --hosts=10.183.181.41,10.183.181.42,10.183.181.43,10.183.181.44,10.183.181.45 --user=root --password=rootroot --template=/multi-node-test
2.
Installed following bundles in all nodes one after another, after ensuring (odl-cluster-data & odl-cluster-rpc Ports) 2550 & 2551 are up.
root@mininet-vm:~# netstat -na | grep 2550
tcp6 0 0 10.183.181.41:2550 :::* LISTEN
tcp6 0 0 10.183.181.41:2550 10.183.181.45:59936 ESTABLISHED
tcp6 0 0 10.183.181.41:38763 10.183.181.42:2550 ESTABLISHED
tcp6 0 0 10.183.181.41:2550 10.183.181.43:60935 ESTABLISHED
tcp6 0 0 10.183.181.41:32937 10.183.181.43:2550 ESTABLISHED
tcp6 0 0 10.183.181.41:47800 10.183.181.45:2550 ESTABLISHED
tcp6 0 0 10.183.181.41:54118 10.183.181.44:2550 ESTABLISHED
tcp6 0 0 10.183.181.41:2550 10.183.181.44:34235 ESTABLISHED
tcp6 0 0 10.183.181.41:2550 10.183.181.42:51515 ESTABLISHED
root@mininet-vm:~# netstat -na | grep 2551
tcp6 0 0 10.183.181.41:2551 :::* LISTEN
tcp6 0 0 10.183.181.41:2551 10.183.181.42:33333 ESTABLISHED
tcp6 0 0 10.183.181.41:2551 10.183.181.44:36155 ESTABLISHED
tcp6 0 0 10.183.181.41:2551 10.183.181.43:49365 ESTABLISHED
tcp6 0 0 10.183.181.41:35548 10.183.181.42:2551 ESTABLISHED
tcp6 0 0 10.183.181.41:58923 10.183.181.44:2551 ESTABLISHED
tcp6 0 0 10.183.181.41:39818 10.183.181.45:2551 ESTABLISHED
tcp6 0 373 10.183.181.41:42423 10.183.181.43:2551 ESTABLISHED
tcp6 0 0 10.183.181.41:2551 10.183.181.45:40387 ESTABLISHED
feature:install odl-clustering-test-app
feature:install odl-restconf-all
feature:install odl-mdsal-all
feature:install odl-openflowplugin-all-li
3. OF ports 6633, 6653 came up in all 4 nodes (c2, c3, c4, c5) excpet c1.
c1:
===
root@mininet-vm:~# netstat -na | grep 665*
tcp 0 0 0.0.0.0:6644 0.0.0.0:* LISTEN
unix 3 [ ] STREAM CONNECTED 9662
unix 3 [ ] STREAM CONNECTED 1866
unix 3 [ ] STREAM CONNECTED 8566 @/com/ubuntu/upstart
c2, c3, c4, c5:
===============
root@mininet-vm:~# netstat -na | grep 665*
tcp 0 0 0.0.0.0:6644 0.0.0.0:* LISTEN
tcp6 0 0 :::6633 :::* LISTEN
tcp6 0 0 :::6653 :::* LISTEN
4. checked suspected ERROR logs
2015-11-02 22:13:17,361 | ERROR | config-pusher | ConfigTransactionControllerImpl | 125 - org.opendaylight.controller.config-manager - 0.3.2.Lithium-SR2 | Commit failed on ModuleIdentifier
{factoryName='openflow-plugin-provider-impl', instanceName='openflow-plugin-provider-impl'} in transaction TransactionIdentifier{name='ConfigTransaction-86-88'}java.lang.IllegalStateException: java.util.concurrent.ExecutionException: TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}
at org.opendaylight.openflowplugin.impl.device.DeviceManagerImpl.<init>(DeviceManagerImpl.java:155)
at org.opendaylight.openflowplugin.impl.OpenFlowPluginProviderImpl.initialize(OpenFlowPluginProviderImpl.java:169)
at org.opendaylight.yang.gen.v1.urn.opendaylight.params.xml.ns.yang.config.openflow.plugin.impl.rev150327.OpenFlowProviderModule.createInstance(OpenFlowProviderModule.java:40)
at org.opendaylight.controller.config.spi.AbstractModule.getInstance(AbstractModule.java:73)
at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)[:1.7.0_65]
at java.lang.reflect.Method.invoke(Method.java:606)[:1.7.0_65]
at org.opendaylight.controller.config.manager.impl.dependencyresolver.DependencyResolverManager$ModuleInvocationHandler.handleInvocation(DependencyResolverManager.java:150)
5. Giving grep output of all errors in logfile. Also attaching karaf log files.
root@mininet-vm:/home/mininet/controller-Li/deploy/current/odl/data/log# grep ERROR karaf.log.1
2015-11-02 22:05:11,708 | ERROR | Event Dispatcher | FeatureDeploymentListener | 21 - org.apache.karaf.deployer.features - 3.0.3 | Unable to install features
2015-11-02 22:10:27,389 | ERROR | config-pusher | DistributedDataStore | 191 - org.opendaylight.controller.sal-distributed-datastore - 1.2.2.Lithium-SR2 | Shared leaders failed to settle in 90 seconds, giving up
2015-11-02 22:11:57,500 | ERROR | config-pusher | DistributedDataStore | 191 - org.opendaylight.controller.sal-distributed-datastore - 1.2.2.Lithium-SR2 | Shared leaders failed to settle in 90 seconds, giving up
2015-11-02 22:12:35,456 | ERROR | CommitFutures-1 | NetconfDeviceDatastoreAdapter | 243 - org.opendaylight.controller.sal-netconf-connector - 1.2.2.Lithium-SR2 | RemoteDevice{controller-config}: TransactionChain(org.opendaylight.controller.md.sal.binding.impl.BindingDOMTransactionChainAdapter@adef923) DOM-CHAIN-0-0 FAILED!
TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}
2015-11-02 22:12:35,253 | ERROR | CommitFutures-0 | NetconfDeviceDatastoreAdapter | 243 - org.opendaylight.controller.sal-netconf-connector - 1.2.2.Lithium-SR2 | RemoteDevice{controller-config}: Transaction(init) DOM-CHAIN-0-0 FAILED!
TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}
2015-11-02 22:12:36,644 | ERROR | CommitFutures-4 | NetconfDeviceTopologyAdapter | 243 - org.opendaylight.controller.sal-netconf-connector - 1.2.2.Lithium-SR2 | RemoteDevice{controller-config}: Transaction(init) DOM-CHAIN-1-0 FAILED!
TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-topology-operational currently has no leader. Try again later.]]}
2015-11-02 22:12:36,449 | ERROR | CommitFutures-3 | NetconfDeviceTopologyAdapter | 243 - org.opendaylight.controller.sal-netconf-connector - 1.2.2.Lithium-SR2 | RemoteDevice{controller-config}: TransactionChain(org.opendaylight.controller.md.sal.binding.impl.BindingDOMTransactionChainAdapter@dd96fd0) DOM-CHAIN-1-0 FAILED!
TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-topology-operational currently has no leader. Try again later.]]}
java.util.concurrent.ExecutionException: TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-topology-operational currently has no leader. Try again later.]]}
Caused by: TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-topology-operational currently has no leader. Try again later.]]}
2015-11-02 22:13:02,207 | ERROR | sing-executor-11 | ExecutionList | 72 - com.google.guava - 18.0.0 | RuntimeException while executing runnable com.google.common.util.concurrent.Futures$6@3d203dd5 with executor INSTANCE
2015-11-02 22:13:17,353 | ERROR | config-pusher | DeviceManagerImpl | 280 - org.opendaylight.openflowplugin.impl - 0.1.2.Lithium-SR2 | Creation of node failed.
java.util.concurrent.ExecutionException: TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}
Caused by: TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}
2015-11-02 22:13:17,361 | ERROR | config-pusher | ConfigTransactionControllerImpl | 125 - org.opendaylight.controller.config-manager - 0.3.2.Lithium-SR2 | Commit failed on ModuleIdentifier{factoryName='openflow-plugin-provider-impl', instanceName='openflow-plugin-provider-impl'}
in transaction TransactionIdentifier
{name='ConfigTransaction-86-88'}java.lang.IllegalStateException: java.util.concurrent.ExecutionException: TransactionCommitFailedException
{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}Caused by: java.util.concurrent.ExecutionException: TransactionCommitFailedException
{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}Caused by: TransactionCommitFailedException
{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}2015-11-02 22:13:17,464 | ERROR | config-pusher | ConfigRegistryImpl | 125 - org.opendaylight.controller.config-manager - 0.3.2.Lithium-SR2 | Configuration Transaction failed on 2PC, server is unhealthy
Caused by: java.lang.IllegalStateException: java.util.concurrent.ExecutionException: TransactionCommitFailedException
Caused by: java.util.concurrent.ExecutionException: TransactionCommitFailedException
{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}Caused by: TransactionCommitFailedException
{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}root@mininet-vm:/home/mininet/controller-Li/deploy/current/odl/data/log# grep ERROR karaf.log