Uploaded image for project: 'controller'
  1. controller
  2. CONTROLLER-1441

Clustering : OF ports (6653,6633) not coming up in couple of nodes after all required OF bundle installated in 5 node cluster ( seen very frequently 9 of 10 times in SR2)

    XMLWordPrintable

Details

    • Bug
    • Status: Resolved
    • Resolution: Cannot Reproduce
    • Lithium
    • None
    • clustering
    • None
    • Operating System: All
      Platform: All

    • 4572
    • Normal

    Description

      Clustering Issue : OF ports (6653,6633) not coming up in couple of nodes after all required OF bundle installed in 5 node cluster ( seen very very frequently 9 outof 10 times in SR2. Please refer below steps.

      1.
      Using SR2 build, Starting a 5 node cluster ( c1, c2, c3, c4 & c5) using deploy script as below.

      ./deploy_odl.py --clean --distribution=/home/mininet/controller-Li/distribution-karaf-0.3.2-Lithium-SR2.zip --rootdir=/home/mininet/controller-Li --hosts=10.183.181.41,10.183.181.42,10.183.181.43,10.183.181.44,10.183.181.45 --user=root --password=rootroot --template=/multi-node-test

      2.
      Installed following bundles in all nodes one after another, after ensuring (odl-cluster-data & odl-cluster-rpc Ports) 2550 & 2551 are up.

      root@mininet-vm:~# netstat -na | grep 2550
      tcp6 0 0 10.183.181.41:2550 :::* LISTEN
      tcp6 0 0 10.183.181.41:2550 10.183.181.45:59936 ESTABLISHED
      tcp6 0 0 10.183.181.41:38763 10.183.181.42:2550 ESTABLISHED
      tcp6 0 0 10.183.181.41:2550 10.183.181.43:60935 ESTABLISHED
      tcp6 0 0 10.183.181.41:32937 10.183.181.43:2550 ESTABLISHED
      tcp6 0 0 10.183.181.41:47800 10.183.181.45:2550 ESTABLISHED
      tcp6 0 0 10.183.181.41:54118 10.183.181.44:2550 ESTABLISHED
      tcp6 0 0 10.183.181.41:2550 10.183.181.44:34235 ESTABLISHED
      tcp6 0 0 10.183.181.41:2550 10.183.181.42:51515 ESTABLISHED
      root@mininet-vm:~# netstat -na | grep 2551
      tcp6 0 0 10.183.181.41:2551 :::* LISTEN
      tcp6 0 0 10.183.181.41:2551 10.183.181.42:33333 ESTABLISHED
      tcp6 0 0 10.183.181.41:2551 10.183.181.44:36155 ESTABLISHED
      tcp6 0 0 10.183.181.41:2551 10.183.181.43:49365 ESTABLISHED
      tcp6 0 0 10.183.181.41:35548 10.183.181.42:2551 ESTABLISHED
      tcp6 0 0 10.183.181.41:58923 10.183.181.44:2551 ESTABLISHED
      tcp6 0 0 10.183.181.41:39818 10.183.181.45:2551 ESTABLISHED
      tcp6 0 373 10.183.181.41:42423 10.183.181.43:2551 ESTABLISHED
      tcp6 0 0 10.183.181.41:2551 10.183.181.45:40387 ESTABLISHED

      feature:install odl-clustering-test-app
      feature:install odl-restconf-all
      feature:install odl-mdsal-all
      feature:install odl-openflowplugin-all-li

      3. OF ports 6633, 6653 came up in all 4 nodes (c2, c3, c4, c5) excpet c1.

      c1:
      ===
      root@mininet-vm:~# netstat -na | grep 665*
      tcp 0 0 0.0.0.0:6644 0.0.0.0:* LISTEN
      unix 3 [ ] STREAM CONNECTED 9662
      unix 3 [ ] STREAM CONNECTED 1866
      unix 3 [ ] STREAM CONNECTED 8566 @/com/ubuntu/upstart

      c2, c3, c4, c5:
      ===============
      root@mininet-vm:~# netstat -na | grep 665*
      tcp 0 0 0.0.0.0:6644 0.0.0.0:* LISTEN
      tcp6 0 0 :::6633 :::* LISTEN
      tcp6 0 0 :::6653 :::* LISTEN

      4. checked suspected ERROR logs

      2015-11-02 22:13:17,361 | ERROR | config-pusher | ConfigTransactionControllerImpl | 125 - org.opendaylight.controller.config-manager - 0.3.2.Lithium-SR2 | Commit failed on ModuleIdentifier

      {factoryName='openflow-plugin-provider-impl', instanceName='openflow-plugin-provider-impl'} in transaction TransactionIdentifier{name='ConfigTransaction-86-88'}
      java.lang.IllegalStateException: java.util.concurrent.ExecutionException: TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}
      at org.opendaylight.openflowplugin.impl.device.DeviceManagerImpl.<init>(DeviceManagerImpl.java:155)
      at org.opendaylight.openflowplugin.impl.OpenFlowPluginProviderImpl.initialize(OpenFlowPluginProviderImpl.java:169)
      at org.opendaylight.yang.gen.v1.urn.opendaylight.params.xml.ns.yang.config.openflow.plugin.impl.rev150327.OpenFlowProviderModule.createInstance(OpenFlowProviderModule.java:40)
      at org.opendaylight.controller.config.spi.AbstractModule.getInstance(AbstractModule.java:73)
      at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)[:1.7.0_65]
      at java.lang.reflect.Method.invoke(Method.java:606)[:1.7.0_65]
      at org.opendaylight.controller.config.manager.impl.dependencyresolver.DependencyResolverManager$ModuleInvocationHandler.handleInvocation(DependencyResolverManager.java:150)

      5. Giving grep output of all errors in logfile. Also attaching karaf log files.

      root@mininet-vm:/home/mininet/controller-Li/deploy/current/odl/data/log# grep ERROR karaf.log.1
      2015-11-02 22:05:11,708 | ERROR | Event Dispatcher | FeatureDeploymentListener | 21 - org.apache.karaf.deployer.features - 3.0.3 | Unable to install features
      2015-11-02 22:10:27,389 | ERROR | config-pusher | DistributedDataStore | 191 - org.opendaylight.controller.sal-distributed-datastore - 1.2.2.Lithium-SR2 | Shared leaders failed to settle in 90 seconds, giving up
      2015-11-02 22:11:57,500 | ERROR | config-pusher | DistributedDataStore | 191 - org.opendaylight.controller.sal-distributed-datastore - 1.2.2.Lithium-SR2 | Shared leaders failed to settle in 90 seconds, giving up
      2015-11-02 22:12:35,456 | ERROR | CommitFutures-1 | NetconfDeviceDatastoreAdapter | 243 - org.opendaylight.controller.sal-netconf-connector - 1.2.2.Lithium-SR2 | RemoteDevice{controller-config}: TransactionChain(org.opendaylight.controller.md.sal.binding.impl.BindingDOMTransactionChainAdapter@adef923) DOM-CHAIN-0-0 FAILED!
      TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}
      2015-11-02 22:12:35,253 | ERROR | CommitFutures-0 | NetconfDeviceDatastoreAdapter | 243 - org.opendaylight.controller.sal-netconf-connector - 1.2.2.Lithium-SR2 | RemoteDevice{controller-config}: Transaction(init) DOM-CHAIN-0-0 FAILED!
      TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}
      2015-11-02 22:12:36,644 | ERROR | CommitFutures-4 | NetconfDeviceTopologyAdapter | 243 - org.opendaylight.controller.sal-netconf-connector - 1.2.2.Lithium-SR2 | RemoteDevice{controller-config}: Transaction(init) DOM-CHAIN-1-0 FAILED!
      TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-topology-operational currently has no leader. Try again later.]]}
      2015-11-02 22:12:36,449 | ERROR | CommitFutures-3 | NetconfDeviceTopologyAdapter | 243 - org.opendaylight.controller.sal-netconf-connector - 1.2.2.Lithium-SR2 | RemoteDevice{controller-config}: TransactionChain(org.opendaylight.controller.md.sal.binding.impl.BindingDOMTransactionChainAdapter@dd96fd0) DOM-CHAIN-1-0 FAILED!
      TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-topology-operational currently has no leader. Try again later.]]}
      java.util.concurrent.ExecutionException: TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-topology-operational currently has no leader. Try again later.]]}
      Caused by: TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-topology-operational currently has no leader. Try again later.]]}
      2015-11-02 22:13:02,207 | ERROR | sing-executor-11 | ExecutionList | 72 - com.google.guava - 18.0.0 | RuntimeException while executing runnable com.google.common.util.concurrent.Futures$6@3d203dd5 with executor INSTANCE
      2015-11-02 22:13:17,353 | ERROR | config-pusher | DeviceManagerImpl | 280 - org.opendaylight.openflowplugin.impl - 0.1.2.Lithium-SR2 | Creation of node failed.
      java.util.concurrent.ExecutionException: TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}
      Caused by: TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}
      2015-11-02 22:13:17,361 | ERROR | config-pusher | ConfigTransactionControllerImpl | 125 - org.opendaylight.controller.config-manager - 0.3.2.Lithium-SR2 | Commit failed on ModuleIdentifier{factoryName='openflow-plugin-provider-impl', instanceName='openflow-plugin-provider-impl'}

      in transaction TransactionIdentifier

      {name='ConfigTransaction-86-88'}

      java.lang.IllegalStateException: java.util.concurrent.ExecutionException: TransactionCommitFailedException

      {message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}

      Caused by: java.util.concurrent.ExecutionException: TransactionCommitFailedException

      {message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}

      Caused by: TransactionCommitFailedException

      {message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}

      2015-11-02 22:13:17,464 | ERROR | config-pusher | ConfigRegistryImpl | 125 - org.opendaylight.controller.config-manager - 0.3.2.Lithium-SR2 | Configuration Transaction failed on 2PC, server is unhealthy
      Caused by: java.lang.IllegalStateException: java.util.concurrent.ExecutionException: TransactionCommitFailedException

      {message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}

      Caused by: java.util.concurrent.ExecutionException: TransactionCommitFailedException

      {message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}

      Caused by: TransactionCommitFailedException

      {message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-1-shard-inventory-operational currently has no leader. Try again later.]]}

      root@mininet-vm:/home/mininet/controller-Li/deploy/current/odl/data/log# grep ERROR karaf.log

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            moraja@cisco.com Moiz Raja
            sanjib.mohapatra@ericsson.com Sanjib Mohapatra
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: