Uploaded image for project: 'controller'
  1. controller
  2. CONTROLLER-1866

circuit breaker timed out; datastore shutdown

    XMLWordPrintable

Details

    Description

      There is a failure showing up in netvirt 1node CSIT where the output of the karaf
      cli "showSvcStatus" has the DATASTORE in ERROR state:

      Timestamp: Tue Oct 09 21:02:42 UTC 2018
      Node IP Address: 10.30.170.157
      System is operational: false
      System ready state: ACTIVE
        OPENFLOW : OPERATIONAL
        IFM : OPERATIONAL
        ITM : OPERATIONAL
        ELAN : OPERATIONAL
        OVSDB : OPERATIONAL
        DATASTORE : ERROR java.lang.reflect.UndeclaredThrowableException
      

      Looking at the karaf.log it seems the reason for this is that we hit
      a circuit breaker timed out issue and some cluster/akka logic is shutting down
      the datastore.

      2018-10-09T20:58:22,469 | ERROR | opendaylight-cluster-data-akka.actor.default-dispatcher-39 | Shard                            | 228 - org.opendaylight.controller.sal-clustering-commons - 1.7.4 | Failed to persist event type [org.opendaylight.controller.cluster.raft.persisted.SimpleReplicatedLogEntry] with sequence number [78318] for persistenceId [member-1-shard-default-config].
      akka.pattern.CircuitBreaker$$anon$1: Circuit Breaker Timed out.
      2018-10-09T20:58:22,515 | INFO  | opendaylight-cluster-data-shard-dispatcher-215 | Shard                            | 228 - org.opendaylight.controller.sal-clustering-commons - 1.7.4 | Stopping Shard member-1-shard-default-config
      2018-10-09T20:58:22,517 | WARN  | opendaylight-cluster-data-akka.actor.default-dispatcher-70 | LocalThreePhaseCommitCohort      | 235 - org.opendaylight.controller.sal-distributed-datastore - 1.7.4 | Failed to prepare transaction member-1-datastore-config-fe-0-txn-65215-0 on backend
      java.lang.RuntimeException: Transaction aborted due to shutdown.
      

      This is not neccessarily a heavy job so I am not suspecting that this job is not able
      to keep up with writing to disk, which I think is one reason this might happen.
       

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            jluhrsen Jamo Luhrsen
            jluhrsen Jamo Luhrsen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: