Uploaded image for project: 'controller'
  1. controller
  2. CONTROLLER-1703

Tweak Akka and Java timeouts to a reasonable compromise between stability and failure detection

XMLWordPrintable

    • Icon: Improvement Improvement
    • Resolution: Unresolved
    • 10.0.0
    • None
    • clustering
    • Operating System: All
      Platform: All

      There are several bugs (such as CONTROLLER-1645) which track failures caused by an unexpected UnreachableMember.
      One hypothesis is that these can happen when cluster is under load, so that members have multiple (or big) messages to process, and they are late to read heartbeats from peers.

      In order to test functional bug fixes, we are frequently testing with increased Akka timeouts, for example with [0].

      But it seems large Akka timeout can also have downsides. This Improvement is to make sure various (default) timeouts within ODL are consistent and suitable for performance tests.

      This is an umbrella bug, specific symptoms will be described in child bugs.

      [0] https://git.opendaylight.org/gerrit/#/c/57699/5

            rovarga Robert Varga
            vrpolak Vratko Polak
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: