Uploaded image for project: 'controller'
  1. controller
  2. CONTROLLER-1703

Tweak Akka and Java timeouts to a reasonable compromise between stability and failure detection

    XMLWordPrintable

Details

    • Improvement
    • Status: In Review
    • Resolution: Unresolved
    • None
    • 8.0.5, 9.0.1
    • clustering
    • Operating System: All
      Platform: All

    Description

      There are several bugs (such as CONTROLLER-1645) which track failures caused by an unexpected UnreachableMember.
      One hypothesis is that these can happen when cluster is under load, so that members have multiple (or big) messages to process, and they are late to read heartbeats from peers.

      In order to test functional bug fixes, we are frequently testing with increased Akka timeouts, for example with [0].

      But it seems large Akka timeout can also have downsides. This Improvement is to make sure various (default) timeouts within ODL are consistent and suitable for performance tests.

      This is an umbrella bug, specific symptoms will be described in child bugs.

      [0] https://git.opendaylight.org/gerrit/#/c/57699/5

      Attachments

        Issue Links

          # Subject Branch Project Status CR V

          Activity

            People

              rovarga Robert Varga
              vrpolak Vratko Polak
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: