Uploaded image for project: 'controller'
  1. controller
  2. CONTROLLER-1757

Singleton leader chasing exhausts heap space in few hours

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • None
    • Carbon
    • clustering
    • None
    • Operating System: All
      Platform: All

    • 9054

      This bug is not (yet) present in Carbon code. This Bug is affecting changes proposed around SR2 branch lock. Reporting, as this will probably prevent some fixes to be merged into SR2 candidate build.

      The exact build where this Bug happens is [0] which was intended to fix MDSAL-275. Which it does, but apparently there is a memory leak somewhere.

      Logs for the Sandbox run are here [1], karaf.log files show UnreachableMember starts happening around three and half hours into the test duration (corresponding to GC pauses of 5 and more seconds), gclogs directories show that members 1 and 3 end with allocation failure not recoverable by GCaround 19 hours after the test starts. It is not clear whether heap dumps were created, they certainly have not been archived.

      Patches that were included in the build are: [2], [3] (with its ancestors) and [4].

      [0] https://nexus.opendaylight.org/content/repositories/opendaylight.snapshot/org/opendaylight/integration/integration/distribution/distribution-karaf/0.6.2-SNAPSHOT/distribution-karaf-0.6.2-20170823.082806-47.zip
      [1] https://logs.opendaylight.org/sandbox/jenkins091/controller-csit-3node-cs-chasing-leader-longevity-only-carbon/14/
      [2] https://git.opendaylight.org/gerrit/#/c/61420/18
      [3] https://git.opendaylight.org/gerrit/#/c/62170/4
      [4] https://git.opendaylight.org/gerrit/#/c/62140/1

            rovarga Robert Varga
            vrpolak Vratko Polak
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: