Uploaded image for project: 'controller'
  1. controller
  2. CONTROLLER-1900

Performance regression in cluster registration

XMLWordPrintable

      The problem seems to be in all branches:

      https://jenkins.opendaylight.org/releng/job/controller-csit-3node-clustering-ask-all-neon/247/robot/controller-clustering-ask.txt/Chasing%20The%20Leader/Unregister_Candidates_And_Validate_Criteria/

      https://jenkins.opendaylight.org/releng/job/controller-csit-3node-clustering-tell-all-fluorine/220/robot/controller-clustering-tell.txt/Chasing%20The%20Leader/Unregister_Candidates_And_Validate_Criteria/

      In Fluorine it started at around May 23rd so there multiple suspect patches:

      https://git.opendaylight.org/gerrit/#/q/branch:stable/fluorine+project:controller

      The test itself does:

      1) Start a singleton registration flap on every controller instance with this RPC: /restconf/operations/odl-mdsal-lowlevel-control:register-flapping-singleton

      2) Maintain the flap for 60 secs.

      3) Stop the flap on every controller instance: /restconf/operations/odl-mdsal-lowlevel-control:unregister-flapping-singleton

      4) Get flap count from above RPC response: <output xmlns="tag:opendaylight.org,2017:controller:yang:lowlevel:control"><flap-count>83</flap-count></output>

      5) Add all the flaps for the 3 controller instances and divide the total by 60 secs.

      Before the regression, the controller handled ~50 flaps/sec, after the regression is less than 5 flaps/sec.

            ecelgp Luis Gomez
            ecelgp Luis Gomez
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: