Details
-
Bug
-
Status: Resolved
-
Resolution: Done
-
None
-
None
-
None
-
Operating System: All
Platform: All
-
5464
Description
While performing scalability test against OpenFlowPlugin codebase, stable/lithium branch, -li plugin, with statistics collection enable, I encounter the following issue:
- OvS 2.3.x or previous version:
Scalability seems fine, or at least goes up to 400+ switches.
- OvS 2.4.x or newer version:
Scalability is capped around 45-50 switches.
At the tipping point, switches close the connection then send hello message. As ODL is in bad shape at this time, ODL goes in a crazy loop where it tries to reestablish the connection for all switches, one by one, but failed. This was ending in OOM error.
This crazy loop behaviour was recently fixed with Bug-4957; now, at the tipping point, ODL goes crazy for a bit then recovers and stabilizes, although all switches are disconnected and no more connection is possible.
Here are some logs, and a Yourkit Java Profiler snapshot: https://www.dropbox.com/sh/1zz1x6i1bl5uor8/AACHJfML-RqvOk7vFI5U-haJa?dl=0
I will setup tests in ODL infra to track progress, and to know better about scalability performance. Work started here: https://git.opendaylight.org/gerrit/#/c/35813/