Details
-
Bug
-
Status: In Progress
-
High
-
Resolution: Unresolved
-
None
-
None
Description
There are some sporadic failures in the external_network suite around PNF.
as of now, I see two types:
1) the initial ping to the PNF in our apex job fails, but it appears the flow rule ends up being
programmed eventually such that subsequent test cases will pass.
2) the ping to the PNF fails after floating IP assignment is done in our non-conntrack
(e.g. "controller") job.
some things to note:
- the apex failure example here fails the first ping from instance 1 to the PNF, and
the flow rule to hit the PNF is not there, but the 2nd ping from instance 2 passes and
the flow rule is then seen in both compute nodes. At that point, all further test cases
are passing. That seems to indicate some really slow learning/programming or
possibly the initial ping from instance 1 never even triggered the path to learning and
programming that proper flow.
- the apex job is conntrack based
- the "controller" job skips the first initial ping test case as it's not expected to work
in the non-conntrack job anyway. But when the controller job fails you can see the
flow tables through the whole test suite and the PNF flow is never programmed.
example job here
- we don't ever seem to hit this type of sporadic failure in the conntrack job that
is based on devstack, only in the apex job. One major difference is in how fast
the tests start from when ODL is brought up. It's a matter of a few minutes at
most, as opposed to the devstack based job where it can be 30-40m with ODL
running before devstack is done stacking and tests begin.