-
Bug
-
Resolution: Won't Do
-
High
-
None
There are sporadic failures due to connectivity issues in our -upstream-stateful-snat-conntrack-oxygen- job.
There are multiple tempest.scenario failures happening on a sporadic basis. I have created a new label
"csit:snat-conntrack" to try and group these more easily.
I am filing this issue for a non-tempest failure that happened in an Oxygen release
candidate job. A subsequent job did not fail. Since the overall nature of these failures
is the same, I'm hoping there is a single (or just a few) root causes that will clean up
all the failures in tempest and otherwise.
After poking around a little bit, and comparing the flow table on a passing run's compute node 0 vs the
same node on the failing run I think it looks like some number of flows are missing in the failing run. For
example, there is no flow in table 36, but I see this in the passing job:
cookie=0x9001392, duration=486.839s, table=36, n_packets=916, n_bytes=90892, priority=5,tun_id=0x5e actions=write_metadata:0x1392000000/0xfffffffff000000,goto_table:51
Also, there are some 8 flows in table=51 in the passing job, but only 4 in the failing job.
This could be explained by a few things. Maybe the extra flows are leftovers and should
not even be there in the passing job. Or maybe the instances created in the test are
on different compute nodes resulting the the difference. Either way, I just wanted to point
it out.