[OPNFLWPLUG-1056] Default tables missing Created: 10/Dec/18 Updated: 09/Jul/19 Resolved: 09/Jul/19 |
|
| Status: | Resolved |
| Project: | OpenFlowPlugin |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Neon, Sodium |
| Type: | Bug | Priority: | Medium |
| Reporter: | Sam Hague | Assignee: | Somashekhar Javalagi |
| Resolution: | Done | Votes: | 0 |
| Labels: | csit:failures | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
CSIt job and the default table is missing so the suite setup fails.
|
| Comments |
| Comment by Faseela K [ 10/Dec/18 ] |
|
shague : Why is this a GENIUS JIRA? Tables missing default flows are 18, 60, and 45. Those are netvirt programmed tabled. And if they are already programmed by netvirt, most likely an openflowplugin bug. |
| Comment by Sam Hague [ 10/Dec/18 ] |
|
Thanks, moved to ofp. |
| Comment by Somashekhar Javalagi [ 13/Dec/18 ] |
|
shague I have added a patch with debug logs, can you please run the csit on the same? |
| Comment by Jamo Luhrsen [ 20/Dec/18 ] |
|
We need to figure out which job sees this the most frequent and then try to reproduce it there I'll see what I can figure out, but if anyone else knows please comment. |
| Comment by Jamo Luhrsen [ 20/Dec/18 ] |
|
Looks like the 3node cluster jobs see this more frequently than others. This tempest one I will create a sandbox job that runs without any test cases, in a loop, |
| Comment by Jamo Luhrsen [ 20/Dec/18 ] |
|
Here's the sandbox job to try and reproduce this. Note that it |
| Comment by Jamo Luhrsen [ 21/Dec/18 ] |
|
Was able to recreate with the distro created in the debug patch here is a link to the robot failure where you can see table=45 was not found on one of the nodes this is a clustered job so three karaf logs to look at: The node with the missing table=45 was the first compute node, and it's ovsdb UUID was 46c9d66c-60a6-4da3-8b58-2c4831689600. I think |
| Comment by Arunprakash D [ 06/Feb/19 ] |
|
DeviceContext is writing node information to the oper inventory. Rolecontext is responsible to device's mastership election and ownership change callback.
FRM registers for ownership callback and will get notified once master is elected for a device. In some cases, Rolecontext is taking time for ownership election but deviceContext is going ahead and writing the node information to the oper inventory. So, apps which listen for node DTCL will go ahead and push default table flows which might be dropped by FRM as it has not yet got ownership callback. The new implementation would be for devicecontext to wait for mastership election to go through and then write the switch information to the oper inventory. This will make sure FRM always has the mastership details when it receives flow information. |