[OPNFLWPLUG-350] Flow present in configuration data store but not operational when running GBP DevStack integration or Docker integration. Created: 30/Jan/15 Updated: 27/Sep/21 Resolved: 18/Jun/15 |
|
| Status: | Resolved |
| Project: | OpenFlowPlugin |
| Component/s: | General |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Thomas Bachman | Assignee: | Michal Rehak |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: Mac OS |
||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| External issue ID: | 2657 | ||||||||||||
| Priority: | Low | ||||||||||||
| Description |
|
Refer to this email: https://lists.opendaylight.org/pipermail/openflowplugin-dev/2015-January/002464.html |
| Comments |
| Comment by Thomas Bachman [ 30/Jan/15 ] |
|
I have a way to reproduce this problem, somewhat reliably. Follow the directions here: https://wiki.opendaylight.org/view/Group_Policy:Building_and_Running_Docker You will only need a single VM to test this – you can just create the VM, configure the test scripts (i.e. IP address of the controller and append --policy in start-poc.sh, IP address of the tunnel in infrastructure_config.py), and run ./start-poc.sh (as sudo or root). When this comes up, run the following command: ovs-ofctl dump-flows s1 | wc -l You should see the number 54 (there are 53 flows installed, but the number is 54 b/c wc counts a header message that is dumped with this command). You should be able to recreate the failure by then running the following commands: ./docker-clean.sh You'll know you've hit the failure case when the output of this: ovs-ofctl dump-flows s1 | wc -l shows 53 instead of 54. You should then observe 53 flows in the config data store, but only 52 flows in the operational data store. |
| Comment by Keith Burns [ 30/Jan/15 ] |
|
We believe the root cause of this, is that when a port is deleted via "ovs-vsctl", a packet is generated by OVS (marked in the trace), but the operational datastore does not remove this "port" (node-connector). We rely on the datastore being correct in order to know where we instantiate flows. |
| Comment by Keith Burns [ 30/Jan/15 ] |
|
Attachment port-delete-marked.pcapng has been added with description: Wireshark Trace of DeletePort |
| Comment by Thomas Bachman [ 03/Feb/15 ] |
|
The missing flow seems to appear in table 2. Below is a dump of the flows that are present in table 2: root@odlgbp1:~/work/groupbasedpolicy/util/dockerTestOfOverlay# ovs-ofctl dump-flows s1 table=2 Here is the flow that's missing from table 2: |
| Comment by Keith Burns [ 03/Feb/15 ] |
|
Please look for any flow in CONFIG AFTER with "35:02" (mac address of an endpoint). You can see this flow in CONFIG-AFTER but not in OPERATIONAL-AFTER. Adding karaf logs in another file. |
| Comment by Keith Burns [ 03/Feb/15 ] |
|
Attachment datastores_and_flows.tar.gz has been added with description: Config and Operational datastores before and after along with flow table |
| Comment by Keith Burns [ 03/Feb/15 ] |
|
Attachment karaflogs.tar.gz has been added with description: Karaf logs |
| Comment by Keith Burns [ 03/Feb/15 ] |
|
In the config and operational data stores zip it would be helpful for you to start with :
|
| Comment by Keith Burns [ 03/Feb/15 ] |
|
It should also be noted that if you do a:
|
| Comment by Thomas Bachman [ 04/Feb/15 ] |
|
This is the output of the REST calls to inventory – both config and operational – when the flow is missing. There is also a set of both config and operational for just the table with the missing flow (table 2). |
| Comment by Thomas Bachman [ 04/Feb/15 ] |
|
Attachment inventory.tar.gz has been added with description: Another capture of full oper/config inventory when this happens, and just the table 2 oper/config when this happens (i.e. the table with the missing flow) |
| Comment by Thomas Bachman [ 04/Feb/15 ] |
|
This has the karaf logs for that run. Note that I had to create and destroy the ports multiple times to make this happen, so you'll see them come and go a few times. The last run has the failure. |
| Comment by Thomas Bachman [ 04/Feb/15 ] |
|
Attachment karaf-logs.tar.gz has been added with description: Here are the karaf log files for that run |
| Comment by Keith Burns [ 04/Feb/15 ] |
|
After working with Michal Rehak we determined that for some reason flows are making it to CONFIG but not making it to OPER (ran LOG.info in frm.add() as well). |
| Comment by Keith Burns [ 04/Feb/15 ] |
|
Attachment port-delete-port-add.tar.gz has been added with description: Wireshark trace of port-add-delete and flowmods |
| Comment by Abhijit Kumbhare [ 12/Mar/15 ] |
|
Any update on this Michal? |
| Comment by Thomas Bachman [ 14/Apr/15 ] |
|
FYI – we found another way to make this happen, very reliably. Using master, run the POC, and do the following: PUT http://controllerIp:port/restconf/config/policy:tenants/policy:tenant/f5c7d344-d1c7-4208-8531-2c2693657e12/l3-context/c0b0edd4-c391-4816-bd07-bf5bb73e3334 body: ] followed by: http://controllerIp:port/restconf/config/policy:tenants/policy:tenant/f5c7d344-d1c7-4208-8531-2c2693657e12/l2-bridge-domain/7b796915-adf4-4356-b5ca-de005ac410c1 ] |
| Comment by Abhijit Kumbhare [ 05/Jun/15 ] |
|
Lowering the priorityto normal and milestone Lithium last RC (to fix only if time permitting) - as GBP has a workaround as per Thomas Bachman. |
| Comment by Abhijit Kumbhare [ 12/Jun/15 ] |
|
Just saw that this has been marked as blocking 3462 (which itself is marked as a blocker). Kamal - is this something you can look into with Michal's or Anil's help (Anil is back from the vacation on Monday). Of course - Michal if you can work on this it will be great. |
| Comment by Keith Burns [ 12/Jun/15 ] |
|
Yep, may as well get this over with. This has gone on far too long. We see flows in CONF, they aren't in OPER. We have to reset controller to get them into OVS. We have tried everything from modifying our matchers, setting up a standard FlowId pattern that automagically generates a FlowId based off the Match passed to it, and we still see this. |
| Comment by Luis Gomez [ 15/Jun/15 ] |
|
So looking at the flows that get alien id (weird flow id) in operational I see 2 issues: 1) What Anil commented this morning that current config<->operational matching function does not look at extensions matches. The workaround for this is to use flow coookies. 2) It seems the mac address mask is not reported in operational and that could also impact the flow matching function: config: } operational: "ethernet-match": { } |
| Comment by Keith Burns [ 18/Jun/15 ] |
|
Was unable to reproduce in lab environment with latest GBP stable/lithium. Assuming that becomes part of the RC tonight, I'm ok after that with closing this. Already updated blocking bug list. |