[OPNFLWPLUG-790] failed to update port status - all new OVS ports fail from this point Created: 09/Oct/16 Updated: 27/Sep/21 Resolved: 19/Jun/17 |
|
| Status: | Resolved |
| Project: | OpenFlowPlugin |
| Component/s: | General |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Arnon Yahel | Assignee: | Arnon Yahel |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| External issue ID: | 6908 | ||||||||||||
| Description |
|
openflowplugin fails to update port status on creation of VM using netvirt. this is the relevant lines in the jaraf.log |
| Comments |
| Comment by Arnon Yahel [ 09/Oct/16 ] |
|
Attachment karaf.logs.tar has been added with description: karaf logs |
| Comment by dotans [ 08/Nov/16 ] |
|
I also encountered that problem: |
| Comment by dotans [ 08/Nov/16 ] |
|
Attachment qsvc.zip has been added with description: q-svc log (fail @08/11, 9:22) |
| Comment by dotans [ 08/Nov/16 ] |
|
Attachment karaf.zip has been added with description: Karaf log (fail @08/11, 9:22) |
| Comment by dotans [ 13/Nov/16 ] |
|
This bug is reproduced when using this script: ;do |
| Comment by Koby Aizer [ 16/Nov/16 ] |
|
Attachment karaf.zip has been added with description: karaf.log - TransactionChainManager in DEBUG level |
| Comment by Alon Kochba [ 06/Dec/16 ] |
|
openflowplugin can't figure it out - need to try to recreate with a running tcpdump and debug logs for whole openflowplugin.impl From Shuva: "Sorry for the late response, got pulled into multiple things. Analysing the logs attached to this bug I was not able to come to any conclusion. To analyse the issue further we need the logs by setting the log level as I assume that we see the issue when the number of switches(DPNs) are high, hence we need to check what is the typical delay between the MASTERSHIP grant and the ports coming up and adjust the delay accordingly. We can also get the relevant information in a wireshark capture that would demarcate when the ROLE_REQ is processed by the switch and when the PORT STATUS is sent." |
| Comment by Alon Kochba [ 11/Dec/16 ] |
|
This was reproduced using Dotan's instructions, with only 1 compute (one devstack machine that is "all in one" - controller+compute). An updated karaf.log with all openflowplugin set to DEBUG was provided via mail on 12/8 (too large to attach here), as requested by Shuva. Updating to blocker to at least get openflowplugin's feedback as this bug has blocked us from proceeding several times. Thanks. |
| Comment by A H [ 13/Dec/16 ] |
|
Is this only for Carbon or does it apply to Boron-SR2 as well? |
| Comment by Alon Kochba [ 13/Dec/16 ] |
|
(In reply to A H from comment #10) This applies to Boron SR2. When it occurs, that DPN is stuck in a zombie state and cannot be used further. |
| Comment by A H [ 19/Dec/16 ] |
|
As per the discussion on the following email thread below, this bug is being retargeted for Boron SR3: https://lists.opendaylight.org/pipermail/openflowplugin-dev/2016-December/006490.html |
| Comment by Abhijit Kumbhare [ 11/Jan/17 ] |
|
Is this still blocking? |
| Comment by Koby Aizer [ 11/Jan/17 ] |
|
I think we can say this is no longer blocking (at least for netvirt), since we have a workaround (disabling statistics polling in openflowplugin). I believe it is still critical, as statistics collection is on by default, and users may encounter those issues which make the entire OVS unusable. CC'd Sam & Alon - maybe they'd like to comment as well. |
| Comment by Tomas Slusny [ 21/Mar/17 ] |
|
Changing importance to critical from blocking as Koby said it is non blocking anymore. |
| Comment by Sam Hague [ 22/Mar/17 ] |
|
Is this on boron or just master? Is the workaround a simple config option? What deployment and setup does this bug happen? Answers to the above would help to decide lowering the priority. |
| Comment by Tomas Slusny [ 28/Mar/17 ] |
|
Can you test this again on this patch please? https://git.opendaylight.org/gerrit/#/c/53905/ I added collecting of early port status updates and after initial statistics gathering is done these collected port status updates will be written to operational datastore. |
| Comment by dotans [ 28/Mar/17 ] |
|
I will test this patch. |
| Comment by dotans [ 02/Apr/17 ] |
|
I'm sorry for the delay- |
| Comment by Tomas Slusny [ 03/Apr/17 ] |
|
Okay great |
| Comment by dotans [ 04/Apr/17 ] |
|
It looks good. |
| Comment by Tomas Slusny [ 24/Apr/17 ] |
|
So patch was merged, is it working and can this one be closed? |
| Comment by Tomas Slusny [ 24/Apr/17 ] |
|
Last patch will probably not work properly with statistics gathering disabled, so I need to make another one that tries to solve this better. |
| Comment by Tomas Slusny [ 27/Apr/17 ] |
| Comment by Tomas Slusny [ 17/May/17 ] |
|
Patch was merged, but there is also cherry-pick for stable/carbon, so it needs to be updated and merged too before this can be closed: https://git.opendaylight.org/gerrit/#/c/56132/ |
| Comment by Tomas Slusny [ 05/Jun/17 ] |
|
Closing this. |
| Comment by Tomas Slusny [ 05/Jun/17 ] |
|
Forgot that this is still waiting for carbon merge, so leaving this in review. |