NetVirt CSIT Improvements
(NETVIRT-1086)
|
|
| Status: | In Progress |
| Project: | netvirt |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Fluorine-SR2, Neon |
| Type: | Sub-task | Priority: | Medium |
| Reporter: | Jamo Luhrsen | Assignee: | Jamo Luhrsen |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Description |
|
initially, just add the call. we can save the results to a file and log them like we do with other but hopefully, at some point, we can fail if the output is non-empty, which probably |
| Comments |
| Comment by Jamo Luhrsen [ 18/May/18 ] |
|
I was remembering a jira ticket about this task that should have comments or updates. this one is Anyway, smalleni mentioned he might want to look at this as a task to get familiar with CSIT, However, we were able to quickly run this patch in the sandbox again and I can confirm again that it smalleni, feel like digging around? |
| Comment by Michael Vorburger [ 22/May/18 ] |
|
I'm interested in getting this done, so motivated to help. Had a look at the "troubles/failures" linked above - 13 CSIT failures? What could perhaps be useful as a next step is if one of you could look through the karaf.log (is it here?), and somehow |
| Comment by Jamo Luhrsen [ 23/May/18 ] |
|
ok, so first of all MY BAD in that the job I linked to with failures did not actually even install the odl-mdsal-trace feature. I did look at the karaf.log (yes, vorburger, you have the right place for those logs) but nothing stood out. The first few failures are end-user kind of failures where openstack instances have connectivity problems. We could Anyway, just thinking out loud for now. I am running this again here so lets wait on that before we dig any further. yes, I am running it with odl-mdsal-trace. |
| Comment by Jamo Luhrsen [ 24/May/18 ] |
|
the new job finished, had failures and really did do the mdsal-trace feature install. All the logs are here I have not dug very deep, but I did notice a few things:
let me know what you think vorburger and we can think of ideas to keep digging. |
| Comment by Michael Vorburger [ 04/Jun/18 ] |
|
jluhrsen as per internal / private email: What I'm most interested in first here (we'll get back to transaction leaks after) is that you've managed to hit "frozen class" Just for the record here and as previously stated elsewhere: It COULD be that frozen ONLY happens with features-mdsal-trace... If you are able to try it once completely without and once with odl-mdsal-trace, we ( |
| Comment by Michael Vorburger [ 04/Jun/18 ] |
|
> in the final call to trace:transactions we can see a LOT (I think it's a lot) of unclosed transactions I've opened |
| Comment by Jamo Luhrsen [ 04/Jun/18 ] |
|
working on it with this sandbox job |
| Comment by Jamo Luhrsen [ 04/Jun/18 ] |
|
I ran one job with the mdsal-trace feature installed and one job without that feature. It looks like the failures are the same, so I no longer suspect this feature is causing any more problems. I'm guessing the failures (which aren't expected) are coming because of all the extra trace:transactions commands via ssh to karaf console. We'll have to dig in to that idea next. |
| Comment by Jamo Luhrsen [ 05/Jun/18 ] |
|
vorburger pinged me on IRC if this jira was about fluorine and/or oxygen. this is an int/test patch and by |
| Comment by Michael Vorburger [ 06/Jun/18 ] |
|
Running this on top Fluorine AND Oxygen is great IMHO! |
| Comment by Michael Vorburger [ 30/Jul/18 ] |
|
|
| Comment by Jamo Luhrsen [ 31/Oct/18 ] |
|
I finally had some time to look at this one again. vorburger, over in Looks like there are other open transactions from the get go, but they do not increase. So possibly those are leak-able, but just not with the normal netvirt csit flow. Unfortunately, something about this work (installing odl-mdsal-trace and issuing that karaf cli on each test case teardown) is causing a problem but, in the meantime we probably want to figure out if we have a leak to fix, and if not, how to deal with it (growing from |
| Comment by Michael Vorburger [ 31/Oct/18 ] |
|
I've opened |
| Comment by Jamo Luhrsen [ 31/Oct/18 ] |
|
great thanks. if it's less critical (as in, wont get fixed) then we'll need to ignore it when checking because we don't want to have |
| Comment by Jamo Luhrsen [ 21/Dec/18 ] |
|
~2m later, taking another look. Trying to put this in the apex job first. step one complete |