[BGPCEP-493] Topology freezes or slows down due to java.util.concurrent.TimeoutException Created: 19/Jul/16 Updated: 03/Mar/19 Resolved: 10/Aug/16 |
|
| Status: | Resolved |
| Project: | bgpcep |
| Component/s: | BGP |
| Affects Version/s: | Bugzilla Migration |
| Fix Version/s: | Bugzilla Migration |
| Type: | Bug | ||
| Reporter: | Vratko Polak | Assignee: | Claudio David Gasparini |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| External issue ID: | 6237 | ||||||||
| Description |
|
Possibly a Bug in distributed datastore or other offset-0 component. This is a Boron regression in Singlepeer Changecount suite. Symptoms are slightly different in "mixed" job [0] (topology practically freezes) and in "non-mixed" job [1] (topology only slows down). The TimeoutException occures in karaf.log, followed by several other ERRORs. This does not happen in Prefix Count suite (or anywhere in Beryllium tests). Segments of logs will be attached, as the whole logs are very big and even the segments take too many lines to paste here. Both segments end with two occurences of "raced with transacion PingPongTransaction" error, in the real logs there were many more of such occurrences. [0] https://jenkins.opendaylight.org/releng/view/bgpcep/job/bgpcep-csit-1node-periodic-bgp-ingest-mixed-only-boron/ |
| Comments |
| Comment by Vratko Polak [ 19/Jul/16 ] |
|
Attachment logs_20160719.tar.xz has been added with description: Archive with the promised karaf.log segments |
| Comment by Milos Fabian [ 29/Jul/16 ] |
|
Routes injection (applciation peer) use-case: Error reproduced with Boron: With unplugged BGP topology provider, test sometimes pass (no errors/RPC finish). The same test runs with Beryllium (latest snapshot) without any problems. More investigation needed. |
| Comment by Milos Fabian [ 01/Aug/16 ] |
|
Memory footprint analysis and comparison (Beryllium vs Boron) shows Boron's memory consumuption significant rise which results in observed timeout erros, casused by insane GC activity (application inactivity). |
| Comment by Peter Gubka [ 04/Aug/16 ] |
|
Attachment karaf1.log.zip has been added with description: NPE karaf log |
| Comment by Peter Gubka [ 04/Aug/16 ] |
|
I forgot to add larger comment. The attached log is a part of the whole log, it has NPE as the first exaception. The lof is taken from https://logs.opendaylight.org/sandbox/jenkins091/bgpcep-csit-1node-periodic-bgp-ingest-mixed-only-boron/5/archives/. Unfortunately, the full kig has 11M zipped (1.4G text) and will be deleted during the weekend. What happened in that suite: tool started to advertised routes and after around 500k routes tho tool side output looked like: Hopefully attached log will show what problems occurred in odl. |
| Comment by Claudio David Gasparini [ 05/Aug/16 ] |
| Comment by Claudio David Gasparini [ 05/Aug/16 ] |
|
Boron Be |
| Comment by Milos Fabian [ 10/Aug/16 ] |
|
stable/boron: https://git.opendaylight.org/gerrit/#/c/43590/ |