[CONTROLLER-1449] Reads of medium to large size data sets fails on nodes with replica shards Created: 12/Nov/15 Updated: 16/Feb/16 Resolved: 16/Feb/16 |
|
| Status: | Resolved |
| Project: | controller |
| Component/s: | clustering |
| Affects Version/s: | Beryllium |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Jan Medved | Assignee: | Tom Pantelis |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| Attachments: |
|
| External issue ID: | 4627 |
| Priority: | Highest |
| Description |
|
when running dsbenchmark on a node with replica shard, the READ operation on a medium-sized list fails. I try to many reads (items in a list) on a remote leader (I.e. in my 3-node test cluster I issued the read on 10.194.126.98 or 10.194.126..99, which are replica nodes), the operation never finishes. The dsbenchmark READ test dumps a 10k-element list into the data store, and then tries to read the elements one by one. A list of 1,000 items works fine. A list of 10,000 items does not work. Even with the 10k-item list, I can see the list items through RESTCONF in the leader node (issue a REST read request on 10.194.126.97), but trying to do the programmatic read from a remote node does not work. To reproduce, install dsbenchmark and run the attached script first on the leader node and then on replica nodes with the following command line: ./dsbenchmark.py --host 10.194.126.98 --txtype SIMPLE-TX --inner 1 --optype READ --warmup 1 --runs 3 --total 10000 --ops 10 100 |
| Comments |
| Comment by Jan Medved [ 12/Nov/15 ] |
|
Attachment dsbenchmark.py has been added with description: script to drive dsbenchmark |
| Comment by Gary Wu [ 25/Jan/16 ] |
|
Sorry, it looks like I may not have time to work on this in the immediate future. |
| Comment by Tom Pantelis [ 16/Feb/16 ] |
|
Submitted https://git.opendaylight.org/gerrit/#/c/34757/. The problem was that the RO transaction was getting prematurely closed due to using the wrong referent for the PhantomReference cleanup mechanism. So the more reads that were done, the more chance the referent would get GC'ed and the transaction closed. After the fix, running the test with 50K items successfully completed. |