[CONTROLLER-263] MDSAL Notification pool threads not getting spawned besides the core threads Created: 01/Apr/14 Updated: 25/Jul/23 Resolved: 23/Apr/14 |
|
| Status: | Resolved |
| Project: | controller |
| Component/s: | mdsal |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Kamal Rameshan | Assignee: | Kamal Rameshan |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: Mac OS |
||
| External issue ID: | 645 |
| Priority: | Normal |
| Description |
|
MDSAL Default Notification pool is currently configured as follows: During cbench tests, its observed the # of threads working on the queue, dont ever go beyond 4. |
| Comments |
| Comment by Kamal Rameshan [ 01/Apr/14 ] |
|
The threadpoolexecutor does not create new threads if the queue is not full. Since the notification queue is unbounded, the above condition is never met and hence no new threads are spawned (besides the 4 core ones) |
| Comment by Robert Varga [ 02/Apr/14 ] |
|
I am not sure whether having a unbounded queue is the correct behavior here. |
| Comment by Robert Varga [ 02/Apr/14 ] |
|
Sorry for incomplete comment. While I see the problem here, and the fix makes for some better throughput, I am afraid we are not introducing a backpressure mechanism and at some point we'll see producers overwhelm consumers, creating a huge backlog. I think we need a more full solution, probably create a per-producer queue and some sort of fair queue dispatch threadpool, which would: 1) prevent a single producer from monopolizing the notification dispatch threadpool I am thinking something along the lines of token buckets. Tony: where would be a good place to place such a rate-limiting mechanism? Kamal: can you introduce a knob, which would place a limit on the queue? ThreadPoolExecutor.CallerRunsPolicy looks like a good first cut at the rejection policy in that case. |
| Comment by Kamal Rameshan [ 02/Apr/14 ] |
|
The patch is to fix a bug in the current design. I am in total agreement that an unbounded queue is not the approach we want to take as an OOM error is not far off. Rather than creating a queue/producer, my take would be to put a RateLimiter(com.google.common.util.concurrent) for each producer. What would be the throttling logic is something we need to come up with. I can take up this as an enhancement, if we agree on the design. But for now, i think we should have a fix for the threadpool. Currently its broken. |
| Comment by Kamal Rameshan [ 03/Apr/14 ] |
|
Working on a patch to implement CallerRunsPolicy or any other way to throttle. Will run some cbench tests to see if it impacts significantly. |
| Comment by Kamal Rameshan [ 07/Apr/14 ] |
|
CallerRunsPolicy implementation is causing thread boundaries to be violated and also having performance issues. Will raise a separate bug as 'enhancement' to introduce a throttle as Robert suggested. For now just fixing the patch for the service release: |