[CONTROLLER-638] Concurrency issue in RpcProviderRegistryImpl.java Created: 18/Jul/14  Updated: 23/Jul/14  Resolved: 23/Jul/14

Status: Resolved
Project: controller
Component/s: mdsal
Affects Version/s: Helium
Fix Version/s: None

Type: Bug
Reporter: Devin Avery Assignee: Tony Tkacik
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: Mac OS
Platform: PC


External issue ID: 1390

 Description   

With gerrit https://git.opendaylight.org/gerrit/#/c/9093/ we unwittingly fixed a concurrency issue with the publicProxies variable. However, the same concurrency issues exist with rpcRouters as well.

Basically, we synchronize when we write the the map, but we do not synchronize when we read from it. This can lead to inconsistent behavior in this class.



 Comments   
Comment by Tony Tkacik [ 21/Jul/14 ]

Devin, I believe there is not an issue,

there is
1. read, lock, read, write mechanism,
so if unsynchronized read misses, we synchronize and read again
if it miss, create value.

Comment by Devin Avery [ 21/Jul/14 ]

Hi Tony -

This is actually NOT thread safe for a few reason.

1) addRpcImplementation, registerRouterInstantiationListener, method is not synchronized. So even if another thread created the RpcRouter it is not guaranteed to be available in that method.

2) The "read, lock, read, write mechanism" is called double-check locking and that is NOT thread safe. See this for the full details: http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html. So for that reason the getRpcService method is not thread safe.

I changed this back to a critical bug because concurrency issues are really hard to track down, fix and test. So if we find one we should fix it.

Comment by Tom Pantelis [ 22/Jul/14 ]

Devin is correct. This is a form of DCL which is not thread-safe. There must be synchronization between the writes and the reads in order to guarantee reading threads see the writes.

Worse, you could end up with the infamous HashMap endless loop (see http://confusion.tweakblogs.net/blog/2019/nasty-java-hashmap-race-condition.html) if an unsynchronized get or put on one thread happens simultaneous with a put on another thread.

Comment by Tom Pantelis [ 22/Jul/14 ]

Actually the DCL mechanism in this case is OK from a visibility standpoint - if the unsync'ed get returns null, the thread will hit the synchronized block which will result in previous writes by other threads done in that sync block to be visible. However there's likely compiler re-ordering optimizations that could cause issues here.

Regardless, the potential for infinite loop in the HashMap is a definite issue. Basically you must synchronize every HashMap operation (get, put, remove etc) or bad things can happen.

Comment by Tony Tkacik [ 22/Jul/14 ]

https://git.opendaylight.org/gerrit/#/c/9221/

Generated at Wed Feb 07 19:53:31 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.