hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Cowan <co...@aconex.com>
Subject Re: Review Request: HBASE-2468: Improvements to prewarm META cache on clients.
Date Wed, 09 Jun 2010 00:59:08 GMT
Hi tsuna,

>> Changing this class to use ConcurrentHashMap, and completely removing
>> synchronization, won't work, as it's written. The lazy-loading in
>> regionCachePreFetchEnabled() (get - if null, put) won't work correctly with
>> a ConcurrentHashMap without external synchronization.
> I'm not sure to understand why, can you explain?

No problem. With that change, there would be a potential race condition:

Thread 1                                Thread 2
----------                              -------------
enter regionCachePrefetchEnabled
call ConcurrentMap.get() - get null
                                         enter disableRegionCachePrefetch
                                         exit disableRegionCachePrefetch
exit regionCachePrefetchEnabled

which means that after one (nominally 'read-only') call to 
regionCachePrefetchEnabled and one call to disableRegionCachePrefetch 
have completed, the value in the cache for the table is true, when it 
really should be false.

The get-nullcheck-put triad really needs to be atomic, which requires 
either external synchronization or use of some of the other features of 
ConcurrentHashMap like below.

> Slide 31 of the aforementioned presentation reads "Calls putIfAbsent
> every time it needs to read a value" "Unfortunately, this usage is
> very common" so I think this is not a good idea.
> I'm still fairly new to Java so I tend to trust what people like
> Joshua Bloch write, but I can be convinced otherwise :)

That's purely for performance reasons -- you're right, I should have 
    get, if (result == null) {putIfAbsent, get}
rather than
    putIfAbsent, get

Both are correct but as Josh talks about on that slide (and the 
subsequent one) it's needless contention. I would be almost certain that 
it's still faster than a synchronized block, but not as much as other 
options. Should have mentioned that.

Of the 4 options I mentioned (putIfAbsent(), get() and default return 
value if missing, CopyOnWriteArraySet, ConcurrentHashSet), the 
putIfAbsent() method is likely the slowest (given the usage pattern 
anyway), and (more importantly, in my opinion) the least clear anyway. I 
think the ConcurrentSet solution is much cleaner.

Hope that makes thing clear.



View raw message