directory-api mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Lecharny <>
Subject Re: DN and valueOf( "<a DN>" ) method
Date Mon, 09 Aug 2010 14:53:39 GMT
  On 8/9/10 4:16 PM, Matthew Swift wrote:
> On 28/07/10 13:07, Emmanuel Lecharny wrote:
>>  On 7/28/10 11:31 AM, Stefan Seelmann wrote:
>>>> I was thinking lately about the DN class. I know that OpenDS (and 
>>>> probably
>>>> UnboundId, but not sure) has a DN.valueOf( "<a DN>" ) factory that

>>>> returns a
>>>> DN potentially leveraging a cache associated to a ThreadLocal.
>>> ...
>>>> I don't think it's such a good idea :
>>>> - first, as it's ThreadLocal based, you will have as many cache as 
>>>> you have
>>>> threads processing requests. Not sure it competes with a unique 
>>>> cache, not
>>>> sure either we can't use the memory in a better way...
>>> An advantage to use ThreadLocal is that you don't need to synchronize
>>> access to the cache Could be worth to measure the performance
>> Using ConcurrentHashMap should not be a major performance penalty. I 
>> mean, it *will* be more costly than not having any synchronization 
>> but it sounds acceptable.
> Unfortunately a CHM won't help either since you need to manage cache 
> eviction, assuming that you want the cache to have a finite size. 
> LinkedHashMap has an eviction strategy which can be defined by 
> overriding the removeEldestEntry method, but unfortunately LHM is not 
> thread safe.

Doh !!! I should have thought about it immediately... That's the problem 
when you are pushing random thoughts on the ML instead of *really* 
coding them.
>> difference, I wonder if the OpenDS team did some performance analysis?
> I did some testing some time back and I have forgotten the exact 
> figures that I got. I do remember finding a substantial performance 
> improvement when parsing DNs when caching is enabled - something like 
> 30ns with caching vs 300ns without for DNs containing 4 RDN components 
> (i.e. about an order of magnitude IIRC).
> We implement our DNs using a recursive RDN + parent DN structure so we 
> are usually able to fast track the decoding process to just a single 
> RDN for DNs having a common ancestor (pretty common).

Our idea was that the first step would be to quickly compute the DN (as 
a String) hashcode, and check if it has already been parsed. If not, 
then we fallback to a plain parsing. But having the low level DN stored 
in the cache is a good idea.

There are definitively many options, we should conduct some perf tests 
based on real world DN to see what's the best.
> We opted for the ThreadLocal approach due to the synchronization 
> limitations of using a single global cache. However, I have often 
> worried about this approach as it will not scale for applications 
> having large numbers of threads, resulting in OOM exceptions.
yes, true. But this is also a fast-track solution, bringing immediate 
> Another approach I have thought about is to use a single global 
> two-level cache comprising of a fixed size array of LinkedHashMaps 
> (think of it as a Map of Maps) each one having its own 
> synchronization. We then distribute the DNs over the LHMs and amortize 
> the synchronization costs across multiple locks (in a similar manner 
> to CHM).
Another aspect we are interested in is the pining of frequently used DN 
(cn=schema, etc). Not sure it worth the effort though...
> This idea needs testing. In particular, we'd need to figure out the 
> optimal array size (i.e. number of locks / LHMs). For example, 
> distributing the cache over 16 LHMs is not going to help much for 
> really big multi-threaded apps containing 16000+ threads (1000 threads 
> contending per lock).
But are you going to have 16K threads anyway ?
> A major problem with this approach if we choose to use it in the 
> OpenDS SDK is that common ancestor DNs (e.g. "dc=com") are going to 
> end up in a single LHM so, using our current design (RDN + parent DN), 
> all decoding attempts will usually end up contending on the same lock 
> anyway :-( So we may need to change our DN implementation to better 
> cope with this caching strategy.
> We are not alone though: a concurrent Map implementation which can be 
> used for caching in a similar manner to LHM is one of the most 
> frequently demanded enhancements to the java.util.concurrent library.
There might be some other data structure available, we may need to do 
some research in this area...

Emmanuel L├ęcharny

View raw message