directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kiran Ayyagari <kayyag...@apache.org>
Subject Re: Cache and partitions...
Date Tue, 10 Sep 2013 12:55:11 GMT
On Mon, Sep 9, 2013 at 11:13 AM, Emmanuel Lécharny <elecharny@gmail.com>wrote:

> Le 9/9/13 4:46 AM, Kiran Ayyagari a écrit :
> > On Sun, Sep 8, 2013 at 1:38 PM, Emmanuel Lécharny <elecharny@gmail.com
> >wrote:
> >
> >> Hi guys,
> >>
> >> we need to use a cache in the partitions, for entries, aliases, and also
> >> for whatever cache the partitins could need (assuming that this can be
> >> configurable : for instance, JDBM has its own cache, but Mavibot does
> not).
> >>
> >> This cache could be handled by the CacheService, which is created in the
> >> ApacheDsService, the DefaultDirectoryServiceFactory, or in the
> >> DefautDirectoryService if it's not injected in this class.
> >>
> >> AbstractBTreePatition alady has support for entry cache and likewise
> Index
> > implementations as well
> > have support for storing the tuples in the cache
>
> Actually, the
>
> AbstractBTreePatition has the following methods :
>
>     public void addToCache( String id, Entry entry )
>     {
>     }
>
> and
>     public Entry lookupCache( String id )
>     {
>         return null;
>     }
>
> As you can see this class is ready to suport an entry cache, but does not
> have any...
>
> ya, I just left it empty so that subclasses can decide whether to use an
entry cache or not instead
of forcing them

> The indexes can be configured so that their internal cache can benefit
> from this configuration, but again, this is partition implementation
> dependent. Now, thinking about it, as Mavibot is pretty much a standalone
> projet, I'm not sure it worth it to propagate the ApacheDS cache to Mavibot.
>
> Mavibot should be using the cache for storing its internal constructs
(Node, PageIO etc)
and a Mavibot based partition can maintain entry cache besides Mavibot
holding its internal
cache(invisible to partition)
And now, Mavibot code can be written in a way so that it can accept an
external cache manager
or can create one when not available(say in a standalone mode), so ApacheDS
can inject
a cache manager from its cache service

> The question then is to know if it's a big deal to have 2 ehCache
> instances : one for the server and one in mavibot.
>
> no, there will be only one cache manager (either injected into or created
by Mavibot)

>
> >
> >> On a side note, the CacheService is a wrapper on top of EhCache, which
> >> is, IMO, not good enough : it should be an interface, with some factory
> >> which creates various instances of CacheService instances, one of them
> >> being based on EhCache. In case we wan't to use another cache later, or
> >> design our own, then we would just instanciate the correct CacheService
> >> instance using the factory.
> >>
> > +0 IMHO, I don't see any gain in this, _very_ few might go to this extent
> > of changing the cache
>
> The gain is null, but in case we decide later to switch to another cache
> (if ehcache license changes to something incompatible with AL 2.0, for
> instance, like what we had for the Tanuki wrapper), the it would be
> easier to have an interface.
> >
> >> That being said, I thing the CacheService should be propagated down to
> >> the partitions for them to be used - or not.
> >>
> >> this is alreay the case
>
> Ahhh, you are right !
>
>
> >> The CacheService should also be configurable through the LDIF
> >> configuration file - we don't necessarily need to make the CacheService
> >> a fully configured system, because then we would face a chicken/egg pb :
> >> how do we read the configuration if we can't configure the cacheService,
> >> which will be used by the LdifPartition ?
> >>
> >> one aspect that is present in the cache service is it copies the cache
> > configuration
> > file to the config folder (not sure if this code is still present, but it
> > was present before)
>
> We have to clarify this part. There is a XML file that is used to
> configure the cache, but we also have some parameters in the partition
> and index configurations that is used for the cache (but they are not
> targetting the same cache).
>
correct, this configuration present in LDIF configuration is for the lower
level cache
of the respective backend and the XML file based one is for partition level
am sure this will be confusing and we should fix this by converging the
configuration
parameters (a big TODO)

>
> At this point, the CacheService must be started before the
> DirectoryService is started, which can be a problem. OTOH, we can't
> really wait before starting teh CacheService, because it's being
> propagated to the partitions, including the configuration partition.
> That means we need an external cache configuration.
>
> As I said, it's a typical chicken/egg situation. How do we solve it ?
>
> I think by converging both configurations, but this needs to be discussed
further

>
> >
> >> Those are elements to think about, because they are pretty critical from
> >> the performance POV. However, this is by no mean urgent : this won't fix
> >> a bug, it will just make the server to run faster.
> >>
> >> I suggest we focus on decoupling the cacheService we have from it's
> >> EhCache implementation atm, so that the API is not to be odified after
> >> the next release, and that's it. I also suggest to make this
> >> CacheService available in the Partitions, even if it's not used.
> >>
> >> again, IMO plugging in this kind of mechanism may not be of great help,
> > just more work
> > on a feature that may never be used, I believe Ehcache is the best
> > available cache with
> > the compatible license and unless we try to write our own we don't need
> > this new feature
> This is exactly what I have in mind. For one simple reason : ehCache is
> good, but it uses some synchronization : basically, ehCache is a
> synchronized List over a ConcurrentHashMap. Every access to the cache
> needs to lock globally the list. There are other ways to get better
> performances, by avoiding the use of a global lock. The Google's
> ConcurrentLinkedHashmap is clearly a winner hee : it uses CAS (Compare
> And Swap [1]) which is a lock free mechanism to insure concurrency.
>
> Now, I'm *not* proposing to use this class now. It's really just an
> investigation I'm doing here, as I'm trying to add a cache for aliases,
> which is not an urgent problem.
>
> ya, have seen this cache in guava project but except this cache impl
rest is of no interest to us (at least as of now) so didn't consider this
(and also guava jar file is bigger than ehcache jar in size)

> IMO, the urgent task is to get rid of the WeakReference usage in
> Mavibot, to replace it by ehCache. That's the *only* critical task. But
> I like to analyse what we have and what to improve in the future
> versions :-)
>
> +1

> Thanks for your valuable inputs, Kiran !
>
>
>
> [1] http://en.wikipedia.org/wiki/Compare-and-swap
>
> --
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
>
>


-- 
Kiran Ayyagari
http://keydap.com

Mime
View raw message