jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sbarriba" <sbarr...@yahoo.co.uk>
Subject RE: JackRabbit Caching: BundleCache vs ItemManager vs CacheManager
Date Tue, 01 Jul 2008 17:11:10 GMT
Hi Marcel et al,
3 suggestions come to mind from this (perhaps for the develop list):

1) the ItemManager should be using Soft References rather than Weak
References otherwise a PooledSessionInView pattern is not really effective
as, pooled (but unused) sessions have their caches cleared immediately by
the GC (using weak references).

2) the CacheManager config needs to be externalised so it can be changed
within the XML config, not programmatically.

3) its worth considering using a caching library (e.g. ehcahe) for the
BundleCache at least? As a case study we've got multi-GB of binaries in
BLOBs in the database and the BundleCache (at 100MB+)  spends 2 hours after
each restart filling /tmp. It would be great to use a caching library which
supported a persistent cache etc. Obviously externalBlobs helps here.

Regards,
Shaun

-----Original Message-----
From: Marcel Reutegger [mailto:marcel.reutegger@gmx.net] 
Sent: 01 July 2008 09:47
To: users@jackrabbit.apache.org
Subject: Re: JackRabbit Caching: BundleCache vs ItemManager vs CacheManager

Hi,

sbarriba wrote:
> ..        PersistenceManager Cache: 
> 
> o   The "bundleCacheSize" determines how many nodes the PersistenceManager
> will cache. As this determines the lifetime of the references to the
> temporary BLOB cache if its not large enough BLOBs will be continually
read
> from the database (if using externalBlobs=false).
> 
> o   Configurable in <PersistenceManager> XML block
> 
> o   Default size 8MB
> 
> o   This cache is shared by all sessions.
> 
> o   Synchronised access using the ISMLocking stategy e.g. Default or
> FineGrained

correct, but there's additional synchronization in the persistence manager
using 
conventional synchronized methods. e.g. see 
AbstractBundlePersistenceManager.load(NodeId)

> ..        Session ItemManager Cache: 
> 
> o   Items are cached from the underlying persistence manager on a per
> session basis.
> 
> o   Limit cannot be set.

not sure, but I think this cache is also managed (at least partially) by the

CacheManager.

> o   Uses a ReferenceMap which can be emptied by the JVM GC as required

that's the 'other part' that manages the cache ;)

items that are still referenced in the application will force the reference
map 
to keep the respective ItemState instances (using weak references).

> o   Synchronised access using the itemCache object
> 
> ..        CacheManager Cache:
> 
> o   Limit can only be set programmatically via the Workspace cacheManager
> 
> o   http://wiki.apache.org/jackrabbit/CacheManager
> 
> o   Defaults to 16MB
> 
> o   Its not clear as yet how the CacheManager relates, if at all, to the
> ItemManager cache

this only happens indirectly. see above.

> 2 questions:
> 
> ..        What is the purpose of the CacheManager and which caches does it
> actually control?

It controls *all* the caches that contain ItemState instances.

> ..        For example, for a workspace with 100,000 nodes what is an
> appropriate setting for the Cache Manager?

I guess that depends on your JVM heap settings and the usage pattern. if you

have a lot of random reads over nearly all 100k nodes and performance is 
critical you may consider caching all of them. have a look a 
ItemState.calculateMemoryFootprint() for a formula on how the memory
consumption 
is calculated.

regards
  marcel



Mime
View raw message