jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefan Guggisberg" <stefan.guggisb...@day.com>
Subject Re: JackRabbit Caching: BundleCache vs ItemManager vs CacheManager
Date Wed, 16 Jul 2008 12:24:57 GMT
hi sean

On Tue, Jul 1, 2008 at 7:11 PM, sbarriba <sbarriba@yahoo.co.uk> wrote:
> Hi Marcel et al,
> 3 suggestions come to mind from this (perhaps for the develop list):
> 1) the ItemManager should be using Soft References rather than Weak
> References otherwise a PooledSessionInView pattern is not really effective
> as, pooled (but unused) sessions have their caches cleared immediately by
> the GC (using weak references).

ItemManager cashes ItemImpl instances. the 'cache' guarantees that there's
no more than 1 ItemImpl instance per item id and session. weak references
are ideal for this task. ItemManager is not meant to be a 'cache'
since ItemImpl
instance creation is IMO not performance critical. i remember that i once
experimented with soft references but they tended to fill the heap pretty fast
since soft references are typically cleared only when you're near an
OOM error...

ItemState caches are a different matter. LocalItemStateManager and
SharedItemStateManager do cache ItemState instances for performance
reasons. please take a look at the javadoc which should explain
why they're using weak references internally instead of soft references:



> 2) the CacheManager config needs to be externalised so it can be changed
> within the XML config, not programmatically.
> 3) its worth considering using a caching library (e.g. ehcahe) for the
> BundleCache at least? As a case study we've got multi-GB of binaries in
> BLOBs in the database and the BundleCache (at 100MB+)  spends 2 hours after
> each restart filling /tmp. It would be great to use a caching library which
> supported a persistent cache etc. Obviously externalBlobs helps here.
> Regards,
> Shaun
> -----Original Message-----
> From: Marcel Reutegger [mailto:marcel.reutegger@gmx.net]
> Sent: 01 July 2008 09:47
> To: users@jackrabbit.apache.org
> Subject: Re: JackRabbit Caching: BundleCache vs ItemManager vs CacheManager
> Hi,
> sbarriba wrote:
>> ..        PersistenceManager Cache:
>> o   The "bundleCacheSize" determines how many nodes the PersistenceManager
>> will cache. As this determines the lifetime of the references to the
>> temporary BLOB cache if its not large enough BLOBs will be continually
> read
>> from the database (if using externalBlobs=false).
>> o   Configurable in <PersistenceManager> XML block
>> o   Default size 8MB
>> o   This cache is shared by all sessions.
>> o   Synchronised access using the ISMLocking stategy e.g. Default or
>> FineGrained
> correct, but there's additional synchronization in the persistence manager
> using
> conventional synchronized methods. e.g. see
> AbstractBundlePersistenceManager.load(NodeId)
>> ..        Session ItemManager Cache:
>> o   Items are cached from the underlying persistence manager on a per
>> session basis.
>> o   Limit cannot be set.
> not sure, but I think this cache is also managed (at least partially) by the
> CacheManager.
>> o   Uses a ReferenceMap which can be emptied by the JVM GC as required
> that's the 'other part' that manages the cache ;)
> items that are still referenced in the application will force the reference
> map
> to keep the respective ItemState instances (using weak references).
>> o   Synchronised access using the itemCache object
>> ..        CacheManager Cache:
>> o   Limit can only be set programmatically via the Workspace cacheManager
>> o   http://wiki.apache.org/jackrabbit/CacheManager
>> o   Defaults to 16MB
>> o   Its not clear as yet how the CacheManager relates, if at all, to the
>> ItemManager cache
> this only happens indirectly. see above.
>> 2 questions:
>> ..        What is the purpose of the CacheManager and which caches does it
>> actually control?
> It controls *all* the caches that contain ItemState instances.
>> ..        For example, for a workspace with 100,000 nodes what is an
>> appropriate setting for the Cache Manager?
> I guess that depends on your JVM heap settings and the usage pattern. if you
> have a lot of random reads over nearly all 100k nodes and performance is
> critical you may consider caching all of them. have a look a
> ItemState.calculateMemoryFootprint() for a formula on how the memory
> consumption
> is calculated.
> regards
>  marcel

View raw message