cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vadim Gritsenko <>
Subject Re: StoreJanitor (was: Re: Moving reduced version of CachingSource to core | Configuration issues)
Date Thu, 05 Apr 2007 12:48:15 GMT
Ard Schrijvers wrote:
> 1) How it works and its intention (I think :-) ):  The StoreJanitor is
> originally invented to monitor cocoon's memory useage and does this by
> checking some memory values every X (default 10) seconds. Beside the fact
> that I doubt users know that it is quite important to configure the store
> janitor correctly,

It is stressed in several places. If you don't set at the very least max 
heapsize, you are in for a trouble.

> I stick to the defaults and use a heapsize of just a
> little lower then JVM maxmemory.

You also need min free memory and interval set according to site & usage.

> Now, every 10 seconds, the StoreJanitor does a check wether
> (getJVM().totalMemory() >= getMaxHeapSize() && (getJVM().freeMemory() <
> getMinFreeMemory()) is true, and if so, the next store is choosen (compared
> to previoud one) and entries are removed from this store (I saw a post that
> in trunk not one single store is chosen anymore, but an equal part of all of
> them is being removed, right?

Branch and trunk, two algorithms are supported.

> 2) My Observations: When running high traffic sites and render them live
> (only mod_cache in between which holds pages for 5 to 10 min) like [1] or
> [2], then checking every X sec for a JVM to be low on memory doesn't make
> sense to me. At the moment of checking, the JVM might be perfectly sound but
> just needed some extra memory for a moment, in that case, the Store Janitor
> is removing items from cache while not needed. Also, when the JVM is really
> in trouble, but the Store Janitor is not checking for 5 more sec....this
> might be too long for a JVM in a high traffic site when it is low on memory.

That's the problem with your configuration. It also a problem in janitor -- but 
it can be fixed only after Java 5 is made an option.

> - Since there is no way to remove cache entries from the used cache impl by
> the cache's eviction policy,

Huh??? That's not the case for me.

> - Ones the JVM gets low on memory, and the StoreJanitor is needed, it is
> quite likely that from that moment on, the StoreJanitor runs *every* 10
> seconds, and keeps removing cache entries which you perhaps don't want to be
> removed, like compiled stylesheets.

Since they are not used (janitor removes least recently used entries), that's 
perfectly fine to me.

> 1) suppose, from one store (or since
> trunk from multiple stores) 10% (default) is removed. This 10% is from the
> number of memory cache entries. I quite frequently happen to have only 200
> entries in memory for each store ( I have added *many* different stores to
> enable all we wanted in a high traffic environment) and the rest is disk
> store. Now, suppose, the JVM which has 512 Mb of memory, is low on memory,
> and removes 10% of 200 entries = 20 entries, helping me zero! These memory
> entries are my most important ones, so, on the next request, they are either
> added again, or, from diskcache I have a hit, implying that the cache will
> put this cache entry in memory again. If I would use 2000 memory items, I am
> very sure, the 200 items which are cleaned are put back in memory before the
> next StoreJanitor runs.

This sounds like a problem in your configuration.

> 2) I am not sure if in trunk you can configure wether
> the StoreJanitor should leave one store alone, like the
> DefaultTransientStore.

No. It should be a configuration parameter on store, IIUC.

> In this store, typically, compiled stylesheets end up,
> and i18n resource bundles. Since these files are needed virtually on every
> request, I had rather not that the StoreJanitor removes from this store.

Quite often you need to purge the i18n cataloge for country which is no longer 
using your website. Similarly janitor can purge the stylesheet for document type 
you are no longer using.

> I
> think, the StoreJanitor does so, leaving my "critical app" in an even worse
> state, and on the next request, the hardly improved JVM needs to recompile
> stylesheets and i18n resource bundles. 3) What if the JVM being low is not
> because of the stores....For example, you have added some component which has
> some problems you did not know, and, that component is the real reason for
> you OOM.

Janitor or not, if you have a buggy code, nothing will help you. You have to fix 
the bug, or site is going down regardless of janitor.

> 4) By default, probably most people are using ehcache. Naturally,
> overflow-to-disk is true. In a high traffic site, the number of cache keys
> can grow enormously

Not used it on live site yet, so no comment. It still though points to the 
importance of configuration, including changing configuration away from ehcache.

> --------o0o--------
> The rules I try to follow to avoid the Store Janitor to run
> 1) use readers in noncaching pipelines and use expires on them to avoid
> cache/memory polution

Better - there is Apache HTTPD for it.

> 2) use a different store for repository binary sources
> which has only a disk store part and no memory part (cached-binary: protocol
> added)

Doesn't it result in some frequently used binary resource always read from the disk?

> 3) use a different store for repository sources then for pipeline
> cache

Hm, what are the benefits?


> 4) replaced the abstract double mapping event registry to use
> weakreferences and let the JVM clean up my event registry
> 5)  (4) gave me
> undesired behavior by removing weakrefs in combination with ehcache when
> overflowing items to disk (i could not reproduce this, but seems that my
> references to cachekeys got lost). Testing with JCSCache solved this problem,
> gave me faster response times and gave me for free to limit the number of
> disk cache entries. Disadvantage of the weakreferences, is that I disabled
> persitstent caches for jvm restarts, but, I never wanted this anyway (but
> this might be implemented quite easily, but might take long start up times) 
> 6) JCSCache has a complex configuration IMO. Therefor, I added default
> configurations to choose from, for example:
> [1] [2]

View raw message