cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Geoff Howard <coc...@leverageweb.com>
Subject Re: Fixing store design (long) (was Re: CocoonForms server sizing?)
Date Sat, 06 Dec 2003 17:20:24 GMT
Overall agreement, just a few points to keep in mind below...

Sylvain Wallez wrote:
 > Geoff Howard wrote:
 >
 >> Bruno Dumon wrote:
 >>
 >>> On Wed, 2003-12-03 at 12:24, Joerg Heinicke wrote:
 >>>
 >>>> On 03.12.2003 10:01, Leszek Gawron wrote:
 >> ....
 >>
 >>> There's also another problem: the store used to cache the stylesheets
 >>> apparently tries to serialize the cached items to disk, as reported
 >>> here:
 >>> http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=106969948018306&w=2
 >>>
 >>> just had a quick look into it, it seems that if we would set the
 >>> use-persistent-cache option of the transient store to false this
 >>> should be fixed. I don't know if this would be acceptable though
 >>> (depends on which other components put items in that store).
 >>
 >> Unless I'm misunderstanding you, the cacheable pipelines put cached
 >> results there not only on container shutdown, but after they are
 >> bumped off the bottom of the MRU stack in memory.  Disabling this
 >> feature by default would be bad I think.
 >
 > This subject comes again regularly, and I would like to solve 
definitively.
 >
 > Cocoon currently has two stores:
 > - a transient store: <transient-store> in cocoon.xconf,
 > Store.TRANSIENT_STORE in Java code
 > - a persistent store: <persistent-store>, Store.PERSISTANT_STORE which
 > equals Store.ROLE (more on this below)

I have always assumed that the labels "Transient" and "Persistent" were 
attempts to generalize the concepts "in memory" and "on disk" and the 
fact that the terms seem to imply a deeper contract was an unfortunate 
side effect.  Still, your proposal makes sense and I agree we should 
pursue it.

....

> Pipeline caching
> ----------------
> The CachingPipeline uses a Cache component to load/save cached 
> responses. The only implementation of Cache, CacheImpl, uses a store 
> which is... Store.TRANSIENT_STORE!!!

Actually, there are several implementations of Cache all in simultaneous 
use now.  Carsten made the Cache each pipeline uses configurable a few 
months ago and uses a(some?) overloaded Cache implementation in the 
portal.

Also, the eventcache (really event aware cache) is currently implemented 
as an overloaded CacheImpl which adds some additional processing to 
support event-based invalidation of cached objects.

I haven't thought through how that would impact your proposal if at all.

 > Transient-cache's "maxobjects"
 > ------------------------------
 > The transient cache has a "maxobjects" of 100, meaning that at most 100
 > non-serializable objects will be kept in memory. This is obviously too
 > low, furthermore considering that pipeline content also goes in this
 > cache, and that a Cocoon pre-analyses lots of things (stylesheets,
 > jxtemplates, XSP logicsheets, woody form definitions) that would benefit
 > of being kept longer in memory.
 >
 > And what's the point of having a store-janitor that is supposed to flush
 > the stores when memory is low if there is such a low hard limit?

Of course the 100 maxobjects is configurable and is necessary even in 
your proposal isn't it?  Perhaps we now change the default, but there 
must be some configurable limit, no?

And if this is true, separating the Stores could lead to harder to 
manage memory configuration because there are multiple collections.  I'm 
also thinking ahead to Stefano's adaptive cache.  As complicated as that 
is, wouldn't separating the caches make it more complex coordinating 
resources across them?

What if we use a pluggable FlushStrategy (like Validity) which would 
allow different types of objects (transient, transient->persistent) to 
all go in the same Store?

VolatileFlushStrategy := never go to persistent store
PersistentFlushStrategy := go to persistent store when janitor requests
ReluctantFlushStrategy := only go to persistent store on container shutdown
TrickyAdaptiveFlushStrategy := cost weighted decision about whether to 
go persistent or not, and if so how early.

This is just off the top of my head - haven't thought it through 
carefully but do you all see my point of the potential downside of 
splitting the actual Store?

 > Private caches all over the place
 > ---------------------------------
 > I mentioned above components that pre-analyze files like jxtemplate,
 > woody form definitions, flowscript, etc. Now if we look closer at these
 > components, we see that each of them has its own private cache (often a
 > static Map). This means that every loaded file is kept in memory
 > forever, even if only used once in the system lifetime, and even if the
 > corresponding file is actually deleted!

Don't know those situations specifically, but this sounds like it 
can/should be fixed whether the Store changes or not?

...
                          --- oOo ---
 >
 > Conclusion
 > ----------
 > The proposed changes want to clarify the respective roles of the various
 > stores, and make them behave as they should according to their names
 > (e.g. transient is really transient). This should allow us to better
 > understand what's going on in the system, and optimize memory usage for
 > a better scalability.
 >
 > So, what do you think?

Hope I haven't clouded the discussion...

Geoff


Mime
View raw message