cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Wallez <>
Subject Re: Fixing store design (long) (was Re: CocoonForms server sizing?)
Date Sat, 06 Dec 2003 21:38:53 GMT
Geoff Howard wrote:

> Overall agreement, just a few points to keep in mind below...
> Sylvain Wallez wrote:


> > Cocoon currently has two stores:
> > - a transient store: <transient-store> in cocoon.xconf,
> > Store.TRANSIENT_STORE in Java code
> > - a persistent store: <persistent-store>, Store.PERSISTANT_STORE which
> > equals Store.ROLE (more on this below)
> I have always assumed that the labels "Transient" and "Persistent" 
> were attempts to generalize the concepts "in memory" and "on disk" and 
> the fact that the terms seem to imply a deeper contract was an 
> unfortunate side effect.  Still, your proposal makes sense and I agree 
> we should pursue it.

That's how I understand them also, but the fact that the persistent role 
equals the general store role makes the distinction rather useless...

> .....
>> Pipeline caching
>> ----------------
>> The CachingPipeline uses a Cache component to load/save cached 
>> responses. The only implementation of Cache, CacheImpl, uses a store 
>> which is... Store.TRANSIENT_STORE!!!
> Actually, there are several implementations of Cache all in 
> simultaneous use now.  Carsten made the Cache each pipeline uses 
> configurable a few months ago and uses a(some?) overloaded Cache 
> implementation in the portal.

I couldn't find anything but the "type" attribute on <map:pipeline>. 
This attribute chooses the pipeline implementation, but not the store. 
Is there something else that I missed?

> Also, the eventcache (really event aware cache) is currently 
> implemented as an overloaded CacheImpl which adds some additional 
> processing to support event-based invalidation of cached objects.

Damn, missed that one. But it relies on the store looked up in its 
superclass, and so will also benefit of the changes.

> I haven't thought through how that would impact your proposal if at all.

I don't think there will be some impacts, since what I'm suggesting will 
change the Store role used by CacheImpl, but not the actual behaviour of 
what's defined by this role.

> > Transient-cache's "maxobjects"
> > ------------------------------
> > The transient cache has a "maxobjects" of 100, meaning that at most 100
> > non-serializable objects will be kept in memory. This is obviously too
> > low, furthermore considering that pipeline content also goes in this
> > cache, and that a Cocoon pre-analyses lots of things (stylesheets,
> > jxtemplates, XSP logicsheets, woody form definitions) that would 
> benefit
> > of being kept longer in memory.
> >
> > And what's the point of having a store-janitor that is supposed to 
> flush
> > the stores when memory is low if there is such a low hard limit?
> Of course the 100 maxobjects is configurable and is necessary even in 
> your proposal isn't it?  Perhaps we now change the default, but there 
> must be some configurable limit, no?

A hard limit makes sense IMO only for the memory front-end of a 
two-stage cache. For the in-memory cache, we should better let the 
store-janitor do its job based not on the number of stored objects, but 
on the actual JVM memory consumption.

> And if this is true, separating the Stores could lead to harder to 
> manage memory configuration because there are multiple collections.  
> I'm also thinking ahead to Stefano's adaptive cache.  As complicated 
> as that is, wouldn't separating the caches make it more complex 
> coordinating resources across them?
> What if we use a pluggable FlushStrategy (like Validity) which would 
> allow different types of objects (transient, transient->persistent) to 
> all go in the same Store?
> VolatileFlushStrategy := never go to persistent store
> PersistentFlushStrategy := go to persistent store when janitor requests
> ReluctantFlushStrategy := only go to persistent store on container 
> shutdown
> TrickyAdaptiveFlushStrategy := cost weighted decision about whether to 
> go persistent or not, and if so how early.
> This is just off the top of my head - haven't thought it through 
> carefully but do you all see my point of the potential downside of 
> splitting the actual Store?

I see your point. But if I understand it correctly, your proposal 
requires components to give a hint on the store about the flush strategy 
that should be applied to the stored object. But how will components 
choose the correct strategy, and how will the store handle a myriad of 
different stragegy implementations? Furthermore, I think the flush 
strategy is a concern of the Store, and that components should just give 
a hint on the properties of stored objects regarding their storage.

We can consider that choosing between the various store roles is a way 
to indicate this.

> > Private caches all over the place
> > ---------------------------------
> > I mentioned above components that pre-analyze files like jxtemplate,
> > woody form definitions, flowscript, etc. Now if we look closer at these
> > components, we see that each of them has its own private cache (often a
> > static Map). This means that every loaded file is kept in memory
> > forever, even if only used once in the system lifetime, and even if the
> > corresponding file is actually deleted!
> Don't know those situations specifically, but this sounds like it 
> can/should be fixed whether the Store changes or not?

Yes and no: most of these objects aren't serializable, and having a 
"persistent" transient store doesn't encourage component writers to it...

> ....
>                          --- oOo ---
> >
> > Conclusion
> > ----------
> > The proposed changes want to clarify the respective roles of the 
> various
> > stores, and make them behave as they should according to their names
> > (e.g. transient is really transient). This should allow us to better
> > understand what's going on in the system, and optimize memory usage for
> > a better scalability.
> >
> > So, what do you think?
> Hope I haven't clouded the discussion...

Not at all! There's still room for improvement in the flush strategy, 
but we have a problem to solve today. If the evolutions require a single 
store with storage hints, we'll just have to change the implementations 
of the various stores roles we have today to proxies to the global 
cache. Each store delegate storage to this global cache with a 
particular hint for the flush strategy.


Sylvain Wallez                                  Anyware Technologies 
{ XML, Java, Cocoon, OpenSource }*{ Training, Consulting, Projects }
Orixo, the opensource XML business alliance  -

View raw message