Return-Path: Delivered-To: apmail-cocoon-dev-archive@www.apache.org Received: (qmail 91311 invoked from network); 6 Dec 2003 17:20:41 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 6 Dec 2003 17:20:41 -0000 Received: (qmail 59324 invoked by uid 500); 6 Dec 2003 17:20:31 -0000 Delivered-To: apmail-cocoon-dev-archive@cocoon.apache.org Received: (qmail 59274 invoked by uid 500); 6 Dec 2003 17:20:31 -0000 Mailing-List: contact dev-help@cocoon.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: dev@cocoon.apache.org Delivered-To: mailing list dev@cocoon.apache.org Received: (qmail 59260 invoked from network); 6 Dec 2003 17:20:30 -0000 Received: from unknown (HELO host.leverageweb.com) (64.91.254.192) by daedalus.apache.org with SMTP; 6 Dec 2003 17:20:30 -0000 Received: from va-leesburg-cmts5c-90.chvlva.adelphia.net ([67.21.159.90] helo=leverageweb.com) by host.leverageweb.com with esmtp (Exim 4.24) id 1ASgFm-0002Rt-Gn for dev@cocoon.apache.org; Sat, 06 Dec 2003 12:30:02 -0500 Message-ID: <3FD20FD8.5070809@leverageweb.com> Date: Sat, 06 Dec 2003 12:20:24 -0500 From: Geoff Howard User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5) Gecko/20031007 X-Accept-Language: en-us, en MIME-Version: 1.0 To: dev@cocoon.apache.org Subject: Re: Fixing store design (long) (was Re: CocoonForms server sizing?) References: <20031201181043.8296.qmail@web41905.mail.yahoo.com> <011601c3b86e$6cc32eb0$eb4bac89@alexk> <1070401958.20678.80.camel@yum.ot> <20031203090152.GB8197@wlkp.org> <3FCDC7D8.1060304@gmx.de> <1070620527.20679.131.camel@yum.ot> <3FD07DF7.4030806@leverageweb.com> In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - host.leverageweb.com X-AntiAbuse: Original Domain - cocoon.apache.org X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [47 12] X-AntiAbuse: Sender Address Domain - leverageweb.com X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Overall agreement, just a few points to keep in mind below... Sylvain Wallez wrote: > Geoff Howard wrote: > >> Bruno Dumon wrote: >> >>> On Wed, 2003-12-03 at 12:24, Joerg Heinicke wrote: >>> >>>> On 03.12.2003 10:01, Leszek Gawron wrote: >> .... >> >>> There's also another problem: the store used to cache the stylesheets >>> apparently tries to serialize the cached items to disk, as reported >>> here: >>> http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=106969948018306&w=2 >>> >>> just had a quick look into it, it seems that if we would set the >>> use-persistent-cache option of the transient store to false this >>> should be fixed. I don't know if this would be acceptable though >>> (depends on which other components put items in that store). >> >> Unless I'm misunderstanding you, the cacheable pipelines put cached >> results there not only on container shutdown, but after they are >> bumped off the bottom of the MRU stack in memory. Disabling this >> feature by default would be bad I think. > > This subject comes again regularly, and I would like to solve definitively. > > Cocoon currently has two stores: > - a transient store: in cocoon.xconf, > Store.TRANSIENT_STORE in Java code > - a persistent store: , Store.PERSISTANT_STORE which > equals Store.ROLE (more on this below) I have always assumed that the labels "Transient" and "Persistent" were attempts to generalize the concepts "in memory" and "on disk" and the fact that the terms seem to imply a deeper contract was an unfortunate side effect. Still, your proposal makes sense and I agree we should pursue it. .... > Pipeline caching > ---------------- > The CachingPipeline uses a Cache component to load/save cached > responses. The only implementation of Cache, CacheImpl, uses a store > which is... Store.TRANSIENT_STORE!!! Actually, there are several implementations of Cache all in simultaneous use now. Carsten made the Cache each pipeline uses configurable a few months ago and uses a(some?) overloaded Cache implementation in the portal. Also, the eventcache (really event aware cache) is currently implemented as an overloaded CacheImpl which adds some additional processing to support event-based invalidation of cached objects. I haven't thought through how that would impact your proposal if at all. > Transient-cache's "maxobjects" > ------------------------------ > The transient cache has a "maxobjects" of 100, meaning that at most 100 > non-serializable objects will be kept in memory. This is obviously too > low, furthermore considering that pipeline content also goes in this > cache, and that a Cocoon pre-analyses lots of things (stylesheets, > jxtemplates, XSP logicsheets, woody form definitions) that would benefit > of being kept longer in memory. > > And what's the point of having a store-janitor that is supposed to flush > the stores when memory is low if there is such a low hard limit? Of course the 100 maxobjects is configurable and is necessary even in your proposal isn't it? Perhaps we now change the default, but there must be some configurable limit, no? And if this is true, separating the Stores could lead to harder to manage memory configuration because there are multiple collections. I'm also thinking ahead to Stefano's adaptive cache. As complicated as that is, wouldn't separating the caches make it more complex coordinating resources across them? What if we use a pluggable FlushStrategy (like Validity) which would allow different types of objects (transient, transient->persistent) to all go in the same Store? VolatileFlushStrategy := never go to persistent store PersistentFlushStrategy := go to persistent store when janitor requests ReluctantFlushStrategy := only go to persistent store on container shutdown TrickyAdaptiveFlushStrategy := cost weighted decision about whether to go persistent or not, and if so how early. This is just off the top of my head - haven't thought it through carefully but do you all see my point of the potential downside of splitting the actual Store? > Private caches all over the place > --------------------------------- > I mentioned above components that pre-analyze files like jxtemplate, > woody form definitions, flowscript, etc. Now if we look closer at these > components, we see that each of them has its own private cache (often a > static Map). This means that every loaded file is kept in memory > forever, even if only used once in the system lifetime, and even if the > corresponding file is actually deleted! Don't know those situations specifically, but this sounds like it can/should be fixed whether the Store changes or not? ... --- oOo --- > > Conclusion > ---------- > The proposed changes want to clarify the respective roles of the various > stores, and make them behave as they should according to their names > (e.g. transient is really transient). This should allow us to better > understand what's going on in the system, and optimize memory usage for > a better scalability. > > So, what do you think? Hope I haven't clouded the discussion... Geoff