Return-Path: Delivered-To: apmail-cocoon-dev-archive@cocoon.apache.org Received: (qmail 62380 invoked by uid 500); 18 Jul 2003 12:28:23 -0000 Mailing-List: contact dev-help@cocoon.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: dev@cocoon.apache.org Delivered-To: mailing list dev@cocoon.apache.org Received: (qmail 62366 invoked from network); 18 Jul 2003 12:28:22 -0000 Received: from unknown (HELO host.leverageweb.com) (64.91.232.157) by daedalus.apache.org with SMTP; 18 Jul 2003 12:28:22 -0000 Received: from dhcp205061.hq.af.mil ([134.205.205.61] helo=leverageweb.com) by host.leverageweb.com with asmtp (Exim 3.36 #1) id 19dUIc-0001Pf-00 for dev@cocoon.apache.org; Fri, 18 Jul 2003 08:25:22 -0400 Message-ID: <3F17E7D5.1070909@leverageweb.com> Date: Fri, 18 Jul 2003 08:28:05 -0400 From: Geoff Howard User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.3.1) Gecko/20030425 X-Accept-Language: en-us, en MIME-Version: 1.0 To: dev@cocoon.apache.org Subject: Re: [RT] Adaptive Caching References: <1E0CC447E59C974CA5C7160D2A2854EC097CE9@SJMEMXMB04.stjude.sjcrh.local> In-Reply-To: <1E0CC447E59C974CA5C7160D2A2854EC097CE9@SJMEMXMB04.stjude.sjcrh.local> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - host.leverageweb.com X-AntiAbuse: Original Domain - cocoon.apache.org X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [0 0] X-AntiAbuse: Sender Address Domain - leverageweb.com X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Well, since Peter's dragged me into this... ;) Hunsberger, Peter wrote: > Stefano Mazzocchi writes (and writes, and writes, > and writes): > > > >>WARNING: this RT is long! and very dense, so I suggest you to >>turn on your printer. Stefano, I started writing a response back about 5 minutes after getting your original RT but started getting the idea I hadn't fully understood the RT and haven't had time to go back in more detail. I'm very interested in this and have been following the discussion, but have been waiting to see if I really "get" it before speaking. The following would help me (and maybe others?) understand: Which of the following does your RT address: - Deciding when the overhead of caching is worthwhile on a given item. (and which part of the overhead - the act of storing, or the resource use) - Deciding when to purge the cache (aka, a better StoreJanitor/MRU) In the first scenario I'd have trouble seeing how this calculation could be any less costly than the current. But only testing would tell for sure, and I'll be very interested to see it develop. The second scenario has little to argue against it. I missed however whether taking the frequency of matching requests is possible. In other words, if I have 100 reports whose cost weighs high but are only requested several times a month and are reasonable to have to wait for, and other items with a smaller cost but are requested thousands of times daily can I come up with a cost function that favors the latter? ... > > > >>Final note: we are discussing resources which are produced >>using a "cacheable" pipeline *ONLY*. If the pipeline is not >>cacheable (means: it's not entirely composed of cache-aware >>components) caching never takes place. ... > At first it would seem that if there is no way to determine the ergodic > period of a fragment there is no reason to cache it! However, there is > an alternative method of using the cache (which Geoff Howard has been > working on) which is to have an event invalidated cache. In this model > cache validity is determined by some event external to the production of > the cached fragment and the cached fragment has no natural ergodic > period. Such fragments still fit mostly within the model given here: > although we do not know when the external event may transpire we can > still determine that it is more efficient to regenerate the fragment > from scratch than retain it in cache. Another interesting thing about this kind of setup is that if you commit to it, you could get out of all validity calculations all together. If it's still in the cache, serve it. I will be experimenting with this to see if that gets any benefit in practice. > If a cache invalidating event transpires then, for such fragments, it > may also make sense to push the new version of the fragment into the > cache at that time. Common use cases might be CMSs where authoring or > editing events are expensive and rare (eg. regen Javadoc). In our case, > we have a large set of metadata that is expensive to generate but rarely > updated. This metadata is global across all users and if there are > resources available we want it in the cache. > > This points out that in order to push something into cache one wants to > make the same calculation as the cache manager would make to expire it > from cache; is it more efficient to push a new version of this now? If > not there may eventually be a pull request at which point the normal > cache evaluation will determine how long to keep the new fragment > cached. This would be better IMHO if it was left to the cache's discretion to cache the pushed update or not. If it was currently cached, it would make sense but otherwise not. For instance, if I update an entire table with rows which never get requested, you wouldn't want them pushed into the cache especially at the expense of more valuable entries. Geoff