cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Berin Loritsch <blorit...@apache.org>
Subject Re: AW: [C2]: Proposal for caching
Date Fri, 26 Jan 2001 13:48:54 GMT
Stuart Roebuck wrote:
> 
> On Friday, January 26, 2001, at 09:23 AM, Carsten Ziegeler wrote:
> 
> I can see that FileGenerator creates some difficulties because:
> 
> 1. You can only check if it has changed by checking it, and the process of connecting
and checking is going to be the bottleneck that the check is there to avoid.

If you have FileMonitors that share a background thread, they can poll a
list of files and mark whether they have changed or not based on a
certain perodicity (On a stable site hours and days are good, on a
volatile site minutes should be fine, and on an insanely changing
site you could do seconds).

> 2. It may not be possible to determine whether the resource has changed other than doing
a byte comparison with the previous version.

With files, it is done by checking the date stamp.  If someone issues a
UNIX 'touch' command and changes the date stamp, then it is considered
to have changed.  Anything more than that is too complex.  There are
issues with XSP pages or other dynamic generators.

> However, I'm not sure that there is any need for these difficulties to manifest themselves
on the sitemap.  I think they should remain 'difficulties' for the developer of the FileGenerator
validator method.

I agree.

> Firstly, if in doubt I think all sitemap components should respond to the validator method
call with a "I think I've changed" response.  This may sometimes lead to inefficiency but
it ensures that the output is always 'as expected'.  With the model we are talking about,
the actual caching mechanism is hidden, but we could potentially have general caching controls
which would allow for things like an across the board 10 second cache delay - ie. further
requests for the same page would come from cache (regardless of validation) for up to 10 seconds
after the last generated response.

That sounds good--except for XSP pages.  The problem is that they are
designed to change content based on certain parameters at run time.
For example, if I have a site that displays sensitive information, I
don't want to broadcast one client's data to anyone who views the page
within the next 10 secs.

> Having said that, and thinking on my feet, this leads to a problem...  How do you distinguish
between unique page requests?  It is not enough to use the HTTP request URL to distinguish
unique page requests, because we may be delivering different pages based on the browser type,
session information, cookies, etc.  *But*, there is distinction, worth making, between "is
this request cached" and "is this cached request up-to-date".  So, I think components need
to be able to respond to three important requests in the pipeline:

>         give me your output (here's the input)
>         give me a unique cache key (here's the input)
>         have you changed (here's the input)
> 
> The first request is what components are doing already.
> 
> The third request is what we've been talking about - the component is being asked to
indicate whether or not its output has changed given the input (and any other inputs it has
internally).
> 
> The second request is the new one.  This is asking the component to generate a unique
key that identifies a unique set of inputs (including internal inputs) but ignoring time.
 In other words, the response to this is effectively, "Here is a unique key which can be used
to lookup my output in a cache, this key guarantees that you will get the right item out of
the cache, but I'm not guaranteeing that the item is up to date".

sounds good.

> So the fileGenerator component would always return the same key for the same URL, but
(in the absence of any more sophisticated logic) would always return "I think I've changed"
to the validation call.  (I say in the absence of more sophisticated logic, because there
is no reason why fileGenerator couldn't utilise the cache to compare the remote resource with
the last cached version and accurately determine whether the remote resource had changed.
 Whilst this wouldn't remove the bottleneck of requesting the remote file, it might reduce
CPU overhead if it removed the need to carry out further processing further down the pipeline.)

Mime
View raw message