cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Miles Elam <mi...@pcextremist.com>
Subject Re: invalid caching problem
Date Sat, 17 May 2003 02:56:14 GMT
Gianugo Rabellino wrote:

> Pretty much on purpose, just like httpd mod_expires does. It would be 
> pretty easy, actually, to spot a forced reload, which is triggered by 
> a "Cache-Control: max-age=0" header, but I'm not sure that it makes 
> really sense: the whole idea is letting the user specify an override 
> (like "I know better than the caching algorithm or the browser when 
> this resource has to be refreshed") over both caching algos and 
> browser requests. But yes, it might make sense to add this too (even 
> if for 99% of request this would turn out into unnecessary parsing of 
> HTTP headers). 


Doesn't this take control away from the site administrator?  Let's say 
for example that the reason the expires attribute was set in the sitemap 
was because a particular resource is conspicuously expensive.  The 
expires value would actually be a lock on the pipeline generation.  Now 
let's say that some remote user can simply run wget with the "--header" 
switch specifying "Cache-Control: max-age=0".  Assuming that the site 
has such a publically available resource (and many sites do and are 
usually easily identified), this could be turned into a denial of 
service with minimal outlay (no need for multiple boxes to attack the 
server) couldn't it?

If the site administrator decides that, for whatever reason, a 
particular resource does not need to be up-to-the-minute, that the 
changes need only be registered once every four hours or however long is 
deemed appropriate, what right does a client have to dictate that the 
site admistrator is wrong?  A resource with an expiration on it is a 
constant-time resource: no matter what the constituent parts of the 
pipeline are compared to every other, it is always just an expires check 
and a byte array dump.  It is by this metric that a site developer would 
estimate the load a machine could handle.  If clients can just hit 
shift-reload to bypass an explicit caching directive (think of all of 
the "First Post!"-ers on slashdot), your performance metrics might as 
well be useless.  You might as well put in functionality that says that 
a client can selectively tell the server to ignore cache and re-run the 
raw pipeline again.  IMHO of course...  The only purpose I could see 
this serving would be to invalidate a cache entry, but that's not a 
_public_ concern on 99.999% of all web sites.  (I would have said 100%, 
but I'm sure someone out there is doing something this weird 
*somewhere*.)  As Gianugo stated,

> My suggestion, however, is to use expires only on production, as part 
> of the release cycle optimization. 


The only other case I can think of is, instead of a site-wide clearing 
of cache, an element-specific clearing of cache when things get 
"stuck."  But then again, as I think of it, that seems like a band-aid 
for a potential bug in the caching system.  If you can simply set a cron 
job to clear a particular URI's cache entry, it makes it much easier to 
pretend that a bug doesn't exist and ignore it for an extended period of 
time -- until the poor bastard who takes your place at a particular 
job/position/task doesn't know about your undocumented and long-lived 
cron hack that you've long since forgotten or cared about.  If the bug 
is that annoying, you will (a) work to patch it yourself or (b) sure as 
hell recompile it as soon as someone else patches it in CVS.   Just as 
well that someone would have to go through the trouble of editing the 
sitemap to reset expiration and thus reset the cache value.  It's 
extremely annoying.  Squeeky wheels get the grease first.  ...In theory 
of course.  ;-)

>> 1) There are never any 'date' headers in the Response, so you never 
>> get a 'conditional-GET' from the Browser and consequentially never 
>> send a 304 response, meaning we always send the page. (This just 
>> isn't implemented yet, right?)
>
>
> Yes, probably it might be the case to consider merging the lates 304 
> patch from Miles with the expires code. Something to work on... 


Shoot...  I hadn't thought of that case when I was working on the 
patch.  It shouldn't be too hard at all now that all of the 
CachedResponse objects have a last mod timestamp now.  I'll see if I 
can't whip out a simple fix.

>> 2) You cannot force-reload during the expiry period of an expiring 
>> Pipeline.
>
> Again, on purpose. :-) But this can be changed. 


I really wish it isn't.

- Miles Elam



Mime
View raw message