httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Colm MacCarthaigh <c...@stdlib.net>
Subject Re: RFC: can I make mod_cache per-dir?
Date Mon, 15 Aug 2005 12:29:13 GMT
On Mon, Aug 15, 2005 at 01:50:14PM +0200, Graham Leggett wrote:
> > 	It can't reliably disable caching for a directory.
> 
> Proxy has a mechanism to do this, cache should have a similar mechanism.
> Does CacheDisable not do this?

That's per-location, not per-directory. If multiple uri's map to the
same directory, I have to add per-location directives for each one. But
I'm *really* screwed when multiple vhosts, with different aliases point
to the same directory.

This is the situation I find myself in on ftp.heanet.ie. Where say

	ftp.ie.debian.org/debian/pool/ 

is also;

	ftp.heanet.ie/mirrors/ftp.debian.org/pool/
	ftp.heanet.ie/pub/pool/
	ftp.*.debian.org/debian/pool/
	apt.heanet.ie/pool/

and the list goes on, that's just four :/ But you get the idea :) I can
hack it by using mod_expires on a per-directory basis, but I don't want
to prevent remote caching - just local.
	
> > 	It's about 99.9% useless for a forward proxy configuration. ;-)
> 
> If so, then this is a bug that should be fixed.

That behaviour is particularly annoying. mod_cache insists that the URL
string start with a /, and it compared parsed_uri.path, so;

	http://www.apache.org/
	http://www.PlaceIDontWantToCache.com/

are exactly the same as far as mod_cache's url checking stage goes.
Though that's easy enough to fix as-is.

> > 	It involves some fairly pants linear searches through the url lists,
> > 	which means not a hope of implementing complex configurations while
> > 	keeping the performance mod_cache is supposed to be for :-/
> 
> These URL lists are not very long (or am I misunderstanding you?), have
> you got some profiling numbers to show this?

I just gave up on ftp.heanet.ie after I reached over 300 CacheDisable
entries. Didn't run any profiling numbers though.

> The case of caching / would only really make sense if an entire site
> was dynamically generated, but this is the exception rather than the
> rule.

There are other cases in which it makes a lot of sense (it certainly
does for me). But you're right , I guess the majority case isn't going
to be "/" once mod_cache becomes widely deployed.

> An attempt at open() can be very expensive compared to some of the
> performance improvements httpd already has (caching file handles, that
> sort of thing).

In general open() isn't expensive if it doesn't succeed, and if the
open() succeed's, it gets used by mod_cache anyway - there is no
penalty.  

> The cost is likely to be greater than comparing lists of URLs, so this
> would very likely be a step backwards performance wise.

Even for a failed open() this is going to be true.

> At no point can an admin be expected to know anything that's not
> immediately obvious. If the admin said "I am removing the cache option
> by commenting out this directive", then httpd should immediately
> disable the cache option.

That's fair enough.

> > As I was saying; What I've done gets rid of the CacheEnable and
> > CacheDisable directives, and instead lets you do this;
> 
> What you seem to be wanting to do is support the concept where there
> is no URL at all, ie "cache everything in the current scope, be it
> virtual|directory|location|whatever".

Yes.

> In other words, extending the current behavior from:
> 
>  CacheEnable disk /blah
> 
> (cache everything below /blah) to support something like this:
> 
>   <Location /blah> CacheEnable disk * </Location>
> 
> (cache everything regardless of the scope). This "*" means "we don't
> care for doing a URL check, cache everything within the current
> scope".
> 
> This is a small incremental change to the current config.

But not to the code. It's pretty much impossible to implement in code
without running the handler later, or replicating the entire url mapping
stage.

> There is going to have to be a very compelling case for fundamentally
> changing the current config method when problems can be solved with
> some more simple changes, and saving on some URL comparisons doesn't
> seem to me to be a good enough reason at this stage, unless some
> performance numbers can tell us otherwise?

My main reason is that I can't enable or disable caching on a
per-directory or per-file basis. 

-- 
Colm MacCárthaigh                        Public Key: colm+pgp@stdlib.net

Mime
View raw message