httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Colm MacCarthaigh <>
Subject RFC: can I make mod_cache per-dir?
Date Mon, 15 Aug 2005 08:55:34 GMT

mod_cache configurability sucks big-time. CacheEnable adds yet another
location mapping scheme for administrators to deal with, but this scheme
lacks basic flexibility;

	It can't reliably disable caching for a directory. 

	It's about 99.9% useless for a forward proxy configuration. ;-)

	It can't do regex matching, unlike every other part of Apache.

	It involves some fairly pants linear searches through the url lists, 
	which means not a hope of implementing complex configurations while 
	keeping the performance mod_cache is supposed to be for :-/

Unfortunately, I want to do some pretty complex things, including all of
the above and I've bitten the bullet have achieved a rough implemention
by throwing away the CacheEnable and CacheDisable directives, and
completely changing the basic configuration of mod_cache. *cough*. 

I'm guessing that the majority of CacheEnable instances out there in the
world probably take "/" as their url argument. For this case, the
changes I've made speed things up. For other cases there is some small
potential slowdown, for example if you had only;

	CacheEnable disk /wiki/

Previously mod_cache would have done a url match at the handle stage and
if it didn't match, that would have been that. With this patch, it
instead looks up the url with the caching provider directly. This has
two consequences; 

	1. It means all requests are hit with the cost of a lookup 
	   in the cache provider, but this shouldn't be expensive.
	   It's already what most sites are doing. And even with
	   mod_disk_cache it's relatively painless, just a hashcalc
	   and an attempt at open(). 

	   Either way, the url match functionality at this stage can 
	   be added back trivially, but I decided not to in my patch
	   because it's so confusing to have.

	2. If an admin re-configures with caching enabled for less
	   locations that they had previously, they have to know to 
	   either clear the cache or to know that the entities will 
	   still get served from the cache until they have expired. 
	   The patch includes a new Caching user guide, for this and 
	   other reasons.

As I was saying; What I've done gets rid of the CacheEnable and
CacheDisable directives, and instead lets you do this;

	# Cache everything to memory, or then disk
	CacheContent mem disk

	# Cache content for /foo/ to disk only
	<Location /foo/>
	    CacheContent disk

	# Don't cache these files at all
        <LocationMatch ~/foo/*.txt$>
	    CacheContent disk off

	<Proxy *>
	   # Only cache to disk
	   CacheConent disk

	<Proxy http://securityupdates/dist/>
	    # Don't cache the list of security updates, ever
	    CacheContent off

	<VirtualHost foobar>
	    # This vhost should never be cached
	    CacheContent off

But I'm still not finished, and I'd like some advice on what next. The
per-dir information isn't availabe at the quick-handle stage, so the
mod_cache handle has to rely on per-server config to decide which
providers to try and use for serving content. (right now I've simply
hard-coded mem and disk).

There are two options for doing this;

	1. Register any providers used by CacheContent at the config
	   stage in the per-server conf. Has the advantage of reducing
	   the ammount of directives involved and minimisming admin
	   confusion. Disadvantages; Makes using CacheContent in 
	   htaccess files a bit iffy, there would have to be a 
	   CacheContent directive in the base config files first.
	   Making the order providers are tried in would also be a
	   bit of a pain.

	2. Adding a another directive. "CacheEnable" makes the most
	   sense as a name, but it would also be a change in its
	   behaviour. So "CacheServe" as a name might be an option.
	   This would be a per-server directive, that says;

		"CacheEnable mem disk"
	   Which would mean serve from memory, or then disk (ie in
	   that order) for this server. 

I vastly prefer 2. myself, but I'd like to know what hope (if any) have
I of getting major changes to directives and the basic configuration of
a module committed? And also, people's thoughts on the trade-off of not
performing a url comparison at the handle stage.

Colm MacCárthaigh                        Public Key:

View raw message