httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Graham Leggett" <>
Subject Re: RFC: can I make mod_cache per-dir?
Date Mon, 15 Aug 2005 11:50:14 GMT
Colm MacCarthaigh said:

> mod_cache configurability sucks big-time. CacheEnable adds yet another
> location mapping scheme for administrators to deal with, but this scheme
> lacks basic flexibility;

The config scheme for mod_cache mirrors that of mod_proxy, from where the
cache originated, allowing configs like this:

ProxyPass /blah http://some-backend/blah
CacheEnable mem /blah

ScriptAlias /cgi-bin /var/some/cgi-bin
CacheEnable disk /cgi-bin

> 	It can't reliably disable caching for a directory.

Proxy has a mechanism to do this, cache should have a similar mechanism.
Does CacheDisable not do this?

> 	It's about 99.9% useless for a forward proxy configuration. ;-)

If so, then this is a bug that should be fixed.

> 	It can't do regex matching, unlike every other part of Apache.


> 	It involves some fairly pants linear searches through the url lists,
> 	which means not a hope of implementing complex configurations while
> 	keeping the performance mod_cache is supposed to be for :-/

These URL lists are not very long (or am I misunderstanding you?), have
you got some profiling numbers to show this?

> Unfortunately, I want to do some pretty complex things, including all of
> the above and I've bitten the bullet have achieved a rough implemention
> by throwing away the CacheEnable and CacheDisable directives, and
> completely changing the basic configuration of mod_cache. *cough*.

I don't think throwing out the current config syntax and redoing it from
scratch is really necessary - rather fix what's broken incrementally.

> I'm guessing that the majority of CacheEnable instances out there in the
> world probably take "/" as their url argument.

Why would this be the case?

It only makes sense to cache "expensive" server operations, probably
limited to the /cgi-bin directory, or to cache ProxyPass directives.
Apache is already really good at shipping static data, so caching
everything will probably only serve to slow things down rather than speed
them up.

The case of caching / would only really make sense if an entire site was
dynamically generated, but this is the exception rather than the rule.

> For this case, the
> changes I've made speed things up. For other cases there is some small
> potential slowdown, for example if you had only;
> 	CacheEnable disk /wiki/
> Previously mod_cache would have done a url match at the handle stage and
> if it didn't match, that would have been that. With this patch, it
> instead looks up the url with the caching provider directly. This has
> two consequences;
> 	1. It means all requests are hit with the cost of a lookup
> 	   in the cache provider, but this shouldn't be expensive.
> 	   It's already what most sites are doing. And even with
> 	   mod_disk_cache it's relatively painless, just a hashcalc
> 	   and an attempt at open().

An attempt at open() can be very expensive compared to some of the
performance improvements httpd already has (caching file handles, that
sort of thing).

The cost is likely to be greater than comparing lists of URLs, so this
would very likely be a step backwards performance wise.

> 	   Either way, the url match functionality at this stage can
> 	   be added back trivially, but I decided not to in my patch
> 	   because it's so confusing to have.
> 	2. If an admin re-configures with caching enabled for less
> 	   locations that they had previously, they have to know to
> 	   either clear the cache or to know that the entities will
> 	   still get served from the cache until they have expired.
> 	   The patch includes a new Caching user guide, for this and
> 	   other reasons.


At no point can an admin be expected to know anything that's not
immediately obvious. If the admin said "I am removing the cache option by
commenting out this directive", then httpd should immediately disable the
cache option.

> As I was saying; What I've done gets rid of the CacheEnable and
> CacheDisable directives, and instead lets you do this;

What you seem to be wanting to do is support the concept where there is no
URL at all, ie "cache everything in the current scope, be it

In other words, extending the current behavior from:

 CacheEnable disk /blah

(cache everything below /blah) to support something like this:

  <Location /blah>
    CacheEnable disk *

(cache everything regardless of the scope). This "*" means "we don't care
for doing a URL check, cache everything within the current scope".

This is a small incremental change to the current config.

> I vastly prefer 2. myself, but I'd like to know what hope (if any) have
> I of getting major changes to directives and the basic configuration of
> a module committed? And also, people's thoughts on the trade-off of not
> performing a url comparison at the handle stage.

There is going to have to be a very compelling case for fundamentally
changing the current config method when problems can be solved with some
more simple changes, and saving on some URL comparisons doesn't seem to me
to be a good enough reason at this stage, unless some performance numbers
can tell us otherwise?


View raw message