httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Graham Leggett" <minf...@sharp.fm>
Subject Re: mod_disk_cache summarization
Date Tue, 24 Oct 2006 12:18:01 GMT
On Tue, October 24, 2006 12:59 pm, Niklas Edmundsson wrote:

> * More assorted small cleanups (mostly error handling).

Error handling patches are welcome and encouraged, don't wait :)

> * Allow disk cache to realise that a (large) file is the same
>    regardless of which URL is used to access it. Reduces cache disk
>    usage a lot for sites like ours that's known by ftp.acc.umu.se,
>    ftp.se.debian.org, ftp.gnome.org, se.releases.ubuntu.com,
>    releases.mozilla.org and so on.

Perhaps this could be as simple as using ServerName and ServerAlias
(unless the name of the site is part of the URL, which will happen in the
forward proxy case) to reduce the cached URL to a canonical form before
storing and or retrieving from the cache.

> * Add option to not try to remove cache directories in the cache
>    structure. IMHO, this should never be needed since the cache
>    directory should not be excessively deep (which the broken defaults
>    leads to). Davi had a fix for the cache dir layout I think, and I
>    personally think that neither mod_disk_cache nor htcacheclean should
>    do rmdir.

It makes sense that mod_disk_cache shouldn't do it, but perhaps it should
be tunable for htcacheclean.

> * Eventually add option to have header and body in the same cachefile.

Is there an advantage to this? IIRC Brian reported that a body in a
separate file can take advantage of sendfile, as is as a result much
faster.

> While working with this I have understood that there are two rather
> different uses for mod_disk_cache: either as a cache in a proxy or as
> a way to make a FTP-server frontend reduce load of its file server
> backend.

My key interest to date has been using the cache to improve performance of
the httpd server, including an httpd reverse proxy.

However having heard from other people who have used other proxies, it's
starting to look like httpd's mod_disk_cache is one of the few (if not
only) RFC2616 caches that is (close to) fully RFC compliant.

Ideally cache should work as both forward and reverse, without any special
configuration either way.

> For the FTP-server frontend usage we see the following
> characteristics: Large files, relatively few requests/s. It's
> important to keep files that are frequently accessed in cache (they
> might be large), hence have cache filesystem mounted with atime and
> clean cache based on atime. This works nicely for us using XFS, and
> cleaning by atime is much quicker and uses less resources than
> htcacheclean.
>
> Others here are more clued on the proxy-cache-usecase, but as I
> understand it the keywords are many small files, many requests/s so
> need to mount with noatime and use htcacheclean.
>
> Tuning tips in the documentation for these rather different cases
> would probably be apprecieted.

A more formal cache cleanup process needs to be fleshed out, giving the
options above both as options in code, and as documentation as you say.

The comparison of your and Brian's experience are two ends of extremes on
high volume caches, one low hits large files, the second high hits small
files. This should make for some useful tuning information.

Regards,
Graham
--



Mime
View raw message