httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niklas Edmundsson <>
Subject Re: mod_disk_cache summarization
Date Tue, 24 Oct 2006 12:48:45 GMT
On Tue, 24 Oct 2006, Graham Leggett wrote:

>> * Allow disk cache to realise that a (large) file is the same
>>    regardless of which URL is used to access it. Reduces cache disk
>>    usage a lot for sites like ours that's known by,
>> and so on.
> Perhaps this could be as simple as using ServerName and ServerAlias
> (unless the name of the site is part of the URL, which will happen in the
> forward proxy case) to reduce the cached URL to a canonical form before
> storing and or retrieving from the cache.

We have a few different servernames depending on which site it's 
serving (needs to cater for official download locations and so on) so 
I guess that won't help much.

>> * Add option to not try to remove cache directories in the cache
>>    structure. IMHO, this should never be needed since the cache
>>    directory should not be excessively deep (which the broken defaults
>>    leads to). Davi had a fix for the cache dir layout I think, and I
>>    personally think that neither mod_disk_cache nor htcacheclean should
>>    do rmdir.
> It makes sense that mod_disk_cache shouldn't do it, but perhaps it should
> be tunable for htcacheclean.

Arguably. But if you ever need to remove directories in the cache 
hiearchy you should really start to wonder why they were created in 
the first place...

>> * Eventually add option to have header and body in the same cachefile.
> Is there an advantage to this? IIRC Brian reported that a body in a
> separate file can take advantage of sendfile, as is as a result much
> faster.

We use combined header/body, and sendfile works flawlessly. Linux 
sendfile has problems when writing to a sendfile():d file with 
mmap, and all sendfiles have problems with overlapping 

The main advantage is half the number of inodes and that by removing 
one file you get rid of both the header and body. I suspect that the 
performance gain is minimal though.

> A more formal cache cleanup process needs to be fleshed out, giving the
> options above both as options in code, and as documentation as you say.
> The comparison of your and Brian's experience are two ends of extremes on
> high volume caches, one low hits large files, the second high hits small
> files. This should make for some useful tuning information.

The extreme difference is what makes me think that we should 
acknowledge that they exist and provide the relevant knobs where 
necessary. As it looks right now, those knobs tend to be more 
OS/filesystem specific, but that might change as this evolves.

  Niklas Edmundsson, Admin @ {acc,hpc2n}      |
  Buy a 486-33 you can reboot faster..

View raw message