httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niklas Edmundsson <>
Subject Re: Thundering herd patch for mod_cache
Date Tue, 12 Jan 2010 09:54:46 GMT
On Sun, 10 Jan 2010, Graham Leggett wrote:

>> - In ap_cache_try_lock - do we really need the hashed directories
>> hardwired?  I thought most modern filesystems had abandoned
>> linear search, so that kind of thing is redundant!
>> At least make it optional - perhaps an additional arg to
>> the Cachelock directive.
> I was trying to avoid yet-another-directive. The locks files are very
> short lived, so there should be few of them at any given time. Is it
> reasonable to assume that no modern platforms suffer linear search?
> (*cough* Windows *cough* :) ).

In general, the defaults for the hashed dirs (for mod_disk_cache at 
least) are awesomely insane. You generally end up with one file in 
each directory, even for a cache filesystem with lots and lots of 
files. CacheDirLength 1 and CacheDirLevels 2 gives 4096 directories 
(64^2), which should be enough for most sites/filesystems.

Also, the default of removing the cache directories makes no sense. If 
you need to remove it you shouldn't have needed to create it in the 
first place.

That said, I don't think it would be a good idea to remove it 
entirely. Having a sane amount of files in a directory is rather nice 
if you happen to do ls or something in it, amongst other things.

>> - Do we need to use lock files like this?  Not I think in
>> every case: with mod_disk_cache we already have files we
>> could lock (and create if they don't already exist).
> This is how I started to look at the problem, but I discovered there is
> ultimately no overlap between the thundering herd lock and
> mod_disk_cache. The thundering herd lock locks URLs right at the start
> of the request, while mod_disk_cache decides to cache and create files
> at the very end of the request. If we tried to lock cache files in
> mod_disk_cache, we would still leave a wide gap of time between the
> start of the request and the time the backend server sent a response,
> which in many cases can be many hundreds of milliseconds later, and
> enough to allow hundreds or thousands of requests through depending on load.

You might remember my mod_disk_cache jumbopatch that works around the 
same issue (and more). There were a number of odd issues I had to 
sidestep in a rather ugly way, and even though it does its job 
beautifully the code is hairy and bloated.

I'm no fan of lock files, but this seems to be a more elegant way to 
do this than patching things up in mod_disk_cache. As it's only 
runtime metadata, the performance inclined might want to put them into 
a tmpfs/ram filesystem.

>> - Your ELSE clause (cache_util, line 623) for requests
>> for a stale object that is being refreshed by another
>> request serves the stale object and adds a warning.
>> Is this the best thing to do?  What about waiting for
>> the file to be fetched?  Should still be quicker than
>> going to the backend in parallel with the other request.
> If the lock kept you waiting for 1000 milliseconds, and 1000 requests
> arrived in that time[1], you could very quickly tie up all the httpd
> children and start rejecting requests. As we have a stale cached copy of
> the request at hand, we might as well serve the stale copy in the mean
> time and shed those 1000 requests, rather than leaving them backed up
> behind us.
> [1] These are the kinds of numbers we have in the environment we have
> that needed the thundering herd lock.

We can see similar bursts upon an Ubuntu/Firefox release, and we're 
dealing with rather large files. OK, most of the burstyness are due to 
download-managers trying to spawn gazillions of connections, but the 
issue is indeed real.

  Niklas Edmundsson, Admin @ {acc,hpc2n}      |
  Take pot to the beach; leave no tern unstoned.

View raw message