httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niklas Edmundsson <ni...@acc.umu.se>
Subject Re: mod_cache: store_body() bites off more than it can chew
Date Mon, 06 Sep 2010 14:49:34 GMT
On Mon, 6 Sep 2010, Paul Fee wrote:

> If mod_disk_cache's on disk format is changing, now may be an opportunity to
> investigate some options to improve performance of httpd as a caching proxy.
>
> Currently headers and data are in separate files.  If they were in a single
> file, the operating system is given more indication that these two items are
> tightly coupled.  For example, when the headers are read in, the O/S can
> readahead and buffer part of the body.
>
> A difficulty with this could be refreshing the headers after a response to a
> conditional GET.  If the headers are at the start of the file and they
> change size, then they may overwrite the start of the existing body.  You
> could leave room for expansion (risks wasted space and may not be enough) or
> you could put the headers at the end of the file (may not benefit from
> readahead).

I tried to go the single-file route, but after having banged my head 
against the above issue and others while trying to design/implement 
something that would work for read-while-caching with using only 
O_EXCL file locking I did some benchmarking and found ut that the gain 
was minimal and reverted to having a separate header and body file.

What DID matter VERY MUCH regarding performance was the totally bogus 
defaults which affects the number of directories mod_disk_cache 
creates. CacheDirLength 1 and CacheDirLevels 2 gives you 4096 
directories (64^2) that holds files, that will hold many millions of 
files even on an fs that isn't too good at coping with many entries in 
a directory. With the defaults you tend to end up with one directory 
for each query, not very optimal.

Also, set CacheRemoveDirectories false because otherwise 
mod_disk_cache creates and deletes directories all the time which is a 
total waste of time. If you need to delete cache dirs then you have 
tuned yourself into the wrong corner, so IMHO that part of 
mod_disk_cache is plainly wrong.

Oh, this rant applies for xfs on Linux while I was hacking on our 
large-file-cache-patchset. The basics should apply for most other 
fs/os combos too ;)

> On a similar theme, would filesystem extended attributes be suitable for
> storing the headers?  The cache file's contents would be the entity body.  A
> problem with this approach could be portability.  However the APR could
> abstract this, reverting to separate files on platforms/filesystems that
> didn't offer extended attributes.
>
> http://en.wikipedia.org/wiki/Extended_file_attributes
>
> I haven't tested extended attributes to see if they offer performance gains
> over separate header and body files.  However it seems cleaner to have both
> parts in one file.  It should also eliminate race conditions where
> headers/body could get out of sync.

I'm honestly not sure you will get any massive performance gains, only 
benchmarks will tell :) The consistency-issues should be cleaner 
though.

Also, you will/might lose any possibility to have multiple headers 
pointing to the same body (classic example is multiple URLs resulting 
in the same plain file).

/Nikke
-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se      |     nikke@acc.umu.se
---------------------------------------------------------------------------
  IBM stands for Inferior But Marketable.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Mime
View raw message