httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Graham Leggett" <>
Subject Re: svn commit: r468373 - in /httpd/httpd/trunk: CHANGES modules/cache/mod_cache.c modules/cache/mod_cache.h modules/cache/mod_disk_cache.c modules/cache/mod_disk_cache.h modules/cache/mod_mem_cache.c
Date Mon, 30 Oct 2006 13:48:18 GMT
On Mon, October 30, 2006 2:44 pm, Nick Kew wrote:

> Hang on!  Where's the file coming from?  If it's local and static,
> what is mod_cache supposed to gain you?  And if not, what put it
> in a (single) file bucket before it reached mod_cache?

In the case of the person who reported this issue, the file is coming from
a very slow, but very large NFS mounted multi TB filesystem, and is served
as a normal file. 10% of the files are responsible for 90% of the traffic,
so it makes sense to have a small fast disk cache in front of the slow NFS

The normal serve-a-static-file content generator puts the entire file into
a single bucket.

>> - apr_bucket_read() assumes that a bucket will only ever be read once.
>> In so doing, it may morph buckets into heap buckets while reading,
>> when buckets are too large to be read in one go. This behaviour is
>> undocumented (I plan to fix that).
> Yes.  But what is reading them?

First, the cache_body() hook, for the purposes of caching the file.

Second, the next filter in the filter stack.

> If mod_disk_cache gets a single file bucket as input, does it
> actually need to read the file?  It can send the file bucket
> down the chain as-is, having given it a filesystem entry in
> cache space.
> OK, that falls down if the cache's filespace is not on the same
> disc as the file bucket.  But that in itself is a major overhead,
> and begs my first question: what is mod_cache supposed to gain?
> Mod_cache fronting a jukebox?  Right, then you do want to copy
> the file: can't the cache filter itself pass buckets as it reads
> them?  Of course it can.  But just because this case exists
> doesn't mean the cache filter should insist on reading every
> file bucket it gets!

How will the cache filter cache the bucket without reading the bucket?

> OK, how about this for an alternative: introduce an apr_bucket_clone
> method, that works by reference-counting and lazy copying, and
> in the case of a file bucket, asynchronous copying.  The filter
> can clone the bucket, pass one copy on immediately, and save the
> other: then the save will actually read the file if and only
> if it's copying between filesystems, and the filter chain can
> use sendfile.
> I haven't thought this through: I put it forward as the kind of
> proposal that might fix the problem without breaking Justin's
> expectations.

You are basically describing Niklas' solution as originally proposed.

His code isolated file buckets, and when found triggered special code that
knew how to copy a large file bucket to the cache using a normal file copy
from one fd buried in the file bucket to the fd of the cached file.

This was objected to by various people, who believed the cache should not
isolate buckets and treat them specially in this way.

An alternative (and in testing, better performing) method was to write the
original bucket regardless of type to the cache, and then replace the (now
heap) bucket with an equivalent bucket pointing at the cached file. As
this cached file has just been written, it is present in kernel caching
buffers. Reading back from this file is therefore very fast.

This removed the need to read the same bucket twice - the second read is
of a brand new file bucket.

It also allowed us to read the response to completion very quickly,
serving the many-orders-of-magnitude-slower network client at leisure from
the cache file.

This has enormous advantages for mod_cgi and other heavy processes - the
heavy process can complete and release resources, without waiting for the
slow client.

This is how the disk cache works right now this minute.

> Really?  So if a DVD image comes in 8K chunks from mod_proxy,
> mod_cache is going to buffer everything?  Erm .... why?

It won't - so far only the file content generator puts large chunks of
content in one bucket.

But that's no guarantee that any other content generator might not do the

> Are you saying mod_cache enforces that?  Or mod_disk_cache?
> In the latter case, there's always the option of introducing
> a new provider for large files.

Joe proposed that, it's not a bad idea.

> OK, I plead guilty to not reviewing them.  Did you motivate review
> by accompanying them with an explanation (as above) of what
> brokenness they fixed?

The patches fixed a number of issues, including but not limited to:

- The issue above (caching of large files causes out of memory, timeouts)

- The thundering herd issue for request bodies. This issue is
significantly more complex than the issue above, I think it would be best
to discuss that in a separate thread.


View raw message