httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Graham Leggett <minf...@sharp.fm>
Subject Re: mod_cache summary and plan
Date Sun, 29 Oct 2006 12:44:09 GMT
Davi Arnaut wrote:

> . Problem:

You have described two separate problems below.

> For a moment forget about file buckets and large files, what's really at
> stake is proxy/cache brigade management when the arrival rate is too
> high (e.g. a single 4.7GB file bucket, high-rate input data to be
> consumed by relatively low-rate).
> 
> By operating as a normal output filter mod_cache must deal with
> potentially large brigades of (possibly) different (other than the stock
> ones) bucket types created by other filters on the chain.

This first problem has largely been solved, bar some testing.

The solution was to pass the output filter through the save_body() hook, 
and let the save_body() code decide for itself when the best time is to 
write the bucket(s) to the network.

For example in the disk cache, the apr_bucket_read() loop will read 
chunks of the 4.7GB file 4MB at a time. This chunk will be cached, and 
then this chuck will be written to the network, then cleanup up. Rinse 
repeat.

Previously, save_body() was expected to save all 4.7GB to the cache, and 
then only write the first byte to the network possibly minutes later.

If a filter was present before cache that for any reason converted file 
buckets into heap buckets (for example mod_deflate), then save_body() 
would try and store 4.7GB of heap buckets in RAM to pass to the network 
later, and boom.

How mod_disk_cache chooses to send data to the network is an entirely 
separate issue, detailed below.

> The problem arises from the fact that mod_disk_cache store function
> traverses the brigade by it self reading each bucket in order to write
> it's contents to disk, potentially filling the memory with large chunks
> of data allocated/created by the bucket type read function (e.g. file
> bucket).

To put this another way:

The core problem in the old cache code was that the assumption was made 
that it was practical to call apr_bucket_read() on the same data _twice_ 
- once during caching, once during network write.

This assumption isn't valid, thus the recent fixes.

> . Constraints:
> 
> No threads/forked processes.
> Bucket type specific workarounds won't work.
> No core changes/knowledge, easily back-portable fixes are preferable.
> 
> . Proposed solution:
> 
> File buffering (or a part of Graham's last approach).
> 
> The solution consists of using the cache file as a output buffer by
> splitting the buckets into smaller chunks and writing then to disk. Once
> written (apr_file_write_full) a new file bucket is created with offset
> and size of the just written buffer. The old bucket is deleted.
> 
> After that, the bucket is inserted into a temporary (empty) brigade and
> sent down the output filter stack for (probably) network i/o.
> 
> At a quick glance, this solution may sound absurd -- the chunk is
> already in memory, and the output filter might need it again in memory
> soon. But there's no silver bullet, and it's a simple enough approach to
> solve the growing memory problem while not occurring into performance
> penalties.

As soon as apr_file_write_full() is executed, the bucket just saved to 
disk cache is also in kernel buffer memory - meaning that a 
corresponding apr_bucket_read() afterwards in the network code reads 
already kernel memory cached data.

In performance testing, on files small enough to be buffered by the 
kernel (a few MB), the initial part of the download after caching is 
very fast.

What this technique does is guarantee that regardless of the source of 
the response, be it a file, a CGI, or proxy, what gets written to the 
network is always a file, and always takes advantage of kernel based 
file performance features.

Regards,
Graham
--

Mime
View raw message