httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Davi Arnaut <>
Subject Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
Date Thu, 26 Oct 2006 11:46:58 GMT
Graham Leggett wrote:
> On Wed, October 25, 2006 11:48 pm, Davi Arnaut wrote:
>> No, we start with a file bucket on the brigade, I will try to explain
>> what happens to see if we are talking about the same thing.
>> Suppose we have a brigade containing a a file bucket, and the file size
>> is 4.7GB. We want to read it fully.
>> When we call apr_bucket_read() on this bucket, we end-up calling the
>> bucket read function (file_bucket_read). What does the bucket file read do
>> ?
>> If mmap is supported, it mmaps APR_MMAP_LIMIT bytes (4MB) by creating a
>> new mmap bucket and splits the bucket. So, after calling read on a file
>> bucket you have two buckets on the brigade. The first one is the mmap
>> bucket and last is file bucket with a updated offset.
>> The same thing happens if mmap is not supported, but the bucket type
>> will be a heap bucket. If we don't delete or flush those implicitly
>> created buckets they will keep the whole file in memory, but one single
>> read will not put the entire file on memory.
>> What Joe's patch does is remove this first implicitly created bucket
>> from the brigade, placing it on the brigade on a temporary brigade for
>> sending it to the client.
> Ok, this makes more sense, but in its current form the temporary brigade
> should not be necessary.

But it would be better. We don't need to keep creating brigades with
apr_brigade_split(), we could only move the buckets to a temporary
brigade. Take a look at apr_brigade_split:

APU_DECLARE(apr_bucket_brigade *) apr_brigade_split(apr_bucket_brigade
*b, apr_bucket *e)
    apr_bucket_brigade *a;
    apr_bucket *f;

    a = apr_brigade_create(b->p, b->bucket_alloc);
    /* Return an empty brigade if there is nothing left in
     * the first brigade to split off
    if (e != APR_BRIGADE_SENTINEL(b)) {
        f = APR_RING_LAST(&b->list);
        APR_RING_UNSPLICE(e, f, link);
        APR_RING_SPLICE_HEAD(&a->list, e, f, apr_bucket, link);


    return a;

If we used a tmpbb we could replace the apr_brigade_split with something
similar to:

    if (e != APR_BRIGADE_SENTINEL(b)) {
        f = APR_RING_LAST(&b->list);
        APR_RING_UNSPLICE(e, f, link);
        APR_RING_SPLICE_HEAD(&a->list, e, f, apr_bucket, link);

And instead of destroying the brigade, we could only do a call to:


> While disk cache goes through the brigade, it replaces buckets on the
> incoming brigade with a bucket referring to the cached file.
> I think a number of the thorny memory problems in mod_*cache have come
> about because the brigade is read twice - once by store_body, and once by
> ap_core_output_filter.

Maybe. Buckets are reference counted and get reused pretty quickly.

> The replacing of the buckets with file buckets in the brigade by disk
> cache means the brigade will only be read once - by save_body().

Right, I liked your approach. I just wanted to be sure everyone is on
the same page on this issue.

>> That's why splitting the brigade with magical values (16MB) is not such
>> a good idea, because the bucket type knows betters and will split the
>> bucket anyway.
> Right, the picture is getting clearer.
> Look closer at the cache code in mod_cache, right at the end of the
> CACHE_SAVE filter.
> Notice how store_body() is called, followed by ap_pass_brigade(). Notice
> how store_body() is expected to handle the complete brigade, all 4.7GB of
> it, before ap_pass_brigade() has a look see. Now - create a 4.7GB file on
> your drive. Copy this file to a second filename. Time this and tell me how
> long it takes. Do you see the further problem?


> The split-before-save_body patch wasn't described well enough by me - it
> is designed to prevent save_body() from having to handle bucket sizes that
> are impractically large to handle by a cache, both in terms of memory, and
> in terms of time.

That is clear.

Davi Arnaut

View raw message