httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Graham Leggett <>
Subject Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
Date Wed, 25 Oct 2006 20:21:26 GMT
Joe Orton wrote:

> There is no other acceptable solution AFAICS.  Buffering the entire 
> brigade (either to disk, or into RAM as the current code does) before 
> writing to the client is not OK, polling on buckets is not possible, 
> using threads is not OK, using non-blocking writes up the output filter 
> chain is not possible.  Any other ideas?

I managed to solve this problem last night.

Took a while and a lot of digging to figure it out, but in the end it is 
relatively simple.

The ap_core_output_filter helps us out:

     /* Scan through the brigade and decide whether to attempt a write,
      * based on the following rules:
      *  1) The new_bb is null: Do a nonblocking write of as much as
      *     possible: do a nonblocking write of as much data as possible,
      *     then save the rest in ctx->buffered_bb.  (If new_bb == NULL,
      *     it probably means that the MPM is doing asynchronous write
      *     completion and has just determined that this connection
      *     is writable.)

Brigades handed to the output filter are written to the network with a 
non blocking write. Any parts of the brigades which cannot be written 
without blocking are set aside to be sent the next time the filter is 
invoked with more data.

There is a catch - the output filter will only setaside a certain number 
of non file buckets before it enforces a blocking write to clear the 
backlog and keep memory usage down. The solution to this catch is to 
ensure that you always write file buckets to the network.

This way, the output filter will never block [1].

Enter mod_disk_cache.

One of the last things mod_disk_cache does after saving the body, is to 
replace whatever buckets were just written regardless of bucket type [2] 
in the brigade, with a file bucket pointing at the cached file and 
containing the exact same data.

This behaviour has two side effects: responses no longer hang around in 
RAM waiting to be sent to a slow client, these responses can now sit on 
disk [3], and this potentially improves performance on "expensive" 
processes like CGI, which can go away immediately and not hang around 
waiting for slow clients. The second side effect is that the bucket 
handed to the output filter is a file bucket - and therefore can be set 
aside and handled with non blocking writes by the output filter.

Now, enter mod_cache.

None of the above would mean anything if the file buckets being sent 
consisted of a single 4.7GB bucket. In this case, the save_body() would 
only finish after 4.7GB was written to disk, and the network write would 
only start after the first complete invocation of save_body(), and by 
that point the browser got bored and is long gone.

Oops. What will we do.

But mod_cache no longer passes 4.7GB file buckets to the providers, it 
now splits them up into buckets of a maximum size defaulting to 16MB.

So 16MB at a time gets written to the cache, then written to the non 
blocking network, then written to the cache, and so on. Suddenly the 
write-to-cache, then write-to-network problem is gone, and without 
threads, and without fork.

Run a wget on a 250MB file. Watch it being downloaded and cached at the 
same time, the size of the file in the cache tracks the size of the file 
downloaded reported by wget. Run a second wget on the same file moments 
later. Watch that wget quickly read the file from the cache up to where 
the first wget is running, and then watch it track the first wget's 
progress from that point on. Run cmp on the original file, the 
downloaded files, and the cache body, all the same.

Works like a charm.

The work is not finished. There are alternate use cases that need to be 
checked. Some alternate use cases are not practical to handle, and we 
must make decisions on these.

This code however is based on code running in production right now, so 
bugs should be reasonably clear and straightforward.

I need some help on the behaviour of the brigades, especially with the 
cleanup of the brigades so they don't hang around for the entire request 

I also need help solving some of the less savory solutions that people 
are not happy with, like fstat/sleep.

Please don't mail me any more about copy_body(). This function is no 
longer necessary and will be removed next. Hopefully the above will 
explain why copy_body() was attempted in the first place, as flawed as 
it was. It was committed as is because it was a prerequisite of Niklas' 
second patch, which is a critical component of the above. It was better 
to commit then change, rather than never commit the first or second 
patches, and never get anywhere.

[1] testing shows this is the case. More testing is needed to make sure 
this is true in all cases.

[2] I need to teach mod_disk_cache to handle metadata buckets more 

[3] Again, I need some help making sure brigades are cleared when they 
should be, and there are no leaks in mod_disk_cache.


View raw message