From Graham Leggett <>
Subject Re: mod_cache: store_body() bites off more than it can chew
Date Sun, 12 Sep 2010 23:13:06 GMT
On 06 Sep 2010, at 11:00 PM, Paul Querna wrote:

> Isn't this problem an artifact of how all bucket brigades work, and is
> present in all output filter chains?
> An output filter might be called multiple times, but a single bucket
> can still contain a 4gb chunk easily.
> It seems to me it would be better to think about this holistically
> down the entire output filter chain, rather than building in special
> case support for this inside mod_cache's internal methods?

In the cache case, thinking about it a bit the in and out brigades are  
probably unavoidable, as the cache is a special case in that it wants  
to write the data twice, once to the cache, a second time to the rest  
of the filter stack. Right now, the cache is forced to read the  
complete brigade to cache it, no option to give up early. And the  
cache has no choice but to keep the brigade buckets in the brigade so  
that they can be passed a second time up the filter stack, no deleting  
buckets as you go like you normally would. Read one 4GB file bucket in  
the cache, and in the process the file bucket gets morphed into 1/2  
million heap buckets, oops. With two brigades, one in, one out, the in  
brigade can have the buckets removed as they are consumed, as normal,  
and moved to the out brigade. The cache can quit at any time, and the  
code following knows what data to write to the network (out), and what  
data to loop round and resend to the cache (in). The cache provider  
could choose to quit and ask to be called again either because writing  
took too long, or too much data was read (and in the process became  
heap buckets), either reason is fine.

That said, following on your suggestion of thinking about this in the  
general sense, it would be really nice if the filter stack had the  
option to say "I have bitten off as much of the brigade as I am  
prepared to chew on right now, and the leftovers are still in the  
brigade, can you call me back with this data, maybe with more data  
added, and I'll try swallow some more?".

In theory, that would mean all handlers (or entities that sent data)  
would no longer be allowed to make the blind assumption that the  
filter stack was willing to consume every possible set of buckets the  
handler wanted to send, and that the stack had the right to go "I'm  
full, give me a second to chew on this".

This wouldn't need separate brigades, probably just a return code that  
meant EAGAIN, and that was expected to be honoured by handlers.


