httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ruediger Pluem <rpl...@apache.org>
Subject Re: cache: the store_body interface
Date Mon, 30 Oct 2006 21:13:09 GMT


On 10/30/2006 07:53 PM, Graham Leggett wrote:
> Justin Erenkrantz wrote:
> 
>>> 1) change the store_body interface to allow the storage provider direct
>>> access to f->next, so it can flush buckets up the output filter chain
>>> when they have been stored.  As seen on trunk.

I am with Justin here. Dealing with the filter chain should not be the business
of the storage provider as it duplicates code and makes the implementation of
providers too complex and error prone IMHO.

>>>
>>> 2) keep the interface as-is, but read buckets in mod_cache and partition
>>> the brigade manually; only pass a "small" brigade with known-length
>>> buckets to the provider. (so no morphing and no arbitrary memory
>>> consumption)

As far as I can see this small brigade would only contain the following bucket
types (pipe and file buckets would get morphed due to apr_bucket_read):

heap
transient
mmap


>>>
>>> 3) change the interface: deal with the buckets entirely in mod_cache and
>>> just pass (char *,size_t) pairs to store_body

In comparison to 2) this is a more straight forward interface as 2) effectively
reduces to the same thing that is only disguised in the more complex data structures
of buckets and brigades.

>>>
>>> 4) change the interface: pass some abstract "flush-me" callback in,
>>> which the provider can call to pass up then delete the bucket.
>>> (apr_brigade_flush doesn't quite fit the bill unfortunately)

Just curious: Why do you think it does not fit the bill? Because it requires
a brigade instead of a bucket or because we possibly would need to pass the
filter to it as ctx?

>>>
>>> IMO:
>>>
>>> if you're going to be reading buckets from the brigade in mod_cache, you
>>> might as well go the whole hog and do (3), and stop exposing the
>>> provider to buckets or brigades at all.  This will prevent the provider
>>> from doing any particular optimisations based on content type (like
>>> copying FILE buckets); feature or bug, take your pick.

I wouldn't call it a bug, but possibly a lack of feature that the API makes it
impossible to create a provider that can do things like calling splice/tee on
Linux or any similar thing on another OS to speed up copying data from the
backend disk to the cache disk (I know a corner case). OTOH we have no hard
numbers that tell us how much performance we really loose if we write (possibly)
MMAPed data back to the cache file. And I think Colm proved by his tests that this
way  (which is effectively used on 2.2.x today) is not all too slow ;-)
BTW: Does anybody know if MMAP for writing files is possible / makes sense /
improves performance?


>>
>>
>> #3 gets my vote.  I hate bucket brigades anyway.  ;-)
> 
> 
> Thinking about this a bit, the fact that buckets can be "too weird" at
> times, and every cache provider has to now care about this weirdness
> while caching, letting mod_cache deal with the weirdness and letting the
> cache providers deal with a trivial buffer is probably the way to go.
> 
> #3 it is.

If anybody can point out the downsides of #4 you get me into #3 group ;-).


Regards

RĂ¼diger

Mime
View raw message