httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Roy T. Fielding" <field...@kiwi.ICS.UCI.EDU>
Subject Re: [PATCH] Filter registration.
Date Fri, 28 Jul 2000 02:50:59 GMT
>"Roy T. Fielding" <fielding@kiwi.ICS.UCI.EDU> wrote:
>>Tony Finch <dot@dotat.at> wrote:
>>>I think the lifetime of a bucket should be independent of the request
>>>handler function because otherwise you lose a lot of advantages of the
>>>abstraction. I wouldn't consider it to be a properly first-class data
>>>type if you can't return it from a function.
>>
>>Hmmm, I disagree -- you just have to make it part of the abstraction.
>>If we say that the bucket always has to be emptied before it can be
>>returned, and we think of the write call always "returning" the
>>bucket when it returns, then this does match the abstraction.
>
>I don't think this is useful because a lot of the data that you put
>into a bucket has a longer lifetime than only one trip down and up the
>call stack, and you want to be able to make the most of that property.

That is the data inside the bucket.  The bucket structure itself is
rarely usable across invocations because of the way filters work -- they
end up either transforming the data (in which case the bucket goes away)
or storing the data in a way that is efficient for later processing
(in which case the bucket will most likely go away).

When the write gets to the stream-end, the last filter (BUFF today) must
look at the data and decide whether to hold onto it or send it.  If the
module was implemented sensibly, the passed bucket brigade will either
have the entire response or at least one large-write worth of data.
So, either the last layer is going to write all of that data or it
is going to store all that data (pipelining a short response or a
poorly developed module's short data).  In order to avoid memory
fragmentation and iovecs > 16, the safest thing for the bottom layer
to do is combine the small data buckets into a single buffer.  This
also has the effect of improving the layer 2 cache performance, or
at least that's what I gathered from all of Dean's notes.

Think of it this way: a bucket is a framing context that points to
a piece of data somewhere and identifies what functions can be
applied to that data.  Splitting a bucket results in two buckets
pointing to the same data (possibly half of which has been converted
because some data sources, like files, won't support multiple pointers).
A bucket is a like a generic, portable FILE * structure.

In some cases it is best to allocate the bucket structure from a
heap, and in others it is best to allocate it from the stack.  We could
easily choose either one (provided the allocations come from the right
pool in the heap case, and provided that the filter calls downstream
filters on a write for the stack case).  Greg's point, I believe, is
that it is easier to manage and more efficient on the common case
for the bucket structure to always be allocated from the stack, or at
least to assume that is the case within receiving filters.  We then only
have to worry about managing the lifetime of the data within the bucket.
Try it and see. 

>>It also makes a certain degree of sense.  What we really care about
>>being persistent is the contents of the bucket, not the bucket itself.
>
>But the bucket exists to hold the metadata for its contents, and so
>the bucket should exist as long as the contents do. I'm not very happy
>with the current names for the bucket colours because I find them
>confusing. Instead I prefer these names; some of them are less
>necessary than others so won't be implemented right away:
>
>IMMORTAL
>	data that's around for ever, like literal strings
>TRANSIENT
>	short-lifetime data (e.g. in a buffer that will be re-written
>	when the call stack is unwound)
>POOL
>	data in a pool, e.g. a variable in the environment table in
>	the request pool
>HEAP
>	data on the C heap that must be free()d when the bucket is
>	destroyed
>MMAP
>	mmap()ed data that must be munmap()ed when the bucket is
>	destroyed
>FILE
>	a region of a file (not in memory yet) for sendfile()
>PIPE
>	data that will (in the future) come from a pipe e.g. a CGI
>
>The basic idea is that when a bucket is created the contents that you
>pass to the creation function become the responsibility of the bucket.
>There are a couple of exceptions (the immortal and transient colours)
>for reasons of optimisation (and because the free operation is a
>no-op), and in the pool case the bucket shares responsibility with the
>pool code.

It's an interesting idea, but all you are doing is naming the function
that is set as the free() pointer for the bucket.  Rather than coloring
buckets by their behavior, we should be coloring them by their interface.
That is, the significant distinctions between bucket types are:

   how do you read the data within this bucket
   what are you supposed to do with the data read (metadata vs data)
   how do you set aside this bucket for the next call

Aside from those issues, all you need is the appropriate function pointers.

>>Consider, for example, real-life bucket brigades.  The bucket is
>>passed downstream with water and then back up empty.
>
>This analogy doesn't really work because we don't always completely
>empty the buckets at the bottom.

Sure we do -- if the bucket content doesn't get tossed onto the fire,
then it is poured into a trough near the fire.  Only the bucket returns.

....Roy

Mime
View raw message