apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cliff Woolley <cliffwool...@yahoo.com>
Subject Re: brigade/bucket splitting (was: Re: cvs commit: apr-util STATUS)x
Date Tue, 12 Dec 2000 05:12:56 GMT

--- Greg Stein <gstein@lyra.org> wrote:
> The reason that I asked for a bucket return value was so that we could split
> the brigade at point P3/4, and get a pointer to B4 back (the start of the
> copy). We split at point P6/7 and get P7 back (the bucket after the last to
> copy).

Oooohhh.... yeah, actually, Greg has a really great point here.  Something I myself
had seriously doubted when thinking about ap_brigade_split_offset() was the amount
of bookkeeping required to use it when you desire multiple splits.  Looking at it
Greg's way solves this problem.

The byterange filter actually is a good example of the general-use case we're trying
to provide for, since it does exactly the sorts of things we would want to do in the
general case: access pre-determined byte ranges (perhaps even overlapping) in an
arbitrary order.  (To demonstrate the generality of the situation, the example
OtherBill gave of a GIF filter that accessed certain image frames in the GIF steam
would do the same thing (though I doubt either of these would need overlapping).)

The bookkeeping problem when you actually split the brigade into two brigades is
seen when you start splitting the original brigade into many pieces: on the second
call to brigade_split_offset, you have to account for the non-zero absolute value of
the first byte in the second brigade (does that make sense?).  It gets really
complicated when you're splitting the brigade into many, many pieces... how do you
manage to keep the little brigades in order and know which one to split and where to
split it upon the next call to brigade_split_offset?  It's a big old mess.

But if you do as Greg suggests and just PARTITION the brigade at the given point by
splitting the middle bucket and returning a pointer to the new bucket after the
split, then all the byterange filter (and others like it) has to do is for each
range keep a pointer to the first and last bucket in the range (where the "last"
bucket is the result of the split, so it's an exclusive bound; maybe it would be
better to say that we keep a pointer to the first and "first-after" buckets...
whatever, you pick the names.)... that way, when you want to retrieve the data, in a
range, all you do is loop like so:
    for (e=first; e!=last; e=AP_BUCKET_NEXT(e)) { ... }

This sounds like a really terrific plan to me.  Note that it also supports
overlapping regions of the brigade for free... each region has no knowledge of the
others.  It's a pretty trivial modification to turn ap_brigade_split_offset() into
ap_brigade_partition() (seems like a good name to me)... all you have to do is
change the return value in the success case to be the ap_bucket* that split_offset
would have passed into ap_brigade_split().  Of course, that also means a simplistic
caller that really does just want to split a brigade into two brigades at a given
offset can emulate the ap_brigade_split_offset() very easily using
ap_brigade_partition() by just passing its return value into ap_brigade_split().

I really like this idea.  Illustrative patch to follow shortly.


Do You Yahoo!?
Yahoo! Shopping - Thousands of Stores. Millions of Products.

View raw message