httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject Re: Implementing split() on pipe buckets?
Date Sun, 12 Nov 2000 04:21:28 GMT
On Sat, 11 Nov 2000, Cliff Woolley wrote:
> --- wrote:
> > This patch fixes the problem of a pipe split where there is enough data in
> > the pipe to split at the desired length, because it splits the pipe into a
> > heap and a pipe bucket at the correct place.  What if there isn't enough
> > data to do the split at the correct place?  This patch doesn't handle that
> > case.  All the other bucket types return an error if the split is past the
> > end of the bucket.
> That's why I mentioned in my preceding comments that pipe_split() might want to
> check the value of *len after calling pipe_readn(), comparing it to the value it
> passed in.  If the two differ, then there wasn't enough data in the pipe to split at
> the right place.  In that case, APR_EINVAL would be returned.  The pipe bucket would
> have been morphed in the process, but that's going to happen as soon as you read it
> anyway.

But there's more to it than that.  There may be enough information coming,
but it may not be there yet.

> > The only logical way to deal with pipe and socket splits is to read the
> > data and then split it.  The split functions can't do this, because that
> > changes the definition of split, so it is up to the program itself.
> But pipe_read() IS a split function, if you think about it.  Its only shortcomings
> if it wants to become pipe_split() are that it needs to obey the passed in length
> and that it needs to have the right parameter list.  That's all!  When you read from
> pipe_readn(), it does the split for you!  There's no need to try to split, find out
> you can, then read... why not just let the pipe bucket do the read for you?

Because that changes the definition of the function.  The split function
takes a single bucket and turns it into two of that bucket pointing to two
different sections of the data.  By changing split into the function you
have implemented, you have taken control out of the hands of the program,
and puts it in the hands of a function that may or may not do what you
want.  You haven't gained anything by making the change, except a single
if statement (which is replaced by a couple of ifs inside the split
function).  This change changes the definition of a function based on the
bucket type, and that is not good.

> If I understand your argument, it's just that you dislike the idea of there being
> any side effects of split() (ie, that the pipe bucket gets morphed into a heap+pipe
> sequence of buckets).  But read() already has that side effect!  A pipe doesn't do

I dislike having the split function be overloaded.  It doesn't make any
sense to split a pipe or socket bucket, since they have an incalculatable
amount of memory.  There are so many cases to deal with that it isn't
worth it.  This should just fall back to a read then a split.  Allow me to
explain all of the conditions that will have to be handled:

Read succeeds and enough data is returned to split at the expected

Read succeeds, but not enough data is returned.  This is an error
condition, a second read would return enough information to complete the

Read succeeds, but not enough data is returned, neither does the second
read, because this is a long-lived CGI request.  A third or fourth read
would have returned enough data to fulfill the split properly.

Read fails, there is no more data coming from the pipe.  This is an error

Read succeeds, but not enough data, and there is no more on the
pipe.  This is an error.

There are at least five conditions that must be handled, and this gets
more complicated when you start adding in subsequent reads and whether
they should block or not.

This is just best handled by the program itself not by the bucket
code.  Want to throw another wrench in the works?  Usually split returns
two buckets, but with pipe and socket buckets, it is possible to come to
the end of a pipe or socket, so split would return a single bucket.  Yes,
it is possible to specify the end of the bucket, but in that case we
always return two buckets, one with the data and the second is a zero
length bucket.  The pipe and socket buckets destroy themselves
automatically when there is no more data.


Ryan Bloom               
406 29th St.
San Francisco, CA 94131

View raw message