apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Stein <gst...@lyra.org>
Subject Re: apr_bucket_simple_split
Date Mon, 27 Aug 2001 08:29:04 GMT
On Sun, Aug 26, 2001 at 04:48:48PM -0700, Ryan Bloom wrote:
>...
> > Greg Stein wrote:
> > > Untrue. Please explain why a PIPE bucket cannot be split at byte 100?
> > > Sure, it doesn't know its length, but it can easily read in 100 bytes,
> > > give you that, and leave itself as the second part of the split.

> OtherBill wrote:
> > I agree here, we can split an unknown length pipe bucket at known point. 
> > I'd suggest that we really want a constant, APR_BUCKET_UNKNOWN_LEN (which
> > needs to map to MAX_SIZE_T or MAX_OFF_T, see below).  But I'm not certain
> > that the apr_bucket_split_simple() should learn how to do this, sounds like
> > a job for apr_bucket_split_indeterminate() or something.
> 
> No, it is not possible to split a bucket with an unknown length.  I have explained
> this at great length, many times.

Two things: (1) I don't recall these "corrective explanations", and (2)
explaining multiple times, over and over, doesn't mean it becomes true.

That said... let's return to the problem.

> Bucket split is a non-destructive operation,

Who said? Bucket split means "split this data into two buckets, so that I
can manipulate the two pieces."

> but splitting a bucket without a length changes the make-up of the brigade. 
> Instead of having two buckets of the same type, you can end up having
> mutliple heap buckets, and one pipe bucket.

Who cares what the bucket types are? The *whole point* of the system is to
isolate the type of the data from the user. I bucket A happens to have a
different type from bucket B, then who cares? Nobody *should* care
whatsoever.

By definition, bucket->split creates two buckets from one. Why does it
matter that the two resulting buckets have different types from the
original?

Are you saying that my GREG bucket cannot be split into JACOB and STEIN
buckets? That I *must* split them into GREG and GREG buckets? That if I do
otherwise, I will have to report to the FSF for Anti-Freedom Lashings? ;-)

> Yes, I agree that splitting a PIPE or SOCKET bucket is easy in the simple case.
> Mt problem is the non-trivial case.  What do you do when there is an error, or
> when you can't read enough data from the bucket?  How many times do you
> try to read?

These are not problems which prevent the definition of splitting an
indeterminite length bucket. The split operation can return errors. If a
problem occurs during the split, then return an error.

Any bucket can generate an error trying to split itself. Pipes and sockets
are not unique in that regard. I can say right now that a "database record"
bucket is going to exist in some third-party module, some time. You can bet
that will return errors from any bucket function -- that DB connection could
drop any time.

That said, what should a bucket do on a split with a short read? It could
return an error, it could return a "warning" with APR_INCOMPLETE, or it
could be totally undefined.

> It is best to leave these decisions to the filter that is doing the
> splitting.

The filter certainly cannot handle these kinds of problems. How could it
possibly know what is happening within the bucket? It shouldn't have to
know. It can try to split it, and it can receive success or an error.

Requiring it to know more than that breaks the abstraction of the buckets.


Let's return to some previous points...

> OtherBill wrote:
> > I agree here, we can split an unknown length pipe bucket at known point. 
> > I'd suggest that we really want a constant, APR_BUCKET_UNKNOWN_LEN (which
> > needs to map to MAX_SIZE_T or MAX_OFF_T, see below).  But I'm not certain
> > that the apr_bucket_split_simple() should learn how to do this, sounds like
> > a job for apr_bucket_split_indeterminate() or something.

The APR_BUCKET_UNKNOWN_LEN is an excellent idea. Much better than the
free-floating "-1" constants throughout the code. The symbol is much more
descriptive, and it is more resilient to changes in the underlying bucket
system.

Note that split_simple() will probably never be called in the case of an
unknown length. Simple buckets essentially have all their data already.
Pipes and sockets (those with unknown lengths) will generally have their own
split functions. And your idea of split_indeterminate actually works well
here: read N bytes (which creates a HEAP bucket or somesuch), then "split".


This falls back to a previous point. Note that read() changes the makeup of
a brigade. If we were so concerned about brigade makeup and stability, then
I'd be *much* more surprised to find that a read() changed the brigade, than
finding that a split() did. If anybody wants to complain about split()
changing bucket types, then they probably ought to start with complaining
about how read() can change types of buckets.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Mime
View raw message