apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Bloom <...@covalent.net>
Subject Re: cvs commit: apr-util/buckets apr_buckets_socket.c
Date Wed, 10 Oct 2001 15:11:08 GMT
On Wednesday 10 October 2001 07:55 am, Justin Erenkrantz wrote:
> On Wed, Oct 10, 2001 at 06:59:32AM -0700, Ryan Bloom wrote:
> > 	do
> > 	    do stuff so that we don't wait until there is data before
> > 	    we process what we have now.
> > 	while (apr_bucket_read == APR_EAGAIN)
> >
> > to:
> > 	do
> > 	    do stuff just like above
> > 	whild (apr_bucket_read == APR_SUCCESS &&
> > 		APR_BUCKET_FIRST()->length == 0)
> >
> > Libraries should return errors, because they are valid information that
> > the caller can use.  They also do not break the abstraction, because we
> > have essentially said that all bucket _can_ return EAGAIN, although
> > anybody who looks at the code will realize quickly that only socket and
> > pipe bucket ever _will_ return EAGAIN.
> See, I think that's my complaint.  Now, we have to always handle
> EAGAIN in every case where we read the buckets to be correct to the
> abstraction.  Something just seems wrong about having to add this for
> a special case where we could reasonably dictate how to handle it
> in the special case.

If you want to be 100% correct, yeah we do.  But we have to handle it
anyway, because pipe buckets can return EAGAIN as well.  In fact, any
bucket with a non-determinant length can return EAGAIN.

> I have a feeling that rather than fixing this, we're just going to
> ditch the socket bucket entirely (what you and Greg have suggested).
> And, I'd rather not do that because it means that the socket bucket
> isn't doing what we want.  And, if it isn't doing what we want, I
> think we need to stop and ask why.

Either way, you have to handle the EAGAIN case.  Either you have to
handle it because the socket returned it to you, or the bucket did.  There
is no difference in the code you have to write.  The reality is that if you
put in the code in this patch, then the special case still exists, but it is
hidden, making it harder to find, but just as important to know about.
The special case moves from "EAGAIN returned" to "SUCCES returned,
and first bucket has 0 length, if the second bucket has an indeterminant
length".  That is painful to check.

We should move away from the socket bucket because it isn't necessary
anymore.  We don't need a socket bucket, because the socket is being
handled completely inside of the core input filter.

> I think in the example of CORE_IN, it would just completely ignore
> EAGAIN and return a 0-length brigade.  In my mind, that just seems
> the only rational decision to make there (perhaps not in the general
> case).  The core filters should not be doing anything to slow down
> processing.

Yeah, most filters will want to just keep going, but this patch makes the
shortcut very hard to see.  If we can just check EAGAIN, then we can
easily branch to the spot where we return the 0 length brigade.  With
this patch, we must check the length of the first and second buckets
to know that we are in the EAGAIN case.

Imagine an output filter that wants to process the data coming from a 
CGI script.  If it gets EAGAIN, then it just passes what it has, and when
the function returns, it calls apr_bucket_read again.  That logic is this

	if (apr_bucket_read == APR_EAGAIN) {
Assuming that logic is in a while loop, this just works.  With this patch,
it becomes

	if (apr_bucket_read == APR_EAGAIN &&
	    APR_BUCKET_FIRST->lenght == 0 &&
	    APR_BUCKET_NEXT(APR_BUCKET_FIRST)->length == -1) {

> But, I'm not sure how any reader of the buckets will know to do a
> select/poll/etc *without* violating the knowledge that the bucket
> is a socket-based bucket.  Then, it'd also have to retrieve the
> socket from the bucket and then do a select/poll/etc on it.  And,
> you'd have to handle the pipe case differently as well.  I guess
> I see the violation occurring when the caller tries to "resolve"
> the EAGAIN not with the EAGAIN itself.  The only way the caller
> can do that (from my understanding) is to know what the storage
> type is.  And, that's verboten, IMHO.

9 times out ot 10, we shouldn't be polling to get the data.  The bucket
code does that for us if you call it with a blocking read, and that
will either pop when the timeout pops or as soon as there is one
byte of data.  There is no reason for the caller to ever do the poll itself.

Ryan Bloom				rbb@apache.org
Covalent Technologies			rbb@covalent.net

View raw message