httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Justin Erenkrantz <jerenkra...@apache.org>
Subject Re: SSL Input Filter bogosity
Date Thu, 31 Oct 2002 09:37:29 GMT
--On Thursday, October 31, 2002 12:19 AM -0600 "William A. Rowe, Jr." 
<wrowe@apache.org> wrote:

> OpenSSL 0.9.6g does so.  Why shouldn't we?

Because OpenSSL is a library, we're not.

> However, if we don't have inl worth of bytes, and they are sitting
> ready (on the socket) shouldn't we fetch them?  Forget the GETLINE
> bogosity, it means nothing to SSL.

No.  I think you are missing the point of having this code understand 
AP_MODE_GETLINE.  When we try to read a line, we don't have a way of 
really knowing how much to read.  So, we try to read a maximum of 8k 
(whatever AP_IOBUFSIZE is).  That merely defines what our maximum 
line length is (actually that is the maximum that mod_ssl will ever 
return on a read).  The generic read code doesn't have logic for 
determining when to stop.  That code is only contained in the logic 
that understands AP_MODE_GETLINE (does the memchr call).

Yet, in all probability, the chances are that the line is going to be 
very short (consider HTTP headers).  Therefore, as a critical 
optimization, we don't necessarily want to read the full 8k, but we 
want to try to see if we have enough with what we already have.  So, 
when we read anything with AP_MODE_GETLINE, we should exit out of our 
generic read call and then check to see if we've satisfied the 
getline requirements.

Removing the short-circuits is going to make all AP_MODE_GETLINE 
calls block for 8k of data every time.  That's going to be 
unacceptable when we're reading the headers (which is a blocking 
read) as we may end up reading too much data and we will end up 
reading past the headers (which commonly take less than 8k of data). 
When reading from the SSL socket generically, AP_MODE_GETLINE calls 
can only safely handle the output of one socket read, and I believe, 
if there was any data left over from the last read, we should attempt 
to check if that was a line as well (again, chances are that it is 
when called with AP_MODE_GETLINE).  Once we have the output of that 
socket read, the getline-aware logic can then determine if it is 
enough (it saw the LF), or to read some more (given the AP_IOBUFSIZE 
constraint).  Doing blind blocking as your patch does can't work.

Therefore, I believe these AP_MODE_GETLINE optimizations must stay in.

As a hunch, I would believe it would be better to return partial data 
up the filter chain sooner rather than waiting for the entirety even 
for AP_MODE_READBYTES calls as well.  There is no assumption that a 
filter *must* return all of the bytes it was requested.  On a 
blocking read, the only guarantee is that it should return at least 
one byte.  If we're really waiting around for the network, I think 
it'd be best to compute what we already have and then once we're 
ready, try to read from the network again.  Of course, I have no 
numbers to prove this thought, but I wouldn't be shocked if it is.

>> We should do a conditional on the APR_BRIGADE_EMPTY() check if
>> inbio->block is non-blocking.  It's considered a design error if a
>> filter returns an empty brigade on a blocking call.
>
> Who said we are blocking?  This could be a SPECULATIVE call
> with a NONBLOCKING request, no?

Huh?  We have the socket mode as we are passing that parameter to 
ap_get_brigade.  So, of course, we know if we are blocking.  Again, 
the common case is that we are blocking.

>> Should the APR_BUCKET_IS_EOS rather be APR_BUCKET_IS_METADATA?
>> Not sure here.  Perhaps.
>
> Hmmm.  I was thinking about the METADATA case.
>
> Do you suppose we should percolate METADATA buckets back out
> to the filter_read of SSL?  I suppose metadata should just go
> through unharmed.

I think so.  We really don't have a case where METADATA originates 
from the core, but it's possible.

> However, we have to react to EOS, since that's the end of the input
> available to the SSL pump.

Yes, as a special case of METADATA.

> Getline means nothing in the context of fetching bytes off of
> an SSL socket.  We must shove the raw bytes into the SSL
> pump in order to return any sort of data (SPECULATIVE, READBYTES,
> GETLINE, etc) from the SSL pump.  The raw bytes just need
> to be fetched.
>
> Of course, it's generally nonblocking, so if we don't get a hit
> from the socket, the new code just returns whatever we got.

No, it's not generally non-blocking (see ap_rgetline_core for the 
initial call to ap_get_brigade - it is blocking).  So, attempting to 
read the full 8k when we don't have that available will not work. 
Again, the AP_MODE_GETLINE is important even within the generic 
reading of the code - see above.

> No, there really isn't.  It was impossible to look at inbio.rv when
> we needed too, since ctx wasn't passed, but the SSL ctx.

Then, change the function parameters.  I'm just concerned that we're 
going to be substantially adding to another function.  mod_ssl has a 
lot of places where functions go on and on and on and on and on. 
Breaking it into little pieces isn't that bad of an idea - this patch 
is making a function considerably longer and more complicated.

If you want to do something *really* productive with mod_ssl to help 
it be more understandable, please go and change those bizzaro 
function names to actually mean something.  ssl_io_filter_Input. 
Gah.  -- justin

Mime
View raw message