httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Stoddard <b...@wstoddard.com>
Subject Re: [PATCH] Move http header parsing into the http_input_filter
Date Thu, 27 Feb 2003 01:46:12 GMT
Justin Erenkrantz wrote:
> --On Tuesday, February 25, 2003 8:09 PM -0500 Bill Stoddard 
> <bill@wstoddard.com> wrote:
> 
>> Ok, I know what I was thinking. Check out tmp_bb in ap_read_request
>> in protocol.c (unpatched version). We create tmp_bb, do
>> read_request_line, ap_get_mime_headers then destroy tmp_bb.  My
>> patch does essentially the same thing.  If there is a request body
>> to read, whoeever needs to read it will create a new brigade. In
>> this regard, my patch does not create any new bogosity. (and no, we
>> do not throw away the first read).
> 
> 
> No, it's not the same thing.  One is: 'read the request line and headers 
> using this temporary brigade,' the other is: 'read part of the input 
> body into this brigade - oh, by the way, feel free to read the headers 
> and status while you are at it, and, oh, don't feel that I require any 
> data.'

No, I think it is the same.  The request body (or the next request in a 
set of pipelined requests) is set aside in the http_input_filter_ctx_t 
in the HTTP_IN filter after the headers are parsed.   The next call to 
ap_get_brigade to fetch the body will fetch any setaside buckets then 
call down into core_in for more bytes as needed.

I probably cannot say this enough... I've not done any work to get the 
code to handle a request body, but AFAIK, it should not be difficult to 
do.

> As I've said before, the patch uses ap_get_brigade only as a way to 
> drive the input filters before we are ready to read the body.  
Call me dense but the unpatched code in ap_read_request sure appears to 
do the same thing.

 > To me, it
> indicates a strong disconnect with how the filters are designed. Filters 
> are meant to be called asynchronously as needed (pulled), but the 
> request parsing must happen synchronously and at a specific point in time.
> 
> I don't fully understand why the status and header parsing must belong 
> in an input filter (hammer, meet balloon).  I think the patch could be 
> rewritten removing the context of an input filter and still see 
> significant performance benefits.  At the time the request is read, 
> there should be no other input filters, so I don't believe there would 
> be significant overhead to not being an input filter.  -- justin

One of the keys behind why this patch improves performance is the 
elimination of the repeat calls to ap_rgetline() to read the header 
fields. ap_rgetline calls ap_get_brigade(AP_MODE_GETLINE) which drived 
multiple trips into the core input filter. The patch enables 
ap_http_filter to make a -single- call to 
ap_get_brigade(AP_MODE_READBYTES) (and a single entry to core_in) and 
works on parsing the header fields out of the returned brigade. So how 
do you solve the problem of ap_get_brigade(AP_MODE_READBYTES) returning 
bytes beyond the header fields?  AFAIK there are two ways to solve this:

1. push the extra bytes back to the core input filter
2. setaside the extra bytes in the ap_http_filter context so that they 
can be fetched later
My patch implements 2) which is essentially the same thing 
core_input_filter does. core_in will issue a bucket read on the socket 
and that read may return request body bytes. If core_in is entered with 
mode AP_MODE_GETLINE, only a single line is returned and the other bytes 
are setaside in the core_input_filter context. My patch just moves that 
setaside function up one level to a protocol level filter.

Bill



Mime
View raw message