Mailing-List: contact new-httpd-help@apache.org; run by ezmlm
Precedence: bulk
Reply-To: new-httpd@apache.org
Date: Thu, 12 Oct 2000 18:01:31 -0700 (PDT)
From: rbb@covalent.net
To: "'new-httpd@apache.org'" <new-httpd@apache.org>
Subject: RE: cvs commit: apache-2.0/src/main http_protocol.c
In-Reply-To: <AA1E32BC8A58D411A2CB0050DACEDA4366070D@raptor.ebuilt.net>
Message-ID: <Pine.LNX.4.21.0010121752540.30933-100000@koj>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII


> There are three goals.  One is to only read the correct amount of data
> from the transfer protocol, a second is to avoid reading a request
> which is larger than a predefined limit (denial of service avoidance),
> and the third is to provide a content length to mod_cgi as part of
> the CGI protocol.  The latter requires internal server buffering if
> the request is chunked (something we've been putting off til 2.0).
> 
> I think the answer is to implement something that meets those goals,
> even if it means that input filtering cannot be used with some resources
> (like CGI) or possibly even with some protocols (like HTTP/1.1).
> Perhaps the HTTP input filter needs to start as monolithic by default
> and "break itself up" for those requests that can handle streams.

Okay, then I am missing something big here.  The design I have discussed
meets all three of these goals and does it without being monolithic.  At
least it does IMHO.

The idea is relatively simple.  We have a bottom most layer that
understands how to read from the network.  This is the core_input_filter

Then we have a filter that understands two modes of operation, header data 
and body data.  Header data is read one line at a time.  The amount of
body data read from is determined by how much is requested.

Basically, if the request says the content-length is 500, we allow the
filter to read up to 500 bytes.  We can return more than that, but we only
read 500 from the network.  If we read less, than the difference between
to two is returned to the calling function.  On the next call, the calling
function passes down that same value to indicate we are still looking for
the end of those 500 bytes.

If we are in chunking mode, then we read one line (header data) get the
amount of data in the chunk, and then call for that amount of data.  Do
this in a loop to get all the chunks.

The framing is handled at the topmost layer, and it is all governed with
content-length.  The only problem with this, is that we have to buffer
data at the ap_get_client_block level.

The other solution is to also add an EOS bucket at the bottom most level,
once we have read all the data.  This allows us to buffer in the top-most
filter instead of in ap_get_client_block, but it doesn't really buy us
anything else.

Greg seems to like the second, I am ambivilant about which we use, but the
marker bucket has to be an EOS, not an EOF.  EOF has a different
meaning.  Although EOF should also trigger an EOS being sent.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------