Return-Path: Delivered-To: apmail-httpd-modules-dev-archive@locus.apache.org Received: (qmail 24231 invoked from network); 31 Jan 2008 03:22:51 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 31 Jan 2008 03:22:51 -0000 Received: (qmail 61404 invoked by uid 500); 31 Jan 2008 03:22:42 -0000 Delivered-To: apmail-httpd-modules-dev-archive@httpd.apache.org Received: (qmail 61376 invoked by uid 500); 31 Jan 2008 03:22:42 -0000 Mailing-List: contact modules-dev-help@httpd.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: modules-dev@httpd.apache.org Delivered-To: mailing list modules-dev@httpd.apache.org Received: (qmail 61367 invoked by uid 99); 31 Jan 2008 03:22:42 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Jan 2008 19:22:42 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of graham.dumpleton@gmail.com designates 209.85.198.186 as permitted sender) Received: from [209.85.198.186] (HELO rv-out-0910.google.com) (209.85.198.186) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 31 Jan 2008 03:22:15 +0000 Received: by rv-out-0910.google.com with SMTP id l15so372097rvb.24 for ; Wed, 30 Jan 2008 19:22:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=RA96yOjfYdz3VpiFA/yOKaoYsbg/3T2DGW50PFodSz8=; b=qos7AX3NAeC3mPvAoRNTh/uzMaq8vv3mhm86Tg9+N79xxTfcMskC/SNKPnwHz2pEfkluK4pvA0W2HFs2ehyakljk9skqxcxOjjiBCKmyl63RI1CJyc4gDwgPq4aKh9N/eURmo3YwIJbhPoEZUSUrAQHDLt/fvX4R85aJ9JgLjeQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=XfsVw5yKea8Pku42JGp6aeNyamiNBCaM7fLLR0lVquSoefa8fEZM/LiQhO5zo8SZAZ61JBRW8/v6AVFzSdMJ6UcWzgbjueJtGzvl1IH8NZJ8S3Dtu6p7UAdxFSafPHWuT2M290Rlplc4ZWUdaoY/smjralzxWNrGCqHFRX9EYaE= Received: by 10.140.226.14 with SMTP id y14mr1109853rvg.164.1201749741887; Wed, 30 Jan 2008 19:22:21 -0800 (PST) Received: by 10.141.1.14 with HTTP; Wed, 30 Jan 2008 19:22:21 -0800 (PST) Message-ID: <88e286470801301922n162b955at431b0354c1a30597@mail.gmail.com> Date: Thu, 31 Jan 2008 14:22:21 +1100 From: "Graham Dumpleton" To: modules-dev@httpd.apache.org Subject: Re: Reading of input after headers sent and 100-continue. In-Reply-To: <003901c863b4$1111ee30$0501a8c0@T60> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <88e286470801291628g6460053ejeed734de499784e9@mail.gmail.com> <003901c863b4$1111ee30$0501a8c0@T60> X-Virus-Checked: Checked by ClamAV on apache.org On 31/01/2008, Brian Smith wrote: > > > > -----Original Message----- > > From: Graham Dumpleton [mailto:graham.dumpleton@gmail.com] > > Sent: Tuesday, January 29, 2008 4:29 PM > > To: modules-dev@httpd.apache.org > > Subject: Reading of input after headers sent and 100-continue. > > > > The HTTP output filter will send a 100 result back to a > > client when the first attempt to read input occurs and an > > Except header with 100-continue was received. Ie., from > > http_filters.c we have: > > > > if ((ctx->state == BODY_CHUNK || > > (ctx->state == BODY_LENGTH && ctx->remaining > 0)) && > > f->r->expecting_100 && f->r->proto_num >= HTTP_VERSION(1,1)) { > > This is from ap_http_filter(). If you look at http_core.c, you can see > that it is registered as an input filter, not an output filter. I knew what I meant, it just didn't come out right. I blame the keyboard. :-) > So, if > you never read from the input brigade, the "100 continue" will never be > sent. I'm not sure if the module needs to just ignore the input brigade, > or actively throw it away, though. > > > The problem then is if only after having sent some response > > content and triggering the response headers to be sent one > > actually goes to read the input, then the HTTP output filter > > above is still sending the 100 status response string. In > > other words, the 100 response status string is appearing in > > the middle of the actual response content. > > "Doctor, it hurts when I do this!" :) > > If a module is sending a response before a 100 continue has been sent, > then it shouldn't read from the input brigade, because it is going > against the HTTP spec. Can you point to the specific bit of the HTTP specification which says that. Section 8.2.3 would to me appear to have slightly conflicting statements. In particular it says: """Because of the presence of older implementations, the protocol allows ambiguous situations in which a client may send "Expect: 100- continue" without receiving either a 417 (Expectation Failed) status or a 100 (Continue) status. Therefore, when a client sends this header field to an origin server (possibly via a proxy) from which it has never seen a 100 (Continue) status, the client SHOULD NOT wait for an indefinite period before sending the request body.""" Effectively, if a 200 response came back, it seems to suggest that the client still should send the request body, just that it 'SHOULD NOT wait for an indefinite period'. It doesn't say explicitly for the client that it shouldn't still send the request body if another response code comes back. This is what I have seen with curl as a client. If one sends back a 200 response without reading any input, curl still sends the request content, but one does notice a slight pause as some timeout occurs only at which point it sends the request content. In other words, curl doesn't send it as soon as it sees the 200 response, but it does still send it. So technically, if the client has to still send the request content, something could still read it. It would not be ideal that there is a delay depending on what the client does, but would still be possible from what I read of this section. But then, later it says: """ Upon receiving a request which includes an Expect request-header field with the "100-continue" expectation, an origin server MUST either respond with 100 (Continue) status and continue to read from the input stream, or respond with a final status code. The origin server MUST NOT wait for the request body before sending the 100 (Continue) response. If it responds with a final status code, it MAY close the transport connection or it MAY continue to read and discard the rest of the request. It MUST NOT perform the requested method if it returns a final status code.""" The critical bit here I guess is: """If it responds with a final status code, it MAY close the transport connection or it MAY continue to read and discard the rest of the request.""" This suggests that the server can discard the request body if handler didn't try and read it before returning a response. What it means by: """It MUST NOT perform the requested method if it returns a final status code.""" I am not quite sure because if the response headers was returned by the handler you are already in the process of performing the requested method, so how can you not now do it. What is also a bit worrying to me is that what might be allowed by a handler for a request can be changed based on the presence of 100-continue, something which is out of the control of the handler and the web server receiving the request. Specifically, if 100-continue is not present and the client therefore sent the request body anyway, then technically nothing to stop the handler reading the input after the response headers have been sent. For example, the handler may generate response headers for same content length and only then starting reading input and returning it as the response body. It seems by what you are saying that if 100-continue is present this wouldn't be allowed, and that to ensure correct behaviour the handler would have to read at least some of the request body before sending back the response headers. Thus it doesn't seem that clear to me what can and cant be done unless there is some other section in the RFC which describes it. > > My question then is, what should a handler do if it is trying > > to generate response content (non buffered), before having > > attempted to read any input, ie., what is the correct way to > > stop Apache still sending the 100 status response for the > > 100-continue header? I know that setting r->expecting_100 to > > 0 at time that first response content is being sent will > > prevent it, but is there something else that should be done > > instead? > > Since ap_http_filter is an input filter only, it should be enough to > just avoid reading from the input brigade. (AFAICT, anyway.) In other words block the handler from reading, potentially raise an error in the process. Except to be fair and consistent, you would have to apply the same rule even if 100-continue isn't present. Whether that would break some existing code in doing that is the concern I have, even if it is some simple test program that just echos back the request body as the response body. > > BTW, this is partly theoretical in that have no actual code > > that is doing this, but technically in systems like > > mod_python or mod_wsgi where one doesn't know what the Python > > application code running on top is doing, a user could > > trigger this situation. > > The module can provide an interface to the input and output brigades > that prevents the application from doing this. mod_wsgi is doing this > already. As I mentioned on the Web-SIG list, it is difficult to have an > uniform, automatic mechanism for doing this for all request methods, or > even a uniform way of doing it for a particular method. So, it basically > has to be left up to the handler/application. All too confusing. :-( Graham