Mailing-List: contact dev-help@httpd.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@httpd.apache.org
Date: Mon, 3 Jun 2002 23:31:58 -0700
From: Justin Erenkrantz <jerenkrantz@apache.org>
To: Greg Stein <gstein@lyra.org>
Cc: dev@httpd.apache.org
Subject: Re: [PATCH] ap_discard_request_body() can't be called more than once
Message-ID: <20020603233158.A19485@apache.org>
Mail-Followup-To: Justin Erenkrantz <jerenkrantz@apache.org>,
	Greg Stein <gstein@lyra.org>, dev@httpd.apache.org
References: <20020602164041.N19485@apache.org> <20020603172913.D2689@lyra.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <20020603172913.D2689@lyra.org>;
 from gstein@lyra.org on Mon, Jun 03, 2002 at 05:29:14PM -0700

On Mon, Jun 03, 2002 at 05:29:14PM -0700, Greg Stein wrote:
> This is all getting *way* too complicated.

I agree, but I think this is something that we've let slide for a
while now.  Only now are we trying to get this code correct.

We currently call ap_discard_request_body without understanding
what it means to discard the request body.  Lots of handlers are
calling discard in the wrong places without understanding the
ramifications of doing so.

> I recall seeing an email where somebody suggested putting a flag in the
> request_rec to determine whether HTTP_IN had seen an EOS or not. Bleh. At a
> minimum, that would go into the context for HTTP_IN.

Here's my core question: "Can HTTP_IN return EOS more than once?"

So, if two places in the request path try to read the request
entity, what do we return to the second guy since the first caller
read until EOS?  (This is highly related to my question about
the double #exec cgi mod_include request.)

> I think that the right answer is that when the request_rec is about to go
> away, that any unread body content should be read by the framework. It would
> be really nice to have filter logic to say "call me when I'm done" so that
> HTTP_IN could go ahead and read the rest of the request right then.

Could we get into any "deadlock" situations at the network layer if
both input and output socket buffers were full?  Imagine if we didn't
discard a large request body until the very end and we had a large
response - could the TCP stack run out of window space since the
client won't send anything until we read (and ACK) the input?

> If that were done, then you wouldn't have to worry about a bunch of rules
> all over the place, about when to call it, to avoid double-calls, etc. In
> fact, all of the calls could just go away...
> 
> [ note that double-calling ap_discard_request_body() is quite fine. ]

Because ap_discard_request_body() becomes a no-op, right?

> My vote would be to put something into ap_finalize_request_protocol(). (the
> problem, of course, is recovering the HTTP_IN context; short of that,
> putting the flag into the request_rec or maybe the 'core_request_config'
> structure (hmm; the latter would be better).

Isn't HTTP_IN's context is still present in the request chain?  Why
would it have been destroyed?  -- justin