Return-Path: Delivered-To: apmail-httpd-dev-archive@httpd.apache.org Received: (qmail 21942 invoked by uid 500); 4 Oct 2001 20:52:55 -0000 Mailing-List: contact dev-help@httpd.apache.org; run by ezmlm Precedence: bulk Reply-To: dev@httpd.apache.org list-help: list-unsubscribe: list-post: Delivered-To: mailing list dev@httpd.apache.org Received: (qmail 21926 invoked from network); 4 Oct 2001 20:52:54 -0000 Message-ID: <3BBCCC61.7090307@apache.org> Date: Thu, 04 Oct 2001 13:53:53 -0700 From: Ian Holsman User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.4+) Gecko/20011003 X-Accept-Language: en-us MIME-Version: 1.0 To: Aaron Bannert , dev@httpd.apache.org Subject: Re: [PATCH] some fixes for ap_http_filter References: <20011004134534.E13476@clove.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Aaron Bannert wrote: > On Thu, Oct 04, 2001 at 01:14:31PM -0700, Justin Erenkrantz wrote: > >>On Thu, Oct 04, 2001 at 12:52:17PM -0700, Aaron Bannert wrote: >> >>>I came across a few things that needed attention in my review of the >>>http filter code: >>> >>Cool. >> > > [hey! Aren't you supposed to be in class? ;) ] I've been debugging the proxy problem with the input filters, and it also seems to be screwing up in this function as well. The proxy calls this function with a NULL request in the input filter, causing the pool-alloc to fail (invalid access) by the looks of it the 'request' variable in the input_filters stucture is NULL all the way back in process_connection. I was wondering how this happens? (BTW.. this stuff was working with the CVS of Sep-20, before the input filter changes went in) > > >>>- ctx->remaining was uninitialized. >>> >>We don't care what ctx->remaining is unless we have a special >>parsing type which sets ctx->remaining. But, yeah, this may >>make more sense with my below comment... >> > > It seemed to me that we were making assumptions about our input. There > were code paths that could be reached where ctx->remaining would be > used before being initialized. > > >>>- I'm a little uncomfortable with the Content-Length: parsing, please >>> see my comments in the diff. >>> >>I didn't touch that loop. =) But, yeah, once we see a digit, >>we should read until we don't see a digit. >> > > I don't really know where the loop came from, I just wanna make sure it's > safe. (more comments below) > > >>>- a logic reduction in a do-while loop (reduces the number of times >>> we call APR_BRIGADE_EMPTY by half). >>>- made the ctx->remaining condition check more robust. >>> >>We should never return more than ctx->remaining. That's the whole >>point. =) So, checking it for negative value may not be necessary. >>In fact, if it is negative, we have major problems. If you want >>to place an assert there - because it should never happen... >> > > "should" being the operative word here. It was not obvious to me how > we ensure ctx->remaining is always non-negative. We are trusting our > input variables on a filter where we aren't supposed to know what our > neighbors are. Maybe I've missed some crutial piece of logic that > would quell my fears, but robust >= reliable. :) > > >>Some quick comments inline... >> >> >>>Index: modules/http/http_protocol.c >>>=================================================================== >>>RCS file: /home/cvspublic/httpd-2.0/modules/http/http_protocol.c,v >>>retrieving revision 1.369 >>>diff -u -r1.369 http_protocol.c >>>--- modules/http/http_protocol.c 2001/10/02 21:13:41 1.369 >>>+++ modules/http/http_protocol.c 2001/10/04 18:34:25 >>>@@ -504,6 +504,7 @@ >>> if (!ctx) { >>> const char *tenc, *lenp; >>> f->ctx = ctx = apr_palloc(f->r->pool, sizeof(*ctx)); >>>+ ctx->remaining = 0; >>> ctx->state = BODY_NONE; >>> >>> tenc = apr_table_get(f->r->headers_in, "Transfer-Encoding"); >>>@@ -523,26 +524,32 @@ >>> else if (lenp) { >>> const char *pos = lenp; >>> >>>+ /* XXX: Is this safe? I can do "Content-Length: 3 3 3" ... */ >>> while (apr_isdigit(*pos) || apr_isspace(*pos)) >>> ++pos; >>> if (*pos == '\0') { >>> ctx->state = BODY_LENGTH; >>> ctx->remaining = atol(lenp); >>>+ /* XXX: Are there any platforms where apr_off_t != long? >>>+ aka: Would it be possible to overflow here? */ >>>+ /* XXX: Can we get here with an invalid number? (ie <0) */ >>> >>As I said, this was just copied from the content-length parsing routines >>we already had. So, possibly. =) >> > > No prob, just bringing it to our attention. > I thought it was kind of neat that I could do: > > curl -H "Content-Length: 3 3 3 3 3 3 3 3 3" -d "foo" [some url] and it > would work fine. This part is NBD ("flexible on input, strict on output"), > I'm really much more concerned about the implicit type cast from a long > to an off_t (apr_off_t). What would happen if we had a platform where > those didn't match, and someone could get an overflow (and potentially > a negative number) into ctx->remaining? Would bad things happen? That's > why I made the next change: > > > >>> } >>> } >>> } >>> >>>- if (!ctx->remaining) >>>+ if (ctx->remaining <= 0) >>> >>I think this negative check is a bit bogus for reasons above... >> > > They are both logically correct, but now it makes fewer assumptions > about the input. > > >>> { >>> switch (ctx->state) >>> { >>> case BODY_NONE: >>> break; >>> case BODY_LENGTH: >>>+ /* We've already consumed the Content-length size, so flush. */ >>> e = apr_bucket_eos_create(); >>> APR_BRIGADE_INSERT_TAIL(b, e); >>> return APR_SUCCESS; >>> case BODY_CHUNK: >>>+ /* Done with this chunk, go get the next chunk length. */ >>> { >>> char line[30]; >>> >>>@@ -560,8 +567,7 @@ >>> ctx->state = BODY_CHUNK; >>> ctx->remaining = get_chunk_size(line); >>> >>>- if (!ctx->remaining) >>>- { >>>+ if (ctx->remaining <= 0) { >>> >>No. =) Java weenie. >> > > Who's the Java weenie? This is the same change as above with the additional > benefit of meeting our style guide. > > (btw, Where do you get this java thing from?) > > >>> /* Handle trailers by calling get_mime_headers again! */ >>> e = apr_bucket_eos_create(); >>> APR_BRIGADE_INSERT_TAIL(b, e); >>>@@ -572,13 +578,13 @@ >>> } >>> } >>> >>>- rv = ap_get_brigade(f->next, b, mode, readbytes); >>>- >>>- if (rv != APR_SUCCESS) >>>+ if ((rv = ap_get_brigade(f->next, b, mode, readbytes)) != APR_SUCCESS) { >>> return rv; >>>+ } >>> >>I don't like this style. Blech. It makes the line too long... >> > > That one was a little gratuitous, sorry. ;) > > >>> >>>- if (ctx->state != BODY_NONE) >>>+ if (ctx->state != BODY_NONE) { >>> >>No need for braces here... >> > > It's a safety thing. If you're willing to commit my patch I'll take them > out. :) > > >>> ctx->remaining -= *readbytes; >>>+ } >>> >>> return APR_SUCCESS; >>> } >>>@@ -1360,19 +1366,17 @@ >>> &core_module); >>> apr_bucket_brigade *bb = req_cfg->bb; >>> >>>- do { >>>- if (APR_BRIGADE_EMPTY(bb)) { >>>- if (ap_get_brigade(r->input_filters, bb, AP_MODE_BLOCKING, >>>- &r->remaining) != APR_SUCCESS) { >>>- /* if we actually fail here, we want to just return and >>>- * stop trying to read data from the client. >>>- */ >>>- r->connection->keepalive = -1; >>>- apr_brigade_destroy(bb); >>>- return -1; >>>- } >>>+ while (APR_BRIGADE_EMPTY(bb)) { >>>+ if (ap_get_brigade(r->input_filters, bb, AP_MODE_BLOCKING, >>>+ &r->remaining) != APR_SUCCESS) { >>>+ /* if we actually fail here, we want to just return and >>>+ * stop trying to read data from the client. >>>+ */ >>>+ r->connection->keepalive = -1; >>>+ apr_brigade_destroy(bb); >>>+ return -1; >>> } >>>- } while (APR_BRIGADE_EMPTY(bb)); >>>+ } >>> >>We purposely chose to ignore ap_get_client_block this go around. >>I had a complete rewrite for it, but put it on hold until the >>filters are more stable. So, this entire section of code is >>getting thrown out as soon as someone gets to it. So, yeah, >>it sucks majorly. -- justin >> > > I'm a filter newbie, so I have no idea what parts are in flux or not. Just > been proofreading the code and you know how much I love to make the logic > faster :) > > -aaron > >