httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Stein <gst...@lyra.org>
Subject Re: [PATCH] Take 3 of httpd filter rewrite...
Date Wed, 26 Sep 2001 08:37:55 GMT
Hmm. This doesn't seem to address some of the concerns that I detailed in
Msg-ID: <20010924024418.D4050@lyra.org>

On Mon, Sep 24, 2001 at 02:03:11PM -0700, Justin Erenkrantz wrote:
> It seems that this patch is growing by leaps and bounds each time
> I post.  =-)

:-)

Actually, http_protocol.c shrank dramatically.

>...
> - Change support code to allow the http_filter to be a
>   request filter (how the request is setup initially).

I think the request filter should be inserted during ap_read_request, after
the headers are read. There is no reason to insert it before then. It only
means you have to write nasty code to get out of the way of reading the
request line and the headers.

>...
> --- server/core.c	2001/09/19 05:52:42	1.61
> +++ server/core.c	2001/09/24 20:46:51
>...
> +        /* seed the brigade with the client socket. */
>          e = apr_bucket_socket_create(f->c->client_socket);
> +        APR_BRIGADE_INSERT_TAIL(ctx->b, e);

Per another email, we might want to consider *not* creating a SOCKET bucket.
It just means that we have to transfer buckets from ctx->b to (param) b. We
may as well read from the socket directly into a new bucket on b.

>...
> +    if (mode == AP_MODE_PEEK) {
> +        apr_bucket *e;
> +        const char *str, *c;
> +        apr_size_t length;
> +
> +        /* The purpose of this loop is to ignore any CRLF (or LF) at the end

Note that we should insert a big ass ### right here. AP_MODE_PEEK should be
just that: peek at the data. Not a magic incantation to delete any extra
CRLF sequences in the input. If we *are* intended to do that, then the
symbol should be AP_MODE_TOSS_CRLF.

That said: keep this code in the patch -- I'm not suggesting it should not
be here (it was a problem before, and can continue to be a problem). It is
just that we should note how bogus this is :-)

>...
> +    /* If readbytes is -1, we want to just read everything until the end

Who does this? And why is it done, rather than just shutting down the
connection? Further, I think we should have a symbolic constant for this.

Again: not a problem with the patch, but with the original code. Please drop
a comment somewhere to fix this.

>...
> +    if (*readbytes == -1) {
> +        apr_bucket *e;
> +        apr_off_t total;
> +        const char *str;
> +        apr_size_t len;

This "len" overrides a declaration further out. In fact, there are several
declarations of "len" and "length" in this function. Just do one, and reuse
it for each of the reads. Same for "str".

> +        APR_BRIGADE_FOREACH(e, ctx->b) {
> +            /* We don't care about these values.  We just want to force the
> +             * lower level to just read it. */
> +            apr_bucket_read(e, &str, &len, APR_BLOCK_READ);
> +        }
> +        APR_BRIGADE_CONCAT(b, ctx->b);
> +
> +        /* Force a recompute of the length */
> +        apr_brigade_length(b, 1, &total);

Um. The above loop sucks. ctx->b is *only* a SOCKET bucket. We know that as
a given since that is a private brigade that we populated. Further, the
above loop will read *everything* on that socket into memory. Man... talk
about Denial of Service. "Hey! Apache! Take this gigabyte of data!"

Whoever is issuing that -1 read should rethink things. We cannot simply read
the entire contents of a socket into memory.

Okay... so we drop yet another comment about a hack. But while we're leaving
hacks around here, you may as well toss the loop. apr_brigade_length() is
going to do exactly the same thing -- the loop is redundant. Just do the
CONCAT after calling length()

>...
> +    /* readbytes == 0 is "read a single line". otherwise, read a block. */
> +    if (*readbytes) {
> +        apr_off_t total;
> +        apr_bucket *e;
> +        apr_bucket_brigade *newbb;
> +
> +        newbb = NULL;
> +
> +        apr_brigade_partition(ctx->b, *readbytes, &e);
> +        /* Must do split before CONCAT */
> +        if (e != APR_BRIGADE_SENTINEL(ctx->b)) {
> +            newbb = apr_brigade_split(ctx->b, e);
> +        }

This feels like our APIs are wrong. We really ought to be able to partition
then split. Checking for the sentinel seems like too much "inside
knowledge." The split() ought to be able to assume a sentinel means "no
splitting".

>...
> +    /* we are reading a single LF line, e.g. the HTTP headers */
> +    while (!APR_BRIGADE_EMPTY(ctx->b)) {
> +        e = APR_BRIGADE_FIRST(ctx->b);
> +        if ((rv = apr_bucket_read(e, (const char **)&buff, &len,

Change the declaration of buff to be const char * and you won't need this
dumb cast. No need for it to be char* in the first place.

> +                                  mode)) != APR_SUCCESS) {
> +            return rv;
> +        }
> +
> +        pos = memchr(buff, APR_ASCII_LF, len);

Please move the declaration for pos into this block. I had to search a long
ways back to find that one. And it should probably be const char * also.

>...
> --- modules/http/http_core.c	2001/08/25 23:43:18	1.282
> +++ modules/http/http_core.c	2001/09/24 20:46:51
>...
> +static int ap_http_create_req(request_rec *r)
> +{
> +    if (!r->main)
> +    {
> +        ap_add_input_filter("HTTP_IN", NULL, r, r->connection);
> +    }
> +    return OK;
> +}

Per my previous email, I think this should be inserted at a different point
in the code flow. It will also remove the need to hook create_request.

>...
> --- modules/http/http_protocol.c	2001/09/17 13:12:37	1.362
> +++ modules/http/http_protocol.c	2001/09/24 20:46:51
> @@ -481,265 +481,117 @@
>      return AP_HTTP_METHODS[methnum];
>  }
>  
> -struct dechunk_ctx {
> -    apr_size_t chunk_size;
> -    apr_size_t bytes_delivered;
> +static long get_chunk_size(char *);
> +
> +typedef struct http_filter_ctx {
> +    int status;

See previous comments about "status". Basically: this is a nasty hack and
the comments (or lack thereof!) do nothing to note that, nor to elucidate
what is going on.

>...
> +        BODY_NONE   /* must have value of zero */,

Doesn't need to be zero...  (see comments before)

>...
> +apr_status_t ap_http_filter(ap_filter_t *f, apr_bucket_brigade *b, ap_input_mode_t mode,
apr_off_t *readbytes)

Wow. This function seems like it is going to be dramatically simpler. Hard
to see through all of this mess, though :-)


I'm +1 on the patch, but with the caveat that *many* more comments need to
be added. There are still hacks all over the place (from before Justin's
patch, and at least one new one). I tried to comment the hacks before;
thankfully, Justin's patch removes most of those. But there are still more,
and we can't just let those slide...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Mime
View raw message