httpd-modules-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Graham Dumpleton" <graham.dumple...@gmail.com>
Subject Re: read POST body
Date Thu, 12 Apr 2007 22:07:22 GMT
On 13/04/07, Arturo 'Buanzo' Busleiman <buanzo@buanzo.com.ar> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
>
> Hi group!
>
> For mod_auth_openpgp I need to read the POST body. During my research (googling, archives
of this
> list, apache.org, etc) I discovered three methods so far. I would like your opinions
on the safest
> one, fastest one, if should DECHUNK, how much to allow for post size allocation (probably
a
> configuration option, but i'd need a default value...).
>
> This is what I got: anything you can think of would be of GREAT help:
>
> Getting REQUEST BODY: (1)
> ============================
>
>         ap_setup_client_block(r, REQUEST_CHUNKED_DECHUNK);
>
>         char buffer[1024];
>
>         if ( ap_should_client_block(r) == 1 ) {
>                 while ( ap_get_client_block(r, buffer, 1024) > 0 ) {
>                         ap_rputs("Reading in buffer...<br>",r);
>                         ap_rputs(buffer,r);
>                 }
>         } else {
>                 ap_rputs("Nothing to read...<br>",r);
>         }

I can't find reference to point at so my memory could be wrong, but if
using this approach one thing you must be mindful of is that the
minimum size you use for the read buffer must be sufficient to hold
any chunk size information and any trailers provided after the last
null chunk when chunked transfer encoding is being used. This is
because the HTTP filter code uses your buffer as working space for
decoding those parts of the request stream.

Someone correct me if I am wrong on this, but if correct and you know
where I read this, other than by looking at code, please point me to
where it is as would like to find it again.

In terms of a reasonable size to use, Apache defines HUGE_STRING_LEN
(8192) which some code uses  as its default block size. Other option
if appropriate is to interrogate the socket to determine the buffer
size used by the OS for sockets. Don't know how valid it is, but
performing reads using the same size may or may not be more optimal.

> Getting REQUEST BODY: (2)
> ============================
>
> http://httpd.apache.org/apreq/

Can't comment on this one.

> Getting REQUEST BODY: (3)
> ============================
> static int util_read(request_rec *r, const char **rbuf)
> {
>    int rc;
>
>     if ((rc = ap_setup_client_block(r, REQUEST_CHUNKED_ERROR)) != OK) {
>        return rc;
>    }
>
>     if (ap_should_client_block(r)) {
>        char argsbuffer[HUGE_STRING_LEN];
>       int rsize, len_read, rpos=0;
>       long length = r->remaining;
>       *rbuf = ap_pcalloc(r->pool, length + 1);
>
>        ap_hard_timeout("util_read", r);
>
>        while ((len_read =
>                ap_get_client_block(r, argsbuffer, sizeof(argsbuffer))) > 0) {
>           ap_reset_timeout(r);
>           if ((rpos + len_read) > length) {
>               rsize = length - rpos;
>           }
>           else {
>               rsize = len_read;
>           }
>           memcpy((char*)*rbuf + rpos, argsbuffer, rsize);
>           rpos += rsize;
>       }
>
>        ap_kill_timeout(r);
>     }
>    return rc;
> }

Even if you use REQUEST_CHUNKED_ERROR, disallowing chunked transfer
encoding and requiring a content length to be specified, I am not sure
you can rely on r->remaining being always correct.

This is because a mutating input filter could actually change the
amount of data which is available but it will not update any content
length value as it may not know in advance what the final content
length may be. An example of such an input filter is one that does
decompression on request content.

In Apache 2.2 the filter protocol directive:

  http://httpd.apache.org/docs/2.2/mod/mod_filter.html#filterprotocol

provides a way of saying that a particular filter changes the content,
including possibly the content length, but I think it is still up to
the user to set up these directives in the configuration to mark
filters as such and then all it does from memory is remove the content
length header to try and block handlers from believing it is correct.

Thus, although one can use r->remaining as a guide as to how much
buffer space may be required, a robust handler would have to be
prepared to cope with more than that thus potentially resizing its
input buffer, or ensure that it does actually stop reading when that
amount of data has been read in, rather than reading until no more
data, although that may result in truncation of data if such a
mutating input filter is present and it yields more data than the
content length specified.

Again, this is my understanding from studying code and reading
different bits and pieces, so someone correct me if I am wrong.
Confirmation from someone that this is correct would also be
appreciated.

Graham

Mime
View raw message