perl-modperl mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier>
Subject Re: mod_perl and Transfer-Encoding: chunked
Date Thu, 04 Jul 2013 11:06:16 GMT
Not disregarding the other answers to your questions, but I believe that maybe one aspect 
has been neglected here.

Bill Moseley wrote:
> For requests that are chunked (Transfer-Encoding: chunked and no
> Content-Length header) calling $r->read returns *unchunked* data from the
> socket.
> That's indeed handy.  Is that mod_perl doing that un-chunking or is it
> Apache?
> But, it leads to some questions.
> First, if $r->read reads unchunked data then why is there a
> Transfer-Encoding header saying that the content is chunked?   Shouldn't
> that header be removed?   How does one know if the content is chunked or
> not, otherwise?

The real question is : does one need to know ?

The transfer-coding is something that even an intermediate HTTP proxy may
be allowed to change, for reasons to do with transport of the request along a section of 
the network path.
It should be entirely transparent to the application receiving the data.

> Second, if there's no Content-Length header then how does one know how much
> data to read using $r->read?
> One answer is until $r->read returns zero bytes, of course.  

Indeed. That means that the end of *this* request body has been encountered.

But, is
> that guaranteed to always be the case, even for, say, pipelined requests?

It should be, because $r concerns the present request being processed.
If there is another request pipelined onto that same connection, it is a separate request

and a different $r.

> My guess is yes because whatever is de-chunking the request knows to stop
> after reading the last chunk, trailer and empty line.   Can
> anyone elaborate on how Apache/mod_perl is doing this?

I can't really, but it should be done by something at some fairly low level.  It should be

the *first* thing which happens to the request body, before any request-level body access

is allowed.
(Similarly, at the response level, "chunking" a response body should be the last thing 
happening before the request is put on the wire out.)

> Perhaps I'm approaching this incorrectly, but this is all a bit untidy.
> I'm using Catalyst and Catalyst needs a Content-Length.

I would posit then that Catalyst is wrong (or not compatible with HTTP 1.1 in that respect).

   So, I have a Plack
> Middleware component that creates a temporary file writing the buffer from
> $r->read( my $buffer, 64 * 1024 ) until that returns zero bytes.  I pass
> this file handle onto Catalyst.

So what you wrote then is a patch to Catalyst.

> Then, for some content-types, Catalyst (via HTTP::Body) writes the body to *
> another* temp file.    I don't know how Apache/mod_perl does its
> de-chunking, but I can call $r->read with a huge buffer length and Apache
> returns that.  So, maybe Apache is buffering to disk, too.
> In other words, for each tiny chunked JSON POST or PUT I'm creating two (or
> three?) temp files which doesn't seem ideal.

I realise that my comments above don't really help you in your specific predicament, but I

just felt that it was good to put things back in their place, particularly that at the $r

(request) level, you should not have to know if the request came in chunked or not.
And that if a client sends a request with a chunked body, you are not necessarily gettting

it so on the server on the which application runs.  And vice-versa.

View raw message