perl-modperl mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier>
Subject Re: mod_perl and Transfer-Encoding: chunked
Date Thu, 04 Jul 2013 16:48:56 GMT
Bill Moseley wrote:
> André, thanks for the response:
> On Thu, Jul 4, 2013 at 4:06 AM, André Warnier <> wrote:
>> Bill Moseley wrote:
>>> First, if $r->read reads unchunked data then why is there a
>>> Transfer-Encoding header saying that the content is chunked?   Shouldn't
>>> that header be removed?
> Looking at the RFC again the answer appears to be yes.   Look at the last
> line in this decoding example in
>    A process for decoding the "chunked" transfer-coding (section 3.6
> <>)
>    can be represented in pseudo-code as:
>        length := 0
>        read chunk-size, chunk-extension (if any) and CRLF
>        while (chunk-size > 0) {
>           read chunk-data and CRLF
>           append chunk-data to entity-body
>           length := length + chunk-size
>           read chunk-size and CRLF
>        }
>        read entity-header
>        while (entity-header not empty) {
>           append entity-header to existing header fields
>           read entity-header
>        }
>        Content-Length := length
>        Remove "chunked" from Transfer-Encoding
> Apache/mod_perl is doing the first part but not updating the headers.
> There's more on Content-Length and Transfer-Encoding here:
>    How does one know if the content is chunked or not, otherwise?
>> The real question is : does one need to know ?
> Perhaps.  That's an interesting question.   Applications probably don't
> need to care.  They should receive the body -- so for mod_perl that means
> reading data using $r->read until there's no more to read and then the app
> should never need to look at the Transfer-Encoding header -- or
> Content-Length header for that matter by that reasoning.
> It's a bit less clear if you think about Plack.  It sits between web
> servers and applications.   What should, say, a Plack Middleware component
> see in the body if the headers say Trasnfer-Encoding: chunked?   The
> decoding probably should happen in the
> server<>,
> but the headers would need to indicate that by removing the
> Transfer-Encoding header and adding in the Content-Length.
>>> Perhaps I'm approaching this incorrectly, but this is all a bit untidy.
>>> I'm using Catalyst and Catalyst needs a Content-Length.
>> I would posit then that Catalyst is wrong (or not compatible with HTTP 1.1
>> in that respect).
> But, Catalyst is a web application (framework) and from your point above it
> should not care about the encoding and just read the input stream by
> calling ->read().   Really, if you think about Plack, Catalyst should never
> make exceptions based on $ENV{MOD_PERL}.
> So, the separation of concerns between the web server and the app is not
> very clean.
>>   So, I have a Plack
>>> Middleware component that creates a temporary file writing the buffer from
>>> $r->read( my $buffer, 64 * 1024 ) until that returns zero bytes.  I pass
>>> this file handle onto Catalyst.
>> So what you wrote then is a patch to Catalyst.
> No, the Middleware component should be usable for any application.   And
> likewise, for any web server.  That's the point of Plack.
> Obviously, there's differences between web servers and maybe we need code
> that understans when running under mod_perl that the Transfer-Encoding:
> chunked header should be ignored, but if that code must live in Catalyst
> then that's really breaking the separation that Plack provides.
> I think the sane thing here is if Apache/mod_perl didn't provide a header
> saying the body is chunked, when it isn't.   Otherwise, code (Plack, web
> apps) that receive a set of headers and a handle to read from don't really
> have any choice but to believe what it is told.

I can see your point, but to me it depends at which level this add-on code "lives".
I do not know Plack or Catalyst, and do not know at which level each of them is supposed 
to "live".
But to me, if the code lives at the "web-app" level, at that point it should just consider

the request body as one piece or stream, without intervening "chunk headers".
(and it should treat the Content-transfer-encoding header as informational only, maybe to

know that it should not expect a Content-length header, and that it can only know the body

length by reading it).

It is different in the case of a mod_perl "connection filter".  That one really sees the 
stream of bytes coming from the browser, request line, headers, body chunked or not, etc..
(And it should see several requests pipelined on the same connection, one after the other,

as one stream of bytes, without any particular break between them other that what it can 
figure out itself.)

But even a "request filter" (which comes before a web-app) should see the request body as

already "de-chunked" (re-assembled).

See here for example :

which I got to starting from here :

View raw message