Return-Path: X-Original-To: apmail-perl-modperl-archive@www.apache.org Delivered-To: apmail-perl-modperl-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CE81610568 for ; Wed, 3 Jul 2013 18:35:34 +0000 (UTC) Received: (qmail 28426 invoked by uid 500); 3 Jul 2013 18:35:33 -0000 Delivered-To: apmail-perl-modperl-archive@perl.apache.org Received: (qmail 28334 invoked by uid 500); 3 Jul 2013 18:35:29 -0000 Mailing-List: contact modperl-help@perl.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: List-Id: Delivered-To: mailing list modperl@perl.apache.org Received: (qmail 28323 invoked by uid 99); 3 Jul 2013 18:35:28 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jul 2013 18:35:28 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [67.212.167.194] (HELO server.tqis.com) (67.212.167.194) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jul 2013 18:35:22 +0000 Received: from server.tqis.com (localhost.localdomain [127.0.0.1]) by server.tqis.com (8.13.8/8.13.8) with ESMTP id r63IYwCD006525; Wed, 3 Jul 2013 14:34:58 -0400 Received: from localhost (jschueler@localhost) by server.tqis.com (8.13.8/8.13.8/Submit) with ESMTP id r63IYteP006521; Wed, 3 Jul 2013 14:34:58 -0400 X-Authentication-Warning: server.tqis.com: jschueler owned process doing -bs Date: Wed, 3 Jul 2013 14:34:55 -0400 (EDT) From: Jim Schueler X-X-Sender: jschueler@server.tqis.com To: Bill Moseley cc: mod_perl list Subject: Re: mod_perl and Transfer-Encoding: chunked In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (LRH 1167 2008-08-23) MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="-733756761-961859947-1372876498=:25557" X-Virus-Checked: Checked by ClamAV on apache.org This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. ---733756761-961859947-1372876498=:25557 Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8BIT I played around with chunking recently in the context of media streaming: The client is only requesting a "chunk" of data. "Chunking" is how media players perform a "seek". It was originally implemented for FTP transfers: E.g, to transfer a large file in (say 10K) chunks. In the case that you describe below, if no Content-Length is specified, that indicates "send the remainder". >From what I know, a "chunk" request header is used this way to specify the server response. It does not reflect anything about the data included in the body of the request. So first, I would ask if you're confused about this request information. Hypothetically, some browsers might try to upload large files in small chunks and the "chunk" header might reflect a push transfer. I don't know if "chunk" is ever used for this purpose. But it would require the following characteristics: 1. The browser would need to originally inquire if the server is capable of this type of request. 2. Each chunk of data will arrive in a separate and independent HTTP request. Not necessarily in the order they were sent. 3. Two or more requests may be handled by separate processes simultaneously that can't be written into a single destination. 4. Somehow the server needs to request a resend if a chunk is missing. Solving this problem requires an imaginitive use of HTTP. Sounds messy. But might be appropriate for 100M+ sized uploads. This *may* reflect your situation. Can you please confirm? For a single process, the incoming content-length is unnecessary. Buffered I/O automatically knows when transmission is complete. The read() argument is the buffer size, not the content length. Whether you spool the buffer to disk or simply enlarge the buffer should be determined by your hardware capabilities. This is standard IO behavior that has nothing to do with HTTP chunk. Without a "Content-Length" header, after looping your read() operation, determine the length of the aggregate data and pass that to Catalyst. But if you're confident that the complete request spans several smaller (chunked) HTTP requests, you'll need to address all the problems I've described above, plus the problem of re-assembling the whole thing for Catalyst. I don't know anything about Plack, maybe it can perform all this required magic. Otherwise, if the whole purpose of the Plack temporary file is to pass a file handle, you can pass a buffer as a file handle. Used to be IO::String, but now that functionality is built into the core. By your last paragraph, I'm really lost. Since you're already passing the request as a file handle, I'm guessing that Catalyst creates the tempororary file for the *response* body. Can you please clarify? Also, what do you mean by "de-chunking"? Is that the same think as re-assembling? Wish I could give a better answer. Let me know if this helps. -Jim On Tue, 2 Jul 2013, Bill Moseley wrote: > For requests that are chunked (Transfer-Encoding: chunked and no > Content-Length header) calling $r->read returns unchunked�data from the > socket. > That's indeed handy. �Is that mod_perl doing that un-chunking or is it > Apache? > > But, it leads to some questions. �� > > First, if $r->read reads unchunked data then why is there a > Transfer-Encoding header saying that the content is chunked? � Shouldn't > that header be removed? � How does one know if the content is chunked or > not, otherwise? > > Second, if there's no Content-Length header then how does one know how much > data to read using $r->read? �� > > One answer is until $r->read returns zero bytes, of course. �But, is > that�guaranteed�to always be the case, even for, say, pipelined requests? � > My guess is yes because whatever is de-chunking the request knows to stop > after reading the last chunk, trailer and empty line. � Can anyone�elaborate > on how Apache/mod_perl is doing this?� > > > Perhaps I'm approaching this incorrectly, but this is all a bit untidy. > > I'm using Catalyst and Catalyst needs a Content-Length. �So, I have a Plack > Middleware component that creates a temporary file writing the buffer from > $r->read( my $buffer, 64 * 1024 ) until that returns zero bytes. �I pass > this file handle onto Catalyst. > > Then, for some content-types, Catalyst (via HTTP::Body) writes the body to > another�temp file. � �I don't know how Apache/mod_perl does its de-chunking, > but I can call $r->read with a huge buffer length and Apache returns that. > �So, maybe Apache is buffering to disk, too. > > In other words, for each tiny chunked JSON POST or PUT I'm creating two (or > three?) temp files which doesn't seem ideal. > > > -- > Bill Moseley > moseley@hank.org > > ---733756761-961859947-1372876498=:25557--