Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 7753 invoked from network); 22 Dec 2009 12:03:44 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 22 Dec 2009 12:03:44 -0000 Received: (qmail 59546 invoked by uid 500); 22 Dec 2009 12:03:01 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 59325 invoked by uid 500); 22 Dec 2009 12:02:56 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 59273 invoked by uid 99); 22 Dec 2009 12:02:56 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Dec 2009 12:02:56 +0000 X-ASF-Spam-Status: No, hits=-10.5 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_HI X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Dec 2009 12:02:49 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 64DAF234C045 for ; Tue, 22 Dec 2009 04:02:29 -0800 (PST) Message-ID: <562358687.1261483349408.JavaMail.jira@brutus> Date: Tue, 22 Dec 2009 12:02:29 +0000 (UTC) From: "Paul Joseph Davis (JIRA)" To: dev@couchdb.apache.org Subject: [jira] Commented: (COUCHDB-583) storing attachments in compressed form and serving them in compressed form if accepted by the client In-Reply-To: <844257352.1259509400633.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/COUCHDB-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793593#action_12793593 ] Paul Joseph Davis commented on COUCHDB-583: ------------------------------------------- @Filipe, > Note that using non chunked compressed body responses requires > storing all the compressed blocks in memory and then sending each > one to the client. This is a necessary "evil", as we only know the length > of the compressed body after compressing all the body, and we need > to set the "Content-Length" header for non chunked responses. You can't buffer attachments in RAM. That's just a no-no. Going back to the RFC, I don't really see a good place where it's spelled out what to do about Content-Encoding headers and their correspondence with Content-Length. I'm fairly certain you've got this backwards though. The Content-Length needs to be the number of bytes of gzip data used to represent the message body. Content-Length is used to delineate messages, so having them mismatched would be bad. Though, this does mean that you'll need to store the number of bytes before and after compression when you stream the attachment to disk. Also, the compression algorithm must be specified by the Accept-Encoding header, not a URL parameter. The "treshold_for_chunking_comp_responses" is a bit of a weird threshold. You might want to change that to something like "min_compression_length" and just have it say "files shorter than this will not be compressed". And then leave the claims on chunked vs. not to the HTTP content negotiation algorithms. I'll be out of town for the next couple weeks so forgive me if my responses are a tad slow. > storing attachments in compressed form and serving them in compressed form if accepted by the client > ---------------------------------------------------------------------------------------------------- > > Key: COUCHDB-583 > URL: https://issues.apache.org/jira/browse/COUCHDB-583 > Project: CouchDB > Issue Type: New Feature > Components: Database Core, HTTP Interface > Environment: CouchDB trunk > Reporter: Filipe Manana > Attachments: couchdb-583-trunk-3rd-try.patch, couchdb-583-trunk-4th-try-trunk.patch, couchdb-583-trunk-5th-try.patch, couchdb-583-trunk-6th-try.patch, jira-couchdb-583-1st-try-trunk.patch, jira-couchdb-583-2nd-try-trunk.patch > > > This feature allows Couch to gzip compress attachments as they are being received and store them in compressed form. > When a client asks for downloading an attachment (e.g. GET somedb/somedoc/attachment.txt), the attachment is sent in compressed form if the client's http request has gzip specified as a valid transfer encoding for the response (using the http header "Accept-Encoding"). Otherwise couch decompresses the attachment before sending it back to the client. > Attachments are compressed only if their MIME type matches one of those listed in a separate config file. Compression level is also configurable in the default.ini file. > This follows Damien's suggestion from 30 November: > "Perhaps we need a separate user editable ini file to specify compressable or non-compressable files (would probably be too big for the regular ini file). What do other web servers do? > Also, a potential optimization is to compress the file while writing to disk, and serve the compressed bytes directly to clients that can handle it, and decompressed for those that can't. For compressable types, it's a win for both disk IO for reads and writes, and CPU on read." > Patch attached. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.